114061 LG 1.42 How do we Monitor – Components

Email: info@saypro.online Call/WhatsApp: + 27 84 313 7407

SayPro is a Global Solutions Provider working with Individuals, Governments, Corporate Businesses, Municipalities, International Institutions. SayPro works across various Industries, Sectors providing wide range of solutions.

Network Data Collection at SLAC

Collect data via SNMP from:

  • Bridges, routers, ethermeters, hubs and switches
  • Data collected includes:
  • # good packets, # kilobytes, pkt size distribution
  • # errors (# of types of errors)
  • # pkts dropped, discarded, buffer/controller overflows
  • top-10 talkers & protocol distributions Collect data via Ping – for response, pkt loss, connectivity from:
  • critical servers, router interfaces, ethermeters
  • off-site collaborators’ nodes Other Sources:
  • Poll critical Unix network daemons & services (e.g. mail, WWW, name, font, NFS …)
  • ARP caches
  • appearance of new unregistered nodes

Data Analysis at SLAC

Once a day (in the early morning), via batch jobs:

  • The previous day’s data is analyzed and summarized into ASCII files (usually tabular) and graphs
  • Long term graphs (fortnightly, monthly, 180 days) are updated Ongoing analysis during the day consists of:
  • Generating files of hourly graphs and other displays of data collected to far today.
  • Bridge, router and ethermeter interface stats
  • Top10 talkers and subnet protocol usage

Data Reduction at SLAC

Analysis generates thousands of reports most of which are uninteresting

Reduction examines the analysis reports and extracts the exceptions e.g.

  • Duplicate IP addresses
  • Appearance of new unregistered nodes
  • Loss of connectivity
  • Data values exceeding thresholds, e.g.
    • CRC & alignment errors > 1 in 10000 packets
    • total utilization on a subnet of > 10% for the day
    • broadcast rate > 150 pkts/sec
    • (shorts+collisions)/good_packets > 10%
    • packet loss from onsite pings > 1% in a day
    • bridge/router overflows and queue drops
  • Creates exception reports (for display by WWW) with hypertext links to tables and plots with more information

Alert Notification

The daily WWW visible exception reports are manually reviewed each working morning and used as input to the morning H. O. T. meeting

  • 5-15 min open meeting of network ops & development, systems admins, help desk and other interested people
  • covers: scheduled outages and installations, newly identified problems, outstanding/unresolved problems In addition:

NMS maps show when a managed critical interface becomes inactive (goes red)

SNMP and ping-polling of critical interfaces results in:

  • issuing of X-window pop-up windows
  • phone pages being issued
  • e-mail messages Security intrusions result in:
  • phone pages being issued by the pager system

Results

Service Level Expectations:

  • Examples
    • Ping response time for on-site network layer < 10msec for 95% of samples
    • Network reachability of critical nodes of >= 99%
    • Sub-second response for trivial network services (name, font, network daemons (smtp, nfsrpc) …)
    • 95% of trivial mail delivered on site in 10 minutes
    • 95% of requests for SLAC WWW home page served in < 0.1 secs.
  • Neftaly Malatjie | CEO | SayPro
  • Email: info@saypro.online
  • Call: + 27 84 313 7407
  • Website: www.saypro.online

SayPro ShopApp Jobs Courses Classified AgriSchool Health EventsCorporate CharityNPOStaffSports

Comments

Leave a Reply