Author: Neftaly Malatjie

  • 114061 LG 1.42 How do we Monitor – Components

    Network Data Collection at SLAC

    Collect data via SNMP from:

    • Bridges, routers, ethermeters, hubs and switches
    • Data collected includes:
    • # good packets, # kilobytes, pkt size distribution
    • # errors (# of types of errors)
    • # pkts dropped, discarded, buffer/controller overflows
    • top-10 talkers & protocol distributions Collect data via Ping – for response, pkt loss, connectivity from:
    • critical servers, router interfaces, ethermeters
    • off-site collaborators’ nodes Other Sources:
    • Poll critical Unix network daemons & services (e.g. mail, WWW, name, font, NFS …)
    • ARP caches
    • appearance of new unregistered nodes

    Data Analysis at SLAC

    Once a day (in the early morning), via batch jobs:

    • The previous day’s data is analyzed and summarized into ASCII files (usually tabular) and graphs
    • Long term graphs (fortnightly, monthly, 180 days) are updated Ongoing analysis during the day consists of:
    • Generating files of hourly graphs and other displays of data collected to far today.
    • Bridge, router and ethermeter interface stats
    • Top10 talkers and subnet protocol usage

    Data Reduction at SLAC

    Analysis generates thousands of reports most of which are uninteresting

    Reduction examines the analysis reports and extracts the exceptions e.g.

    • Duplicate IP addresses
    • Appearance of new unregistered nodes
    • Loss of connectivity
    • Data values exceeding thresholds, e.g.
      • CRC & alignment errors > 1 in 10000 packets
      • total utilization on a subnet of > 10% for the day
      • broadcast rate > 150 pkts/sec
      • (shorts+collisions)/good_packets > 10%
      • packet loss from onsite pings > 1% in a day
      • bridge/router overflows and queue drops
    • Creates exception reports (for display by WWW) with hypertext links to tables and plots with more information

    Alert Notification

    The daily WWW visible exception reports are manually reviewed each working morning and used as input to the morning H. O. T. meeting

    • 5-15 min open meeting of network ops & development, systems admins, help desk and other interested people
    • covers: scheduled outages and installations, newly identified problems, outstanding/unresolved problems In addition:

    NMS maps show when a managed critical interface becomes inactive (goes red)

    SNMP and ping-polling of critical interfaces results in:

    • issuing of X-window pop-up windows
    • phone pages being issued
    • e-mail messages Security intrusions result in:
    • phone pages being issued by the pager system

    Results

    Service Level Expectations:

    • Examples
      • Ping response time for on-site network layer < 10msec for 95% of samples
      • Network reachability of critical nodes of >= 99%
      • Sub-second response for trivial network services (name, font, network daemons (smtp, nfsrpc) …)
      • 95% of trivial mail delivered on site in 10 minutes
      • 95% of requests for SLAC WWW home page served in < 0.1 secs.
  • 114061 LG 1.41 What should we monitor

    The ultimate measures of performance are the users’ perceptions of the performance of their networked applications (e.g. WWW, email, a distributed RDBMS, a spreadsheet accessing a distributed file system etc.)

    This performance is affected by the performance of the complete Distributed System, which includes:

    • physical network plant
    • communications devices (e.g. routers, switches) , computers and peripherals attached to the network plant
    • host resource utilization
    • software from device interfaces, thru operating systems to applications running on computers and devices To set and meet user expectations for distributed system performance, we must monitor all of the above
  • 114061 LG 1.50 SESSION 3: WAN NETWORK ADMINISTRATION

    • On completion of this section you will be able to Explain Network Administration. 

    • The explanation identifies the tasks involved and outlines their requirements. 
    • The explanation outlines, for a range of factors, how response times are affected. . 
    • The explanation outlines the principles of network interconnections. 
    • The explanation outlines and explains network security administration procedures. 
    • Network administration documentation is completed. 

  • 114061 LG 1.39 Characteristics of WAN

    • It connects devices that are separated by a broader geographical area than a LAN
    • It uses Carriers such as phone companies or network providers
    • It uses serial connection.
  • 114061 LG 1.38 FEATURES AND CONSTRAINTS OF WAN

    Wide area network is collection of different local area networks, connect different LAN and WAN with each other, typically contain a country or region. Internet is the biggest example of wide area network. WAN is running with large variety of protocols and providing different services.

    The major features of WANs are listed below:

    • Multiple computers are connected together
    • It connect devices that are separated by a broader geographical area than a LAN
    • A WAN usually interconnects multiple LANs
    • Communication links between computers are provided by telephone networks, public data networks, satellites etc.
    • Links are of low capacity (that is low data rate)
    • Bit error rate is higher (1 in 100,000) compared to that for a LAN.
    • The data rates of WAN is low as compare to data transfer rate of local area network, and the signal propagation delay is much greater than the local area network. The typical data rates for WAN are 56kbps to 155Mbps, 622Mbps, 2.4 Gbps or higher