In this section:

Note

The following section is for the SP2000 Platforms only.


Link Aggregation (LAG) Monitoring for the SP2000 is a function of the Ethernet bonding driver that detects link faults, and is important in ensuring that the redundant network connection can be sustained in the event of a component failure. The SP2000 cards need to use LAG Monitoring because the Ethernet ports on the AMCs are not directly connected to the backplane chassis of the ATCA 13U or 3U chassis. The AMCs connect to a Ethernet Switch on the carrier blade, and it is this switch that has ports directly connected to the chassis backplane. This means that the AMC interfaces cannot directly see any loss of the Ethernet carrier signal on the backplane, and another mechanism (LAG Monitoring) is needed to allow the AMC interfaces to detect backplane link faults.

LAG Monitoring works by periodically validating the Ethernet communication path for each link from the physical network ports on an SP2000 card to the next actively managed network node, which may be a port on an SCX switch card on a 13U deployment, or a next-hop router in a 3U deployment.

The validation is done by generating an Address Resolution Protocol (ARP) packet to the SCX or next-hop router and waiting for a response. If ARP responses are being lost, the related link is flagged as being in a failed state. Any subsequent Ethernet traffic is redirected to a peer link until the validation succeeds.

To avoid being blocked by L2 routing rules in the related 3U network, the backup slave in the bond makes use of a unique MAC address for ARP validation. In addition, the LAG Monitoring logic is expanded to treat reception of an ARP query packet from the standby slave by the active slave as indication that the backup is working, and that it should not fail validation.

The LAG Monitoring statistics provide a view of historical timing of ARP reply packets from the next-hop routers, and if slow (or lost) ARP reply packets are occurring frequently which may be an indication of network congestion or some other incorrect behavior. The LAG Monitoring statistics are reported by the bonding driver in the /proc/net/ bonding/bond0 file. This file is not easily deciphered, and typically only Ribbon Global Product Support will examine the contents for troubleshooting purposes.

Note

It is recommended that the LAG Monitoring timeout default values presented in the Menu UI are not changed, but if you has a more complex network, there is the possibility of slow ARP replies that could result in unnecessary bonding failovers or loss of redundancy (that is, if the standby slave is unavailable due to an ARP validation failure). Any changes to the timing of the LAG Monitor logic requires an associated change to the Transparent Inter-Process Communication (TIPC) link threshold setting.

To enable LAG Monitoring, the bonding driver needs to know the datadefault IP addresses of the SCX cards if on a 13U chassis, or the basedefault gateway addresses of the next-hop routers (typically ERS-8600 units) if the system is 3U.

Configuring LAG Monitoring includes the following procedures.

To query the datedefault IP address on the 13U


Click to read more...


To configure the LAG Monitoring timeout


Click to read more...


To confirm the changes in the preceding procedure have been fully enabled


Click to read more...


  • No labels