Documentation, Support, and Training
Check Network Management Port Status LEDs
Check Power Supply Status LEDs
Understanding Routing Through the Switch
CXP Connectors and Link LEDs to Switch Chip Port Routes
Switch Chip Port to Switch Chip Port Routes
Switch Chip Port to CXP Connectors and Link LED Routes
Signal Route Through the Switch
Administrative Command Overview
Display Switch Environmental and Operational Data
Display Switch Firmware Versions
Locate a Switch Chip or Connector From the GUID
Display Switch Chip Boot Status
Display Switch Chip Port Status
Monitoring the InfiniBand Fabric
Identify All Switches in the Fabric
Identify All HCAs in the Fabric
Display the InfiniBand Fabric Topology
Display a Route Through the Fabric
Display the Link Status of a Node
Display Data Counters for a Node
Display Low-Level Detailed Information About a Node
Display Low-Level Detailed Information About a Port
Restart the Management Controller
Recover Ports After Switch Chip Reset
Change the Administrator Password
Controlling the InfiniBand Fabric
Perform Comprehensive Diagnostics for the Entire Fabric
Perform Comprehensive Diagnostics for a Route
Determine Changes to the InfiniBand Fabric Topology
Find 1x or SDR or DDR Links in the Fabric
Controlling the Subnet Manager
Set the Subnet Manager Priority
Start the Subnet Manager With the opensmd Daemon
You can use the ibdiagnet command to determine which links are experiencing symbol errors and recovery errors by injecting packets.
On the management controller, type.
# ibdiagnet -c 100 -P all=1
In this instance of the ibdiagnet command, 100 test packets are injected into each link and the -P all=1 option returns all counters that increment during the test.
In the output of the ibdiagnet command, search for the symbol_error_counter string.
That line contains the symbol error count in hexadecimal. The preceding lines identify the node and port with the errors. Symbol errors are minor errors, and if there are relatively few during the diagnostic, they can be monitored.
Note - According to the InfiniBand specification 10E-12 BER, the maximum allowable symbol error rate is 120 errors per hour.
Also in the output of the ibdiagnet command, search for the link_error_recovery_counter string.
That line contains the recovery error count in hexadecimal. The preceding lines identify the node and port with the errors. Recovery errors are major errors and the respective links must be investigated for the cause of the rapid symbol error propagation.
Note - Additionally, the ibdiagnet.log file contains the log of the testing.
Switch Reference, ibdiagnet command