Documentation, Support, and Training
Understanding Hardware Commands
Validates InfiniBand fabric and report errors.
ibcheckerrors [-h][-b][-v][-N][topology|-C ca_name -P ca_port -t timeout]
where:
topology is the topology file.
ca_name is the channel adapter name.
ca_port is the channel adapter port.
timeout is the timeout in milliseconds.
This InfiniBand command is a script that uses the topology file created by the ibnetdiscover command to scan the InfiniBand fabric to validate the connectivity and report errors from the port counters.
The following table describes the options to the ibcheckerrors command and their purposes:
|
The following example shows how to check error counters for all LIDs in the InfiniBand fabric with the ibcheckerrors command.
# ibcheckerrors #warn: counter SymbolErrors = 3121 (threshold 10) lid 25 port 255 #warn: counter RcvSwRelayErrors = 48545 (threshold 100) lid 25 port 255 #warn: counter XmtDiscards = 9789 (threshold 100) lid 25 port 255 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port all: FAILED #warn: counter RcvSwRelayErrors = 56839 (threshold 100) lid 25 port 28 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port 28: FAILED #warn: counter RcvSwRelayErrors = 56839 (threshold 100) lid 25 port 9 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port 9: FAILED #warn: counter XmtDiscards = 9714 (threshold 100) lid 25 port 1 Error check on lid 25 (Sun DCS 72 QDR FC switch o4nm2-72p-2) port 1: FAILED . . . ## Summary: 6 nodes checked, 0 bad nodes found ## 142 ports checked, 3 ports have errors beyond threshold #
Note - The output in the example is just a portion of the full output.
ibcheckerrors man page