14.6 Verify the InfiniBand Fabric in a Multirack Configuration

Use the following commands on the command-line interface (CLI) of the spine switch to verify that the InfiniBand fabric in your multirack configuration is operational:

  1. ibnetdiscover

    Discovers and displays the InfiniBand fabric topology and connections. See Discover the InfiniBand Network Topology in a Multirack Configuration.

  2. ibdiagnet

    Performs diagnostics upon the InfiniBand fabric and reports status. See Perform Diagnostics on the InfiniBand Fabric in a Multirack Configuration.

  3. ibcheckerrors

    Checks the entire InfiniBand fabric for errors. See Check for Errors in the InfiniBand Fabric in a Multirack Configuration.

14.6.1 Discover the InfiniBand Network Topology in a Multirack Configuration

To discover the InfiniBand network topology and build a topology file which is used by the OpenSM Subnet Manager, run the following command on the command-line interface (CLI) of the spine switch:

# ibnetdiscover

The output is displayed, as in the following example:

The topology file is used by InfiniBand commands to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.

# Topology file: generated on Sat Apr 13 22:28:55 2002
#
# Max of 1 hops discovered
# Initiated from node 0021283a8389a0a0 port 0021283a8389a0a0
vendid=0x2c9
devid=0xbd36
sysimgguid=0x21283a8389a0a3
switchguid=0x21283a8389a0a0(21283a8389a0a0)
Switch   36 "S-0021283a8389a0a0" # "Sun DCS 36 QDR switch localhost" enhanced port 0 lid 15 lmc 0
[23]    "H-0003ba000100e388"[2](3ba000100e38a) # "nsn33-43 HCA-1" lid 14 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x3ba000100e38b
caguid=0x3ba000100e388
Ca   2 "H-0003ba000100e388" # "nsn33-43 HCA-1"
[2](3ba000100e38a)   "S-0021283a8389a0a0"[23] # lid 14 lmc 0 "Sun DCS 36 QDR switch localhost" lid 15 4xQDR

Note:

The actual output for your InfiniBand fabric will differ from that in the example.

14.6.2 Perform Diagnostics on the InfiniBand Fabric in a Multirack Configuration

To perform a collection of tests on the InfiniBand fabric and generate several files that contain parameters and aspects of the InfiniBand fabric, run the following command on the command-line interface (CLI) of the spine switch:

# ibdiagnet

In the following example, the ibdiagnet command is minimized to determine which links are underperforming:

# ibdiagnet -lw 4x -ls 10 -skip all

Loading IBDIAGNET from: /usr/lib/ibdiagnet1.2
-W- Topology file is not specified.
 Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/ibdm1.2
-I- Using port 0 as the local port.
-I- Discovering ... 2 nodes (1 Switches & 1 CA-s) discovered.
.
.
.
-I- Links With links width != 4x (as set by -lw option)
-I---------------------------------------------------
-I- No unmatched Links (with width != 4x) were found
-I---------------------------------------------------
-I- Links With links speed != 10 (as set by -ls option)
-I---------------------------------------------------
-I- No unmatched Links (with speed != 10) were found
.
.
.
-I- Stages Status Report:
 STAGE               Errors Warnings
 Bad GUIDs/LIDs Check               0   0 
 Link State Active Check               0   0 
 Performance Counters Report               0   0 
 Specific Link Width Check               0   0 
 Specific Link Speed Check               0   0 
 Partitions Check               0   0 
 IPoIB Subnets Check               0   0 
Please see /tmp/ibdiagnet.log for complete log
----------------------------------------------------------------
-I- Done. Run time was 1 seconds.

Note:

The actual output for your InfiniBand fabric will differ from that in the example.

14.6.3 Check for Errors in the InfiniBand Fabric in a Multirack Configuration

Use the ibcheckerrors command that uses the topology file to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.

On the command-line interface (CLI) of the spine switch, enter the following command:

# ibcheckerrors

## Summary: 4 nodes checked, 0 bad nodes found
##      34 ports checked, 0 ports have errors beyond threshold

Note:

The actual output for your InfiniBand fabric will differ from that in the example.