This section contains the following topics:
You can use the ibswitches
command to identify the Sun Network QDR InfiniBand Gateway Switches in the InfiniBand fabric in your Exalogic machine. This command displays the Global Unique Identifier (GUID), name, Local Identifier (LID), and LID mask control (LMC) for each switch. The output of the command is a mapping of GUID to LID for switches in the fabric.
On any command-line interface (CLI), run the following command:
# ibswitches
The output is displayed, as in the following example:
Switch : 0x0021283a8389a0a0 ports 36 "Sun DCS 36 QDR switch localhost" enhancedport 0 lid 15 lmc 0
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
You can use the ibhosts
command to display identity information about the host channel adapters (HCAs) in the InfiniBand fabric in a subnet. This command displays the GUID and name for each HCA.
On the command-line interface (CLI), run the following command:
# ibhosts
The output is displayed, as in the following example:
Ca : 0x0003ba000100e388 ports 2 "nsn33-43 HCA-1" Ca : 0x5080020000911310 ports 1 "nsn32-20 HCA-1" Ca : 0x50800200008e532c ports 1 "ib-71 HCA-1" Ca : 0x50800200008e5328 ports 1 "ib-70 HCA-1" Ca : 0x50800200008296a4 ports 2 "ib-90 HCA-1" . . . #
Note:
The output in the example is just a portion of the full output and varies for each InfiniBand topology.
To understand the routing that happens within your InfiniBand fabric, the ibnetdiscover
command displays the node-to-node connectivity. The output of the command is dependent upon the size of your fabric. You can also use this command to display the LIDs of HCAs.
On the command-line interface (CLI), enter the following command:
# ibnetdiscover
The output is displayed, as in the following example:
# Topology file: generated on Sat Apr 13 22:28:55 2002 # # Max of 1 hops discovered # Initiated from node 0021283a8389a0a0 port 0021283a8389a0a0 vendid=0x2c9 devid=0xbd36 sysimgguid=0x21283a8389a0a3 switchguid=0x21283a8389a0a0(21283a8389a0a0) Switch 36 "S-0021283a8389a0a0" # "Sun DCS 36 QDR switch localhost" enhanced port 0 lid 15 lmc 0 [23] "H-0003ba000100e388"[2](3ba000100e38a) # "nsn33-43 HCA-1" lid 14 4xQDR vendid=0x2c9 devid=0x673c sysimgguid=0x3ba000100e38b caguid=0x3ba000100e388 Ca 2 "H-0003ba000100e388" # "nsn33-43 HCA-1" [2](3ba000100e38a) "S-0021283a8389a0a0"[23] # lid 14 lmc 0 "Sun DCS 36 QDR switch localhost" lid 15 4xQDR
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
You sometimes need to know the route between two nodes in the InfiniBand fabric. The ibtracert
command can provide that information by displaying the GUIDs, ports, and LIDs of the nodes.On the command-line interface (CLI), run the following command:
# ibtracert slid dlid
where slid
is the LID of the source node and dlid
is the LID of the destination node in the fabric.
The output is displayed, as in the following example:
# ibtracert 15 14 # From switch {0x0021283a8389a0a0} portnum 0 lid 15-15 "Sun DCS 36 QDR switch localhost" [23] -> ca port {0x0003ba000100e38a}[2] lid 14-14 "nsn33-43 HCA-1" To ca {0x0003ba000100e388} portnum 2 lid 14-14 "nsn33-43 HCA-1" #
For this example:
The route starts at switch with GUID 0x0021283a8389a0a0
and is using port 0
. The switch is LID 15
and in the description, the switch host's name is Sun DCS 36 QDR switch localhost
. The route enters at port 23
of the HCA with GUID 0x0003ba000100e38a
and exits at port 2
. The HCA is LID 14
.
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
If you want to know the link status of a node in the InfiniBand fabric, run the ibportstate
command to display the state, width, and speed of that node:
On the command-line interface (CLI), run the following command:
# ibportstate lid port
where lid
is the LID of the node in the fabric, port
is the port of the node.
The output is displayed, as in the following example:
# ibportstate 15 23 PortInfo: # Port info: Lid 15 port 23 LinkState:.......................Active PhysLinkState:...................LinkUp LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedActive:.................10.0 Gbps Peer PortInfo: # Port info: Lid 15 DR path slid 15; dlid 65535; 0,23 LinkState:.......................Active PhysLinkState:...................LinkUp LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedActive:.................10.0 Gbps #
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
To help ascertain the health of a node in the fabric, use the perfquery
command to display the performance, error, and data counters for that node:
On the command-line interface (CLI), enter the following command:
# perfquery lid port
where lid
is the LID of the node in the fabric, and port
is the port of the node.
Note:
If a port value of 255 is specified for a switch node, the counters are the total for all switch ports.
For example:
# perfquery 15 23 # # Port counters: Lid 15 port 23 PortSelect:......................23 CounterSelect:...................0x1b01 SymbolErrors:....................0 . . . VL15Dropped:.....................0 XmtData:.........................20232 RcvData:.........................20232 XmtPkts:.........................281 RcvPkts:.........................281
Note:
The output in the example is just a portion of the full output.
To list the data counters for a node in the fabric, use the ibdatacounts
command.
On the command-line interface (CLI), enter the following command:
# ibdatacounts lid port
where lid
is the LID of the node in the fabric, and port
is the port of the node.
For example:
# ibdatacounts 15 23 # XmtData:.........................6048 RcvData:.........................6048 XmtPkts:.........................84 RcvPkts:.........................84
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
If intensive troubleshooting is necessary to resolve a problem, you can use the smpquery
command to display very detailed information about a node in the fabric.
On the command-line interface (CLI), enter the following command:
# smpquery switchinfo lid
where lid
is the LID of the node in the fabric.
For example:
# smpquery switchinfo 15 # # Switch info: Lid 15 LinearFdbCap:....................49152 RandomFdbCap:....................0 McastFdbCap:.....................4096 LinearFdbTop:....................16 DefPort:.........................0 DefMcastPrimPort:................255 DefMcastNotPrimPort:.............255 LifeTime:........................18 StateChange:.....................0 LidsPerPort:.....................0 PartEnforceCap:..................32 InboundPartEnf:..................1 OutboundPartEnf:.................1 FilterRawInbound:................1 FilterRawOutbound:...............1 EnhancedPort0:...................1 # # smpquery portinfo lid port
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
If intensive troubleshooting is necessary to resolve a problem, you can use the smpquery
command to display very detailed information about a port.
On the command-line interface (CLI), enter the following command:
# smpquery portinfo lid port
where lid
is the LID of the node in the fabric.
For example:
# smpquery portinfo 15 23 # Mkey:............................0x0000000000000000 GidPrefix:.......................0x0000000000000000 Lid:.............................0x0000 SMLid:...........................0x0000 CapMask:.........................0x0 DiagCode:........................0x0000 MkeyLeasePeriod:.................0 LocalPort:.......................0 LinkWidthEnabled:................1X or 4X LinkWidthSupported:..............1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkState:.......................Active PhysLinkState:...................LinkUp LinkDownDefState:................Polling ProtectBits:.....................0 LMC:.............................0 . . . SubnetTimeout:...................0 RespTimeVal:.....................0 LocalPhysErr:....................8 OverrunErr:......................8 MaxCreditHint:...................85 RoundTrip:.......................16777215 #
Note:
The actual output for your InfiniBand fabric will differ from that in the example, and it is just a portion of the full output.
In the InfiniBand fabric in Exalogic machines, as a Subnet Manager and Subnet administrator, you may want to assign subnet-specific LIDs to nodes in the fabric. Often in the use of the InfiniBand commands, you must provide an LID to issue a command to a particular InfiniBand device.
Alternatively, the output of a command might identify InfiniBand devices by their LID. You can create a file that is a mapping of node LIDs to node GUIDs, which can help with administrating your InfiniBand fabric.
Note:
Creation of the mapping file is not a requirement for InfiniBand administration.
The following procedure creates a file that lists the LID in hexadecimal, the GUID in hexadecimal, and the node description:
Note:
The output in the example is just a portion of the entire file.
If you require a full testing of your InfiniBand fabric, you can use the ibdiagnet
command to perform many tests with verbose results. The command is a useful tool to determine the general overall health of the InfiniBand fabric.
On the command-line interface (CLI), run the following command:
# ibdiagnet -v -r
The ibdiagnet.log
file contains the log of the testing.
You can use the ibdiagpath
command to perform some of the same comprehensive tests for a particular route.
On the command-line interface (CLI), run the following command:
# ibdiagpath -v -l slid dlid
where slid
is the LID of the source node in the fabric, and dlid
is the LID of the destination node.
The ibdiagpath.log
file contains the log of the testing.
If your fabric has a number of nodes that are suspect, the osmtest
command enables you to take a snapshot (inventory file) of your fabric and at a later time compare that file to the present conditions.
Note:
Although this procedure is most useful after initializing the Subnet Manager, it can be performed at any time.
Complete the following steps:
You can use the ibdiagnet
command to determine which links are experiencing symbol errors and recovery errors by injecting packets.
On the command-line interface (CLI), run the following command:
# ibdiagnet -c 100 -P all=1
In this instance of the ibdiagnet
command, 100 test packets are injected into each link and the -P all=1
option returns all counters that increment during the test.
In the output of the ibdiagnet
command, search for the symbol_error_counter
string. That line contains the symbol error count in hexadecimal. The preceding lines identify the node and port with the errors. Symbol errors are minor errors, and if there are relatively few during the diagnostic, they can be monitored.
Note:
According to the InfiniBand specification 10E-12 BER, the maximum allowable symbol error rate is 120 errors per hour.
In addition, in the output of the ibdiagnet
command, search for the link_error_recovery_counter
string.
That line contains the recovery error count in hexadecimal. The preceding lines identify the node and port with the errors. Recovery errors are major errors and the respective links must be investigated for the cause of the rapid symbol error propagation.
Additionally, the ibdiagnet.log
file contains the log of the testing.
To perform a quick check of all ports of all nodes in your InfiniBand fabric, you can use the ibcheckstate
command.
On the command-line interface (CLI), run the following command:
# ibcheckstate -v
The output is displayed, as in the following example:
# Checking Switch: nodeguid 0x0021283a8389a0a0 Node check lid 15: OK Port check lid 15 port 23: OK Port check lid 15 port 19: OK . . . # Checking Ca: nodeguid 0x0003ba000100e388 Node check lid 14: OK Port check lid 14 port 2: OK ## Summary: 5 nodes checked, 0 bad nodes found ## 10 ports checked, 0 ports with bad state found #
Note:
The ibcheckstate
command requires time to complete, depending upon the size of your InfiniBand fabric. Without the -v
option, the output contains only failed ports. The output in the example is only a small portion of the actual output.