This section contains the following topics:
This section introduces Sun Network QDR InfiniBand Gateway Switches, which are also referred to as leaf switches in this guide.
Table 12-1 provides the physical specifications of the Sun Network QDR InfiniBand Gateway Switch.
Table 12-1 NM2-GW Specifications
Dimension | Measurements |
---|---|
Width |
17.52 in. (445.0 mm) |
Depth |
24 in. (609.6 mm) |
Height |
1.75 in. (44.5 mm) |
Weight |
23.0 lbs (11.4 kg) |
With power applied, you can access the CLI of a gateway switch in your Exalogic machine.
The number of gateway switches in your Exalogic machine depends on your purchased Exalogic machine rack configuration. You must access the command-line interfaces of these gateway switches individually.
For example, to access the CLI of a gateway switch, complete the following steps:
If you are using a network management port, begin network communication with the CLI using the ssh command and the host name configured for the gateway switch:
% ssh -l root gateway-name root@gateway-name's password: password #
where gateway-name
is the host name configured for the gateway switch.
If you do not see this output or prompt, there is a problem with the network communication, host name, or CLI.
If you are using a USB management port, begin serial communication with the CLI as follows:
Connect a serial terminal, terminal server, or workstation with a TIP connection to the USB-to-serial adapter. Configure the terminal or terminal emulator with these settings:
115200 baud, 8 bits, No parity, 1 Stop bit, and No handshaking
Press the Return or Enter key on the serial device several times to synchronize the connection. You might see text similar to the following:
… CentOS release 5.2 (Final) Kernel 2.6.27.13-nm2 on an i686 gateway-name login: root Password: password #
where gateway-name
is the host name assigned to the gateway switch.
If you do not see this output or prompt, there is a problem with the network communication, host name, or command-line interface (CLI).
Note:
Repeat these steps to access the CLI for the other gateway switches in your Exalogic machine.
For each gateway switch, you can check the status of the CLI, power supplies, fans, and switch chip. Verify that the voltage and temperature values of the gateway switch are within specification:
# showunhealthy # env_test
An unfavorable output from these commands indicates a hardware fault with that particular component. A voltage or temperature deviating more than 10% from the provided specification means a problem with the respective component.
For example, on the CLI of one of the gateway switches, enter the following command to check its status:
# env_test
This command performs a set of checks and displays the overall status of the gateway switch, as in the following example:
Environment test started: Starting Voltage test: Voltage ECB OK Measured 3.3V Main = 3.28 V Measured 3.3V Standby = 3.37 V Measured 12V = 12.06 V Measured 5V = 5.03 V Measured VBAT = 3.25 V Measured 1.0V = 1.01 V Measured I4 1.2V = 1.22 V Measured 2.5V = 2.52 V Measured V1P2 DIG = 1.17 V Measured V1P2 AND = 1.16 V Measured 1.2V BridgeX = 1.21 V Measured 1.8V = 1.80 V Measured 1.2V Standby = 1.20 V Voltage test returned OK Starting PSU test: PSU 0 present PSU 1 present PSU test returned OK Starting Temperature test: Back temperature 23.00 Front temperature 32.62 SP temperature 26.12 Switch temperature 45, maxtemperature 45 Bridge-0 temperature 41, maxtemperature 42 Bridge-1 temperature 43, maxtemperature 44 Temperature test returned OK Starting FAN test: Fan 0 not present Fan 1 running at rpm 11212 Fan 2 running at rpm 11313 Fan 3 running at rpm 11521 Fan 4 not present FAN test returned OK Starting Connector test: Connector test returned OK Starting onboard ibdevice test: Switch OK Bridge-0 OK Bridge-1 OK All Internal ibdevices OK Onboard ibdevice test returned OK Environment test PASSED
When the status is operational, you can start the Subnet Manager (SM).
Note:
Repeat these steps to verify the status of the other gateway switches in your Exalogic machine.
The Subnet Manager (SM) is enabled on the gateway switches in a single Exalogic rack configuration, by default.
However, if the SM is not running on the InfiniBand switches, you can start and activate the SM as follows:
After starting the SM, you can verify that the Link LEDs for cabled links are green. If the Link LED is dark, the link is down. If the Link LED flashes, there are symbol errors.
To check the link status of the cables:
# listlinkup
If the link for a connector is reported as not present, the link at either end of the cable is down. If a port is down, use the enableswitchport 0 portnumber
command to bring the port up. Alternatively, use the ibdevreset
command to reset the switch chip.
See the Sun Network QDR InfiniBand Gateway Switch Administration Guide, "Enable a Switch Chip Port" and "Reset the Switch Chip".
After making sure that the link is up, you can verify the InfiniBand fabric.
The following is an output example of the listlinkup
command:
# listlinkup Connector 0A Present <-> Switch Port 20 up (Enabled) Connector 1A Present <-> Switch Port 22 up (Enabled) Connector 2A Present <-> Switch Port 24 up (Enabled) . . . Connector 15A Not present Connector 0A-ETH Present Bridge-0-1 Port 0A-ETH-1 up (Enabled) Bridge-0-1 Port 0A-ETH-2 up (Enabled) Bridge-0-0 Port 0A-ETH-3 up (Enabled) Bridge-0-0 Port 0A-ETH-4 up (Enabled) Connector 1A-ETH Present Bridge-1-1 Port 1A-ETH-1 up (Enabled) Bridge-1-1 Port 1A-ETH-2 up (Enabled) Bridge-1-0 Port 1A-ETH-3 up (Enabled) Bridge-1-0 Port 1A-ETH-4 up (Enabled) Connector 0B Present <-> Switch Port 19 up (Enabled) Connector 1B Present <-> Switch Port 21 up (Enabled) . . . Connector 15B Not present #
Use the following commands on the command-line interface (CLI) to verify that the InfiniBand fabric is operational:
To discover the InfiniBand network topology and build a topology file which is used by the OpenSM Subnet Manager, run the following command on the command-line interface (CLI) of a gateway switch:
# ibnetdiscover
The output is displayed, as in the following example:
The topology file is used by InfiniBand commands to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.
# Topology file: generated on Sat Apr 13 22:28:55 2002 # # Max of 1 hops discovered # Initiated from node 0021283a8389a0a0 port 0021283a8389a0a0 vendid=0x2c9 devid=0xbd36 sysimgguid=0x21283a8389a0a3 switchguid=0x21283a8389a0a0(21283a8389a0a0) Switch 36 "S-0021283a8389a0a0" # "Sun DCS 36 QDR switch localhost" enhanced port 0 lid 15 lmc 0 [23] "H-0003ba000100e388"[2](3ba000100e38a) # "nsn33-43 HCA-1" lid 14 4xQDR vendid=0x2c9 devid=0x673c sysimgguid=0x3ba000100e38b caguid=0x3ba000100e388 Ca 2 "H-0003ba000100e388" # "nsn33-43 HCA-1" [2](3ba000100e38a) "S-0021283a8389a0a0"[23] # lid 14 lmc 0 "Sun DCS 36 QDR switch localhost" lid 15 4xQDR
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
To perform a collection of tests on the InfiniBand fabric and generate several files that contain parameters and aspects of the InfiniBand fabric, run the following command on the command-line interface (CLI) on a gateway switch:
# ibdiagnet
In the following example, the ibdiagnet
command is minimized to determine which links are utilized:
# ibdiagnet -lw 4x -ls 10 -skip all Loading IBDIAGNET from: /usr/lib/ibdiagnet1.2 -W- Topology file is not specified. Reports regarding cluster links will use direct routes. Loading IBDM from: /usr/lib/ibdm1.2 -I- Using port 0 as the local port. -I- Discovering ... 2 nodes (1 Switches & 1 CA-s) discovered. . . . -I- Links With links width != 4x (as set by -lw option) -I--------------------------------------------------- -I- No unmatched Links (with width != 4x) were found -I--------------------------------------------------- -I- Links With links speed != 10 (as set by -ls option) -I--------------------------------------------------- -I- No unmatched Links (with speed != 10) were found . . . -I- Stages Status Report: STAGE Errors Warnings Bad GUIDs/LIDs Check 0 0 Link State Active Check 0 0 Performance Counters Report 0 0 Specific Link Width Check 0 0 Specific Link Speed Check 0 0 Partitions Check 0 0 IPoIB Subnets Check 0 0 Please see /tmp/ibdiagnet.log for complete log ---------------------------------------------------------------- -I- Done. Run time was 1 seconds.
Note:
The actual output for your InfiniBand fabric will differ from that in the example.
Use the ibcheckerrors
command that uses the topology file to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.
On the command-line interface (CLI), enter the following command:
# ibcheckerrors
## Summary: 4 nodes checked, 0 bad nodes found ## 34 ports checked, 0 ports have errors beyond threshold
Note:
The actual output for your InfiniBand fabric will differ from that in the example.