Skip Headers
Oracle® Exalogic Elastic Cloud Machine Owner's Guide
Release EL X2-2, X3-2, and X4-2

E18478-18
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

13 Using Sun Datacenter InfiniBand Switch 36 in Multirack Configurations

This chapter describes how to set up and configure Sun Datacenter InfiniBand Switch 36, which is used as the spine switch in multirack configurations (an Exalogic machine to another Exalogic machine, or an Exalogic machine to an Oracle Exadata Database Machine) only. This spine switch is not included in Exalogic machine quarter-rack configuration.

By using this spine switch, you can connect multiple Exalogic machines or a combination of Exalogic machines and Oracle Exadata Database Machines together on the same InfiniBand fabric.

This chapter contains the following topics:

13.1 Physical Specifications

Table 13-1 provides the physical specifications of the Sun Datacenter InfiniBand Switch 36.

Table 13-1 Sun Datacenter InfiniBand Switch 36 Specifications

Dimension Measurements

Width

17.52 in. (445.0 mm)

Depth

24 in. (609.6 mm)

Height

1.75  in. (44.5 mm)

Weight

23.0 lbs (11.4 kg)


13.2 Accessing the CLI of a Sun Datacenter InfiniBand Switch 36

The Sun Datacenter InfiniBand Switch 36 is connected and used in the Exalogic machine in multirack configuration scenarios only. Therefore, you can access the CLI of this switch after connecting the switch in a multirack configuration scenario.

After connecting this switch and applying power, you can access its command-line interface (CLI).

To access the command-line interface (CLI):

  1. If you are using a network management port, begin network communication with the command-line interface (CLI) using the ssh command and the host name configured with the DHCP server.:

    % ssh -l root switch-name
    root@switch-name's password: password
    #
    

    where switch-name is the host name assigned to the Sun Datacenter InfiniBand Switch 36.

    If you do not see this output or prompt, there is a problem with the network communication or cabling of the switch.

  2. If you are using a USB management port, begin serial communication with the command-line interface (CLI) as follows:

    1. Connect a serial terminal, terminal server, or workstation with a TIP connection to the USB-to-serial adapter. Configure the terminal or terminal emulator with these settings:

      115200 baud, 8 bits, No parity, 1 Stop bit, and No handshaking

    2. Press the Return or Enter key on the serial device several times to synchronize the connection. You might see text similar to the following:

      …
      CentOS release 5.2 (Final)
      Kernel 2.6.27.13-nm2 on an i686
      
      switch-name login: root
      Password: password
      #
      

      where switch-name is the host name assigned to the Sun Datacenter InfiniBand Switch 36.

      If you do not see this output or prompt, there is a problem with the network communication or the cabling of the switch.

13.3 Verifying the Switch Status

For the Sun Datacenter InfiniBand Switch 36, you can check the status of the command-line interface (CLI), power supplies, fans, and switch chip. Verify that the voltage and temperature values of the switch are within specification:

# showunhealthy
# env_test

An unfavorable output from these commands indicates a hardware fault with that particular component. A voltage or temperature deviating more than 10% from the provided specification means a problem with the respective component.

For example, on the CLI of the switch, enter the following command to check its status:

# env_test

This command performs a set of checks and displays the overall status of switch, as in the following example:

Environment test started:
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.28 V
Measured 3.3V Standby = 3.37 V
Measured 12V = 12.06 V
Measured 5V = 5.03 V
Measured VBAT = 3.25 V
Measured 1.0V = 1.01 V
Measured I4 1.2V = 1.22 V
Measured 2.5V = 2.52 V
Measured V1P2 DIG = 1.17 V
Measured V1P2 AND = 1.16 V
Measured 1.2V BridgeX = 1.21 V
Measured 1.8V = 1.80 V
Measured 1.2V Standby = 1.20 V
Voltage test returned OK
Starting PSU test:
PSU 0 present
PSU 1 present
PSU test returned OK
Starting Temperature test:
Back temperature 23.00
Front temperature 32.62
SP temperature 26.12
Switch temperature 45, maxtemperature 45
Bridge-0 temperature 41, maxtemperature 42
Bridge-1 temperature 43, maxtemperature 44
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 11212
Fan 2 running at rpm 11313
Fan 3 running at rpm 11521
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting onboard ibdevice test:
Switch OK
Bridge-0 OK
Bridge-1 OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Environment test PASSED

When the switch status is operational, you can start the Subnet Manager (SM).

13.4 Starting the Subnet Manager in Multirack Configuration Scenarios

The Sun Datacenter InfiniBand Switch 36, which is referred to as the spine switch, is not connected, by default. Therefore, when you connect this switch in multirack configuration scenarios, you must start the Subnet Manager manually on the switch as follows:

  1. On the CLI of the spine switch, run the following command:

    # enablesm

  2. On the CLI of the spine switch, set the Subnet Manager priority within the command-line interface (CLI) as follows:

    # setsmpriority priority

    Note:

    For information about the switches on which the SM should run in various rack configurations and the SM priorities for the switches, see Section 12.3.2, "Running the Subnet Manager in Different Rack Configurations."

    For example, to set the Subnet Manager on the spine switch to priority 8, run the following command on the CLI of the spine switch:

    # setsmpriority 8

    The following output is displayed:

    -------------------------------------------------
    OpenSM 3.2.6_20090717
      Reading Cached Option File: /etc/opensm/opensm.conf
      Loading Cached Option:routing_engine = ftree
      Loading Cached Option:sminfo_polling_timeout = 1000
      Loading Cached Option:polling_retry_number = 3
    Command Line Arguments:
      Priority = 8
      Creating config file template '/tmp/osm.conf'.
      Log File: /var/log/opensm.log
    -------------------------------------------------
    

    For the changes to take effect, restart the Subnet Manager as follows:

    # disablesm

    # enablesm

  3. After assigning the SM priority on the spine switch correctly, on the CLI of the gateway switches (Sun Network QDR InfiniBand Gateway Switches referred to as leaf switches in this guide), run the following command to disable Subnet Manager individually on the gateway switches:

    # disablesm

13.5 Checking Link Status

After starting the Subnet Manager, you can verify that the Link LEDs for cabled links are green. If the Link LED is dark, the link is down. If the Link LED flashes, there are symbol errors.

To check the link status of the cables:

# listlinkup

If the link for a connector is reported as not present, the link at either end of the cable is down. If a port is down, use the enableswitchport 0 portnumber command to bring the port up. Alternatively, use the ibdevreset command to reset the switch chip.

See the Sun Datacenter InfiniBand Switch 36 User's Guide, "Enable a Switch Chip Port" and "Reset the Switch Chip".

After making sure that the link is up, you can verify the InfiniBand fabric.

The following is example output of the listlinkup command:

# listlinkup
Connector  0A Present <-> Switch Port 20 up (Enabled)
Connector  1A Present <-> Switch Port 22 up (Enabled)
Connector  2A Present <-> Switch Port 24 up (Enabled)
.
.
.
Connector 15A Not present
Connector 0A-ETH Present
 Bridge-0-1 Port 0A-ETH-1 up (Enabled)
 Bridge-0-1 Port 0A-ETH-2 up (Enabled)
 Bridge-0-0 Port 0A-ETH-3 up (Enabled)
 Bridge-0-0 Port 0A-ETH-4 up (Enabled)
Connector 1A-ETH Present
 Bridge-1-1 Port 1A-ETH-1 up (Enabled)
 Bridge-1-1 Port 1A-ETH-2 up (Enabled)
 Bridge-1-0 Port 1A-ETH-3 up (Enabled)
 Bridge-1-0 Port 1A-ETH-4 up (Enabled)
Connector 0B Present <-> Switch Port 19 up (Enabled)
Connector 1B Present <-> Switch Port 21 up (Enabled)
.
.
.
Connector 15B Not present
#

13.6 Verifying the InfiniBand Fabric in a Multirack Configuration

Use the following commands on the command-line interface (CLI) of the spine switch to verify that the InfiniBand fabric in your multirack configuration is operational:

  1. ibnetdiscover

    Discovers and displays the InfiniBand fabric topology and connections. See Discovering the InfiniBand Network Topology in a Multirack Configuration.

  2. ibdiagnet

    Performs diagnostics upon the InfiniBand fabric and reports status. See Performing Diagnostics on the InfiniBand Fabric in a Multirack Configuration.

  3. ibcheckerrors

    Checks the entire InfiniBand fabric for errors. See Checking for Errors in the InfiniBand Fabric in a Multirack Configuration.

13.6.1 Discovering the InfiniBand Network Topology in a Multirack Configuration

To discover the InfiniBand network topology and build a topology file which is used by the OpenSM Subnet Manager, run the following command on the command-line interface (CLI) of the spine switch:

# ibnetdiscover

The output is displayed, as in the following example:

The topology file is used by InfiniBand commands to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.

# Topology file: generated on Sat Apr 13 22:28:55 2002
#
# Max of 1 hops discovered
# Initiated from node 0021283a8389a0a0 port 0021283a8389a0a0
vendid=0x2c9
devid=0xbd36
sysimgguid=0x21283a8389a0a3
switchguid=0x21283a8389a0a0(21283a8389a0a0)
Switch   36 "S-0021283a8389a0a0" # "Sun DCS 36 QDR switch localhost" enhanced port 0 lid 15 lmc 0
[23]    "H-0003ba000100e388"[2](3ba000100e38a) # "nsn33-43 HCA-1" lid 14 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x3ba000100e38b
caguid=0x3ba000100e388
Ca   2 "H-0003ba000100e388" # "nsn33-43 HCA-1"
[2](3ba000100e38a)   "S-0021283a8389a0a0"[23] # lid 14 lmc 0 "Sun DCS 36 QDR switch localhost" lid 15 4xQDR

Note:

The actual output for your InfiniBand fabric will differ from that in the example.

13.6.2 Performing Diagnostics on the InfiniBand Fabric in a Multirack Configuration

To perform a collection of tests on the InfiniBand fabric and generate several files that contain parameters and aspects of the InfiniBand fabric, run the following command on the command-line interface (CLI) of the spine switch:

# ibdiagnet

In the following example, the ibdiagnet command is minimized to determine which links are underperforming:

# ibdiagnet -lw 4x -ls 10 -skip all

Loading IBDIAGNET from: /usr/lib/ibdiagnet1.2
-W- Topology file is not specified.
 Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/ibdm1.2
-I- Using port 0 as the local port.
-I- Discovering ... 2 nodes (1 Switches & 1 CA-s) discovered.
.
.
.
-I- Links With links width != 4x (as set by -lw option)
-I---------------------------------------------------
-I- No unmatched Links (with width != 4x) were found
-I---------------------------------------------------
-I- Links With links speed != 10 (as set by -ls option)
-I---------------------------------------------------
-I- No unmatched Links (with speed != 10) were found
.
.
.
-I- Stages Status Report:
 STAGE               Errors Warnings
 Bad GUIDs/LIDs Check               0   0 
 Link State Active Check               0   0 
 Performance Counters Report               0   0 
 Specific Link Width Check               0   0 
 Specific Link Speed Check               0   0 
 Partitions Check               0   0 
 IPoIB Subnets Check               0   0 
Please see /tmp/ibdiagnet.log for complete log
----------------------------------------------------------------
-I- Done. Run time was 1 seconds.

Note:

The actual output for your InfiniBand fabric will differ from that in the example.

13.6.3 Checking for Errors in the InfiniBand Fabric in a Multirack Configuration

Use the ibcheckerrors command that uses the topology file to scan the InfiniBand fabric and validate the connectivity as described in the topology file, and to report errors as indicated by the port counters.

On the command-line interface (CLI) of the spine switch, enter the following command:

# ibcheckerrors

## Summary: 4 nodes checked, 0 bad nodes found
##      34 ports checked, 0 ports have errors beyond threshold

Note:

The actual output for your InfiniBand fabric will differ from that in the example.

13.7 Monitoring the Spine Switch Using Web Interface

  1. Open a web browser and go to the following URL:

    http://switch-IP

    where switch-IP is the IP address of the spine switch.

  2. Log in to the interface as the root user.

  3. Click the Switch/Fabric Monitoring Tools tab.

  4. Click Launch Sun DCS GW Monitor.

    The Fabric Monitor is displayed.

13.8 What Next?

After setting up the Sun Datacenter InfiniBand Switch 36 in a multirack configuration scenario, you can proceed to monitor and control the InfiniBand fabric.