C H A P T E R  3

InfiniBand Software on the Solaris Operating System and Linux

InfiniBand is a network architecture for the large-scale interconnection of computing and I/O nodes through a high-speed switched fabric. To operate InfiniBand on a Sun server, you need an InfiniBand HCA (the adapter) and an InfiniBand software stack.

This chapter provides an overview of installing and using the InfiniBand software stack for the Solaris OS and Linux operating system.

Consult the product notes for your server for recent information about supported operating systems, firmware and software updates, and other issues not covered in the main product documentation.

This chapter includes:


InfiniBand Software on the Solaris Operating System

InfiniBand software is bundled with the Solaris 10 OS. The package containing the device driver for the Sun Dual Port 4x QDR IB Host Channel Adapter is SUNWhermon. The driver name is hermon.

InfiniBand Software for Solaris 10

For details about InfiniBand software supported in Solaris 10 Update releases, refer to the following documents in the Solaris 10 Release and Installation Collection available at http://docs.sun.com:



Note - The SUNWhermon package that is available in the Solaris 10 10/09 OS and subsequent Solaris Update releases must be used with this IB-HCA PCIe.


The InfiniBand software stack, consisting of the upper layer protocols and transport framework, is included in all of the Solaris software groups described in the Solaris Installation Guide. The SUNWhermon package is included in the Entire+OEM, Entire, and Developer software groups. If you are not using any of these groups, you must explicitly add the SUNWhermon package during initial installation. If you are not doing a software install, use the pkgadd(1) utility to add the package prior to using the Sun Dual Port 4x QDR IB Host Channel Adapter PCIe.

Sun Firmware Flash Update Tool for IB-HCAs

The Sun Firmware Flash Update tool in the Solaris 10 OS does not support the Sun Dual Port 4x QDR IB Host Channel Adapter PCIe. You must download a separate package containing that tool from the Oracle Download Webpage at: http://www.sun.com/download/index.jsp.

Go to the Download A-Z tab and search for the "Sun Firmware Flash Utility." Refer to the installation instructions in the package README file.

To check that the correct version is installed, type:


# firmwareflash -v
 
firmwareflash: version v1.9



Note - This command must display version number 1.9 or higher.



procedure icon  Verify the Installation With the Solaris 10 OS

Before you can verify the installation, you must install the adapter in the chassis, power on the server, and cable the server to an operational InfiniBand switch. Afterward, perform the following steps:

1. Ensure that the cables are connected to the adapter and switches.

2. Verify that the IB Subnet Manager is running on the IB switch or on a host within the subnet.

Refer to the manual for the IB Subnet Manager for more information.

3. Check that the green LED is illuminated for each port that is connected to the switch.

If the green LED is not on, check the cable connections at the adapter and at the switch.

4. Check that the amber LED is illuminated for each port that is connected to the switch.

5. Verify that the IB-HCA ports are up and the driver is attached.

a. Use the cfgadm(1m) command to obtain the state of the device installed.


# cfgadm -als "cols=ap_id:condition" hca
Ap_Id                          Condition
hca:2C90109763F70              ok

If more than one IB-HCA device is installed in the server, a row is displayed for each. Look for the row displaying hca:GUID where GUID is the 64-bit number from the physical label on the IB-HCA card. See Node GUID.

The Condition column must display ok to indicate that the driver has discovered the hardware and bound to it. Refer to the cfgadm_ib(1m) man pages for details about the IB specific extensions.

b. Use the cfgadm(1m) command to obtain port GUIDs for each port on the IB-HCA card.


# cfgadm -als "cols=ap_id:info" hca
Ap_Id                          Information
hca:2C90109763F70              VID: 0x15b3, PID: 0x5a44, 
#ports: 0x2, port1 GUID: 0x2C90109763F71, port2 GUID: 0x2C90109763F72

If more than one IB-HCA device is installed in the server, a row is displayed for each. Look for the row displaying hca:GUID where GUID is the 64-bit number from the physical label on the IB-HCA card. See Node GUID.

Use the port number and GUID displayed by this command for your IB-HCA device in the following step.

c. Use the cfgadm(1m) command to verify that the IB ports and partitions are configured by the Subnet Manager.


# cfgadm -als "select=type(IB-VPPA),cols=ap_id"
Ap_Id
ib::2C90109763F71,ffff,ipib
ib::2C90109763F72,ffff,ipib

The command displays the AP_ID column where each row has the format of ib::Port GUID,P_Key,ipib. Match the Port GUIDs from the previous command with these port GUIDs. There must be one row corresponding to the port and P_Key setup by the Subnet Manager. If an entry is missing, check the Subnet Manager configuration.

Sun Firmware Version for IB-HCAs on the Solaris OS

To use this adapter with the Solaris OS, the minimum firmware version must be 2.7.000. Use the firmwareflash command to display the revision level of your IB-HCA card.


# firmwareflash -l -c IB

Look for the revision number that appears after the Firmware revision string. If more than one HCA device is displayed, look for the Node Image GUID that matches the GUID displayed on the physical GUID label of the IB-HCA card being installed. See Node GUID.

If the firmware version is not at 2.7.000 or higher, update the firmware. Only update the firmware on your IB-HCA card with files specifically approved for the Oracle product. Select and download approved firmware files from:

http://www.mellanox.com/support/firmware_table_Sun.php

After obtaining a firmware image, use the firmwareflash command to install the firmware. Once installed, reboot the system to enable the new firmware.

Using InfiniBand Devices on the Solaris 10 OS

For details about InfiniBand software stack configurations in a Solaris 10 Update release, refer to the System Administration Guide: Devices and File Systems document in the Solaris 10 System Administrator Collection available at http://docs.sun.com.

Section 9 of this guide titled Using InfiniBand Devices (Overview/Tasks) describes how to set up upper layer protocols such as IPoIB and uDAPL.

Troubleshooting

Check the Sun Dual Port 4x QDR IB Host Channel Adapter PCIe Product Notes (820-6537) for information about known issues discovered using the Solaris 10 OS with your IB-HCA card.

When using IPoIB, verify that the broadcast group is configured by the Subnet Manager in the partition where the IPoIB link will be used.

 


InfiniBand Support Software for Linux

With most supported Linux releases, you must also install the MLNX_OFED software stack. This software is the Mellanox OpenFabrics Enterprise Distribution (OFED) for Linux. Refer to your Linux vendor for software installation recommendations and support.


procedure icon  Download the MLNX_OFED Software and Documentation

1. Go to the Mellanox Technologies web site:

http://www.mellanox.com

2. Select the Products tab.

3. Select InfiniBand SW/Drivers from the menu for Products.

4. Select Linux SW/Drivers from the menu for InfiniBand SW/Drivers.

5. Select the download that corresponds to your operating system.

Follow those instructions to complete the download.

6. Select and download the related documentation for MLNX_OFED.

At minimum, download copies of these manuals that are offered on the Linux SW/Drivers page:


procedure icon  Install the MLNX_OFED Software on a Sun Server

1. Refer to the Mellanox OFED for Linux Installation Guide you downloaded from the Mellanox web site.

The instructions in that document are correct for installation on a Sun server except for a difference explained in the next step.

2. Whenever a procedure in the Mellanox guide calls for running the mlnxofedinstall script, always include the --without-fw-update option.

This option prevents the MLNX_OFED installation process from automatically updating the firmware on your Sun HCA. Only update the firmware on that device using files specifically approved for Sun product. You can select and download approved firmware files from:

http://www.mellanox.com/support/firmware_table_Sun.php


Internet Protocol Over InfiniBand on Linux

Support for Internet Protocol Over InfiniBand (IPoIB) is included in the MLNX_OFED software distribution. Details on using IPoIB are included in the Mellanox OFED Stack for Linux User’s Manual.


Boot Over InfiniBand on Linux

Software to enable Boot Over InfiniBand (BoIB) on Linux is available from the Mellanox Technologies web site.


procedure icon  Download the Boot Over IB Software and Documentation

1. Go to the Mellanox Technologies web site:

http://www.mellanox.com

2. Select the Products tab.

3. Select Boot Over IB from the menu for Products.

4. Select Linux SW/Drivers from the menu for InfiniBand SW/Drivers.

5. Select Download.

Follow the instructions provided on the web page to complete the download.

6. Select and download the related documentation for BoIB.

At minimum, download copies of these manuals that are offered on the Boot Over IB page:


Verifying the Installation With Linux

Before you can verify the installation, you must install the adapter in the chassis, power the server, and cable it to an operational InfiniBand switch. The InfiniBand switch should automatically recognize InfiniBand servers when they are connected to the fabric.


procedure icon  Verify the Installation With Linux

The InfiniBand switch should automatically recognize the IB-HCA card when it is connected to the fabric if the IB Subnet Manager is running on the switch, or on a host within the subnet.

1. Ensure that the cables are connected to the adapter and switches.

2. Verify that the IB Subnet Manager is running on the IB switch or on a host within the subnet.

Refer to the manual for the IB Subnet Manager for more information.

3. Check that the green LED is illuminated for each port that is connected to the switch.

If the green LED is not on, check the cable connections at the adapter and at the switch.

4. Check that the amber LED is illuminated for each port that is connected to the switch.

5. Verify that the IB-HCA ports are up and the driver is attached:


# ibstat

The output shows system diagnostic messages that have the string mlx4 in the message (the name of the Linux driver). Included in the output is a message that indicates whether the port is up or down.


Additional InfiniBand Software for Linux

As the popularity of InfiniBand technology increases, the number of Linux distributions and open source organizations producing drivers and tools will increase. For up-to-date information, check with open source organizations (such as http://OpenFabrics.org) and your current vendors.

The OpenFabrics organization is the Open Software solution in the InfiniBand software space. The OpenFabrics Enterprise Distribution (OFED) is the InfiniBand suite of software produced by this organization. Various vendors contribute their drivers (and other software components) to OFED.