C H A P T E R  3

InfiniBand Software on Linux

InfiniBand is a network architecture that is designed for the large-scale interconnection of computing and I/O nodes through a high-speed switched fabric. To operate InfiniBand on a Sun Blade 8000 Series Modular System, you need an InfiniBand HCA (the ExpressModule) and an InfiniBand software stack.

This chapter provides an overview and installation instructions for the InfiniBand software stack for the Linux operating system.

Consult the Sun Blade 8000 Series Product Notes for the most recent information about supported operating systems, firmware and software updates, and other issues not covered in the main product documentation.


InfiniBand Software for Linux

If you have installed current releases of Red Hat Enterprise Linux Advanced Server (RHEL AS 4-U3 or later) or SUSE Linux Enterprise Server (SLES9 SP3 or later, SLES10) on a Sun Blade Server Module and you have installed the bundled drivers and OFED Release 1.2.5 or later, you do not need to install or configure additional drivers to support the IB ExpressModule (IB EM).

Specifically, RHEL AS 4-U4 contains support in the kernel for IB-HCA hardware produced by Mellanox (mthca driver). The kernel also includes core InfiniBand modules, which provide the interface between the lower-level hardware driver and the upper-layer InfiniBand protocol drivers. The InfiniBand modules provide user space access to InfiniBand hardware.

The kernel also includes the Sockets Direct Protocol (SDP) driver, IP over Infiniband (IPoIB) and the SCSI RDMA Protocol (SRP) driver.

RHEL AS 4-U4 includes the following user space packages described in TABLE 3-1.

 


TABLE 3-1 RHEL AS 4-U4 Packages
kernel-ib

Base package that is required to support all other packages. Includes
the files necessary to configure the kernel portion of the openib stack, create the proper udev rules, add the init script that allows the kernel modules to be selectively loaded at boot, and so on.

dapl

RDMA API that supports the DAT 1.2 specification.

ibibcm

InfiniBand Connection Management API.

libibcommon

Common utility functions for the IB diagnostic and management tools.

libibmad

Low-layer IB functions for use by the IB diagnostic and management programs, including MAD, SA, SMP, and other basic IB functions.

libibumad

User MAD library functions that sit on top of the user MAD modules in the kernel. Used by the IB diagnostic and management tools, including OpenSM.

libibverbs

Library that allows user space processes to use InfiniBand "verbs" as described in the InfiniBand Architecture Specification.

libibverbs-utils

Useful subnet and device diagnostic utilities.

libmthca

Device-specific user space driver for Mellanox HCAs (MT23108 InfiniHost and MT25208 InfiniHost III Ex) for use with the libibverbs library.

libipathverbs

Device-specific driver for Pathscale HCAs for use with libibverbs (only available on x86_64 and ia64 systems).

librdmacm

RDMA Connection Management (cm) library.

libsdp

Driver that enables a sockets application to use InfiniBand Sockets Direct Protocol (SDP) instead of TCP transparently and without recompiling the application.

openib-diags

Diagnostic programs and scripts that diagnose the IB subnet.

opensm

Subnet manager software for InfiniBand networks.

opensm-libs

Shared libraries for InfiniBand user space access.

perftest

InfiniBand performance tests.

srptools

In conjunction with the kernel ib_srp driver, allows discovery and and use of SCSI class devices via the SCSI RDMA Protocol over InfiniBand.

mstflint

Tool to query and update firmware flash memory attached to Mellanox InfiniBand HCAs.




Note - These package names can change, depending upon the Linux OS.


The packages selected to support any given configuration will vary. TABLE 3-2 lists the packages considered the absolute minimum needed to support the environment described in this guide.


TABLE 3-2 Required Packages for InfiniBand Support

Package

Command Enabled

Description

kernel-ib

openibd

IB master control script

openib-diags

ibstat

IB utility to display HCAs

openib-diags

ibnetdiscover

IB utility to probe and show the fabric

mstflint

mstflint

Mellanox utility to update HCA FLASHRAM

libibcommon

NA

IB support package

libibmad

NA

IB support package

libibumad

NA

IB support package

OFED Release 1.2.5 or later

NA

IB support package


If you elected not to install these packages when installing the Linux OS or if you want to upgrade your drivers, you can install these packages at any time from the OS distribution source. You can also download the required files from OpenFabrics.org. For information on both of these procedures, see Installing the InfiniBand Drivers on Linux.

OpenFabrics Enterprise Distribution for Linux

As the popularity of InfiniBand technology increases, the number of Linux distributions and open source organizations producing drivers and tools will increase. For up-to-date information, check with open source organizations and your current vendors.

The OpenFabrics organization is the Open Software solution in the InfiniBand software space and OpenFabrics Enterprise Distribution (OFED) is the InfiniBand suite of software produced by this organization. Various vendors contribute their drivers (and other software components) to OFED.

TABLE 3-3 lists the tested Linux platforms and the corresponding OFED release.


TABLE 3-3 Linux Platforms and OFED Release

Linux Platform

OFED Release

RHEL AS 4-U3 or later

For RHEL AS 4-U3, Sun has tested OFED Release 1.2.5 of the OpenFabrics stack.

Note: RHEL AS 4-U4 includes an older version of OFED, so you must install OFED Release 1.2.5 or a later version.

SLES9 SP3 or later, SLES10

Sun has tested OFED Release 1.2.5 for the SLES10 platform. Note: You must have OFED Release 1.2.5 or a later version.


OFED contains the following components:

Installing the InfiniBand Drivers on Linux

If you did not install the InfiniBand drivers when installing the Linux OS, you can install them at any time from the OS distribution source or by downloading the necessary files from OpenFabrics.org.

To do so, choose one of the following procedures:

If you need to determine whether or not the drivers are already installed, see To Verify Driver Installation on Linux.


procedure icon  To Install IB Drivers From the Linux Distribution Source

1. Obtain the Red Hat Package Manager (RPM) files containing the InfiniBand drivers.

Access to these files is dependent on your individual installation configuration (net boot, CD/DVD boot, .iso files, and so on). When you decide on the appropriate access method and package selection, you can add the packages to the KickStart configuration file (on RHEL) for automatic inclusion in future installations.



Note - On a 32-bit RHEL4 system, all packages have a .i386.rpm extension (as shown in the following procedure). On a 64-bit RHEL4 system, all packages have a .x86_64.rpm extension instead.


2. Enter the rpm -ivh command for each InfiniBand package that you need to install.

Packages must be installed in the following order:

The following example shows the installation of one package (libibcommon) and the resulting dialog on an RHEL AS 4-U4 32-bit system:


 > rpm -ivh libibcommon-1.0-1.i386.rpm
 warning: libibcommon-1.0-1.i386.rpm: V3 DSA signature: NOKEY, key ID db42a60e
 Preparing...     ##################################### [100%]
1:libibcommon  ########################################### [100%]
> rpm -ivh libibumad-1.0-1.i386.rpm
.
.
.

3. If you are running the CSH or TCSH shell, enter the rehash command to rebuild the shell’s view of available executables.

4. Enter the ibstat command to verify that the OS sees the IB EM.


> ibstat
CA 'mthca0'
     CA type: MT25204
     Number of ports: 1
     Firmware version: 1.1.0
     Hardware version: a0
     Node GUID: 0x001b00000ca72640 
     System image GUID: 0x001b00000ca72643
     Port 1
         State: Active
         Physical state: LinkUp
         Rate: 20
         Base lid: 71
         LMC: 0 
         SM lid: 2
         Capability mask: 0x02510a68
         Port GUID: 0x001b00000ca72641

5. (Optional) You can enter the ibnetdiscover command to verify the presence of an operational IB fabric.

For an example of the output of this command, see To Verify Driver Installation on Linux.

6. (Optional) You can check the status of the ib0 network interface to determine whether the ib_ipoib driver is installed.

For details on this step, see To Install IPoIB Driver.


procedure icon  To Install the OFED Package



Note - The Dual Port 4x DDR IB PCIe Host Channel Adapter requires OFED Release 1.2.5 or later.


1. On the Sun Blade Server Module, log in as root and copy the required files from the following location:

http://www.openfabrics.org/downloads.htm

In the following example, OFED-1.2.5.tar is used only as an example.



Note - You need Write access to the files to execute the install script.


2. From root, extract the files by typing:


> tar -zxvf OFED-1.2.5.tar

3. From the OFED-1.2.5 directory, initiate the installation process by typing:


> ./install.sh

4. When the InfiniBand OFED Distribution Software Installation Menu appears, select option 2 (Install OFED Software).

5. When the Select OFED Software menu appears, select option 3 (All packages).

6. When you are asked if you wish to create/install an MPI RPM with gcc,
enter n.


The following compiler(s) on your system can be used to build/install MPI:  gcc 
Do you wish to create/install an MPI RPM with gcc? [Y/n]: n

7. Next, you are asked if you wish to create/install an openmpi RPM with gcc. Again, type n.


The following compiler(s) on your system can be used to build/install openmpi:  gcc 
Do you wish to create/install an openmpi RPM with gcc? [Y/n]: n

The installation script then lists the OFED packages that it will build. See the following sample output.


Following is the list of OFED packages that you have chosen (some may have been added by the installation program due to package dependencies):
ib_ipath
ib_ipoib
...
mpitests
ibutils
 
WARNING: This installation program will remove any previously installed IB packages on your machine.
 
Do you want to continue? [Y/n]: Y

8. Type Y to continue, as shown above.

Next, you are prompted to configure InfiniBand IP support.

9. Type Y when asked if you want to include IPoIB configuration files.


Do you want to include IPoIB configuration files (ifcfg-ib*)? [Y/n]: Y

10. Press Enter to accept the default when prompted to enter a temporary directory for OFED.


RPM build process requires a temporary directory.
Please enter the temporary directory [/var/tmp/OFED]: 

11. Press Enter to accept the default when prompted for the OFED installation directory.


Please enter the OFED installation directory [/usr/local/ofed]:

At this point, the installer begins compiling InfiniBand packages. The process of building packages takes approximately 15-20 minutes.

The system displays output like the following:


The MPI_COMPILER_openmpi variable is not defined. Trying the default compiler: gcc 
 
The following compiler(s) will be used to build the openmpi RPM(s): gcc
 
Checking dependencies. Please wait ...
 
Building InfiniBand Software RPMs. Please wait...
 
Building openib RPMs. Please wait... 
.
.
.
33 packages were built
 
Build process finished ...

Installation then begins. See the following message.


Removing previous InfiniBand Software installation
Running /bin/rpm -e libibverbs libibverbs-devel libibverbs-utils...

The actual installation takes about one minute.

Assuming the IB EM hardware is installed (and, therefore, an InfiniBand HCA is present), you are prompted to configure InfiniBand IP support.

12. Enter Y in response to the following prompt:


Do you want to configure IPoIB interfaces [Y/n]? Y

The default IPoIB interface configuration is based on DHCP. A special patch for DHCP is required for supporting IPoIB. The patch is available under:

OFED-1.0/docs/dhcp

If you do not have DHCP, you must change this configuration in the following steps.

The system next displays the current configuration.

13. When asked if you want to change the configuration as displayed, type y.


The current IPOIB configuration for ib0 is:
DEVICE=ib0
BOOTPROTO=dhcp
ONBOOT=yes
Do you want to change this configuration? [y/N]: Y

The configuration script guides you through the changes one at a time. See the following as an example.


Enter an IP Address:10.0.0.52
Enter the Netmask: 255.255.255.0
Enter the Network:10.0.0.0
Enter the Broadcast Address:10.0.0.255
Start Device On Boot? [Y/n]:Y
 
Selected configuration:
 
IPADDR=10.0.0.52
NETMASK=255.255.255.0
NETWORK=10.0.0.0
BROADCAST=10.0.0.255
ONBOOT=yes
 
Do you want to save the selected configuration? [Y/n]: Y

14. Type Y to save the configuration.

If you have entered a valid IP configuration for ib0, you are now properly configured for IPoIB operations.

15. Iterate the InfiniBand configuration over all InfiniBand interfaces.

Enter a valid IP configuration for each network interface.

Once all IPoIB interfaces have been configured, you are prompted as follows to configure OpenSM for the blade.

16. Enter n to complete this part of the installation.


Do you want to configure OpenSM [Y/n]? n

You should see a message like the following.


Installation finished successfully...
Press Enter to continue...

17. Press Enter.

The InfiniBand OFED Distribution Software Installation Menu is displayed.

18. Type Q to exit.

The Sun Blade Server Module is configured now to start up the InfiniBand software on reboot (ONBOOT=yes).

If this is not the desired behavior, you can edit the /etc/infiniband/openib.conf file, changing ONBOOT to equal no. You can also manually control basic InfiniBand behavior by entering the following command:


/etc/init.d/openibd option 

where option can be start, stop, or status.

19. After a successful installation, reboot the Server Module.

After the reboot, the Server Module should come up as a functional member of the InfiniBand fabric.


procedure icon  To Verify Driver Installation on Linux

1. Verify that the Linux software driver is installed and attached to the IB EM by typing the openibd status command.

When using the openibd command, type the entire path as shown in this example..


> /etc/init.d/openibd status
     HCA driver loaded
Configured devices:
ib0
Currently active devices:
ib0
     The following modules are also loaded: 
ib_cm
ip_ipoib
.
.
.

This example shows the IB driver installed, running and presenting one IB-HCA channel or network device (ibn) to the OS. In the example, the Linux network device appears as ib0To view details of operational status, type the ibstat command.

The following example shows one operational IB channel into the IB fabric (or network). The LinkUp state indicates active participation in an IB fabric. It is present as lid 69 and it is being managed by lid 2.


> ibstat
CA 'mthca0'
     CA type: MT25204
     Number of ports: 1
     Firmware version: 1.1.0
     Hardware version: a0
     Node GUID: 0x001b00000ca72620 
     System image GUID: 0x001b00000ca72623
     Port 1
         State: Active
         Physical state: LinkUp
         Rate: 20
         Base lid: 69
         LMC: 0 
         SM lid: 2
         Capability mask: 0x02510a68
         Port GUID: 0x001b00000ca72621

You can also verify that the InfiniBand fabric is operational by entering the ibnetdiscover command. The output from this command will list all the nodes, as shown in the following sample output.


> ibnetdiscover
#
# Topology file: generated on Thu Jan 11 15:19:59 2007
#
# Max of 4 hops discovered
# Initiated from node 001b00000ca72620 port 001b00000ca72621 
 
vendid=0x8f1
devid=0x5a31
sysimgguid=0x8f10400411ef9
switchguid=0x8f10400411ef8
 
Switch  24 "S-0008f10400411ef8"    # Switch port 0 lid 9
[21]       "H-0002c90109761ea0"[2]
[12]       "S-0005ad00000161ba"[5]
[7]        "H-001b00000ca72630"[1]
[6]        "H-001b00000ca72620"[1]
vendid=0x5ad
devid=0xa87c
sysimgguid=0x5ad01010161b6
switchguid=0x5ad00000161ba 
Switch  8 "S-0005ad00000161ba"    # Switch - U3 port 0 lid 3
[4]"       H-0005ad0000011310"[1]
[3]       "S-0005ad00000161b6"[1]
[2]       "S-0005ad00000161b6"[2] 
[1]       "S-0005ad00000161b8"[3]
[5]       "S-0008f10400411ef8"[12]
.
.
.
vendid=0x2c9
devid=0x6274
sysimgguid=0x1b00000ca72633
caguid=0x1b00000ca72630 
Ca  1 "H-001b00000ca72630"  # 4x DDR IB 10-Port PCIe Network Express Module
[1]     "S-0008f10400411ef8"[7]     # lid 68 lmc 0n