A P P E N D I X  B

Diagnostic Software and Troubleshooting

This appendix provides an overview of the SunVTStrademark diagnostic application and troubleshooting tips for checking the adapter. There is also a section outlining some common troubleshooting issues. This appendix contains the following sections:


SunVTS Diagnostic Testing

The SunVTS software executes multiple diagnostic hardware tests from a single user interface, and is used to verify the configuration and functionality of most hardware controllers and devices. The SunVTS software operates primarily from a graphical user interface, enabling test parameters to be set quickly and easily while a diagnostic test operation is being performed.



Note - SunVTS diagnostic software is not currently available for Solaris x86 Operating Systems.


Refer to the SunVTS documents (listed in TABLE B-1) for instructions on how to run and monitor the nettest diagnostic. These SunVTS documents are available online at the following URL:

http://docs.sun.com/app/docs/prod/test.validate/sunvts/index.html

Select the document for the Solaris release on your system


TABLE B-1 SunVTS Documentation

Title

Descriptions

SunVTS 6.0 PS1 Documentation Supplement (819-1804)

Describes the new SunVTS features and tests, including the ibhcatest.

SunVTS 6.0 User’s Guide (817-7664)

Describes the SunVTS diagnostic environment.

SunVTS 6.0 Test Reference Manual (817-7665)

Describes each SunVTS test and describes the various test options and command-line arguments.

SunVTS 6.0 Quick Reference Card (817-7686)

Provides an overview of the user interface.


Table listing the SunVTS documentation.

Using the SunVTS ibhcatest

The ibhcatest diagnostic test checks the functionality of Sun Dual Port 4x DDR IB Host Channel Adapter PCIe ExpressModule card. This test can be run from the SunVTS user interface, or it can be run from the command line. See the SunVTS 6.0 Test Reference Manual (817-7665) for more information about the ibhcatest test.

The ibhcatest diagnostic test is included in the SunVTS 6.0 Patch Set 1 and subsequent SunVTS software releases. SunVTS 6.0 Patch Set 1 is available for downloading from the SunSolveSM web site http://sunsolve.sun.com using the following patch numbers:

The adapter and Tavor device driver must be installed, and the IB port interface must be configured offline for the ibhcatest to run. A loopback cable is not needed because ibhcatest includes an internal loopback test. Use the following procedure when running the ibhcatest command.


procedure icon  To Use the ibhcatest Command

1. Ensure that the SunVTS software and the Tavor driver are installed on your system, by typing:


# pkginfo SUNWvts SUNWvtsx SUNWtavor

If a SunVTS software package is not installed, refer to the SunVTS User’s Guide for installation instruction. If the SUNWtavor package is not installed, check your Solaris Operating System documentation for software package information.

2. Unplumb the interface from the system, using the ifconfig command:


# ifconfig ibdn down unplumb

where n is the instance number of the interface.

3. Refer to SunVTS 6.0 PS1 Documentation Supplement (819-1804) for instructions on how to run the ibhcatest command.


Troubleshooting Tasks

The following tasks can be useful when troubleshooting the IB-HCA and the link.

Check that the following packages are installed:

If an InfiniBand software package is not installed, check your Solaris Operating System documentation for software package information.

See tavor(7D) for error messages and descriptions. When the driver is attached to a port on the adapter, the following message is sent.


tavorn: port m up (link width 4x).

In the message, n is the instance of the Tavor device number and m is the port number on the adapter.

One way to check Tavor messages is by typing the following command:


# dmesg | grep tavor


Other Useful Utilities

These utilities can display status and other information about InfiniBand devices:

cfgadm

The cfgadm utility displays status and other information about the IB-HCA and IB fabric. See cfgadm_ib(1M) for details. For example:


# cfgadm -al
  Ap_Id                     Type       Receptacle   Occupant      Condition      
  hca:21346543210a987       IB-HCA     connected    configured    ok
  ib                        IB-FABRIC  connected    configured    ok
  ib::80020123456789a       IB-IOC     connected    configured    ok
  ib::802abc9876543         IB-IOC     connected    unconfigured  unknown
  ib::80245678,ffff,ipib    IB-VPPA    connected    configured    ok
  ib::12245678,0,nfs        IB-PORT    connected    configured    ok
  ib::21346543,0,hnfs       IB-HCA_SVC connected    configured    ok
  ib::sdp,0                 IB-PSEUDO  connected    configured    ok

snoop

The snoop program captures and inspects network packets. See the snoop(1M) man page for details. For example:


# snoop -d ibd1
Using device /dev/ibd1 (promiscuous mode)
    ib-1-167 -> *            ARP C Who is 199.1.1.168, ib-1-168 ?
    ib-1-168 -> ib-1-167     ARP R 199.1.1.168, ib-1-168 is
0:2:4:7:0:0:0:0:a:4:7c:4f:0:2:c9:2:0:0:55:91
    ib-1-167 -> ib-1-168     ICMP Echo request (ID: 35608 Sequence number: 0)
    ib-1-168 -> ib-1-167     ICMP Echo reply (ID: 35608 Sequence number: 0)

netstat

netstat shows network status. See the netstat(1M) man page for details. For example:


# netstat -I ibd 4
    input   ibd1      output       input  (Total)    output
packets errs  packets errs  colls  packets errs  packets errs  colls
2458394 0     2458268 0     0      2467288 0     2465951 0     0
92233   0     92237   0     0      92247   0     92238   0     0
92703   0     92702   0     0      92709   0     92704   0     0

kstat

kstat displays kernel statistics. See the kstat(1M) man page for details. For example:


# kstat ibd:1
module: ibd                             instance: 1
name:   ibd1                            class:    net
 
                                                   0
        opackets                        27381595
        opackets64                      27381595
        promisc                         off        xmt_badinterp
              0
        xmtretry                        4