JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Sun Datacenter InfiniBand Switch 648 Topic Set
search filter icon
search icon

Document Information

Preface

Related Documentation

Documentation, Support, and Training

Third-Party Web Sites

Sun Welcomes Your Comments

Administration

Administering the Switch

Understanding Administrative Commands

Monitoring the Switch

Monitoring the InfiniBand Fabric

Controlling the Switch

Controlling the InfiniBand Fabric

Diagnosing a Problem

Identifying the Problem

Checking Logs and Error Messages

Switch Event Message Overview

Display the Switch Message Log for Power Supplies

Display the Switch Message Log for Fabric Cards

Display the Switch Message Log for Line Cards

Display the System Event Log

Checking Status LEDs

Check Power Supply LEDs

Check CMC Status LEDs

Check the NET MGT LEDs

Check Fabric Card LEDs

Check Fan LEDs

Check Line Card LEDs

Check Link LEDs

Troubleshooting the Switch

Switch Hardware Problems

InfiniBand Fabric Problems

Understanding Signal Routing Through the Switch

CXP Connectors and LEDs to Line Card Switch Chip and Port Routes

Line Card Switch Chip to Line Card XBOW Connector Routes

Fabric Card XBOW Connector to Fabric Card Switch Chip Routes

Fabric Card Switch Chip to Fabric Card XBOW Connector Routes

Line Card XBOW Connector to Line Card Switch Chip Routes

Line Card Switch Chip and Port to CXP Connector and LED Routes

Signal Route Through the Switch

Understanding Switch Startup and Component Addition

Switch Startup Process

Fabric Card Addition Process

Line Card Addition Process

Switch GUIDs Overview

Remote Management

Understanding ILOM on the Switch

ILOM Documentation

ILOM Overview

Supported Features

Understanding ILOM Targets

Installing the ILOM Firmware

Firmware Layout

rupgrade_tool Application

Download the ILOM Firmware

Install the ILOM Firmware

Administering ILOM (CLI)

CLI Overview

Access ILOM From the CLI

Switching Between the ILOM Shell and the Linux Shell

Monitoring ILOM Targets (CLI)

Controlling ILOM Targets (CLI)

Upgrading the Switch Firmware Through ILOM (CLI)

Administering ILOM (Web Interface)

Web Interface Overview

Access ILOM From the Web Interface

Monitoring ILOM Targets (Web Interface)

Controlling ILOM Targets (Web Interface)

Upgrading the Switch Firmware Through ILOM (Web Interface)

Understanding ILOM Commands

cd Command

create Command

delete Command

dump Command

exit Command

help Command

load Command

set Command

show Command

version Command

Installation

Preparing the Site

Routing Service Cables

Understanding InfiniBand Cabling

Understanding Switch Specifications

Installing the Switch

Understanding the Installation

Installing the InfiniBand Software Stack

Unpacking the Chassis

Preparing the Chassis

Inspecting the Components

Installing the Components

Powering On the CMCs

Powering On the Fabric Cards

Powering On the Line Cards

Connecting InfiniBand Cables

Verifying the InfiniBand Fabric

Service

Servicing the Switch

Understanding Service Procedures

Servicing the Power Supplies

Servicing the CMCs

Servicing the Fans

Servicing the Fabric Cards and Fabric Card Fillers

Servicing the InfiniBand Cables

Servicing the Line Cards

Doing Supportive Software Tasks

Reference

Command Reference

Understanding Switch-Specific Commands

Understanding CLIA Commands

Understanding InfiniBand Software Commands

Index

InfiniBand Fabric Problems

The following table lists situations that might occur with the InfiniBand fabric and corrective steps that can be taken to resolve the problem.

Situation
Corrective Steps
Performance of the InfiniBand fabric seems diminished.
  1. Determine if there are errors or problems with the InfiniBand fabric.

    See:

  2. Locate the affected nodes by the GUID provided in the output of the ibdiagnet command.

    See Locate a Switch Chip or Connector From the GUID.

  3. If the problem is at a cable connection, swap the suspect cable with a known good cable or reconnect the cable to a known good remote port and repeat Step 1.

    See Switch Service, servicing an InfiniBand cable.

  4. If the problem still remains at the cable connection, disable and re-enable that port on the line card and repeat Step 1.

    See Disable a Port and Enable a Port.

  5. If the problem is within a line card or fabric card, disable and re-enable the respective port.

    See Disable a Port and Enable a Port.

  6. If the problem still remains within a line card or fabric card, reduce the local deflection of the midplane.

    Unscrew the retainer bolts of the affected fabric cards or line cards or both by 3/4 turn. Alternate between retainer bolts, turning each 1/4 turn counter-clockwise each time and then reseat the fabric card or line card.

    See Switch Service, servicing a fabric card, servicing a line card.

Temporary solution:

  • If the problem still remains, disable the affected port.

    See Disable a Port.

Permanent solution:

  • If the problem still remains, replace the affected component.

    See Switch Service, replacing a fabric card, replacing a line card, replacing an InfiniBand cable.

    See remote port’s documentation for replacement procedures.

An InfiniBand Link LED is blinking.
  1. Disconnect and properly reconnect both ends of the respective InfiniBand cable.

    See Switch Service, servicing an InfiniBand cable.

  2. If the LED is still blinking, determine the significance of the errors through use of the ibdiagnet command.

    See Determine Which Links Are Experiencing Significant Errors.

  3. Determine which connectors map to the affected link.

    See Locate a Switch Chip or Connector From the GUID.

  4. If some of the links are running at 1x or SDR, use that situation elsewhere in this table to rectify the problem.

  5. Disable and re-enable the respective ports.

    See Disable a Port and Enable a Port.

  6. If the errors are still significant, swap the cable with a known good one or reconnect the cable to a known good remote port, and repeat from 2.

  7. Depending upon what does or does not rectify the problem, replace that component.

    See Switch Service, replacing a InfiniBand cable, replacing a line card.

    See remote port’s documentation for replacement procedures.

There are errors on some InfiniBand links.
  1. Clear the error counters.

    See Clear Error Counters.

  2. Start a fabric stress test.

  3. Identify the suspect links using the ibdiagnet command.

    See Determine Which Links Are Experiencing Significant Errors. Look for text like the following:

    -W- lid=0x0006 guid=0x0021283a8816c0a0 dev=48438 Port=34

    Performance Monitor counter : Value

    link_recovery_error_counter : 0x1

    symbol_error_counter : 0x25 (Increase by 3 during ibdiagnet)

  4. For links that are experiencing recovery errors or substantial symbol errors, refer to other parts of this table to help identify the cause and rectify the problem.

Some InfiniBand links are running at 1x or SDR.
For a temporary solution:
  1. Identify the suspect links using the ibdiagnet command.

    See Find 1x or SDR or DDR Links in the Fabric. Look for text like the following:

    -W- link with SPD=2.5 found at direct path "1,19"

    From: a Switch PortGUID=0x00066a00d80001dd Port=19

    To: a Switch PortGUID=0x00066a00d80001dd Port=24

  2. Determine which connectors map to the affected link.

    See Locate a Switch Chip or Connector From the GUID.

  3. Verify the cable connection at both ends.

    See Switch Service, servicing an InfiniBand cable.

  4. Disable and re-enable the respective ports.

    See Disable a Port and Enable a Port.

  5. If the previous steps do not rectify the problem, disable the port.

    See Disable a Port.

For a permanent solution:

  1. Perform the steps for a temporary solution, steps 1 to step 5.

  2. Swap the cable with a known good one or reconnect the cable to a known good remote port, and repeat from 1.

  3. Depending upon what does or does not rectify the problem, replace that component.

    See Switch Service, replacing a InfiniBand cable, replacing a fabric card, replacing a line card.

    See remote port’s documentation for replacement procedures.

Output of InfiniBand software commands provides only GUID and port, not switch chip numbers or CXP connectors.
  1. The findport switch-specific command can translate GUID port combinations to the location in the switch.

    See Locate a Switch Chip or Connector From the GUID.

  2. If the port immediately links to a CXP connector, the findport command identifies that connector.

    See Switch Reference, findport command.

Related Information