JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Sun Network QDR InfiniBand Gateway Switch HTML Document Collection for Firmware Version 2.1
search filter icon
search icon

Document Information

Using This Documentation

Product Notes

Related Documentation

Feedback

Access to Oracle Support

Installation

Understanding the Gateway

Gateway Overview

Physical Specifications

Environmental Requirements

Acoustic Noise Emissions

Electrical Specifications

NET MGT Connector and Pins

USB Management Connector and Pins

Data QSFP Connector and Pins

Understanding Cabling

Routing Service Cables

Understanding Data Cabling

Preparing to Install the Gateway

Installation Preparation

Suggested Tools for Installation

Antistatic Precautions for Installation

Installation Responsibilities

Installation Sequence

Verify Shipping Carton Contents

Assemble the Optical Fiber Data Cables

Route the Data Cables

Installing the Gateway

Install the Gateway in the Rack

Powering On the Gateway

Connecting Data cables

Verifying the InfiniBand Fabric

Administration

Troubleshooting the Gateway

Gateway Hardware Problems

InfiniBand Fabric Problems

Network Management Troubleshooting Guidelines

Understanding Routing Through the Gateway

Switch Chip Port to QSFP Connectors and Link LED Routes

QSFP Connectors and Link LEDs to Switch Chip Port Routes

Signal Route Through the Gateway

Switch GUIDs Overview

Understanding Administrative Commands

Configuration Overview

Oracle ILOM Command Overview

Hardware Command Overview

InfiniBand Command Overview

Administering the Chassis

Monitoring the Chassis

Controlling the Chassis

Administering the I4 Switch Chip

Monitoring the I4 Switch Chip

Controlling the I4 Switch Chip

Administering the Subnet Manager

Monitoring the Subnet Manager

Controlling the Subnet Manager

Configuring Secure Fabric Management

Administering the InfiniBand Fabric

Monitoring the InfiniBand Fabric

Controlling the InfiniBand Fabric

Administering Gateway Resources

Installing Gateway Supportive Software (Linux)

Creating VNICs Under Gateway Manual Mode (Linux)

Creating VNICs Under Host Manual Mode (Linux)

Creating Virtual IO Adapters (Oracle Solaris)

Monitoring Gateway Resources

Controlling LAGs

Controlling VLANs and VNICs

Controlling Gateway Ports and Parameters

Remote Management

Understanding Oracle ILOM on the Gateway

Oracle ILOM Overview

Supported Features

Understanding Oracle ILOM Targets

Administering Oracle ILOM (CLI)

CLI Overview

Accessing Oracle ILOM From the CLI

Switching Between the Oracle ILOM Shell and the Linux Shell

Monitoring Oracle ILOM Targets (CLI)

Controlling Oracle ILOM Targets (CLI)

Upgrading the Gateway Firmware Through Oracle ILOM (CLI)

Administering Oracle ILOM (Web)

Web Interface Overview

Access Oracle ILOM From the Web Interface

Monitoring Oracle ILOM Targets (Web)

Controlling Oracle ILOM Targets (Web)

Upgrade the Gateway Firmware (Web)

Using the Fabric Monitor

Access the Fabric Monitor

Fabric Monitor Features

Accessing the Rear Panel Diagram

Accessing Status Pane Information

Control Panel Function

Monitoring Parameters and Status

Administering Oracle ILOM (SNMP)

SNMP Overview

Understanding SNMP Commands

Monitoring Oracle ILOM Targets (SNMP)

Controlling Oracle ILOM Targets (SNMP)

Administering Hardware (IPMI)

ipmitool Overview

Display the Sensor States (IPMI)

Display the Sensor Information (IPMI)

Display the System Event Log (IPMI)

Display FRU ID Information (IPMI)

Display Gateway Status LED States (IPMI)

Enable the Locator LED (IPMI)

Disable the Locator LED (IPMI)

Understanding Oracle ILOM Commands

cd Command

create Command

delete Command

dump Command

exit Command (Oracle ILOM)

help Command (Oracle ILOM)

load Command

reset Command

set Command

show Command

version Command (Oracle ILOM)

Service

Detecting and Managing Faults

Interpreting Status LEDs

Managing Faulty Components

Identify Faults in the Oracle ILOM Event Log

Determining the Alarm State of a Component or System

Evaluating Sensor Alarms

Accessing CLI Prompts

Understanding Service Procedures

Replaceable Components

Suggested Tools for Service

Antistatic Precautions for Service

Servicing Power Supplies

Determine If a Power Supply Is Faulty

Inspecting a Power Supply

Power Off a Power Supply

Remove a Power Supply

Install a Power Supply

Power On a Power Supply

Servicing Fans

Determine If a Fan Is Faulty

Inspecting a Fan

Remove a Fan

Install a Fan

Servicing Data Cables

Inspecting the Data Cables

Remove a Data Cable

Install a Data Cable

Servicing the Battery

Determine If the Battery Is Faulty

Remove the Gateway From the Rack

Replace the Battery

Reference

Understanding Hardware Commands

Linux Shells for Hardware Commands

addlagport Command

allowhostconfig Command

autodisable Command

checkboot Command

checkpower Command

checktopomax Command

checkvoltages Command

connector Command

create_ipoib Command

createfabric Command

createlag Command

createvlan Command

createvnic Command

dcsport Command

delete_ipoib Command

deletelag Command

deletevlan Command

deletevnic Command

dellagport Command

disablecablelog Command

disablegwport Command

disablelagmode Command

disablelinklog Command

disablesm Command

disableswitchport Command

disablevnic Command

disallowhostconfig Command

enablecablelog Command

enablegwport Command

enablelagmode Command

enablelinklog Command

enablesm Command

enableswitchport Command

enablevnic Command

env_test Command

exit Command (Hardware)

fdconfig Command

fwverify Command

generatetopology Command

getfanspeed Command

getmaster Command

getportcounters Command

getportstatus Command

help Command (Hardware)

listlinkup Command

localmkeypersistence Command

matchtopology Command

modifyvnic Command

setcontrolledhandover Command

setdefaultgwdiscpkey Command

setgwethport Command

setgwinstance Command

setgwsl Command

setgwsystemname Command

sethostvniclimit Command

setmsmlocationmonitor Command

setsmmkey Command

setsmpriority Command

setsmrouting Command

setsubnetprefix Command

showdisk Command

showfree Command

showfruinfo Command

showgwconfig Command

showgwports Command

showioadapters Command

showlag Command

showpsufru Command

showsmlog Command

showtemps Command

showtopology Command

showunhealthy Command

showvlan Command

showvnics Command

smconfigtest Command

smnodes Command

smpartition Command

smsubnetprotection Command

version Command (Hardware)

Understanding InfiniBand Commands

Linux Shells for InfiniBand Commands

ibdiagnet Command

ibhosts Command

ibnetdiscover Command

ibnetstatus Command

ibnodes Command

ibportstate Command

ibroute Command

ibrouters Command

ibstat Command

ibswitches Command

ibtracert Command

perfquery Command

saquery Command

smpquery Command

Understanding SNMP MIB OIDs

OID Tables Overview

Understanding the SUN-DCS-IB-MIB MIB OIDs

Understanding the SUN-FABRIC-MIB MIB OIDs

Understanding the SUN-HW-TRAP-MIB MIB OIDs

Understanding the SUN-ILOM-CONTROL-MIB MIB OIDs

Understanding the SUN-PLATFORM-MIB MIB OIDs

Understanding the ENTITY-MIB MIB OIDs

Index

InfiniBand Fabric Problems

This table lists situations that might occur with the InfiniBand fabric and corrective steps that can be taken to resolve the problem.

Situation
Corrective Steps
After installation, no links are operational.
  1. Verify that there is at least one Subnet Manager active on the InfiniBand fabric.

    See Display Subnet Manager Priority, Controlled Handover State, Prefix, Management Key, and Routing Algorithm.

  2. If no Subnet Manager is active, start the Subnet Manager within the gateway.

    Refer to Gateway Installation, staring the Subnet Manager.

  3. If the previous steps do not rectify the situation, restart the Subnet Manager.

    See Disable the Subnet Manager and Enable the Subnet Manager.

After installation, not all links are operational.
  1. Determine which links are nonoperational.

    See Display Link Status.

  2. For links that are “Down”, disable and re-enable the respective ports.

    See Disable a Switch Chip Port and Enable a Switch Chip Port.

  3. If the previous steps do not rectify the situation, disable the respective port.

    See Disable a Switch Chip Port.

There was a power outage during a firmware update.
  1. If you are able to access the management controller, restart the management controller.

    See Restart the Management Controller.

  2. If you are unable to access the management controller, power cycle the gateway.

    Refer to Gateway Service, removing the gateway from the rack.

    Refer to Gateway Installation, installing the gateway into the rack.

  3. Reperform the firmware upgrade.

    Refer to Gateway Remote Management, upgrading the gateway firmware.

Performance of the InfiniBand fabric seems diminished.
  1. Determine if there are errors or problems with the InfiniBand fabric.

    See:

  2. Locate the affected nodes by the GUID provided in the output of the ibdiagnet command.

    See Locate a Switch Chip or Connector From the GUID and Port.

  3. If the problem is at a cable connection, swap the suspect cable with a known good cable or reconnect the cable to a known good remote port and repeat Step 1.

    Refer to Gateway Service, servicing data cables.

  4. If the problem still remains at the cable connection, disable and re-enable the respective port and repeat Step 1.

    See Disable or Enable an External Port.

Temporary solution:

Permanent solution:

  • If the problem still remains, replace the affected component or the gateway.

    Refer to Gateway Service, servicing data cables.

    Refer to remote port's documentation for replacement procedures.

    Refer to Gateway Service, removing the gateway from the rack.

    Refer to Gateway Installation, installing the gateway into the rack.

An InfiniBand Link LED is blinking.
  1. Disconnect and properly reconnect both ends of the respective InfiniBand cable.

    Refer to Gateway Service, servicing the data cables.

  2. If the LED is still blinking, determine the significance of the errors through use of the ibdiagnet command.

    See Determine Which Links Are Experiencing Significant Errors.

  3. Determine which connectors map to the affected link by deconstructing the node's GUID and port.

    See Locate a Switch Chip or Connector From the GUID and Port.

  4. If some of the links are running at 1x or SDR, use that situation elsewhere in this table to rectify the problem.

  5. Disable and re-enable the respective ports.

    See Disable or Enable an External Port.

  6. If the errors are still significant, swap the cable with a known good one or reconnect the cable to a known good remote port, and repeat from Step 2.

  7. Depending upon what does or does not rectify the problem, replace that component.

    Refer to Gateway Service, servicing the data cables.

    Refer to remote port's documentation for replacement procedures.

Some InfiniBand links are running at 1x or SDR.
For a temporary solution:
  1. Identify the suspect links using the ibdiagnet command.

    See Find 1x, SDR, or DDR Links in the Fabric. Look for text like this:

    -W- link with SPD=2.5 found at direct path "1,19"

    From: a Switch PortGUID=0x00066a00d80001dd Port=19

    To: a Switch PortGUID=0x00066a00d80001dd Port=24

  2. Determine which connectors map to the affected link by deconstructing the node's GUID and port.

    See Locate a Switch Chip or Connector From the GUID and Port.

  3. Verify the cable connection at both ends.

    Refer to Gateway Service, servicing the data cables.

  4. Disable and re-enable the respective ports.

    See Disable or Enable an External Port.

  5. If the previous steps do not rectify the problem, disable the port.

    See Disable or Enable an External Port.

For a permanent solution:

  1. Perform the steps for a temporary solution, Steps 1 to Step 4.

  2. Swap the cable with a known good cable or reconnect the cable to a known good remote port, and repeat from Step 1.

  3. Depending upon what does or does not rectify the problem, replace that component or the gateway.

    Refer to Gateway Service, servicing the data cables.

    Refer to the remote port's documentation for replacement procedures.

    Refer to Gateway Service, removing the gateway from the rack.

    Refer to Gateway Installation, installing the gateway into the rack.

There are errors on some InfiniBand links.
  1. Clear the error counters.

    See Clear Data and Error Counters.

  2. Start a fabric stress test.

  3. Identify the suspect links using the ibdiagnet command.

    See Determine Which Links Are Experiencing Significant Errors. Look for text like this:

    -W- lid=0x0006 guid=0x0021283a8816c0a0 dev=48438 Port=34

    Performance Monitor counter : Value

    link_recovery_error_counter : 0x1

    symbol_error_counter : 0x25 (Increase by 3 during ibdiagnet)

  4. For links that are experiencing recovery errors or substantial symbol errors, refer to other parts of this table to help identify the cause and rectify the problem.

Output of InfiniBand commands provides only GUID and port, not switch chip or QSFP connectors.
  1. You can find the location of a node in the gateway by deconstructing the node's GUID and port.

    See Locate a Switch Chip or Connector From the GUID and Port.

  2. Use the dcsport command to provide port-to-connector and connector-to-port mapping.

    See Display the Switch Chip Port to QSFP Connector Mapping.

Related Information