C H A P T E R  4

Troubleshooting the System

This chapter provides instructions for troubleshooting the Netra CT server. You can troubleshoot the system several ways.

In addition, Appendix C lists the error messages that might appear when you are operating or servicing a Netra CT server.


4.1 Troubleshooting the System Using the System Status Panel

You can use the system status panel to troubleshoot the Netra CT server.

4.1.1 Locating and Understanding the System Status Panel

The system status panel on the Netra CT server give the majority of troubleshooting information that you need for your server. FIGURE 4-1 shows the locations of the system status panels on the Netra CT servers. FIGURE 4-2 shows the system status panel for the Netra CT 810 server, and FIGURE 4-3 shows the system status panel for the Netra CT 410 server.


FIGURE 4-1 System Status Panel Locations



FIGURE 4-2 System Status Panel (Netra CT 810 Server)



FIGURE 4-3 System Status Panel (Netra CT 410 Server)


4.1.2 Using the System Status Panel LEDs to Troubleshoot the System

When you first power-on the Netra CT server, some or all of the green Power LEDs on the system status panel flash on and off for several seconds. Do not attempt to troubleshoot the system until after the LEDs have gone through their initial power-on testing.

Each major component in the Netra CT 810 server and Netra CT 410 server has a set of LEDs on the system status panel that gives the status on that component. Each component has either green Power and amber Okay to Remove LEDs (FIGURE 4-4) or green Power and amber Fault LEDs (FIGURE 4-5).


FIGURE 4-4 Power and Okay to Remove LEDs



FIGURE 4-5 Power and Fault LEDs


TABLE 4-1 lists the LEDs for each component in the Netra CT 810 server, and TABLE 4-2 lists the LEDs for each component in the Netra CT 410 server. Note that the boards in the Netra CT servers all have the green Power LED, and they have either the amber Okay to Remove LED or the amber Fault LED, not both.


TABLE 4-1 System Status Panel LEDs for the Netra CT 810 Server

LED

LEDs Available

Component

HDD 0

Power and Okay to Remove

Upper hard drive

HDD 1

Power and Okay to Remove

Lower hard drive

Slot 1

Power and Okay to Remove

Host CPU board installed in slot 1

Slots 2 - 7

Power and Okay to Remove

I/O board or satellite CPU board () installed in slot 2 - 7

Slot 8

Power and Okay to Remove

Alarm card () installed in slot 8

SCB

Power and Fault

System controller board (behind the system status panel)

FAN 1

Power and Fault

Upper fan tray (behind the system status panel)

FAN 2

Power and Fault

Lower fan tray (behind the system status panel)

RMM

Power and Okay to Remove

Removable media module

PDU 1 (DC only)

Power and Fault

Left most power distribution unit (behind the server)

PDU 2 (DC only)

Power and Fault

Right most power distribution unit (behind the server)

PSU 1

Power and Okay to Remove

Left most power supply unit

PSU 2

Power and Okay to Remove

Right most power supply unit


 


TABLE 4-2 System Status Panel LEDs for the Netra CT 410 Server

LED

LEDs Available

Component

Slot 1

Power and Okay to Remove

Alarm card() installed in slot 1

Slot 2

Power and Okay to Remove

I/O board or satellite CPU board () installed in slot 2

Slot 3

Power and Okay to Remove

Host CPU board installed in slot 3

Slot 4 and 5

Power and Okay to Remove

I/O boards or satellite CPU boards () installed in slot 4 and 5

HDD 0

Power and Okay to Remove

hard drive

SCB

Power and Fault

System controller board (behind the system status panel)

FAN 1

Power and Fault

Upper fan tray (behind the system status panel)

FAN 2

Power and Fault

Lower fan tray (behind the system status panel)

PDU 1 (DC only)

Power and Fault

Power distribution unit (behind the server)

PSU 1

Power and Okay to Remove

Power supply




Note - Do not use the information in TABLE 4-4 to troubleshoot a power supply unit in a server that has only one power supply unit (a Netra CT 410 server or a Netra CT 810 server with only one power supply). To troubleshoot the power supply in a single power supply system, use the LEDs on the power supply itself. See Section 4.6, Troubleshooting a Power Supply Using the Power Supply Unit LEDs for more information. The information given in TABLE 4-4 applies to all other components in the Netra CT 810 server or Netra CT 410 server, including the power supplies in a two-power supply Netra CT 810 server.



 


TABLE 4-3 CompactPCI Board LED States and Meanings

Green Power LED state

Amber Okay to Remove LED state

Meaning

Action

Off

Off

The slot is empty or the system thinks that the slot is empty because the system didn't detect the component when it was inserted.

If there is a component installed in this slot, then one of the following boards is faulty:

  • the board installed in the slot
  • the alarm card
  • the system controller board

Remove and replace the failed board to clear this state.

Blinking

Off

The component is coming up or going down.

Do not remove the component in this state.

On

Off

The component is up and running.

Do not remove the component in this state.

Off

On

The component is powered off.

You can remove the component in this state.

Blinking or Off

On

The component is powered on, but it is offline for some reason (for example, a fault was detected on the component).

Wait until the green Power LED stops blinking. If it does not stop blinking after several seconds, enter cfgadm and verify that the component is in the unconfigured state, then perform the necessary action, depending on the component:

  • Alarm card--When the green Power LED is OFF and the amber Okay to Remove LED is ON, you can remove the alarm card.
  • All other boards--Power off the slot through the alarm card software, then remove the component.

On

On

The component is powered on and is in use, but a fault has been detected on the component

Deactivate the component using one of the following methods:

  • Use the cfgadm -f -c unconfigure command to deactivate the component. Note that in some cases, this may cause the system to panic, depending on the nature of the component's hardware or software.
  • Halt the system and power off the slot through the alarm card software, then remove the component.

The green Power LED will then give status information:

  • If the green Power LED goes off, then you can remove the component.
  • If the green Power LED remains on, then you must halt the system and power off the slot through the alarm card software.

 


TABLE 4-4 Meanings of Power and Okay to Remove LEDs

LED State


Power LED

 


Okay to Remove LED

 

On, Solid

Component is installed and configured.

Component is Okay to Remove. You can remove the component from the system, if necessary.

On, Flashing

Component is installed but is unconfigured or is going through the configuration process.

Not applicable.

Off

Component was not recognized by the system or is not installed in the slot.

Component is not Okay to Remove. Do not remove the component while the system is running.


 


TABLE 4-5 Meanings of Power and Fault LEDs

LED State


Power LED

 


Fault LED

 

On, Solid

Component is installed and configured.

Component has failed. Replace the component.

On, Flashing

Component is installed but is unconfigured or is going through the configuration process.

Not applicable.

Off

Component was not recognized by the system or is not installed in the slot.

Component is functioning properly.



4.2 Troubleshooting the System Using prtdiag

You can troubleshoot the system using the prtdiag command. Log onto the server console and, as root, enter:


# /usr/platform/sun4u/sbin/prtdiag

If you have a Netra CT 810 server, the output on the console is similar to the following:


CODE EXAMPLE 4-1 prtdiag Output for a Netra CT 810 Server
System Configuration: Sun Microsystems  sun4u SPARCengine CP2000 model 140 
(UltraSPARC-IIi 648MHz)
Memory size: 512 Megabytes
platform is : SUNW,NetraCT-810
=============================== FRU Information ===============================
FRU         FRU      FRU        Green     Amber     Miscellaneous
Type        Unit#    Present    LED       LED       Information
----------  -----    -------    -----     -----     --------------------------
Midplane    1        Yes                            Netra ct800
                                                    Properties:
                                                      Version=0
                                                      Maximum Slots=8
SCB         1        Yes        on        off       System Controller Board
                                                     Properties:
                                                       Version=2
                                                       hotswap-mode=basic
SSB         1        Yes                            System Status Panel
CPU         1        Yes        on        off       CPU board 
                                                      temperature(celsius):38
I/O         2        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         3        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                      Devices:
                                                        pci
                                                          pci108e,1000
                                                          SUNW,hme
                                                          SUNW,isptwo
I/O         4        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                      Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         5        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                      Board Type:Unknown 
                                                     Devices:
                                                       pci
                                                         pci108e,1000
                                                         SUNW,hme
                                                         SUNW,isptwo
I/O         6        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
I/O         7        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci1176,608
I/O         8        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Alarm Card
                                                    Devices:
                                                      pci
                                                        ebus
                                                        ethernet
PDU         1        Yes        on        off       Power Distribution Unit
PDU         2        Yes        on        off       Power Distribution Unit
PSU         1        Yes        on        on        Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                    ps fan:ok
                                                      supply:on
PSU         2        Yes        on        on        Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                      ps fan:ok
                                                      supply:on
FAN         1        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
FAN         2        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
HDD         0        Yes        on        off       hard drive
                                                      condition:ok
HDD         1        Yes        on        off       hard drive
                                                      condition:ok
RMM                  Yes        on        on       Removable Media Module
                                                      condition:Unknown 
 
System Board PROM revision:
---------------------------
OBP 3.14.1 2000/04/28 12:56

If you have a Netra CT 410 server, the output on the console is similar to the following:


CODE EXAMPLE 4-2 prtdiag Output for a Netra CT 410 Server
System Configuration: Sun Microsystems  sun4u SPARCengine CP2000 model 140 
(UltraSPARC-IIi 648MHz)
Memory size: 512 Megabytes
platform is : SUNW,NetraCT-410
=============================== FRU Information ===============================
FRU         FRU      FRU        Green     Amber     Miscellaneous
Type        Unit#    Present    LED       LED       Information
----------  -----    -------    -----      -----    --------------------------
Midplane    1        Yes        Netra ct400
                                                    Properties:
                                                      Version=0
                                                      Maximum Slots=5
SCB         1        Yes        on        off       System Controller Board
                                                      Properties:
                                                      Version=2
                                                      hotswap-mode=basic
SSB         1        Yes                            System Status Panel
I/O         1        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Alarm Card
                                                    Devices:
                                                      pci
                                                        ebus
                                                        ethernet
I/O         2        Yes        off        off      CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
CPU         3        Yes        on        off       CPU board
                                                      temperature(celsius):38
I/O         4        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         5        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
PDU         1        Yes        on        off       Power Distribution Unit
PSU         1        Yes        on        off       Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                      ps fan:ok
                                                      supply:on
FAN         1        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
FAN         2        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
HDD         0        Yes        on        off       hard drive
                                                      condition:ok
 
System Board PROM revision:
---------------------------
OBP 3.14.1 2000/04/28 12:56


4.3 Troubleshooting the System Using Diagnostic Software

Software packages, such as Sun VTS, allow you to run diagnostic tests on your system. SunVTS is a validation test suite that is provided as a supplement to the Solaris operating environment. The individual tests can stress a device, system, or resource so as to detect and pinpoint hardware and software failures and to provide users with informational messages for resolving problems. SunVTS runs at the operating system level.

The following tests are useful when troubleshooting a Netra CT server:

A new utility called diagconf, which is part of the embedded firmware image on the alarm card, is now available. You can use diagconf to set or display the configuration settings for Apost, allowing you to make the tests run on the alarm card more or less thoroughly before the embedded firmware is brought up on the alarm card.

To display the values currently set for Apost, access the alarm card command line interface (CLI) and enter the following command:


hostname cli> diagconf -d

Output similar to the following is displayed, giving you the values currently set for the Apost test on the alarm card:


diag-switch        False
verb-mode          True
stop-on-error      False
diag-level         Max
mfg-mode           Off
hdr-checksum       0xaa
time-stamp         0
record-format-ver  49
post-version       02
reset-status       0xd0000000
post-status        ...
post-msg           Watchdog Reset-------- POST Passed-------------------

Some values are hard-set and cannot be changed by a user, while others can be changed to make the test more or less thorough. To change the value for a test, enter the following command:


hostname cli> diagconf -s command value

where command is the name of the command that you want to change, and value is the value you want to change.

The following table lists the Apost tests that can be changed by a user and the allowable values for each. Any tests not listed in TABLE 4-6 are either hard-set and cannot be changed, or should not be changed by a user.


TABLE 4-6 Apost Tests and Values through diagconf

Command

Value

diag-switch

  • True--Turns the diag-switch test on.
  • False--Turns the diag-switch test off.

verb-mode

  • True--Turns the verb-mode test on.
  • False--Turns the verb-mode test off.

stop-on-error

  • True--Stops the Apost testing when the first error is encountered.
  • False--Continues Apost testing, regardless of the number of errors encountered.

diag-level

  • Off--Turns the diag-level test off.
  • Min--Sets the diag-level test to the minimum level of testing.
  • Max--Sets the diag-level test to the maximum level of testing.

For more information on these and other tests in the SunVTS test suite, refer to the Computer Systems Release Notes Supplement for Sun Hardware document or the SunVTS documentation on the Solaris on Sun Hardware Answerbook, both included with your Solaris operating environment.


4.4 Troubleshooting the System Using the Power-On Self Test (POST)

When you first power-up the Netra CT server, some or all of the green Power LEDs on the system status panel flash on and off for several seconds. The green Power LED for the I/O slot holding the host board (slot 1 in the Netra CT 810 server and slot 3 in the Netra CT 410 server) lights solid green while the green Power LEDs for the remaining components are flashing on and off; this status is an indication that the CPU board passed the power-on self test (POST).

Before any processing occurs on a system, the system must successfully complete the POST. Messages are displayed for each step in the POST process. If there is a critical failure, the system does not complete POST and does not boot. To monitor this process, you must be connected to the TTY A port on the CPU board or rear transition module. See Section 5.2.1, Logging In to the Netra CT Server.

OpenBoot PROM (OBP) variables control the console port. The variables and their possible settings are described as follows.

To see the console output device, enter:


ok printenv output-device

The screen displays output similar to the following:


output-device						ttya

The possible settings for this variable are as follows:

Both ttya and ttyb represent the serial ports on the CPU board. screen represents the display attached to the first frame buffer installed in the system (not present on the Netra CT server). rsc is used by the alarm card.

To see the console input device, enter:


ok printenv input-device

The screen displays output similar to the following:


input-device						ttya

The possible settings for this variable are:

ttya and ttyb represent the serial ports on the CPU board. keyboard represents the standard system keyboard (not present on the Netra CT server). rsc is used by the alarm card. If no system keyboard is connected, the console port defaults to ttya.



Note - Be sure the two variables are consistent with each other. For example, do not set the output-device to screen and the input-device to ttya.



Another OBP variable controls the behavior of the POST process called diag-level. By default, this variable is set to max, which means POST runs more thorough (verbose) tests against the hardware. This variable can be set to min, which runs a less stringent set of tests against the hardware. A minimum level of POST testing takes less time, so the Solaris operating environment can boot more quickly on a machine with diag-level set to min.

To run the maximum amount of POST tests, enter:


ok setenv diag-level max

To run the minimum amount of POST tests, enter:


ok setenv diag-level min


4.5 Troubleshooting the System Using the Alarm Card Software

For information on troubleshooting using the alarm card software, refer to the Netra CT Server System Administration Guide (819-2743-xx).


4.6 Troubleshooting a Power Supply Using the Power Supply Unit LEDs


Two LEDs are on each power supply unit: a green () LED and an amber () LED. Use the LEDs on the power supply unit to troubleshoot each power supply unit. Because there is one power supply unit in the Netra CT 410 server and two power supply units in the Netra CT 810 server, the actions to take are different. The following sections provide guidelines for each server. 

4.6.1 Troubleshooting the Power Supply Unit in the Netra CT 410 Server

Following are the states of the LEDs on the power supply unit in the Netra CT 410 server:

4.6.2 Troubleshooting the Power Supply Units in the Netra CT 810 Server


When both power supply units in a Netra CT 810 server are up and running properly, the green ()LEDs on both power supply units is ON (note that these are the LEDs on the power supply units themselves, notthe LEDs on the system status panel).

If a power supply unit fails, the amber () LED on the power supply unit might light, depending on the type of failure that occurs:


If one power supply unit fails (either a soft-fault or a hard-fault), but the other power supply unit is still functioning normally, replace the faulty power supply unit as soon as possible. If both power supply units fail, the action to take varies depending on which of the two types of faults occurred:


If

Then

Both power supply units go through a soft-fault

Replace one power supply unit at a time, in order to keep the system up and running.

One power supply unit goes through a soft-fault and the other power supply unit goes through a hard-fault

Replace the power supply unit that has a hard-fault first, in order to keep the system up and running.

Both power supply units go through a hard-fault

The system is down. Replace at least one of the power supply units to bring the system back up again.



4.7 Troubleshooting a Host CPU Board

This section describes how to troubleshoot problems related to the host board. The information provided here primarily covers situations when the system containing the host board does not boot up or when the board is not fully functional after boot. Only general troubleshooting tips are provided here. No component-level troubleshooting information is included in this section.

The following topics are covered:

Also, the following diagnostic procedures are described:

4.7.1 General Troubleshooting Tips



caution icon

Caution - High voltages are present in the Netra CT server. To avoid physical injury, follow all the safety rules specified in the Netra CT Server Safety and Compliance Manualwhen opening the enclosure and/or removing and installing a board.



The following general troubleshooting tips are useful in isolating issues related to a host board:

1. Make sure the host board is installed properly in the correct slot in the Netra CT server.

The CPU board is installed in slot 1 in the Netra CT 810 server and in slot 3 in the Netra CT 410 server.

2. Make sure all the necessary cables are attached properly to the host rear transition module.

The following are possible board and rear transition module combinations:

FIGURE 4-6 shows the connectors on the CTC. FIGURE 4-7 shows connectors on the RTM-H.


FIGURE 4-6 Connectors on the Netra CP2140 Host Rear Transition Module



FIGURE 4-7 Connectors on the Netra CP2500 Rear Transition Module Host (RTM-H) Board


4.7.2 Warning, Critical, and Shutdown Temperatures

The following temperatures apply to the Netra CP2500 board:

4.7.3 General Troubleshooting Requirements

The following devices are required to take some of the recommended actions in this section:

4.7.4 Mechanical Failures

Symptom

Unable to insert the CPU board into the backplane.

Action

1. Verify that no mechanical and physical obstructions exist in the slot where the CPU board is going to be installed.

2. Make sure no pins on the board connectors or the CompactPCI backplane connectors are bent or damaged.

3. Verify that the front panel screws are seated and not preventing the board from seating properly.

4.7.5 Power-On Failures

This section provides examples of power-on failure symptoms and suggests actions.

single-step bulletMake sure the CPU board is installed properly.

4.7.6 Failures Subsequent to Power-On

Symptom

Cannot connect successfully to a TTY serial port; there are no POST messages and unable to send keyboard input.

Action

1. Check the TTY cable for proper setup.

2. If you do not see any output after connecting the TTY terminal to the rear transition module, remove the terminal, then connect it to the COM port of the CPU board and try again.

4.7.7 Troubleshooting During POST/OpenBoot PROM and During Boot Process

This section describes problems encountered while running POST and OBP and during the boot process.

Symptom

POST error message displays:


cannot establish network service

Action

This might be a hardware address problem.

single-step bulletAdd or check the media access control (MAC) address to the server and the IP address at the server.

Symptom

POST detects Ecache error and a message similar to the following is displayed:


STATUS =FAILED 
TEST =Memory Addr w/ Ecache 
SUSPECT=U5201 and U5202 
MESSAGE=Mem Addr line compare error
addr 00000000.00000000
exp 00000000.00000000
obs 88888888.88888888

Action

This might be a mounting issue with the CPU Mylar film, socket, or heatsink, which could have occurred during transportation or due to severe vibration.

single-step bulletContact Sun's Customer Care Center.



caution icon

Caution - Any attempt to disassemble or replace the aforementioned devices voids the warranty.



4.7.8 OpenBoot PROM On-Board Diagnostics



Note - For Netra CP2500 boards, pcia-probe-list does not apply.



The following OBP variables are specific to the Netra CT server:

The following section describes the OBP on-board diagnostics. To execute the OBP on-board diagnostics, the system must be at the ok prompt. The OBP on-board diagnostics are listed as follows:

4.7.8.1 watch-clock

The watch-clock command reads a register in the TOD chip and displays the result as a seconds counter. During normal operation, the seconds counter repeatedly increments from 0 to 59 until interrupted by pressing any key on the keyboard. The following identifies the watch-clock output message.


ok watch-clock 
Watching the seconds register of the real time clock chip 
It should be ticking once a second 
Type any key to stop 
49 
ok

4.7.8.2 watch-net and watch-net-all

The watch-net and watch-net-all commands monitor Ethernet packets on the Ethernet interfaces connected to the system. Good packets received by the system are indicated by a period (.). Errors such as the framing error and the cyclic redundancy check (CRC) error are indicated with an X and an associated error description. CODE EXAMPLE 4-3 identifies the watch-net output message and CODE EXAMPLE 4-4 identifies the watch-net-all output message.


CODE EXAMPLE 4-3 watch-net Output Message
ok watch-net 
Hme register test --- succeeded. 
Internal loopback test -- succeeded. 
Transceiver check -- 
Using Onboard Transceiver - Link Up. passed Using Onboard 
Transceiver - Link Up. Looking for Ethernet Packets.  
.  is a Good Packet.  X  is a Bad Packet. 
Type any key to stop. .................................................. ................................................................ ................................................................ ........................................................ 
ok

 


CODE EXAMPLE 4-4 watch-net-all Output Message
ok watch-net-all 
/pci@1f,0/pci@1,1/network@1,1 
Hme register test --- succeeded. 
Internal loopback test -- succeeded. 
Transceiver check -- Using Onboard Transceiver - Link Up. passed 
Using Onboard Transceiver - Link Up. 
Looking for Ethernet Packets.  
.  is a Good Packet.  
X  is a Bad Packet. 
Type any key to stop. ........ ........ ........................................................ ................................................................ ................................................................ .................................... 
ok

4.7.8.3 probe-scsi

The probe-scsi command transmits an inquiry command to SCSI devices connected to the system unit on-board SCSI interface. If the SCSI device is connected and active, the target address, unit number, device type, and manufacturer name is displayed. CODE EXAMPLE 4-5 identifies the probe-scsi output message.


CODE EXAMPLE 4-5 probe-scsi Output Message
ok probe-scsi 
Primary UltraSCSI bus: 
Target 0 Unit 0 Disk SEAGATE ST32272W 0876 
Target 6 
Unit 0 Removable Read Only device TOSHIBA CD-ROM XM-6201TA1037 
ok

4.7.8.4 test alias name, device path, -all

The test command, combined with a device alias name or device path, enables a device self-test program. If a device has no self-test program, the message: No selftest method for device name is displayed. To enable the self-test program for a device, type the test command followed by the device alias or device path name. TABLE 4-7 lists test alias name selections, a description of the selection, and preparation.

 


TABLE 4-7 Selected OpenBoot PROM On-Board Diagnostic Tests

Type of Test

Description

Preparation

test screen

Tests system video graphics hardware and monitor.

Diag-switch? NVRAM parameter must be true for the test to execute.

test floppy

Tests diskette drive response to commands.

A formatted diskette must be inserted into the diskette drive.

test net

Performs internal/external loopback test of the system auto- selected Ethernet interface.

An Ethernet cable must be attached to the system and to an Ethernet tap or hub or the external loopback test fails.

test ttya

test ttyb

Outputs an alphanumeric test pattern on the system serial ports: ttya, serial port A; ttyb, serial port B.

A terminal must be connected to the port being tested to observe the output.

test keyboard

Executes the keyboard self-test.

Four keyboard LEDs should flash once and a message is displayed: Keyboard Present.

test -all

Sequentially test system- configured devices containing self-test.

Tests are sequentially executed in device-tree order (viewed with the show-devs command).


4.7.9 OpenBoot Diagnostics

OpenBoot Diagnostics is an interactive tool that tests various hardware and peripheral devices. When obdiag is typed at the ok prompt in OBP, the menu shown in CODE EXAMPLE 4-6 is displayed on the screen.

obdiag performs root-cause failure analysis on the referenced devices by testing internal registers, confirming subsystem integrity, and verifying device functionality. To run obdiag:



caution icon

Caution - Prior to running obdiag, do not run any other OBP command that might change the hardware state of the board. After obdiagtests are run, always reset the system to bring it to a known state.



1. At the ok prompt, enter obdiag.

This displays the OBDiag menu as shown in CODE EXAMPLE 4-6.

2. At the OBDiag menu prompt, enter a number from the menu (such as 17 to enable toggle script-debug messages).


CODE EXAMPLE 4-6 OBDiag Menu
0 .... PCI/Cheerio 
1 .... EBUS DMA/TCR Registers 
2 .... Ethernet 
3 .... Ethernet2 <Inactive> 
4 .... Parallel Port 
5 .... Serial Port C (on optional I/O board) <Inactive> 
6 .... Serial Port D (on optional I/O board) <Inactive> 
7 .... NVRAM 
8 .... Floppy 
9 .... Serial port A 
10 ... Serial port B 
11 ... RAS 
12 ... User Flash1 
13 ... User Flash2 
14 ... All Above 
15 ... Quit 
16 ... Display this Menu 
17 ... Toggle Script-debug 
18 ... Enable External Loopback Tests 
19 ... Disable External Loopback Tests 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

You can type the relevant numbers to run all or some of the tests. If an error is detected, an error message is displayed on the screen. For example, if an error is detected while testing the floppy disk drive, a message similar to the following is displayed on the screen:


TEST= floppy_test  
STATUS= FAILED  
SUBTEST= floppy_id0_read_test  
ERRORS= 1   
TTF= 66   
SPEED= 440 MHz  
PASSES= 1   
MESSAGE= Error: Recalibrate failed. floppy missing, improperly connected, or defective. 

Some of the OBDiag menu options are described in detail in the following paragraphs.

4.7.9.1 PCI/PCIO

The PCI/PCIO diagnostic performs the following:

CODE EXAMPLE 4-7 shows the PCI/PCIO output message.


CODE EXAMPLE 4-7 PCI/PCIO Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 0 
 
TEST= all_pci/PCIO_test  
SUBTEST= vendor_id_test  
SUBTEST= device_id_test  
SUBTEST= mixmode_read  
SUBTEST= e2_class_test  
SUBTEST= status_reg_walk1  
SUBTEST= line_size_walk1  
SUBTEST= latency_walk1  
SUBTEST= line_walk1  
SUBTEST= pin_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.2 EBus DMA/TCR Registers

The diagnostic EBus DMA/TCR registers performs the following:

CODE EXAMPLE 4-8 shows the EBus DMA/TCR registers output message.


CODE EXAMPLE 4-8 EBus DMA/TCR Registers Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 1 
 
TEST= all_dma/ebus_test  
SUBTEST= dma_reg_test  
SUBTEST= dma_func_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.3 Ethernet

The Ethernet diagnostic performs the following:

CODE EXAMPLE 4-9 shows the Ethernet output message.


CODE EXAMPLE 4-9 Ethernet Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 2 
 
TEST= ethernet_test  
SUBTEST= my_channel_reset  
SUBTEST= hme_reg_test  
SUBTEST= global_reg1_test  
SUBTEST= global_reg2_test  
SUBTEST= bmac_xif_reg_test  
SUBTEST= bmac_tx_reg_test  
SUBTEST= mif_reg_test  
Test only supported for National Phy DP83840A 
SUBTEST= 10mb_xcvr_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
SUBTEST= 100mb_phy_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
SUBTEST= 100mb_twister_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.4 Parallel Port

The parallel port diagnostic performs the dma_read. This enables the enhanced capability port (ECP) mode configuration, ECP DMA configuration, and FIFO test mode. It transfers 16 bytes of data from the memory to the parallel-port device, then verifies that the data is in FIFO test. CODE EXAMPLE 4-10 shows the parallel-port output message.


CODE EXAMPLE 4-10 Parallel Port Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 4 
 
TEST= parallel_port_test  
SUBTEST= dma_read  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>



Note - The Netra CP2500 board has no parallel port.



4.7.9.5 Serial Port A

The serial port A diagnostic invokes the uart_loopback test. This test transmits and receives 128 characters and checks the transaction validity. CODE EXAMPLE 4-11 identifies the serial port A output message.


CODE EXAMPLE 4-11 Serial Port A Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 9 
 
TEST= uarta_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>



Note - The serial port A diagnostic stalls if the TIP line is installed on serial port A. CODE EXAMPLE 4-12 identifies the serial port A output message when the TIP line is installed on serial port A.




CODE EXAMPLE 4-12 Serial Port A Output Message With TIP Line Installed
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 9 
 
TEST= uarta_test   
UART A in use as console - Test not run.  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.6 Serial Port B

The serial port B diagnostic is identical to the serial port A diagnostic. CODE EXAMPLE 4-13 identifies the serial port B output message.



Note - The serial port B diagnostic stalls if the TIP line is installed on serial port B.




CODE EXAMPLE 4-13 Serial Port B Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 10 
 
TEST= uartb_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.7 NVRAM

The NVRAM diagnostic verifies the NVRAM operation by performing a write and read to the NVRAM. CODE EXAMPLE 4-14 identifies the NVRAM output message.


CODE EXAMPLE 4-14 NVRAM Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 7 
 
TEST= nvram_test  
SUBTEST= write/read_patterns  
SUBTEST= write/read_inverted_patterns  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.9.8 All Above

The All Above diagnostic validates the system unit. CODE EXAMPLE 4-15 shows an example of the All Above output message.


CODE EXAMPLE 4-15 All Above Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 14 
 
TEST= all_pci/cheerio_test  
SUBTEST= vendor_id_test  
SUBTEST= device_id_test  
... 
SUBTEST= bmac_xif_reg_test  
SUBTEST= bmac_tx_reg_test  
SUBTEST= mif_reg_test  
SUBTEST= mac_internal_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
... 
SUBTEST= 100mb_twister_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
TEST= ethernet2_test  
TEST= parallel_port_test  
SUBTEST= dma_read  
TEST= uarta_test  
... 
SUBTEST= write/read_patterns  
...  
ttya in use as console - Test not run.  
TEST= usi_test   
ttyb in use as console - Test not run.  
TEST= ras_test  env-monitor = disabled 
SUBTEST= obd-init-i2c-test  
... 
TEST= flash_test  
SUBTEST= flash-supported?  
TEST= flash_test  
SUBTEST= flash-supported?  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>