C H A P T E R  4

Troubleshooting the System

This chapter gives instructions for troubleshooting the Netra CT server. You can troubleshoot the system several ways.

In addition, Appendix C lists the error messages that might appear when you are operating or servicing your Netra CT server.


4.1 Troubleshooting the System Using the System Status Panel

You can use the system status panel to troubleshoot the Netra CT server.

4.1.1 Locating and Understanding the System Status Panel

The system status panel on the Netra CT server give the majority of troubleshooting information that you will need for your server. FIGURE 4-1 shows the locations of the system status panels on the Netra CT servers. FIGURE 4-2 shows the system status panel for the Netra CT 810 server, and FIGURE 4-3 shows the system status panel for the Netra CT 410 server.

 FIGURE 4-1 System Status Panel Locations


 FIGURE 4-2 System Status Panel (Netra CT 810 Server)


 FIGURE 4-3 System Status Panel (Netra CT 410 Server)


4.1.2 Using the System Status Panel LEDs to Troubleshoot the System

When you first power-on the Netra CT server, some or all of the green Power LEDs on the system status panel flash on and off for several seconds. Do not attempt to troubleshoot the system until after the LEDs have gone through their initial power-on testing.

Each major component in the Netra CT 810 server or Netra CT 410 server has a set of LEDs on the system status panel that gives the status on that particular component. Each component will have either the green Power and the amber Okay to Remove LEDs (FIGURE 4-4) or the green Power and amber Fault LEDs (FIGURE 4-5).

 FIGURE 4-4 Power and Okay to Remove LEDs


 FIGURE 4-5 Power and Fault LEDs


TABLE 4-1 describes which combination of LEDs is used for each component in the Netra CT 810 server, and TABLE 4-2 describes which combination of LEDs is used for each component in the Netra CT 410 server. Note that the components in the Netra CT servers all have the green Power LED, and they will have either the amber Okay to Remove LED or the amber Fault LED, but not both.

TABLE 4-1 System Status Panel LEDs for the Netra CT 810 Server

LED

LEDs Available

Component

HDD 0

Power and Okay to Remove

Upper hard disk drive

HDD 1

Power and Okay to Remove

Lower hard disk drive

Slot 1

Power and Okay to Remove

Host CPU card installed in slot 1

Slots 2 - 7

Power and Okay to Remove

I/O card or satellite CPU card () installed in slot 2 - 7

Slot 8

Power and Okay to Remove

Alarm card () installed in slot 8

SCB

Power and Fault

System controller board (behind the system status panel)

FAN 1

Power and Fault

Upper fan tray (behind the system status panel)

FAN 2

Power and Fault

Lower fan tray (behind the system status panel)

RMM

Power and Okay to Remove

Removeable media module

PDU 1 (DC only)

Power and Fault

Leftmost power distribution unit (behind the server)

PDU 2 (DC only)

Power and Fault

Rightmost power distribution unit (behind the server)

PSU 1

Power and Okay to Remove

Leftmost power supply unit

PSU 2

Power and Okay to Remove

Rightmost power supply unit


 

TABLE 4-2 System Status Panel LEDs for the Netra CT 410 Server

LED

LEDs Available

Component

Slot 1

Power and Okay to Remove

Alarm card() installed in slot 1

Slot 2

Power and Okay to Remove

I/O card or satellite CPU card () installed in slot 2

Slot 3

Power and Okay to Remove

Host CPU card installed in slot 3

Slot 4 and 5

Power and Okay to Remove

I/O cards or satellite CPU cards () installed in slot 4 and 5

HDD 0

Power and Okay to Remove

Hard disk drive

SCB

Power and Fault

System controller board (behind the system status panel)

FAN 1

Power and Fault

Upper fan tray (behind the system status panel)

FAN 2

Power and Fault

Lower fan tray (behind the system status panel)

FTC

Power and Fault

Host CPU front transition card or host CPU front termination board

PDU 1 (DC only)

Power and Fault

Power distribution unit (behind the server)

PSU 1

Power and Okay to Remove

Power supply




Note - Do not use the information in TABLE 4-4 to troubleshoot a power supply unit in a server that has only one power supply unit (a Netra CT 410 server or a Netra CT 810 server with only one power supply). To troubleshoot the power supply in a single power supply system, use the LEDs on the power supply itself. Refer to Section 4.6, Troubleshooting a Power Supply Using the Power Supply Unit LEDs for more information. The information given in TABLE 4-4 applies to all other components in the Netra CT 810 server or Netra CT 410 server, including the power supplies in a two power supply Netra CT 810 server.



 

TABLE 4-3 CompactPCI Board LED States and Meanings

Green Power LED state

Amber Okay to Remove LED state

Meaning

Action

Off

Off

The slot is empty or the system thinks that the slot is empty because the system didn't detect the card when it was inserted.

If there is a card installed in this slot, then one of the following components is faulty:

  • the card installed in the slot
  • the alarm card
  • the system controller board

Remove and replace the failed component to clear this state.

Blinking

Off

The card is coming up or going down.

Do not remove the card in this state.

On

Off

The card is up and running.

Do not remove the card in this state.

Off

On

The card is powered off.

You can remove the card in this state.

Blinking

On

The card is powered on, but it is offline for some reason (for example, a fault was detected on the card).

Wait several seconds to see if the green Power LED stops blinking. If it does not stop blinking after several seconds, enter cfgadm and verify that the card is in the unconfigured state, then perform the necessary action, depending on the card:

  • Alarm card--You can remove the alarm card in this state.
  • All other cards--Power off the slot through the alarm card software, then remove the card.

On

On

The card is powered on and is in use, but a fault has been detected on the card.

Deactivate the card using one of the following methods:

  • Use the cfgadm -f -c unconfigure command to deactivate the card. Note that in some cases, this may cause the system to panic, depending on the nature of the card hardware or software.
  • Halt the system and power off the slot through the alarm card software, then remove the card.

The green Power LED will then give status information:

  • If the green Power LED goes off, then you can remove the card.
  • If the green Power LED remains on, then you must halt the system and power off the slot through the alarm card software.

 

TABLE 4-4 Meanings of Power and Okay to Remove LEDs

LED State

Power LED

 

Okay to Remove LED

 

On, Solid

Component is installed and configured.

Component is Okay to Remove. You can remove the component from the system, if necessary.

On, Flashing

Component is installed but is unconfigured or is going through the configuration process.

Not applicable.

Off

Component was not recognized by the system or is not installed in the slot.

Component is not Okay to Remove. Do not remove the component while the system is running.


 

TABLE 4-5 Meanings of Power and Fault LEDs

LED State

Power LED

 

Fault LED

 

On, Solid

Component is installed and configured.

Component has failed. Replace the component.

On, Flashing

Component is installed but is unconfigured or is going through the configuration process.

Not applicable.

Off

Component was not recognized by the system or is not installed in the slot.

Component is functioning properly.


 


4.2 Troubleshooting the System Using prtdiag

You can troubleshoot the system using the prtdiag command. Log into the server console and, as root, enter:

# /usr/platform/sun4u/sbin/prtdiag

If you have a Netra CT 810 server, you should get output on the console similar to the following:

 

CODE EXAMPLE 4-1 prtdiag Output for a Netra CT 810 Server
System Configuration: Sun Microsystems  sun4u SPARCengine CP2000 model 140 
(UltraSPARC-IIi 648MHz)
Memory size: 512 Megabytes
platform is : SUNW,NetraCT-810
=============================== FRU Information ===============================
FRU         FRU      FRU        Green     Amber     Miscellaneous
Type        Unit#    Present    LED       LED       Information
----------  -----    -------    -----     -----     --------------------------
Midplane    1        Yes                            Netra ct800
                                                    Properties:
                                                      Version=0
                                                      Maximum Slots=8
SCB         1        Yes        on        off       System Controller Board
                                                     Properties:
                                                       Version=2
                                                       hotswap-mode=basic
SSB         1        Yes                            System Status Panel
CPU         1        Yes        on        off       CPU board 
                                                      temperature(celsius):38
I/O         2        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         3        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                      Devices:
                                                        pci
                                                          pci108e,1000
                                                          SUNW,hme
                                                          SUNW,isptwo
I/O         4        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                      Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         5        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                      Board Type:Unknown 
                                                     Devices:
                                                       pci
                                                         pci108e,1000
                                                         SUNW,hme
                                                         SUNW,isptwo
I/O         6        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
I/O         7        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci1176,608
I/O         8        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Alarm Card
                                                    Devices:
                                                      pci
                                                        ebus
                                                        ethernet
PDU         1        Yes        on        off       Power Distribution Unit
PDU         2        Yes        on        off       Power Distribution Unit
PSU         1        Yes        on        on        Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                    ps fan:ok
                                                      supply:on
PSU         2        Yes        on        on        Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                      ps fan:ok
                                                      supply:on
FAN         1        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
FAN         2        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
HDD         0        Yes        on        off       Hard Disk Drive
                                                      condition:ok
HDD         1        Yes        on        off       Hard Disk Drive
                                                      condition:ok
RMM                  Yes        on        on       Removable Media Module
                                                      condition:Unknown 
 
System Board PROM revision:
---------------------------
OBP 3.14.1 2000/04/28 12:56

 

If you have a Netra CT 410 server, you should get output on the console similar to the following:

CODE EXAMPLE 4-2 prtdiag Output for a Netra CT 410 Server
System Configuration: Sun Microsystems  sun4u SPARCengine CP2000 model 140 
(UltraSPARC-IIi 648MHz)
Memory size: 512 Megabytes
platform is : SUNW,NetraCT-410
=============================== FRU Information ===============================
FRU         FRU      FRU        Green     Amber     Miscellaneous
Type        Unit#    Present    LED       LED       Information
----------  -----    -------    -----      -----    --------------------------
Midplane    1        Yes        Netra ct400
                                                    Properties:
                                                      Version=0
                                                      Maximum Slots=5
SCB         1        Yes        on        off       System Controller Board
                                                      Properties:
                                                      Version=2
                                                      hotswap-mode=basic
SSB         1        Yes                            System Status Panel
I/O         1        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Alarm Card
                                                    Devices:
                                                      pci
                                                        ebus
                                                        ethernet
I/O         2        Yes        off        off      CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
CPU         3        Yes        on        off       CPU board
                                                      temperature(celsius):38
I/O         4        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,hme
                                                        SUNW,isptwo
I/O         5        Yes        on        off       CompactPCI IO Slot
                                                    Properties:
                                                      auto-config=disabled
                                                    Board Type:Unknown 
                                                    Devices:
                                                      pci
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
                                                        pci108e,1000
                                                        SUNW,qfe
PDU         1        Yes        on        off       Power Distribution Unit
PSU         1        Yes        on        off       Power Supply Unit
                                                      condition:ok
                                                      temperature:ok
                                                      ps fan:ok
                                                      supply:on
FAN         1        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
FAN         2        Yes        on        off       Fan Tray
                                                      condition:ok
                                                      fan speed:low
HDD         0        Yes        on        off       Hard Disk Drive
                                                      condition:ok
 
System Board PROM revision:
---------------------------
OBP 3.14.1 2000/04/28 12:56


4.3 Troubleshooting the System Using Diagnostic Software

There are several software packages that allow you to run diagnostic tests on your system, such as Sun VTS. SunVTS is a validation test suite that is provided as a supplement to the Solaris operating environment. The individual tests can stress a device, system or resource so as to detect and pinpoint specific hardware and software failures and provide users with informational messages to resolve any problems found. SunVTS runs at the operating system level.

There are several tests that are particularly useful when troubleshooting a Netra CT server:

A new utility called diagconf, which is also part of the Chorus operating system image on the alarm card, is now available. You can use diagconf to set or display the configuration settings for Apost, allowing you to make the tests run on the alarm card more or less thoroughly before the Chorus operating system is brought up on the alarm card.

To display the values currently set for Apost, access the alarm card command line interface (CLI), and, through the alarm card CLI, enter the following command:

hostname cli> diagconf -d

You should see output similar to the following, giving you the values currently set for the Apost test on the alarm card:

diag-switch        False
verb-mode          True
stop-on-error      False
diag-level         Max
mfg-mode           Off
hdr-checksum       0xaa
time-stamp         0
record-format-ver  49
post-version       02
reset-status       0xd0000000
post-status        ...
post-msg           Watchdog Reset-------- POST Passed-------------------

Some values are hard-set and cannot be changed by a user, while others can be changed to make that particular test more or less thorough. To change the value for a particular test, enter the following command:

hostname cli> diagconf -s command value

where command is the name of the command that you want to change, and value is the value you want to change.

The following table lists the Apost tests that can be changed by a user and the allowable values for each. Any tests not listed in TABLE 4-6 are either hard-set and cannot be changed, or should not be changed by a user.

TABLE 4-6 Apost Tests and Values through diagconf

Command

Value

diag-switch

  • True--Turns the diag-switch test on.
  • False--Turns the diag-switch test off.

verb-mode

  • True--Turns the verb-mode test on.
  • False--Turns the verb-mode test off.

stop-on-error

  • True--Stops the Apost testing when the first error is encountered.
  • False--Continues Apost testing, regardless of the number of errors encountered.

diag-level

  • Off--Turns the diag-level test off.
  • Min--Sets the diag-level test to the minimum level of testing.
  • Max--Sets the diag-level test to the maximum level of testing

For more information on these and other tests in the SunVTS test suite, refer to the Computer Systems Release Notes Supplement for Sun Hardware document or the SunVTS documentation on the Solaris on Sun Hardware Answerbook, both included with your Solaris operating environment.


4.4 Troubleshooting the System Using the Power-On Self Test (POST)

When you first power-up the Netra CT server, some or all of the green Power LEDs on the system status panel will flash on and off for several seconds. The green Power LED for the I/O slot holding the CPU card (slot 1 in the Netra CT 810 server and slot 3 in the Netra CT 410 server) will go to solid green while the green Power LEDs for the remaining components are still flashing on and off; this is an indication that the CPU card has passed the power-on self test (POST).

Before any processing can occur on a system, it must successfully complete the POST. Messages are displayed for each step in the POST process. If there is a critical failure, the system will not complete POST and will not boot. To monitor this process, you must be connected to the TTY A port on the CPU card or CPU transition card. See Section 5.2.1, Logging In to the Netra CT Server.

OpenBoot PROM (OBP) variables control the console port. The variables and their possible settings are described below.

To see the console output device, enter:

ok printenv output-device

The screen will display something similar to the following:

output-device						ttya

The possible settings for this variable are:

ttya and ttyb represent the serial ports on the CPU card. screen represents the display attached to the first frame buffer installed in the system (not present on the Netra CT server). rsc is used by the alarm card.

To see the console input device, enter:

ok printenv input-device

The screen will display something similar to the following:

input-device						ttya

The possible settings for this variable are:

ttya and ttyb represent the serial ports on the CPU card. keyboard represents the standard system keyboard (not present on the Netra CT server). rsc is used by the alarm card. If no system keyboard is connected, the console port defaults to ttya.



Note - Be sure the two variables are consistent with each other. For example, do not set the output-device to screen and the input-device to ttya.



There is another OBP variable that controls the behavior of the POST process called diag-level. By default, this variable is set to max, which means POST will run more thorough/verbose tests against the hardware. This variable can also be set to min, which will run a less stringent set of tests against the hardware. A minimum level of POST testing also takes less time, so the Solaris operating environment can boot more quickly on a machine with diag-level set to min.

To run the maximum amount of POST tests, enter:

ok setenv diag-level max

To run the minimum amount of POST tests, enter:

ok setenv diag-level min 


4.5 Troubleshooting the System Using the Alarm Card Software

For information on troubleshooting using the alarm card software, refer to the Netra CT Server System Administration Guide (816-2483-xx).


4.6 Troubleshooting a Power Supply Using the Power Supply Unit LEDs

There are two LEDs on each power supply unit: a green () LED and an amber () LED. You can use the LEDs on the power supply unit to troubleshoot each power supply unit; however, because there is one power supply unit in the Netra CT 410 server and two power supply units in the Netra CT 810 server, the actions to take are different. 

4.6.1 Troubleshooting the Power Supply Unit in the Netra CT 410 Server

Following are the states of the LEDs on the power supply unit in the Netra CT 410 server:

4.6.2 Troubleshooting the Power Supply Units in the Netra CT 810 Server

When both power supply units in a Netra CT 810 server are up and running properly, the green ()LEDs on both power supply units will be ON (note that these are the LEDs on the power supply units themselves, not the LEDs on the system status panel).

If a power supply unit fails, the amber () LED on the power supply unit might light, depending on the type of failure that has occurred:

If one power supply unit fails (either a soft-fault or a hard-fault), but the other power supply unit is still functioning normally, you should replace the faulty power supply unit as soon as possible to keep the system up and running. If both power supply units fail, the action you should take varies depending on which of the two types of fault has occurred:

If

Then

Both power supply units go through a soft-fault

Replace one power supply unit at a time in order to keep the system up and running.

One power supply unit goes through a soft-fault and the other power supply unit goes through a hard-fault

Replace the power supply unit that has gone through a hard-fault first in order to keep the system up and running.

Both power supply units go through a hard-fault

The system is down and you should replace at least one of the power supply units to bring the system back up again.



4.7 Troubleshooting a CPU Card

This section describes how to troubleshoot problems related to the CPU card. The information provided here primarily covers those situations when the system containing the CPU card does not boot up or when the CPU card is not fully functional after boot up. Only general troubleshooting tips are provided here. No component level troubleshooting information is included in this section.

The following topics are covered:

The following diagnostic procedures are also described:

4.7.1 General Troubleshooting Tips



caution icon

Caution - High voltages are present in the Netra CT server. To avoid physical injury, follow all the safety rules specified in the Netra CT Server Safety and Compliance Manual when opening the enclosure and/or removing and installing the board.



The following general troubleshooting tips are useful in isolating the problems related to the CPU card:

1. Make sure the CPU card is installed properly in the correct slot in the Netra CT server.

The CPU card should be installed in slot 1 in the Netra CT 810 server and in slot 3 in the Netra CT 410 server.

2. Make sure all the necessary cables are attached properly to the CPU transition card.

The following figures show the connectors on the different CPU transition cards:



Note - The CPU rear transition card is the same for both the Netra CT 810 server and the Netra CT 410 server; only the location in the rear card cage differs.



 FIGURE 4-6 Connectors on the CPU Front Transition Card (Netra CT 410 Server)


 FIGURE 4-7 Connectors on the CPU Rear Transition Card


4.7.2 General Troubleshooting Requirements

The following devices are generally required to take some of the recommended actions in this section:

4.7.3 Mechanical Failures

Symptom

Unable to insert the CPU card into the backplane.

Action

1. Verify that there are no mechanical and physical obstructions in the slot where the CPU card is going to be installed.

2. Make sure no pins on the board connectors or the CompactPCI backplane connectors are bent or damaged.

4.7.4 Power-On Failures

This section provides examples of power-on failure symptoms and suggested actions. There can be several reasons for the power-on failures.

single-step bulletMake sure the CPU card is installed properly.



Note - If both Ready and Alarm LEDs on the CPU card are green, the board is partially functional and capable of running POST (power on self-test). It means that the basic functionality of the board is present. If none of the aforementioned LEDs is green, and the board is installed properly, the board is not functional. In that case, contact your Sun supplier or field service engineer.



4.7.5 Failures Subsequent to Power-On

Symptom

Cannot connect successfully to a TTY serial port; there are no POST messages and unable to send keyboard input.

Action

1. Check the TTY cable for proper setup.

2. If you do not see any output after connecting the TTY terminal to the CPU transition card, remove it and connect it to the COM port of the CPU card and try again.

4.7.6 Troubleshooting During POST/OBP and During Boot Process

This section describes certain possible problems encountered while running POST and OBP and during the boot process.

Symptom

POST error message displays:

cannot establish network service 

Action

single-step bulletThis might be a hardware address problem. Add or check the media access control (MAC) address to the server and the IP address at the server.

Symptom

POST detects Ecache error and a message similar to the one below is displayed:

STATUS =FAILED 
TEST =Memory Addr w/ Ecache 
SUSPECT=U5201 and U5202 
MESSAGE=Mem Addr line compare error
addr 00000000.00000000
exp 00000000.00000000
obs 88888888.88888888

Action

single-step bulletThis might be a mounting issue with the CPU Mylar film, socket, or heatsink which could have occurred during transportation or due to severe vibration. Contact Sun s Enterprise Services Solution Center.



caution icon

Caution - Any attempt to disassemble or replace the aforementioned devices will void the warranty.



4.7.7 OpenBoot PROM On-Board Diagnostics

There are several OBP variables specific to the Netra CT server, such as:

The following section describes the OBP on-board diagnostics. To execute the OBP on-board diagnostics, the system must be at the ok prompt. The OBP on-board diagnostics are listed as follows:

4.7.7.1 watch-clock

The watch-clock command reads a register in the NVRAM/TOD chip and displays the result as a seconds counter. During normal operation, the seconds counter repeatedly increments from 0 to 59 until interrupted by pressing any key on the PS/2 keyboard. The following identifies the watch-clock output message.

ok watch-clock 
Watching the seconds register of the real time clock chip 
It should be ticking once a second 
Type any key to stop 
49 
ok

4.7.7.2 watch-net and watch-net-all

The watch-net and watch-net-all commands monitor Ethernet packets on the Ethernet interfaces connected to the system. Good packets received by the system are indicated by a period (.). Errors such as the framing error and the cyclic redundancy check (CRC) error are indicated with an X and an associated error description. CODE EXAMPLE 4-3 identifies the watch-net output message and CODE EXAMPLE 4-4 identifies the watch-net-all output message.

CODE EXAMPLE 4-3 watch-net Output Message
ok watch-net 
Hme register test --- succeeded. 
Internal loopback test -- succeeded. 
Transceiver check -- 
Using Onboard Transceiver - Link Up. passed Using Onboard 
Transceiver - Link Up. Looking for Ethernet Packets.  
.  is a Good Packet.  X  is a Bad Packet. 
Type any key to stop. .................................................. ................................................................ ................................................................ ........................................................ 
ok

 

CODE EXAMPLE 4-4 watch-net-all Output Message
ok watch-net-all 
/pci@1f,0/pci@1,1/network@1,1 
Hme register test --- succeeded. 
Internal loopback test -- succeeded. 
Transceiver check -- Using Onboard Transceiver - Link Up. passed 
Using Onboard Transceiver - Link Up. 
Looking for Ethernet Packets.  
.  is a Good Packet.  
X  is a Bad Packet. 
Type any key to stop. ........ ........ ........................................................ ................................................................ ................................................................ .................................... 
ok

4.7.7.3 probe-scsi

The probe-scsi command transmits an inquiry command to SCSI devices connected to the system unit on-board SCSI interface. If the SCSI device is connected and active, the target address, unit number, device type, and manufacturer name is displayed. CODE EXAMPLE 4-5 identifies the probe-scsi output message.

CODE EXAMPLE 4-5 probe-scsi Output Message
ok probe-scsi 
Primary UltraSCSI bus: 
Target 0 Unit 0 Disk SEAGATE ST32272W 0876 
Target 6 
Unit 0 Removable Read Only device TOSHIBA CD-ROM XM-6201TA1037 
ok

4.7.7.4 test alias name, device path, -all

The test command, combined with a device alias or device path, enables a device self-test program. If a device has no self-test program, the message: No selftest method for device name is displayed. To enable the self-test program for a device, type the test command followed by the device alias or device path name. TABLE 4-7 lists test alias name selections, a description of the selection, and preparation.

 

TABLE 4-7 Selected OBP On-Board Diagnostic Tests

Type of Test

Description

Preparation

test screen

Tests system video graphics hardware and monitor.

Diag-switch? NVRAM parameter must be true for the test to execute.

test floppy

Tests diskette drive response to commands.

A formatted diskette must be inserted into the diskette drive.

test net

Performs internal/external loopback test of the system auto- selected Ethernet interface.

An Ethernet cable must be attached to the system and to an Ethernet tap or hub or the external loopback test fails.

test ttya

test ttyb

Outputs an alphanumeric test pattern on the system serial ports: ttya, serial port A; ttyb, serial port B.

A terminal must be connected to the port being tested to observe the output.

test keyboard

Executes the keyboard self-test.

Four keyboard LEDs should flash once and a message is displayed: Keyboard Present.

test -all

Sequentially test system- configured devices containing self-test.

Tests are sequentially executed in device-tree order (viewed with the show-devs command).


4.7.8 OpenBoot Diagnostics (OB Diag)

OpenBoot Diagnostics is an interactive tool that tests various hardware and peripheral devices. When obdiag is typed at the ok prompt in OBP, the menu shown in CODE EXAMPLE 4-6 is displayed on the screen.

OBDiag performs root-cause failure analysis on the referenced devices by testing internal registers, confirming subsystem integrity, and verifying device functionality. To run OBDiag:

1. At the ok prompt, enter obdiag.

This displays the OBDiag menu as shown in CODE EXAMPLE 4-6.

2. At the OBDiag menu prompt, enter a number from the menu (such as 17 to enable toggle script-debug messages).

CODE EXAMPLE 4-6 OBDiag Menu
0 .... PCI/Cheerio 
1 .... EBUS DMA/TCR Registers 
2 .... Ethernet 
3 .... Ethernet2 <Inactive> 
4 .... Parallel Port 
5 .... Serial Port C (on optional I/O board) <Inactive> 
6 .... Serial Port D (on optional I/O board) <Inactive> 
7 .... NVRAM 
8 .... Floppy 
9 .... Serial port A 
10 ... Serial port B 
11 ... RAS 
12 ... User Flash1 
13 ... User Flash2 
14 ... All Above 
15 ... Quit 
16 ... Display this Menu 
17 ... Toggle Script-debug 
18 ... Enable External Loopback Tests 
19 ... Disable External Loopback Tests 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>



caution icon

Caution - Prior to running obdiag, do not run any other OBP command that may change the hardware state of the board. After obdiag tests are run, always reset the system to bring it to a known state.



The user may type the relevant numbers at this point to run all or some of the tests. If an error is detected the error message is displayed on the screen. For example, if an error is detected while testing the floppy disk drive, a display similar to the following message is displayed on the screen:

TEST= floppy_test  
STATUS= FAILED  
SUBTEST= floppy_id0_read_test  
ERRORS= 1   
TTF= 66   
SPEED= 440 MHz  
PASSES= 1   
MESSAGE= Error: Recalibrate failed. floppy missing, improperly connected, or defective. 

Some of the individual items on the OBDiag menu are described in further detail in the following paragraphs.

4.7.8.1 PCI/PCIO

The PCI/PCIO diagnostic performs the following:

CODE EXAMPLE 4-7 identifies the PCI/PCIO output message.

CODE EXAMPLE 4-7 PCI/PCIO Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 0 
 
TEST= all_pci/PCIO_test  
SUBTEST= vendor_id_test  
SUBTEST= device_id_test  
SUBTEST= mixmode_read  
SUBTEST= e2_class_test  
SUBTEST= status_reg_walk1  
SUBTEST= line_size_walk1  
SUBTEST= latency_walk1  
SUBTEST= line_walk1  
SUBTEST= pin_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.2 EBus DMA/TCR Registers

The EBUS DMA/TCR registers diagnostic performs the following:

CODE EXAMPLE 4-8 identifies the EBus DMA/TCR registers output message.

CODE EXAMPLE 4-8 EBus DMA/TCR Registers Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 1 
 
TEST= all_dma/ebus_test  
SUBTEST= dma_reg_test  
SUBTEST= dma_func_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.3 Ethernet

The Ethernet diagnostic performs the following:

CODE EXAMPLE 4-9 identifies the Ethernet output message.

CODE EXAMPLE 4-9 Ethernet Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 2 
 
TEST= ethernet_test  
SUBTEST= my_channel_reset  
SUBTEST= hme_reg_test  
SUBTEST= global_reg1_test  
SUBTEST= global_reg2_test  
SUBTEST= bmac_xif_reg_test  
SUBTEST= bmac_tx_reg_test  
SUBTEST= mif_reg_test  
Test only supported for National Phy DP83840A 
SUBTEST= 10mb_xcvr_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
SUBTEST= 100mb_phy_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
SUBTEST= 100mb_twister_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.4 Parallel Port

The parallel port diagnostic performs the dma_read. This enables ECP mode and ECP DMA configuration, and FIFO test mode. It transfers 16 bytes of data from the memory to the parallel port device and then verifies that the data is in TFIFO. CODE EXAMPLE 4-10 identifies the parallel port output message.

CODE EXAMPLE 4-10 Parallel Port Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 4 
 
TEST= parallel_port_test  
SUBTEST= dma_read  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.5 Serial Port A

The serial port A diagnostic invokes the uart_loopback test. This test transmits and receives 128 characters and checks the transaction validity. CODE EXAMPLE 4-11 identifies the serial port A output message.

CODE EXAMPLE 4-11 Serial Port A Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 9 
 
TEST= uarta_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>



Note - The serial port A diagnostic will stall if the TIP line is installed on serial port A. CODE EXAMPLE 4-12 identifies the serial port A output message when the TIP line is installed on serial port A.



CODE EXAMPLE 4-12 Serial Port A Output Message with TIP Line Installed
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 9 
 
TEST= uarta_test   
UART A in use as console - Test not run.  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.6 Serial Port B

The serial port B diagnostic is identical to the serial port A diagnostic. CODE EXAMPLE 4-13 identifies the serial port B output message.



Note - The serial port B diagnostic will stall if the TIP line is installed on serial port B.



CODE EXAMPLE 4-13 Serial Port B Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 10 
 
TEST= uartb_test  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.7 NVRAM

The NVRAM diagnostic verifies the NVRAM operation by performing a write and read to the NVRAM. CODE EXAMPLE 4-14 identifies the NVRAM output message.

CODE EXAMPLE 4-14 NVRAM Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 7 
 
TEST= nvram_test  
SUBTEST= write/read_patterns  
SUBTEST= write/read_inverted_patterns  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>

4.7.8.8 All Above

The All Above diagnostic validates the system unit. CODE EXAMPLE 4-15 shows an example of the All Above option output message.

CODE EXAMPLE 4-15 All Above Output Message
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===> 14 
 
TEST= all_pci/cheerio_test  
SUBTEST= vendor_id_test  
SUBTEST= device_id_test  
... 
SUBTEST= bmac_xif_reg_test  
SUBTEST= bmac_tx_reg_test  
SUBTEST= mif_reg_test  
SUBTEST= mac_internal_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
... 
SUBTEST= 100mb_twister_loopback_test  
selecting internal transceiver 
Test only supported for National Phy DP83840A 
TEST= ethernet2_test  
TEST= parallel_port_test  
SUBTEST= dma_read  
TEST= uarta_test  
... 
SUBTEST= write/read_patterns  
...  
ttya in use as console - Test not run.  
TEST= usi_test   
ttyb in use as console - Test not run.  
TEST= ras_test  env-monitor = disabled 
SUBTEST= obd-init-i2c-test  
... 
TEST= flash_test  
SUBTEST= flash-supported?  
TEST= flash_test  
SUBTEST= flash-supported?  
 
Enter (0-14 tests, 15 -Quit, 16 -Menu) ===>