C H A P T E R  11

Power-On Self-Test

This chapter describes the Sun Ultra 45 and Ultra 25 workstation power-on self-test (POST). Topics covered are:


11.1 POST Overview

Power-on self-test (POST) performs tests on workstation core components such as CPU and memory. POST checks low-level interaction between the CPU, caches, memory, JBus, and the I/O bridge chip.

Typing the post command from the ok prompt initiates tests that check the CPU, I/O bridge chip, and memory modules. The output of the post command is directed to the serial port of the system under test. An external display device and a Tip connection are required to view this output. Tip connections are described in Configuring an External Display Device.

11.1.1 Configuring POST Output

The post command uses two variables to determine its output. The command syntax is:


ok post level verbosity

where:

TABLE 11-1 describes the diagnostic levels.


TABLE 11-1 POST Diagnostic Levels

POST Diagnostic Level

Output

min

Testing of CPU, cache, some memory, and I/O bridge chip.

max

Same tests as min, with additional extensive memory testing.


TABLE 11-2 describes the output verbosity.


TABLE 11-2 POST Output Verbosity

POST Output Verbosity

Output

min

Only Executing Power On Self Test is displayed.
(The system executes the test but there is no other output on the display for several minutes.)

normal

POST banner and major test groups are indicated.

max

Each step of POST is identified.



11.2 post Command

The post command enables you to override NVRAM settings and execute POST on demand with different diagnostic levels and output verbosity. For example:


ok post level verbosity

where:

If no diagnostic level or output verbosity is provided, the post command uses the NVRAM settings for diag-level and verbosity. See Changing NVRAM Configuration Parameter Values for more information about these parameters.

11.2.1 Diagnostic Levels

You can set the following diagnostic levels with POST.

TABLE 11-3 summarizes the tests performed at min and max diagnostic levels.


TABLE 11-3 Tests Performed at min and max Diagnostic Levels

min Level

max Level

  • Initializes critical CPU resources
  • CPU tests
  • I2C devices read
  • CPU memory
  • CPU DIMMs interconnect checks
  • Internal cache tests
  • CPU memory scrub
  • I/O bridge chip tests

Same as min level, but with extended memory tests.


11.2.2 Output Verbosity

TABLE 11-4 describes the output seen when output verbosity is set to min, normal, and max.


TABLE 11-4 Output Seen at min , normal , and max Output Verbosity

min Verbosity

normal Verbosity

max Verbosity

Only the following text is displayed:

Executing Power On Self Test

  • POST banner is displayed.
  • Major test groups are indicated.

Each step of POST is identified.


Samples of POST output at different diagnostic levels and output verbosities are provided in POST Output.


11.3 POST Output

The contents of the POST output depends on the values of the diagnostic level and output verbosity. For the examples in this section, the Sun Ultra 45 or Ultra 25 workstation was configured with a single CPU and two 1-GByte DIMMs in slots 0 and 1.



Note - The 0> that precedes the output text is the CPU identifier and indicates that the output is from POST. The 1> indicates CPU 1. If you do not see these characters, the output is from the OpenBoot PROM.



11.3.1 post min normal

The following is the output of POST with min diagnostic level and normal output verbosity. The duration of POST was 90 seconds. The left column of the table is the output. The right column describes what is happening. If the POST output from your system does not match the output in the left column, use the information in the right column to help diagnose the problem.


TABLE 11-5 post min normal Output Comparison

Output Displayed

What Is Happening

ok post min normal

User initiates POST from the OpenBoot PROM ok prompt.

reset reason: 0000.0000.0000.0001
Fire TLU-A OE Error status: 0003.0100.0000.0100
@(#)OBP 4.21.x 2005/09/28 16:12 Sun Ultra 45 or Ultra 25 workstation
Clearing TLBs 
Executing Power On Self Test
Q0>

OpenBoot PROM prepares to run POST.

0>@(#) Sun Ultra 45 POST 4.21.x 2005/10/13 16:57

POST build version and date is displayed.

 /dat/fw/common-source/firmware_re/post/post-build-4.21.x/Ultra/Ultra45/integrated  (firmware_re)
 

POST build path is displayed.

0>Copyright © 2005 Sun Microsystems, Inc. All rights reserved
  SUN PROPRIETARY/CONFIDENTIAL.
  Use is subject to license terms.

Copyright and license are displayed.

0>OBP->POST Call with %o0=00000800.01012000.
0>Diag level set to MIN.
0>Verbosity level set to NORMAL.

CPU0 is acknowledged and POST configuration is identified.

0>Start Selftest.....

Testing is started.

0>CPUs present in system: 0
0>Test CPU(s)....Done

CPU is identified and tested.

0>Interrupt Crosscall....Done

Interrupt handlers are set up and checked.

0>Init Memory....Done
0>Test Memory....Done

Memory is initialized, phase-locked loops (PLL) are reset, and memory is re-initialized and tested.

0>IO-Bridge Tests....Done

I/O bridge is tested.

0>INFO:
0>      POST Passed all devices.
0>
0>POST: Return to OBP.

POST has passed successfully and returns control to the OpenBoot PROM.


11.3.2 post max max

The following section contains the output of POST with max diagnostic level and max output verbosity. The duration of POST was four minutes, 30 seconds. The left column is the output. The right column describes what is happening. If the POST output from your system does not match the output in the left column, use the information in the right column to help diagnose the problem.

Error messages are reported when they are found. Examples of POST messages are shown in Analyzing POST Messages.


TABLE 11-6 post max max Output Comparison

Output Displayed

What Is Happening

ok post max max

User initiates POST from OpenBoot PROM ok prompt.

reset reason: 0000.0000.0000.0001
Fire TLU-A OE Error status: 0003.0100.0000.0100@(#)OBP 4.21.x 2005/09/28 16:12 Sun Ultra 45
Clearing TLBs 
Executing Power On Self Test
Q0>

OpenBoot PROM prepares to run POST.

0>@(#) Sun Ultra 45 POST 4.21.x 2005/11/05 19:58

POST build version and date is displayed.

 /dat/fw/common-source/firmware_re/post/post-build-4.21.0/Ultra/Ultra45/integrated  (firmware_re)

POST build path is displayed.

0>Copyright © 2005 Sun Microsystems, Inc. All rights reserved
  SUN PROPRIETARY/CONFIDENTIAL.
  Use is subject to license terms.

Copyright and license are displayed.

0>Soft Power-on RST thru SW 
0>OBP->POST Call with %o0=00001000.01014000.
0>Diag level set to MAX.
0>Verbosity level set to MAX.
0>MFG scrpt mode set NORM 
0>I/O port set to TTYA.

CPU0 is acknowledged and POST configuration is read from register.

0>Start Selftest.....
0>CPUs present in system: 0
0>Test CPU(s).....
0>Initialize I2C Controller
0>Init CPU
0>DMMU
0>DMMU TLB DATA RAM Access
0>DMMU TLB TAGS Access
0>IMMU Registers Access
0>IMMU TLB DATA RAM Access
0>IMMU TLB TAGS Access
0>Init mmu regs

CPU, I2C controller, data memory management unit (DMMU), and instruction memory management unit (IMMU) are initialized.

0>Setup L2 Cache
0>L2 Cache Control = 00000000.00f04400 
0>      Size = 00000000.00100000...
0>L2 Cache Tags Test
0>Scrub and Setup L2 Cache

L2 cache is set up and scrubbed (data values set to defaults).

0>Setup and Enable DMMU
0>Setup DMMU Miss Handler

DMMU is set up.

0>Test  Mailbox
0>Scrub Mailbox

Mailbox region is checked and initialized in L2 cache.

0>CPU Tick and Tick Compare Registers Test

Operation of TICK registers is verified.

0>CPU Stick and Stick Compare Registers Test

Operation of STICK registers is verified.

0>Set Timing

Motherboard timing is to be configured.

0> UltraSPARC[TM] IIIi, Version 3.4

CPU version is identified.

0>Interrupt Crosscall.....
0>Setup Int Handlers

Interrupt handlers are set up.

0>MB:   Part-Dash-Rev#:  3753279-02-0C  Serial#:  000225

Motherboard part number and serial number is read from FRU ID.

0>CPU0 DIMM 0: 
0>Part#:  18VDDF12872Y-335D3  Serial#:  71fe1ec9  Date Code:  0506  Rev#:  0300
0>CPU0 DIMM 1: 
0>Part#:  18VDDF12872Y-335D3  Serial#:  71fe1e32  Date Code:  0506  Rev#:  0300

DIMM part numbers, serial numbers, date codes, and revisions are read from the DIMM's internal firmware.

0>Set CPU/System Speed
0>MCR Timing index = 00000000.00000007 
0>..

Jumpers for CPU and JBus frequency are read.

0>Init Memory.....

Memory is initialized.

0>Probe Dimms

Presence of DIMMs is checked.

0>Init Mem Controller Regs

Memory controller registers are initialized.

0>Set JBUS config reg

JBus configuration register is set.

0>IO-Bridge unit 1 init test             
0>Clear TLU loopback for PCI-E

I/O bridge chip is initialized.

0>Do PLL reset
0>Setting timing to 8:1 12:1, system frequency 200 MHz, CPU frequency 1600 MHz

Phase-locked loop (PLL) is reset for the selected frequencies.

ø0>Soft Power-on RST thru SW

Soft reset.

0>PLL Reset.....
0>Initialize I2C Controller
0>Init CPU
0>Init mmu regs
0>Setup L2 Cache
0>L2 Cache Control = 00000000.00f04400 
0>      Size = 00000000.00100000...
0>Setup and Enable DMMU
0>Setup DMMU Miss Handler
0>Scrub Mailbox

Initializations and setups are repeated.

0>Timing is 8:1 12:1, sys 200 MHz, CPU 1600 MHz, mem 133 MHz.

New timing ratios and frequencies are displayed.

0> UltraSPARC[TM] IIIi, Version 3.4
0>Init Memory.....
0>Probe Dimms
0>Init Mem Controller Sequence
0>Clear TLU loopback for PCI-E

Repeated initialization continues.

0>Test Memory.....
0>Select Bank Config
0>Probe and Setup Memory
0>INFO: 2048MB Bank 0, Dimm Type X4 
0>INFO: No memory detected in Bank 1
0>INFO: No memory detected in Bank 2
0>INFO: No memory detected in Bank 3
0>
0>Test Memory.....
0>Select Bank Config
0>Probe and Setup Memory
0>INFO: 2048MB Bank 0, Dimm Type X4 
0>INFO: No memory detected in Bank 1
0>INFO: No memory detected in Bank 2
0>INFO: No memory detected in Bank 3
0>

Memory is probed.

0>Data Bitwalk on Master

CPU data pins are tested.

0> Test Bank 0.

Where found, memory is tested.

0>Address Bitwalk on Master
0>Addr walk mem test on CPU 0 Bank 0: 00000000.00000000 to 00000000.80000000.

CPU address pins are tested.

0>Set Mailbox

Mailbox region is set in memory.

0>Final mc1 is 1000000a.1e581c61

Memory control register1 is set.

0>Setup Final DMMU Entries

Memory is allocated for POST.

0>Post Image Region Scrub

Allocated memory is scrubbed clean.

0>Run POST from Memory

POST is transferred from ROM to RAM memory. POST is executed from memory from this point forward.

0>Verifying checksum on copied image.
0>The Memory's CHECKSUM value is f482.
0>The Memory's Content Size value is 8c57a.
0>Success...  Checksum on Memory Validated.

Copied data is verified.

0>Test CPU Caches.....

CPU internal caches are tested.

0>I-Cache RAM Test
0>I-Cache Tag RAM
0>I-Cache Valid/Predict TAGS Test
0>I-Cache Snoop Tag Field
0>I-Cache Branch Predict Array Test

Instruction cache is tested.

0>Branch Prediction Initialization
0>D-Cache RAM
0>D-Cache Tags
0>D-Cache Micro Tags
0>D-Cache SnoopTags Test
0>W-Cache RAM
0>W-Cache Tags
0>W-Cache Valid bit Test
0>W-Cache Bank valid bit Test
0>W-Cache SnoopTAGS Test

Data and write caches are tested.

0>P-Cache RAM
0>P-Cache Tags
0>P-Cache SnoopTags Test
0>P-Cache Status Data Test

Prefetch cache is tested.

0>8k DMMU TLB 0 Data
0>8k DMMU TLB 1 Data
0>8k DMMU TLB 0 Tags
0>8k DMMU TLB 1 Tags
0>8k IMMU TLB Data
0>8k IMMU TLB Tags

Translation look-aside buffers (TLB) are tested for data and instruction buffers.

0>FPU Registers and Data Path
0>FPU Move Registers

Floating point unit (FPU) is checked.

0>FSR Read/Write

FPU status register is checked.

0>FPU Block Register Test
0>FPU Branch Instructions
0>FPU Functional Test

Additional FPU testing is performed.

0>Scrub Memory

Memory is set to zero.

0>Flush Caches

Caches are set to zero.

0>Functional CPU Tests.....
0>L2-Cache Functional
0>L2-Cache Stress
0>IMMU Functional
0>DMMU Functional
0>I-Cache Functional
0>I-Cache Parity Functional
0>I-Cache Parity Tag
0>I-Cache Snoop Parity Tag
0>D-Cache Functional
0>D-Cache Parity Functional
0>D-Cache Parity Tag Test
0>W-Cache Functional
0>Graphics Functional
0>CPU Superscalar Dispatch
0>SPARC Atomic Instruction Test
0>Non SPARC Atomic Instruction Test
0>SOFTINT Register and Interrupt Test
0>Branch Memory Test
0>Fast ECC test
0>System ECC test

CPU functional checks are executed.

0>XBus SRAM

On-board SRAM is checked.

0>IO-Bridge Quick Read             
0>

 

0>--------------------------------------------------------------
0>--------- IO-Bridge Quick Read Only of CSR and ID ----------
0>--------------------------------------------------------------
0>fire 1 JBUSID  00000400.0f000000 =       
0>                                       fc000002.f03dda23
0>--------------------------------------------------------------
0>fire 1 JBUSCSR 00000400.0f410000 =       
0>                                       00000ff5.13cb6000
0>--------------------------------------------------------------

 

0>IO-Bridge unit 1 jbus perf test 
 
0>IO-Bridge unit 1 int init test 
0>IO-Bridge unit 1 msi init test 
0>IO-Bridge unit 1 ilu init test 
0>IO-Bridge unit 1 tlu init test 
0>IO-Bridge unit 1 lpu init test 
0>IO-Bridge unit 1 link train port A   
0>IO-Bridge unit 1 link train port B
0>IO-Bridge unit 1 interrupt test 

I/O bridge is checked and PCI-E links are trained.

0>IO-Bridge unit 1 Config MB bridges
0>Config port A, bus 2 dev 0 func 0, tag IOBD/PCI-SWITCH
0>Config port A, bus 3 dev 1 func 0, tag IOBD/PCIE-IO
0>Config port A, bus 4 dev 0 func 0, tag IOBD/PCIE-IO-DEVICES
0>Config port A, bus 3 dev 2 func 0, tag IOBD/GBE
0>Config port A, bus 3 dev 3 func 0, tag IOBD/PCIE2
0>Config port A, bus 3 dev 8 func 0, tag IOBD/PCIE1
0>Config port A, bus 3 dev 9 func 0, tag IOBD/PCI-BRIDGE
0>Config port A, bus 9 dev 0 func 0, tag IOBD/PCI-BRIDGE PORT0-SAS
0>Config port A, bus 9 dev 0 func 2, tag IOBD/PCI-BRIDGE PORT1-slot0
0>

On-board PCI bridges and switches are configured.

0>IO-Bridge unit 1 PCI id test 
0>      INFO:10 count read passed for IOBD/PCI-SWITCH! Last read VID:10b5|DID:8532
0>      INFO:10 count read passed for IOBD/PCIE-IO! Last read VID:10b9|DID:5249
0>      INFO:10 count read passed for IOBD/GBE! Last read VID:1166|DID:103
0>      INFO:10 count read passed for IOBD/PCI-BRIDGE! Last read VID:8086|DID:341
0>      INFO:10 count read passed for IOBD/SASHBA! Last read VID:1000|DID:50

The PCI IDs of the on-board devices are checked.

0>Print Mem Config

Memory configuration is to be displayed.

0>Caches : Icache is ON, Dcache is ON, Wcache is ON, Pcache is ON.

Cache status is displayed.

0>Memory interleave set to 0
0> Bank 0 2048MB : 00000000.00000000 -> 00000000.7fffffff.

The amount of memory installed is displayed.

0>Block Memory

Memory is checked by block memory tests.

0>Test 2141192192 bytes on bank 0....
0>0% Done...
0>2% Done...
...
0>98% Done...
0>99% Done...

Memory is checked in bank0.

0>INFO:
0>      POST Passed all devices.
0>
0>POST: Return to OBP.

POST has passed successfully and returns control to the OpenBoot PROM.


Error messages are reported when they are found. Examples of POST messages are shown in Analyzing POST Messages.

11.3.3 post min min

The following is the output of POST with min diagnostic level and min output verbosity. The duration of POST was 90 seconds.


ok post min min
 
Executing Power On Self Test
 
Configuring system memory & CPU(s)
 
Probing system devices
Probing memory
Probing I/O buses
 
Sun Ultra 45, Keyboard present
Copyright 1998-2005 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.21.x, 1024 MB memory installed, Serial #53463596.
Ethernet address 0:3:ba:2f:ca:2c, Host ID: 832fca2c.

POST conducted the tests. No output was provided. Error messages are reported when they are found. Examples of POST messages are shown in Analyzing POST Messages.

11.3.4 post max min

The following is the output of POST with max diagnostic level and min output verbosity. The duration of POST was 120 seconds.


ok post max min
 
Executing Power On Self Test

No output other than error messages is displayed. Examples of POST messages are shown in Analyzing POST Messages.


11.4 Analyzing POST Messages

POST has three categories of messages:

11.4.1 Error Messages

When an error occurs during POST, an error message is displayed. The error message is bounded by the text ERROR and END_ERROR. Several error messages might be displayed at different times during the POST process for any single error condition.

The following error examples were caused by a defective 1-GByte DIMM in the slot labeled DIMM0. The first error message occurred when the DIMMs were probed:


0>ERROR: TEST = Probe and Setup Memory
0>H/W under test = CPU0 Memory
0>Repair Instructions: Replace items in order listed by 'H/W under test' above
0>MSG = ERROR:  miscompare on mem test!
                Address: 00000000.00000000
                Expected: a5a5a5a5.a5a5a5a5
                Observed: a5a6a5a5.a5a5a5a5
0>END_ERROR

At address 00000000.00000000, there was a test pattern mismatch. A string of a5a6a5a5 was observed when a string of a5a5a5a5 was expected.

The second error message identified where the fault was located:


0>ERROR: TEST = Probe and Setup Memory
0>H/W under test = CPU0: Bank 0  DIMM0, Motherboard
0>Repair Instructions: Replace items in order listed by 'H/W under test' above
0>MSG = Pin 72 failed on CPU0: Bank 0  DIMM0, Motherboard
0>END_ERROR

The DIMM in slot DIMM0 was at fault. Several other error messages were displayed, and a summary was provided:


0>ERROR:
0>      POST top level status has the following failures:
0>              CPU0: Bank 0  DIMM0, Motherboard
0>              CPU0: Bank 1  DIMM0, Motherboard
0>END_ERROR

The DIMM in slot DIMM0 should be replaced. Because memory works in pairs, POST disables both slots DIMM0 and DIMM1. POST returns system status and control back to the OpenBoot PROM which then displays messages regarding the results of POST.

For example:


Power On Selftest Failed.
   CPU: 0 cause: CPU0: Bank 0  DIMM0, Motherboard
ERROR: CPU0 has 2048/4096MB of memory disabled
 
ERROR: POST failed

Because of the error, two DIMM slots have been disabled (bank0), so only half of the original memory (2048/4096MBytes) is available for use.



Note - If only two DIMMs were installed and this set of errors occurred, the system would have beeped three times and powered off.



11.4.2 Warning Messages

Warning messages have a structure similar to error messages, however the messages are bounded by the text WARNING and END_WARNING. Warning messages do not contain a Repair Instructions line.

The following warning message example indicates that there is a DIMM size mismatch in slots DIMM0 and DIMM1:


0>WARNING: TEST = Probe and Setup Memory
0>H/W under test = CPU0 Memory
0>MSG = DIMM size does not match for Dimm set 0, Dimm0=00000000.40000000, Dimm1=00000000.20000000
0>END_WARNING

DIMM0 is a 1-GByte DIMM and DIMM1 is a 512-MByte DIMM.

11.4.3 Info Messages

Info messages are simple and are only preceded by the text, INFO. Info messages provide noncritical facts, as seen in this example:


0>Probe and Setup Memory
0>INFO: 1024MB Bank 0, Dimm Type X4 
0>INFO: 1024MB Bank 1, Dimm Type X4 
0>INFO: 1024MB Bank 2, Dimm Type X4 
0>INFO: 1024MB Bank 3, Dimm Type X4 

These info messages indicate that a 1-GByte DIMM is installed into each DIMM connector.


11.5 Setting Up for POST

To execute POST and view its output, you must perform the procedures in the following sections.

11.5.1 Verifying the Baud Rate

Ensure that the communication parameters are correct. Use one of the following procedures:

11.5.1.1 OpenBoot PROM Level Procedure

single-step bulletFrom the ok prompt of the system to run POST, type:


ok setenv ttya-mode=9600,8,n,1,-

11.5.1.2 Solaris OS Level Procedure

single-step bulletAs superuser in a terminal window of the system to run POST, type:


# eeprom ttya-mode=9600,8,n,1,-

11.5.2 Obtaining the ok Prompt

1. Save all work in progress and close any open applications.

2. As superuser in a terminal window of the system to run POST, type:


# init 0

11.5.3 Configuring an External Display Device

POST directs its output to serial port 1 (TTYA) of the system being tested. You can view this output by connecting a serial terminal or a second system running a Tip connection through a terminal window.

11.5.3.1 Configuring a Serial Terminal


You can view POST output through any VT-100 RS-232 compatible serial terminal. The terminal connects to the Sun Ultra 45 or Ultra 25 workstation at the TTYA port:

The serial ports are DB-9 F connectors. Use a straight-through cable and connect to the serial terminal's DCE port. Configure the serial terminal to the communication parameters listed in TABLE 11-7.


TABLE 11-7 Serial Terminal Communication Parameters

Parameter

Value

Baud

9600

Data bits

8

Parity

None

Stop bits

1

Handshaking

None

Duplex

Full


If a DCE port is not available, then use a crossover cable as illustrated in FIGURE 11-1.

11.5.3.2 Configuring a Second System

Instead of a serial terminal, you can use a second system running a Tip connection through a terminal window.

The second system must have a serial port capable of RS-232 communications. Use a crossover cable (null-modem cable) with the Tip connection.

FIGURE 11-1 shows the wiring for a crossover cable. If your system does not have a DB-9 F connector at the serial port, adapters are available from most computer supply stores or from your Sun Microsystems sales representative.

The following URL provides part numbers for adapters and other Sun cables. You must be a registered SunSolve user to access this URL.

http://sunsolve.sun.com/handbook_pub/Devices/Cables/cables_ext_data.html


FIGURE 11-1 Crossover Cable Wiring Diagram


11.5.3.3 Making a Tip Connection


Making a Tip connection requires configuring the serial port of the second system and using the tipcommand. The following procedure configures for serial port 1, or .

1. As superuser of the second system, edit the /etc/remote file.

2. Replace the hardwire property with the following:


hardwire:\
        :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

3. Ensure that the communication parameters are correct. Type:


# eeprom ttya-mode=9600,8,n,1,-

11.5.3.4 Managing Tip Connections

Serial Ports contains the following topics:

11.5.4 Running POST

1. Attach the crossover cable to the system being tested and then to the serial terminal or second system.

2. Start the Tip connection. Type:


# tip hardwire

3. Press the Return key several times to synchronize the handshaking between the two systems.

You should see the ok prompt.

4. Type the post command.

For example:


ok post min max

POST is run. See POST Output for examples of POST output.



Note - POST execution can be aborted by pressing the Ctrl-X keys of the serial terminal or second system. POST then returns control to the OpenBoot PROM.




11.6 Disabling Diagnostics and Auto Boot

Use one of the following procedures to ensure that the diagnostics are turned off and that the system does not auto boot.

11.6.0.1 OpenBoot PROM Level Procedure

single-step bulletFrom the ok prompt of the system to run POST, type:


ok setenv diag-switch? false
ok setenv auto-boot? false

11.6.0.2 Solaris OS Level Procedure

single-step bulletAs superuser in a terminal window of the system to run POST, type:


# eeprom diag-switch?=false
# eeprom auto-boot?=false