You can use POST for basic hardware verification and diagnosis, and for troubleshooting as described in the following sections.
POST tests critical hardware components to verify functionality before the system boots and accesses software. If POST detects an error, the faulty component is disabled automatically, preventing faulty hardware from potentially harming software.
You can use POST as an initial diagnostic tool for the system hardware. In this case, configure POST to run in maximum mode (diag_mode=service, setkeyswitch=diag, diag_level=max) for thorough test coverage and verbose output.
This procedure describes how to run POST when you want maximum testing, as in the case when you are troubleshooting a server or verifying a hardware upgrade or repair.
ok #. sc>
sc> setkeyswitch diag
There are several ways to initiate a reset. Initiating POST Using the powercycle Command shows the powercycle command. For other methods, refer to the Sun Netra T5440 Server Administration Guide.
sc> console
Initiating POST Using the powercycle Command depicts abridged POST output.
If no faults were detected, the system will boot.
If POST detects a faulty device, the fault is displayed and the fault information is passed to ALOM CMT CLI for fault handling. Faulty FRUs are identified in fault messages using the FRU name.
POST error messages use the following syntax:
In this syntax, c = the core number, s = the strand number.
Warning and informational messages use the following syntax:
In POST Error Message, POST reports a memory error at FB-DIMM location /SYS/MB/CMP0/BR1/CH0/D0. The error was detected by POST running on core 7, strand 2.
The fault is captured by ALOM CMT CLI, where the fault is logged, the Service Required LED is lit, and the faulty component is disabled.
Refer to showfaults Output.
In this example, /SYS/MB/CMP0/BR1/CH0/D0 is disabled. The system can boot using memory that was not disabled until the faulty component is replaced.
sc> powercycle Are you sure you want to powercycle the system (y/n)? y Powering host off at Fri Jul 27 08:11:52 2007 Waiting for host to Power Off; hit any key to abort. Audit | minor: admin : Set : object = /SYS/power_state : value = soft : success Chassis | critical: Host has been powered off Powering host on at Fri Jul 27 08:13:08 2007 Audit | minor: admin : Set : object = /SYS/power_state : value = on : success Chassis | major: Host has been powered onExample 1-2 Initiating POST Using the powercycle Command
sc> console /export/delivery/delivery/4.x/4.x.build_119/post4.x/Niagara/t5440/integrated (root) 2007-07-03 10:25:12.386 0:0:0>Copyright 2007 Sun Microsystems, Inc. All rights reserved 2007-07-03 10:25:12.550 0:0:0>VBSC cmp0 arg is: 00ff00ff.ffffffff 2007-07-03 10:25:13.353 0:0:0>Basic Memory Tests..... 2007-07-03 10:25:12.653 0:0:0>POST enabling threads: 00ff00ff.ffffffff 2007-07-03 10:25:12.766 0:0:0>VBSC mode is: 00000000.00000001 2007-07-03 10:25:13.456 0:0:0>Begin: Branch Sanity Check2007-07-03 10:25:38.399 0:0:0>End : DRAM Memory BIST 2007-07-03 10:25:13.569 0:0:0>End : Branch Sanity Check2007-07-03 10:25:39.658 0:0:0>L2 Bank EFuse = 00000000.000000ff 2007-07-03 10:25:39.547 0:0:0>Sys 166 MHz, CPU 1166 MHz, Mem 332 MHz 2007-07-03 10:25:39.760 0:0:0>L2 Bank status = 00000000.00000f0f 2007-07-03 10:25:13.066 0:0:0>VBSC setting verbosity level 32007-07-03 10:25:12.081 0:0:0>@(#)Sun Netra[TM] T5440 POST 4.x.build_119 2007/06/06 09:48 2007-07-03 10:25:12.867 0:0:0>VBSC level is: 00000000.000000012007-07-03 10:25:12.966 0:0:0>VBSC selecting POST MAX Testing.2007-07-03 10:25:39.864 0:0:0>Core available Efuse = ffff00ff.ffffffff 2007-07-03 10:25:13.668 0:0:0>Begin: DRAM Memory BIST2007-07-03 10:25:13.793 0:0:0>................................................................................................2007-07-03 10:25:13.161 0:0:0> Niagara2, Version 2.12007-07-03 10:25:13.247 0:0:0> Serial Number: 0fac006b.0e654482 2007-07-03 10:25:39.982 0:0:0>Test Memory.....Enter #. to return to ALOM. 2007-07-03 10:25:40.070 0:0:0>Begin: Probe and Setup Memory2007-07-03 10:29:21.683 0:0:0>INFO: 2007-07-03 10:25:40.181 0:0:0>INFO: 4096MB at Memory Branch 0 2007-07-03 10:29:21.686 0:0:0> POST Passed all devices. ...2007-07-03 10:29:21.692 0:0:0>POST: Return to VBSC.Example 1-3 POST Error Message
7:2> 7:2>ERROR: TEST = Data Bitwalk 7:2>H/W under test = /SYS/MB/CMP0/BR1/CH0/D0 7:2>Repair Instructions: Replace items in order listed by 'H/W under test' above. 7:2>MSG = Pin 149 failed on /SYS/MB/CMP0/BR1/CH0/D0 (J2001) 7:2>END_ERROR 7:2>Decode of Dram Error Log Reg Channel 2 bits 60000000.0000108c 7:2> 1 MEC 62 R/W1C Multiple corrected errors, one or more CE not logged 7:2> 1 DAC 61 R/W1C Set to 1 if the error was a DRAM access CE 7:2> 108c SYND 15:0 RW ECC syndrome. 7:2> 7:2> Dram Error AFAR channel 2 = 00000000.00000000 7:2> L2 AFAR channel 2 = 00000000.00000000Example 1-4 showfaults Output
ok .# sc> showfaults Last POST Run: Wed Jun 27 21:29:02 2007 Post Status: Passed all devices ID FRU Fault 0 /SYS/MB/CMP0/BR1/CH0/D0 SP detected fault: /SYS/MB/CMP0/BR1/CH0/D0 Forced fail (POST)
In most cases, when POST detects a faulty component , POST logs the fault and automatically takes the failed component out of operation by placing the component in the ASR blacklist (see Managing Components With Automatic System Recovery Commands).
In most cases, the replacement of the faulty FRU is detected when the service processor is reset or power cycled. In this case, the fault is automatically cleared from the system. This procedure describes how to identify POST detected faults and, if necessary, manually clear the fault.
POST detected faults are distinguished from other kinds of faults by the text: Forced fail, and no UUID number is reported.
See POST Detected Fault.
If no fault is reported, you do not need to do anything else. Do not perform the subsequent steps.
Use the FRU name that was reported in the fault in Step 1. See Using the enablecomponent Command.
The fault is cleared and should not show up when you run the showfaults command. Additionally, the Service Required LED is no longer on.
You must reboot the server for the enablecomponent command to take effect.
sc> showfaults Last POST Run: Wed Jun 27 21:29:02 2007 Post Status: Passed all devices ID FRU Fault 0 /SYS/MB/CMP0/BR1/CH0/D0 SP detected fault: /SYS/MB/CMP0/BR1/CH0/D0 Forced fail (POST)Example 1-6 Using the enablecomponent Command
sc> enablecomponent /SYS/MB/CMP0/BR1/CH0/D0Example 1-7 Verifying Cleared Faults Using the showfaults Command
sc> showfaults Last POST run: THU MAR 09 16:52:44 2006 POST status: Passed all devices No failures found in System