|C H A P T E R 3|
The Service Processor
The key to obtaining diagnostic information for an X6275 Blade Server Module node is its service processor (SO), an independent processor located on a daughter board in the blade.
This chapter includes the following topics:
Your server’s SP is a “baseboard management controller” (BMC). A definition of a BMC might be:
“A baseboard management controller (BMC) is a specialized processor that monitors the physical state of a server using sensors and communicates with the system administrator through a special management connection. The BMC, part of the Intelligent Platform Management Interface (IPMI), is usually mounted on the motherboard of the server that it monitors.”
Each of the two nodes of an X6275 Blade Server Module includes an SP, which runs whenever the server blade is inserted in a powered-on chassis, irrespective of whether or not the node itself is powered on. The SP runs software called the Integrated Lights Out Manager (ILOM), aptly named because the SP and the ILOM program run continuously whenever power is appliedto the server.
The ILOM program that runs on each server’s SP is the focal point for all of your diagnostic capability. With the ILOM program that runs on your server’s SP you can:
ILOM has both a web interface and a command-line interface.
You can use IPMItool to interrogate your server’s sensors and view system information without using the ILOM program, but IPMItool requires a functioning SP to gather data.
If either of the two service processors on your server blade fails, the SP attempts to reboot itself (just the SP, not the server). If the SP cannot reboot, then it is likely due to a corrupted firmware image. You must recover the image by flashing the firmware image. See Sun Integrated Lights Out Manager 2.0 User’s Guide (820-1188) or Sun Integrated Lights Out Manager (ILOM) 3.0 CLI Procedures Guide (820-6412).
Ccorrupted firmware on the host might cause the host to cease to operate. However, because the server and the SP operate independently of one another, a failed SP does not cause its host server to stop.
The only evidence of a failed SP is that ILOM ceases to work. This means that all of your diagnostic capabilities, including IPMItool, are disabled.
If you have determined that the SP has failed, you must replace the entire server blade as soon as you deem it appropriate.
|Note - Diagnosing why your SP has failed might be of interest to Sun Services, but there is no need for you to attempt such a diagnosis.|
The SP has its own diagnostics program that runs when it boots. This program is called U-Boot Diagnostics and is analogous to the Power-On Self-Test (POST) that runs when you boot a server.
You can observe the output of the U-Boot Diagnostics program through the serial console when you first put your server into service (at times when you suspect that the SP may have been damaged in shipping). You can also observe the output at any time by rebooting the SP.
Every time your X6275 server blade’s SP is booted or rebooted, the U-Boot Diagnostics program runs immediately on the SP. It collects data about the functional state of the SP and its components. It also obtains some information about the host. The resulting data is sent to two places:
You can view the results yourself by connecting a terminal directly to the SP’s serial port and watching the SP boot.
|Note - Even if the program finds a fault, the SP still boots if it can. SP faults that arise during booting are not reported to ILOM Fault Management and do not turn on the Service Required LED, although they do get written to the ILOM’s System Event Log (SEL).|
1. Log in to the SP web interface (see To Log In to the SP ILOM Web Interface Directly).
2. Select the Maintenance tab.
3. Select the Reset SP tab.
4. Click the Reset SP button.
1. Log in to the SP CLI (see To Log In to the SP ILOM CLI Directly).
2. Type reset /SP and press Enter.
3. The question appears, Are you sure you want to reset the /SP (y/n)? Type y.
When the SP reboots, the U-Boot Diagnostics program runs immediately, whether or not you are connected to the SP’s serial port.
Without your intervention, the program runs in its default mode (Normal mode).
You can connect a terminal to the SP’s serial port so that you can see the program’s output. As long as you do not interact with the program, it runs in Normal mode just as it would if you were not connected.
You must connect to the SP’s serial port with a terminal if you want to intervene in the running of the U-Boot Diagnostics program.
When you are connected to the serial port, you can see the test results, and you can force the U-boot diagnostics program to run in one of two alternate modes: Quick mode or Extended mode. Both of these modes perform more diagnostics than the Normal mode.
As the SP is booting, you see this prompt:
Enter Diagnostics Mode [‘q’uick/’n’ormal(default)/e’x’tended(manufacturing mode]......0
Diagnostic Mode - NORMAL <DIAGS> Memory Data Bus Test ... PASSED <DIAGS> Memory Address Bus Test ... PASSED I2C Probe Test - Motherboard Bus Device Address Result === ================ ======= ====== 2 Sys FRUID ( U22) 0xA0 PASSED 2 Power CPLD ( U40) 0x4e PASSED 2 CPU/DIMM Fault LEDs ( U78) 0x40 PASSED 2 PCA9555 (Misc) ( U79) 0x42 PASSED 2 LM75 Temp. Sensor 0 ( U18) 0x90 PASSED 2 LM75 Temp. Sensor 1 ( U128) 0x92 PASSED 2 LTC4215 ( U80) 0x96 PASSED 2 DIMM IMAX ( U88) 0x12 PASSED 6 Front Panel LEDs ( U100) 0xC6 PASSED 6 DS1338(RTC) ( U79) 0xD0 PASSED 6 PCA9555(Volt Marg) ( U84) 0x44 PASSED <DIAGS> PHY #0 R/W Test ... PASSED <DIAGS> PHY #0 Link Status ... PASSED <DIAGS> ETHERNET PHY #0, Internal Loopback Test ... PASSED Host in ON, Skipping HOST-based tests <DIAGS> Testing PowerCPLD version ... PASSED
|Note - Refer to the Sun x64 Servers Diagnostics Guide for more information on the Quick and Extended modes, including sample output.|