C H A P T E R  3

The Service Processor

The key to obtaining diagnostic information for an X6275 Blade Server Module node is its service processor (SO), an independent processor located on a daughter board in the blade.

This chapter includes the following topics:


The Service Processor

Your server’s SP is a “baseboard management controller” (BMC). A definition of a BMC might be:

“A baseboard management controller (BMC) is a specialized processor that monitors the physical state of a server using sensors and communicates with the system administrator through a special management connection. The BMC, part of the Intelligent Platform Management Interface (IPMI), is usually mounted on the motherboard of the server that it monitors.”

Each of the two nodes of an X6275 Blade Server Module includes an SP, which runs whenever the server blade is inserted in a powered-on chassis, irrespective of whether or not the node itself is powered on. The SP runs software called the Integrated Lights Out Manager (ILOM), aptly named because the SP and the ILOM program run continuously whenever power is appliedto the server.

ILOM

The ILOM program that runs on each server’s SP is the focal point for all of your diagnostic capability. With the ILOM program that runs on your server’s SP you can:

ILOM has both a web interface and a command-line interface.

IPMItool

You can use IPMItool to interrogate your server’s sensors and view system information without using the ILOM program, but IPMItool requires a functioning SP to gather data.

SP Failure

If either of the two service processors on your server blade fails, the SP attempts to reboot itself (just the SP, not the server). If the SP cannot reboot, then it is likely due to a corrupted firmware image. You must recover the image by flashing the firmware image. See Sun Integrated Lights Out Manager 2.0 User’s Guide (820-1188) or Sun Integrated Lights Out Manager (ILOM) 3.0 CLI Procedures Guide (820-6412).

Ccorrupted firmware on the host might cause the host to cease to operate. However, because the server and the SP operate independently of one another, a failed SP does not cause its host server to stop.

The only evidence of a failed SP is that ILOM ceases to work. This means that all of your diagnostic capabilities, including IPMItool, are disabled.

If you have determined that the SP has failed, you must replace the entire server blade as soon as you deem it appropriate.



Note - Diagnosing why your SP has failed might be of interest to Sun Services, but there is no need for you to attempt such a diagnosis.



Service Processor Diagnostics

The SP has its own diagnostics program that runs when it boots. This program is called U-Boot Diagnostics and is analogous to the Power-On Self-Test (POST) that runs when you boot a server.

You can observe the output of the U-Boot Diagnostics program through the serial console when you first put your server into service (at times when you suspect that the SP may have been damaged in shipping). You can also observe the output at any time by rebooting the SP.

The U-Boot Diagnostics Program

Every time your X6275 server blade’s SP is booted or rebooted, the U-Boot Diagnostics program runs immediately on the SP. It collects data about the functional state of the SP and its components. It also obtains some information about the host. The resulting data is sent to two places:

You can view the results yourself by connecting a terminal directly to the SP’s serial port and watching the SP boot.



Note - Even if the program finds a fault, the SP still boots if it can. SP faults that arise during booting are not reported to ILOM Fault Management and do not turn on the Service Required LED, although they do get written to the ILOM’s System Event Log (SEL).



procedure icon  To Reboot the SP Using the ILOM Web Interface

1. Log in to the SP web interface (see To Log In to the SP ILOM Web Interface Directly).

2. Select the Maintenance tab.

3. Select the Reset SP tab.

4. Click the Reset SP button.


procedure icon  To Reboot the SP Using the ILOM CLI

1. Log in to the SP CLI (see To Log In to the SP ILOM CLI Directly).

2. Type reset /SP and press Enter.

3. The question appears, Are you sure you want to reset the /SP (y/n)? Type y.

Running the U-Boot Diagnostic Software Without User Intervention

When the SP reboots, the U-Boot Diagnostics program runs immediately, whether or not you are connected to the SP’s serial port.

Without your intervention, the program runs in its default mode (Normal mode).

You can connect a terminal to the SP’s serial port so that you can see the program’s output. As long as you do not interact with the program, it runs in Normal mode just as it would if you were not connected.

Running the U-Boot Diagnostic Software With User Intervention

You must connect to the SP’s serial port with a terminal if you want to intervene in the running of the U-Boot Diagnostics program.

When you are connected to the serial port, you can see the test results, and you can force the U-boot diagnostics program to run in one of two alternate modes: Quick mode or Extended mode. Both of these modes perform more diagnostics than the Normal mode.

As the SP is booting, you see this prompt:

Enter Diagnostics Mode [‘q’uick/’n’ormal(default)/e’x’tended(manufacturing mode]......0

Sample Output From Normal Mode Test


Diagnostic Mode - NORMAL
<DIAGS> Memory Data Bus Test ... PASSED
<DIAGS> Memory Address Bus Test ... PASSED
I2C Probe Test - Motherboard
          Bus     Device                       Address       Result
         ===    ================           =======      ======
          2       Sys FRUID ( U22)            0xA0           PASSED
          2       Power CPLD ( U40)           0x4e           PASSED
          2       CPU/DIMM Fault LEDs ( U78)  0x40           PASSED
         2       PCA9555 (Misc) ( U79)       0x42           PASSED
          2       LM75 Temp. Sensor 0 ( U18)   0x90           PASSED
          2       LM75 Temp. Sensor 1 ( U128)  0x92           PASSED
          2       LTC4215 ( U80)               0x96           PASSED
          2       DIMM IMAX ( U88)             0x12           PASSED
          6       Front Panel LEDs ( U100)    0xC6           PASSED
          6       DS1338(RTC) ( U79)           0xD0           PASSED
          6       PCA9555(Volt Marg) ( U84)    0x44           PASSED
<DIAGS> PHY #0 R/W Test ... PASSED
<DIAGS> PHY #0 Link Status ... PASSED
<DIAGS> ETHERNET PHY #0, Internal Loopback Test ... PASSED
Host in ON, Skipping HOST-based tests
<DIAGS> Testing PowerCPLD version ... PASSED



Note - Refer to the Sun x64 Servers Diagnostics Guide for more information on the Quick and Extended modes, including sample output.