Sun Netra T5440 Server

Exit Print View

Updated: September 2015
 
 

1.4 Using the Service Processor Firmware for Diagnosis and Repair Verification

The Sun Integrated Lights Out Manager (ILOM) firmware is a service processor in the server that enables you to remotely manage and administer your server.

ILOM enables you to remotely run diagnostics, such as power-on self-test (POST), that would otherwise require physical proximity to the server's serial port. You can also configure ILOM to send email alerts of hardware failures, hardware warnings, and other events related to the server or to ILOM.

The service processor runs independently of the server, using the server's standby power. Therefore, ILOM firmware and software continue to function when the server operating system goes offline or when the server is powered off.


Note - Refer to the Integrated Lights Out Management 2.0 Supplement for the Sun Netra T5440 Server for comprehensive ALOM CMT information.

Faults detected by ILOM, POST, and the Solaris Predictive Self-Healing (PSH) technology are forwarded to ILOM for fault handling (ILOM Fault Management).

In the event of a system fault, ILOM ensures that the fault LED is lit, FRU ID PROMs are updated, the fault is logged, and alerts are displayed (faulty FRUs are identified in fault messages using the FRU name).

Figure 1-7  ILOM Fault Management

image:Figure showing the fault source interfaces

The service processor detects when a fault is no longer present and clears the fault in several ways:

  • F ault recovery – The system automatically detects that the fault condition is no longer present. ILOM extinguishes the Service Required LED and updates the FRU's PROM, indicating that the fault is no longer present.

  • F ault repair – The fault has been repaired by human intervention. In most cases, the service processor detects the repair and extinguishes the Service Required LED. If the service processor does not perform these actions, you must perform these tasks manually with the clearfault or enablecomponent commands.

The service processor also detects the removal of a FRU, in many cases even if the FRU is removed while the service processor is powered off (that is, if the system power cables are unplugged during service procedures). This situation enables ILOM to know that a fault, diagnosed to a specific FRU, has been repaired.


Note - ILOM does not automatically detect hard drive replacement.

Many environmental faults can automatically recover. A temperature that is exceeding a threshold might return to normal limits. An unplugged power supply can be plugged in, and so on. Recovery of environmental faults is automatically detected.


Note - No ILOM command is needed to manually repair an environmental fault.

The Solaris Predictive Self-Healing technology does not monitor the hard drive for faults. As a result, the service processor does not recognize hard drive faults, and will not light the fault LEDs on either the chassis or the hard drive itself. Use the Solaris message files to view hard drive faults. See Collecting Information From Solaris OS Files and Commands.