The Sun Integrated Lights Out Manager (ILOM) firmware is a service processor in the server that enables you to remotely manage and administer your server.
ILOM enables you to remotely run diagnostics, such as power-on self-test (POST), that would otherwise require physical proximity to the server's serial port. You can also configure ILOM to send email alerts of hardware failures, hardware warnings, and other events related to the server or to ILOM.
The service processor runs independently of the server, using the server's standby power. Therefore, ILOM firmware and software continue to function when the server operating system goes offline or when the server is powered off.
Faults detected by ILOM, POST, and the Solaris Predictive Self-Healing (PSH) technology are forwarded to ILOM for fault handling (ILOM Fault Management).
In the event of a system fault, ILOM ensures that the fault LED is lit, FRU ID PROMs are updated, the fault is logged, and alerts are displayed (faulty FRUs are identified in fault messages using the FRU name).
Figure 1-7 ILOM Fault Management
The service processor detects when a fault is no longer present and clears the fault in several ways:
F ault recovery – The system automatically detects that the fault condition is no longer present. ILOM extinguishes the Service Required LED and updates the FRU's PROM, indicating that the fault is no longer present.
F ault repair – The fault has been repaired by human intervention. In most cases, the service processor detects the repair and extinguishes the Service Required LED. If the service processor does not perform these actions, you must perform these tasks manually with the clearfault or enablecomponent commands.
The service processor also detects the removal of a FRU, in many cases even if the FRU is removed while the service processor is powered off (that is, if the system power cables are unplugged during service procedures). This situation enables ILOM to know that a fault, diagnosed to a specific FRU, has been repaired.
Many environmental faults can automatically recover. A temperature that is exceeding a threshold might return to normal limits. An unplugged power supply can be plugged in, and so on. Recovery of environmental faults is automatically detected.
The Solaris Predictive Self-Healing technology does not monitor the hard drive for faults. As a result, the service processor does not recognize hard drive faults, and will not light the fault LEDs on either the chassis or the hard drive itself. Use the Solaris message files to view hard drive faults. See Collecting Information From Solaris OS Files and Commands.