Go to main content
Oracle® Server X5-4 Service Manual

Exit Print View

Updated: December 2015
 
 

Troubleshooting Hardware Faults Using Oracle ILOM

This section provides a troubleshooting procedure that you can use to investigate server hardware faults and, if necessary, prepare the server for service

When a server hardware fault event occurs the system lights the Service Action Required indicator and captures the event in the system event log (SEL). If you have set up notification through Oracle ILOM, you also receive an alert through the notification method you chose. When you become aware of a hardware fault, you should address it immediately.

  1. Log in to the server SP Oracle ILOM web interface.

    Open a browser and enter the IP address of the server SP. At the login screen, type a user name (with administrator privileges) and password. The Summary screen appears.

    The Status section of the Summary screen provides information about the server subsystems, including:

    • Processors

    • Memory

    • Power

    • Cooling

    • Storage

    • Networking

    • I/O Modules

  2. In the Status section of the summary screen, identify the server subsystem that requires service.
    image:A screen capture showing the Oracle ILOM Status screen.

    In the above example, the Status screen shows that the Memory subsystem requires service. This indicates that a hardware component within the subsystem is in a fault state.

  3. To identify the component, click on the subsystem name.

    The subsystem screen appears.


    image:A screen capture of the Oracle ILOM Processors screen.

    The above example shows the processor information screen and indicates that CPU 0 has a problem.

  4. To get more information, click one of the Open Problems links.

    The Open Problems screen provides detailed information, such as the time the event occurred, the component and subsystem name, and a description of the issue. It also includes a link to a KnowledgeBase article.


    Tip  -  The System Log provides a chronological list of all the system events and faults that have occurred since the log was last reset and includes additional information, such as severity levels and error counts. The System Log also includes information on devices not reported in the Subsystem Summary screen. To access it, click the System Log link.

    In this example, the hardware fault with DIMM 8 of CPU 0 requires local/physical access to the server.

  5. Before going to the server, review the server Product Notes document for information related to the issue or the component.

    The Product Notes document contains up-to-date information about the server, including hardware-related issues.

  6. To prepare the server for service.

    See Preparing to Service the Server.

  7. Service the component.

    Note -  After servicing the component, you might need to clear the fault in Oracle ILOM. For more information, refer the component service procedure.