Go to main content

Oracle® X6 Series Servers Administration Guide

Exit Print View

Updated: September 2017
 
 

Fault Detection and Diagnostics Overview

The server supports multiple fault detection and diagnostics tools. Fault detection tools, such as the Oracle ILOM Fault Manager, automatically poll the system to detect hardware faults and adverse environmental conditions. Diagnostics tools, such as Oracle VTS must be run manually and can assist you in troubleshooting server issues. The following table provides an overview of the fault detection and diagnostics tools supported by the server.

Tool
Description
Documentation
Oracle ILOM Fault Manager
The Oracle ILOM Fault Manager is part of the Oracle ILOM firmware embedded on the server service processor (SP). The fault manager automatically detects system hardware faults and environmental conditions on the server. If a problem occurs on the server, Oracle ILOM identifies the problem in the Open Problems table and logs information about the fault in the Event log.
Refer to Protecting Against Hardware Faults: Oracle ILOM Fault Manager, Oracle ILOM User's Guide for System Monitoring and Diagnostics, Firmware Release 3.2.x at:
Oracle Linux Fault Management Architecture (FMA)
Oracle Linux FMA software can be optionally installed on the server through Oracle Hardware Management Pack. Oracle Linux FMA can be used to manage faults detected at the operating system (OS) level in much the same way that you manage faults in Oracle ILOM. Fault diagnosis messages from Linux FMA are maintained on a fault management database, which is shared with Oracle ILOM.
Refer to the Oracle Linux Fault Management Architecture User's Guide at:
Oracle Solaris Fault Management Architecture (FMA)
Oracle Solaris FMA is included with the Oracle Solaris operating system (OS). The fault manager receives data related to hardware and software errors, automatically diagnoses the underlying problem, and responds by trying to take faulty components offline.
Refer to Oracle Solaris Administration: Common Tasks at:
Auto Service Request (ASR)
ASR is an optional support service for Oracle hardware. ASR collects hardware telemetry data from telemetry sources (such as Oracle ILOM) on ASR-enabled systems in your data center. ASR filters this telemetry data and forwards what it determines to be potential faults directly to Oracle, and then automatically initiates a service request. You can configure features of the ASR service from Oracle ILOM.
Go to:
BIOS POST
At system startup, the system BIOS performs a power-on self-test (POST) that checks the hardware on your server to ensure that all components are present and functioning properly. It displays the results of this test on the system console.
To launch the power-on self-test and view the test output, reset the power on the server.
Refer to the BIOS POST section in the Oracle x86 Servers Diagnostics, Applications, and Utilities Guide for Servers with Oracle ILOM 3.1 and Oracle ILOM 3.2.x at:
Oracle VTS
Oracle VTS is a comprehensive diagnostic tool that verifies the connectivity and functionality of most hardware controllers and devices. Oracle VTS is the preferred test for diagnosing I/O and host bus adapter (HBA) problems.
Launch Oracle VTS on a system running the Oracle Solaris operating system. Alternatively, you can download the Oracle VTS ISO image to your Oracle server or to a CD/DVD and then use Oracle ILOM redirection to boot the image.
Refer to the Oracle VTS section in the Oracle x86 Servers Diagnostics, Applications, and Utilities Guide for Servers with Oracle ILOM 3.1 and Oracle ILOM 3.2.x at:
UEFI Diagnostics
UEFI Diagnostics is a suite of diagnostics tests that enable you to detect problems on motherboard components, drives, ports, and slots.
Launch these tests from the Oracle Integrated Lights Out Manager (ILOM) web interface or command-line interface (CLI):
Web:
  1. Navigate to the Host Management > Diagnostics page.

  2. In the Mode drop-down list, select the level of diagnostics you want to run (Enabled, Disabled, Extended, or Manual).

  3. Click Start Diagnostics.

CLI:
  • Use the following command to specify the diagnostics mode:

    set /HOST/diag mode=[enabled|disabled|extended| manual]

  • Use the following command to start the diagnostics:

    start /HOST/diag

Refer to one of the following resources: