Oracle ILOM Platform Features for the Sun Fire X4470 Server
|
Oracle ILOM 3.0 operates on many platforms, supporting features that are common to all platforms. Some Oracle ILOM 3.0 features belong to a subset of platforms and not to all. This chapter describes the features that are specific to Oracle’s Sun Fire X4470 Server.
For detailed information about Oracle ILOM features that are common to all server platforms, see the Oracle Integrated Lights Out Manager (ILOM) 3.0 Documentation Collection, as described in Oracle ILOM 3.0 Common Feature Set Documentation Collection.
Oracle ILOM features discussed in this chapter, which are specific to the Sun Fire X4470 Server, are as follows:
Supported Sun Fire X4470 Server Firmware
TABLE 2-1 identifies the supported Oracle ILOM and BIOS firmware versions supported on the Sun Fire X4470 Server.
TABLE 2-1 Supported Platform Firmware
Software Release
|
Oracle ILOM SP Firmware
|
BIOS Firmware
|
1.0
|
3.0.9.10
|
9.1.25.11
|
1.1
|
3.0.9.25
|
9.2.1.15
|
1.2.1
|
3.0.14.10a
|
9.3.1.15
|
For information about how to update the firmware on your server, refer to the Oracle ILOM 3.0 Common Feature Set Documentation Collection at:
http://www.oracle.com/pls/topic/lookup?ctx=E19860-01&id=homepage
Hardware Management Pack for Single Server Management
The Sun Server Hardware Management Pack (Hardware Management Pack) from Oracle provides tools to help you manage and configure your Oracle servers from the host operating system. To use these tools, you must install the Hardware Management Pack software on your server. After installing the Hardware Management Pack software, you will be able to perform the following server management tasks described in TABLE 2-2.
TABLE 2-2 Hardware Management Pack - Server Management Tasks
Server Management Task
From Host OS*
|
Hardware Management Pack Implementation
|
Tool
|
Monitor Oracle hardware with host IP address
|
Use the Hardware Management Agent and the associated Simple Network Management Protocol (SNMP) Plug-ins at the operating-system level to enable in-band monitoring of your Oracle hardware. This in-band monitoring functionality enables you to use your host operating system IP address to monitor your Oracle servers without the need of connecting the Oracle ILOM management port to your network.
|
Host OS-level
management tool
|
Monitor storage devices, including RAID arrays
|
Use the Server Storage Management Agent at the operating-system level to enable in-band monitoring of the storage devices configured on your Oracle servers. The Server Storage Management Agent provides an operating-system daemon that gathers information about your server’s storage devices such as hard disk drives (HDDs) and RAID arrays, and sends this information to the Oracle ILOM service processor. The Storage Monitoring features in Oracle ILOM enable you to view and monitor the information provided by the Server Storage Management Agent. You can access the Storage Monitoring features in Oracle ILOM from the command-line interface (CLI).
|
Oracle ILOM 3.0 CLI
Storage Monitoring features
|
Configure BIOS CMOS settings, device boot order, and some SP settings
|
Use the biosconfig CLI tool from the host operating system to configure your Oracle x86 servers BIOS CMOS settings, device boot order, and some service processor (SP) settings.
|
Host OS-level
biosconfig CLI
|
Query, update, and validate firmware versions on supported SAS storage devices
|
Use the fwupdate CLI tool from the host operating system to query, update, and validate firmware versions on supported storage devices such as SAS host bus adapters (HBAs), embedded SAS storage controllers, LSI SAS storage expanders, and disk drives.
|
Host OS-level
fwupdate CLI
|
Restore, set, and view Oracle ILOM configuration settings
|
Use the ilomconfig CLI tool from the host operating system to restore Oracle ILOM configuration settings, as well as to view and set Oracle ILOM properties that are associated with network management, clock configuration, and user management.
|
Host OS-level
ilomconfig CLI
|
View or create RAID volumes on storage drives
|
Use the raidconfig CLI tool from the host operating system to view and create RAID volumes on storage drives that are attached to RAID controllers, including storage arrays.
|
Host OS-level
raidconfig CLI
|
Use IPMItool to access and manage Oracle servers
|
Use the open source command-line IPMItool from the host operating system to access and manage your Oracle servers via the IPMI protocol.
|
Host OS-level
command-line IPMItool
|
|
Download Hardware Management Pack Software
Navigate to the following web site to download the Hardware Management Pack software.
http://support.oracle.com
Hardware Management Pack Documentation
For instructions for installing the management pack software or using its components, see the following Hardware Management Pack documentation:
- Sun Server Hardware Management Pack 2.0 User’s Guide
- Sun Server Management Agent 2.0 User’s Guide
- Sun Server CLI Tools and IPMItool 2.0 User's Guide
For additional details about how to use the Storage Monitoring features in Oracle ILOM, see the Oracle Integrated Lights Out Manager (ILOM) 3.0 Concepts Guide and the Oracle Integrated Lights Out Manager (ILOM) 3.0 CLI Procedures Guide.
For additional details about accessing and managing your server via SNMP or IPMI, see the Oracle Integrated Lights Out Manager (ILOM) 3.0 Management Protocols Reference Guide.
Power Management Policies
This release of Oracle ILOM 3.0 software provides new Power Management policies that are supported on the Sun Fire X4470 Server.
For more information about the latest Oracle ILOM 3.0 Power Management policies, see the Oracle Integrated Lights Out Manager (ILOM 3.0) Feature Updates and Release Notes.
This section includes the following topics:
Host Power Throttling and Recovery
The Sun Fire X4470 Server supports a simple mechanism to automatically apply hardware throttles to the CPUs and memory controllers when power exceeds the rated capacity of the available power supplies. This can occur when a redundant power supply has failed or has been removed from the system.
When the server’s hardware (power CPLD) determines that power demand has exceeded the system’s available power, it automatically throttles the host processor to reduce its power consumption. The service processor (SP) removes this hardware throttle after it has been applied for 5 seconds. Host power throttling and recovery continues until such action is no longer needed.
Service Processor Power-On Policy
The service processor (SP) power-on policy determines the power state of the server when a cold boot is performed on the server. A server cold boot occurs only when AC power is applied to the server.
Service processor power-on policies are mutually exclusive, meaning that if one policy is enabled, the other policy is disabled by default. If both policies are disabled, then the server SP will not apply main power to the server at boot time. A brief description of the SP power-on policies and default settings follows:
- Auto Power-On Host On Boot - When this option is enabled, the SP automatically applies main power to the server. When disabled (default), main power is not applied to the server.
- Set Host Power to Last Power State On Boot - When this option is enabled, the SP automatically applies main power to the server based on the last power state of the server. The SP automatically tracks the last power state and restores the server to its last remembered power state following a power state change of at least 10 seconds. When disabled (default), the last power state is not applied to the server.
You can configure SP power-on policies using the Oracle ILOM web interface or the Oracle ILOM command-line interface (CLI). For instructions, see the following sections:
Light Load Efficiency Mode
Light Load Efficiency Mode (LLEM) increases system power efficiency by placing power supply unit 1 (PSU1) in warm-standby mode when the system is lightly loaded. LLEM is disabled by default on the Sun Fire X4470 Server.
When PSU1 is in warm-standby mode, PSU0 carries the entire power load. If PSU0 loses AC power or is extracted for replacement, PSU1 takes over the load automatically.
Note - In rare instances, an internal failure might cause PSU0 to lose power faster than PSU1 can take over the load.
|
Disabling LLEM forces the PSUs to share the power load at all times, causing reduced efficiency during light power loads.
You can configure LLEM using the Oracle ILOM web interface or the Oracle ILOM command-line interface (CLI). For instructions, see the following sections:
Low Line AC Override Mode Policy
The Low Line AC Override Mode policy setting is provided to enable special test scenarios of a 4-CPU system using low-line (110 volt) power. Low-line voltage is normally supported only in 2-CPU system configurations. The capacity of each power supply unit (PSU) is roughly 1000 watts at low line. Since the power of a 4-CPU system can exceed 1000 watts by a large amount, enabling this setting results in a loss of PSU redundancy. This setting is disabled by default on the Sun Fire X4470 Server.
Note - The server is rated to have a maximum AC input current of 12 amps (with one or both PSUs working). When the Low Line AC Override policy is enabled, a 4-CPU system can require more than 12 amps total current for both PSUs. In any case, each AC inlet will not exceed 12 amps.
|
You can configure Low Line AC Override policy setting using the Oracle ILOM web interface or the Oracle ILOM command-line interface (CLI). For instructions, see the following sections:
Configure SP Power Management Policies Using the Web Interface
|
1. Log in to Oracle ILOM using the web interface.
2. Select Configuration --> Policy.
The Policy Configuration page appears.
3. Depending on the SP policy you want to configure, do the following:
- To configure Auto power-on host on boot, select its radio button, then click the Actions drop-down menu and select Enable or Disable.
- To configure Set host power to last power state on boot, select its radio button, then click the Actions drop-down menu and select Enable or Disable.
- To configure Set Light Load Efficiency Mode Policy, select its radio button, then click the Actions drop-down menu and select Enable or Disable.
- To configure Set Low Line AC Override Mode Policy, select its radio button, then click the Actions drop-down menu and select Enable or Disable.
4. Click OK to enable or disable the SP policy.
Configure SP Power Management Policies Using the CLI
|
1. Log in to Oracle ILOM using the CLI.
2. To show the current power policy settings, type:
-> show /SP/policy
The SP policy properties appear. For example:
/SP/policy
Targets:
Properties:
HOST_AUTO_POWER_ON = disabled
HOST_LAST_POWER_STATE = disabled
LIGHT_LOAD_EFFICIENCY_MODE = enabled
LOW_LINE_AC_OVERRIDE_MODE = disabled
Commands:
cd
set
show
->
|
In the above output, Host Auto Power On is disabled, Host Last Power State is disabled, Light Load Efficiency Mode is enabled, and Low Line AC Override Mode is disabled.
3. Depending on the SP policy you want to configure, do the following:
- To enable or disable Host Auto Power On, type:
-> set /SP/policy/ HOST_AUTO_POWER_ON=[enabled|disabled]
- To enable or disable Host Last Power State, type:
-> set /SP/policy/ HOST_LAST_POWER_STATE=[enabled|disabled]
- To enable or disable Light Load Efficiency Mode, type:
-> set /SP/policy/ LIGHT_LOAD_EFFICIENCY_MODE=[enabled|disabled]
- To enable or disable Low Line AC Override Mode, type:
-> set /SP/policy/ LOW_LINE_AC_OVERRIDE_MODE=[enabled|disabled]
Oracle ILOM Sideband Management
By default, you connect to the server’s service processor (SP) using the out-of-band network management port (NET MGT). The Oracle ILOM sideband management feature enables you to select either the NET MGT port or one of the server’s Gigabit Ethernet ports (NET 0, 1, 2, 3), which are in-band ports, to send and receive Oracle ILOM commands to and from the server SP. In-band ports are also called sideband ports.
The advantage of using a sideband management port to manage the server’s SP is that one fewer cable connection and one fewer network switch port is needed. In configurations where numerous servers are being managed, such as data centers, sideband management can represent a significant savings in hardware and network utilization.
You can configure sideband management using either the web interface, the command-line interface (CLI), the BIOS, or IPMI. For special considerations and configuration instructions, see the following sections:
Special Considerations for Sideband Management
When sideband management is enabled in Oracle ILOM, the following conditions might occur:
- Connectivity to the server SP might be lost when the SP management port configuration is changed while you are connected to the SP using a network connection, such as SSH, web, or Oracle ILOM Remote Console.
- In-chip connectivity between the SP and the host operating system might not be supported by the on-board host Gigabit Ethernet controller. If this condition occurs, use a different port or route to transmit traffic between the source and destination targets instead of using L2 bridging/switching.
- Server host power cycles might cause a brief interruption of network connectivity for server Gigabit Ethernet ports (NET 0, 1, 2, 3) that are configured for sideband management. If this condition occurs, configure the adjacent switch/bridge ports as host ports.
Note - If the ports are configured as switch ports and participate in the Spanning Tree Protocol (STP), you might experience longer outages due to spanning tree recalculation.
|
Configure Sideband Management Using the Web Interface
|
1. Log in to Oracle ILOM using the web interface.
2. Select Configuration --> Network.
The Network Settings page appears.
3. In the Network Settings page, do the following:
a. Configure a static IP address or select the appropriate options to acquire an IP address automatically.
b. To select a sideband management port, click the Management Port drop-down list and select the desired management port.
The drop-down list enables you to change to any one of the four Gigabit Ethernet ports, /SYS/MB/NETn, where n is 0 to 3. The SP NET MGT port, /SYS/SP/NET0, is the default.
c. Click Save for the changes to take effect.
Configure Sideband Management Using the CLI
|
1. Log in to Oracle ILOM using the CLI.
Note - Using a serial connection for this procedure eliminates the possibility of losing connectivity during sideband management configuration changes.
|
2. If you logged in using the serial port, you can assign a static IP address.
For instructions, see the information about assigning an IP address in the Sun Fire X4470 Server Installation Guide.
3. To show the current port settings, type:
-> show /SP/network
The network properties appear. For example:
/SP/network
Targets:
Properties:
commitpending = (Cannot show property)
dhcp_server_ip = none
ipaddress = xx.xx.xx.xx
ipdiscovery = static
ipgateway = xx.xx.xx.xx
ipnetmask = xx.xx.xx.xx
macaddress = 11.11.11.11.11.86
managementport = /SYS/SP/NET0
outofbandmacaddress = 11.11.11.11.11.86
pendingipaddress = xx.xx.xx.xx
pendingipdiscovery = static
pendingipgateway = xx.xx.xx.xx
pendingipnetmask = xx.xx.xx.xx
pendingmanagementport = /SYS/SP/NET0
sidebandmacaddress = 11.11.11.11.11.87
state = enabled
|
In the above output the current active macaddress is the same as the SP’s outofbandmacaddress and the current active managementport is set to the default (/SYS/SP/NET0).
4. To set the SP management port to a sideband port, type the following commands:
-> set /SP/network pendingmanagementport=/SYS/MB/NETn
Where n equals 0, 1, 2, or 3.
-> set commitpending=true
5. To view the change, type:
-> show /SP/network
The network properties appear and show that the change has taken effect. For example:
/SP/network
Targets:
Properties:
commitpending = (Cannot show property)
dhcp_server_ip = none
ipaddress = xx.xx.xx.xx
ipdiscovery = static
ipgateway = xx.xx.xx.xx
ipnetmask = xx.xx.xx.xx
macaddress = 11.11.11.11.11.87
managementport = /SYS/MB/NETn
outofbandmacaddress = 11.11.11.11.11.86
pendingipaddress = xx.xx.xx.xx
pendingipdiscovery = static
pendingipgateway = xx.xx.xx.xx
pendingipnetmask = xx.xx.xx.xx
pendingmanagementport = /SYS/MB/NETn
sidebandmacaddress = 11.11.11.11.11.87
state = enabled
|
In the above output the macaddress matches the sidebandmacaddress, and the managementport matches the pendingmanagementport.
Configure Sideband Management Using the Host BIOS Setup Utility
|
You can access the BIOS Setup Utility screens from the following interfaces:
- Use a USB keyboard, mouse, and VGA monitor connected directly to the server.
- Use a terminal (or terminal emulator connected to a computer) through the serial port on the back panel of the server.
- Connect to the server using the Oracle ILOM Remote Console. To use this interface, you must know the IP address of the server. For instructions on viewing the server IP address, see the Sun Fire X4470 Server Installation Guide.
To configure sideband management using the host BIOS Setup Utility, perform the following steps:
1. Power on or power cycle the server.
2. To enter the BIOS Setup Utility, press the F2 key while the system is performing the power-on self-test (POST).
When BIOS is started, the main BIOS Setup Utility top-level screen appears. This screen provides seven menu options across the top of the screen.
3. In the main screen, select Advanced --> IPMI 2.0 Configuration.
The IPMI 2.0 Configuration screen appears.
4. In the IPMI 2.0 Configuration screen, select the Set LAN Configuration option.
The LAN Configuration screen appears.
5. In the LAN Configuration screen, do the following:
a. Use the left and right arrow keys to select the IP Assignment option and set it to DHCP to acquire the IP address automatically, or set it to Static if manually specifying the IP address.
b. Use the left and right arrow keys to select the Active Management Port option and set the port to a sideband management port (NET0, NET1, NET2, NET3).
The NET MGT port is the default.
c. Select Commit for the change to take effect.
Switch Serial Port Output Between SP and Host Console
You can switch the serial port output of the Sun Fire X4470 Server between the SP console (SER MGT) and the host console (COM1). By default, the SP console is connected to the system serial port. This feature is beneficial for Windows kernel debugging, as it enables you to view non-ASCII character traffic from the host console.
You can switch serial port output using either the Oracle ILOM web interface or the Oracle ILOM command-line interface (CLI). For instructions, see the following sections:
|
Caution - You should set up the network on the SP before attempting to switch the serial port owner to the host server. If a network is not set up, and you switch the serial port owner to the host server, you will be unable to connect using the CLI or web interface to change the serial port owner back to the SP. To change the serial port owner back to the SP, you must use the Oracle ILOM Preboot Menu to restore access to the serial port over the network. For more information, see the Oracle ILOM Preboot Menu information in the Sun Fire X4470 Server Service Manual.
|
Switch Serial Port Output Using the Web Interface
|
1. Log in to Oracle ILOM using the web interface.
2. Select Configuration --> Serial Port.
The Serial Port Settings page appears.
3. To select a serial port owner, click the Owner drop-down list and select the desired serial port owner.
The drop-down list enables you to select either Service Processor or Host Server.
By default, Service Processor is selected.
4. Click Save for your change to take effect.
Switch Serial Port Output Using the CLI
|
1. Log in to Oracle ILOM using the CLI.
2. To set the serial port owner, type:
-> set /SP/serial/portsharing/ owner=host
By default, owner=SP.
Server Chassis Intrusion Sensor
The /SYS/INTSW sensor is asserted when the server’s top cover is removed while power is being applied to the server. This is an improper service action so this sensor serves to alert you to any unauthorized and inadvertent removal of the server’s cover. Thus, this sensor enables system administrators to have confidence that the physical integrity of the server has not been violated. This is particularly beneficial when the server is in a remote or uncontrolled location.
Note - The server cannot be powered on when the server top cover is off and the /SYS/INTSW sensor is asserted. If the server’s top cover is removed while the server is powered-on, the host will immediately employ a non-graceful shutdown to power off the server.
|
How the /SYS/INTSW Sensor Works
The /SYS/INTSW sensor is asserted when the chassis intrusion switch trips while the server is powered-on. If the AC power cords are connected to the server, power is being applied to the server. Even when you shut down the server’s host, power is still being applied to the server. The only way to remove power from the server completely is to disconnect the server’s AC power cords.
The chassis intrusion switch will trip if the server’s cover is removed, the switch itself is misaligned, or the cover is not properly seated. This sensor is deasserted when the integrity of the server’s chassis is restored, that is, when the removed cover is properly reinstalled, returning the chassis intrusion switch to its closed state.
|
Caution - Removing the server’s top cover while the power cord is connected to the system is not an authorized service action. Proper service action requires that host and SP shutdown operations be observed and that the power cords be disconnected from the system before the cover is opened. If proper service actions are taken, you should not see the /SYS/INTSWsensor asserted unless there are other issues, such as a misaligned chassis intrusion switch.
|
Fault Management
When a server component fails, error telemetry is either captured via the BIOS or is monitored by the Oracle ILOM SP. Oracle ILOM consumes error telemetry from both sources and provides diagnosis in the form of a fault event. The fault event is stored in the Oracle ILOM event log as a fault message. You can use either the Oracle ILOM web interface or the command-line interface (CLI) to manually clear faults.
This section includes the following topics. The first four topics describe how to examine and clear faults, while the last topic provides reference information for sensors and indicators.
Determining Faults
When a system fault occurs, you can view system indicators and use the Oracle ILOM CLI or web interface to determine the fault:
- LEDs - The Service Required LED will always be illuminated, and the component or subsystem-specific Service LED will be illuminated when applicable.
- Oracle ILOM CLI - Examine fault messages in the Oracle ILOM event log or see a fault summary.
For example:
- To view the Oracle ILOM event log, log in to the Oracle ILOM CLI and type:
show /SP/logs/event/list
- To view a fault summary, log in to the Oracle ILOM CLI and type:
show /SP/faultmgmt
- Oracle ILOM web interface - Examine fault messages in the Oracle ILOM event log or see a fault summary.
For example:
- To view the Oracle ILOM event log, log in to the Oracle ILOM web interface and select:
System Monitoring --> Event Logs
- To view a fault summary, log in to the Oracle ILOM web interface and select:
System Information --> Fault Management
Clearing Faults
The procedure for clearing a fault differs depending on the type of component.
1. Customer-replaceable units (CRUs) that are hot-swappable and are monitored by the SP will have their faults cleared automatically when the failed component is replaced and the updated status is reported as deasserted.
2. CRUs and field-replaceable units (FRUs) that have a FRUID container with identity information will have their faults cleared automatically when the failed component is replaced, as the SP is able to determine when a component is no longer present in the system.
3. CRUs and FRUs that are not hot-swappable or lack a FRUID container with identity information will not have their faults cleared automatically.
You can use the Oracle ILOM web interface or the command-line interface (CLI) to manually clear faults. For information on how to use the Oracle ILOM web interface or the CLI to clear server faults, see the Oracle ILOM 3.0 Documentation Collection at:
http://www.oracle.com/pls/topic/lookup?ctx=E19860-01&id=homepage
The following types of faults are diagnosed by the Oracle ILOM SP:
- Environmental events - Fan modules, power supplies, ambient temperature, AC power loss, and chassis intrusion switch
- Memory Reference Code (MRC) errors and warnings - Memory initialization and population
- I/O Hub (IOH) uncorrectable error events - Motherboard
- Memory ECC uncorrectable and correctable events - Memory DIMMs
- CPU uncorrectable error events - Processor
- Boot progress events - Power-on, power-off, IPMI, MRC, QPI, BIOS, setup, and boot retries
- Service Processor error events - Oracle ILOM
TABLE 2-3 lists the server component faults that are persistent after a system cold boot and the action to clear the fault.
TABLE 2-3 Component Fault Events
Component
|
Action to Clear the Fault
|
Motherboard
|
Fault is automatically cleared upon component replacement
|
Memory riser
|
Fault is automatically cleared upon component replacement
|
Fan board
|
Fault is automatically cleared upon component replacement
|
DDR3 Memory DIMMs
|
Fault is automatically cleared upon component replacement
|
CPU module
|
Clear fault manually after component replacement
|
PCIe cards
|
Clear fault manually after component replacement
|
Fan module
|
Fault is automatically cleared when the sensor status is OK
|
Power supply
|
Fault is automatically cleared when the sensor status is OK
|
Disk drive
|
Fault is automatically cleared when the sensor status is OK
|
In addition to the above faults, the following fault does not require replacement of a faulty part; however, user action is needed to clear it:
fault.security.integrity-compromised@/sys/sp
This fault is generated when the server’s top cover is removed while the AC power cords are still connected to the power supply, that is, power is not completely removed from the server. To clear this fault, replace the server’s top cover and either reboot the server’s SP or remove the AC power cords, and then reconnect the power cords.
Components With No Fault Diagnosis
Certain Sun Fire X4470 Server components do not provide a mechanism to diagnose faults. These include:
- Disk backplane
- DVD player
- Disk drive
- Power supply backplane
- Lithium battery for host and SP real-time clocks
Viewing Sensors Using IPMItool
Sun Fire X4470 Server sensors can be viewed using IPMItool. For information and instructions for viewing sensors using IPMItool, see the Oracle Integrated Lights Out Manager (Oracle ILOM) 3.0 Management Protocols Reference Guide.
Sensors and Indicators Reference Information
The server includes several sensors and indicators that report on hardware conditions. Many of the sensor readings are used to adjust the fan speeds and perform other actions, such as illuminating LEDs and powering off the server.
This section describes the sensors and indicators that Oracle ILOM monitors for the Sun Fire X4470 Server.
The following types of sensors are described:
Note - For information about how to obtain sensor readings or to determine the state of system indicators in Oracle ILOM, see the Oracle Integrated Lights Out Manager (ILOM) 3.0 CLI Procedures Guide and the Oracle Integrated Lights Out Manager (ILOM) 3.0 Web Interface Procedures Guide.
|
System Components
TABLE 2-4 describes the system components.
TABLE 2-4 System Components
Component Name
|
Description
|
/SYS/DBP
|
Disk backplane
|
/SYS/DBP/HDDn
|
Hard disks n
|
/SYS/FB
|
Fan board
|
/SYS/FB/FANn
|
Fan n
|
/SYS/MB
|
Motherboard
|
/SYS/MB/NETn
|
Host network interfaces n
|
/SYS/MB/Pn
|
Processor n
|
/SYS/MB/Pn/MRn
|
Processor n; Memory riser n
|
/SYS/MB/Pn/MRn/Dn
|
Processor n; Memory riser n; DIMM n
|
/SYS/MB/PCIE[n, CC]
|
PCIe slot n, or cluster card
|
/SYS/PSn
|
Power supply n
|
/SYS/SP
|
Service processor
|
/SYS/SP/NETn
|
SP network interface n
|
System Indicators
TABLE 2-5 describes the system indicators.
TABLE 2-5 System Indicators
Indicator Name
|
Description
|
/SYS/CPU_FAULT
|
System CPU Fault LED
|
/SYS/DBP/HDDn/OK2RM
|
Hard disk n OK-to-Remove LED
|
/SYS/DBP/HDDn/
SERVICE
|
Hard disk n Service LED
|
/SYS/FAN_FAULT
|
System fan Fault LED
|
/SYS/FB/FANn/OK
|
Fan n OK LED
|
/SYS/FB/FANn/SERVICE
|
Fan n Service LED
|
/SYS/LOCATE
|
System Locate indicator LED
|
/SYS/MB/Pn/SERVICE
|
Processor n Service LED
|
/SYS/MB/Pn/MRn/
SERVICE
|
Processor n; Memory riser n Service LED
|
/SYS/MB/Pn/MRn/Dn/
SERVICE
|
Processor n; Memory riser n; DIMM n; Service indicator
|
/SYS/MEMORY_FAULT
|
System memory Fault LED
|
/SYS/OK
|
System OK LED
|
/SYS/PS_FAULT
|
System power supply Fault LED
|
/SYS/SERVICE
|
System Service LED
|
/SYS/SP/OK
|
SP OK LED
|
/SYS/SP/SERVICE
|
SP Service LED
|
/SYS/TEMP_FAULT
|
System temperature Fault LED
|
Temperature Sensors
TABLE 2-6 describes the environmental sensors.
TABLE 2-6 Temperature Sensors
Sensor Name
|
Sensor Type
|
Description
|
/SYS/DBP/T_AMB
|
Temperature
|
Disk back plane ambient temperature sensor
|
/SYS/MB/T_OUTn
|
Temperature
|
Motherboard exhaust temperature n sensor
Note - These sensors are located in the rear of the chassis.
|
/SYS/T_AMB
|
Temperature
|
System ambient temperature sensor
Note - This sensor is located on the underside of the fan board.
|
/SYS/PSn/T_OUT
|
Temperature
|
Power supply n exhaust temperature sensors
|
Power Supply Fault Sensors
TABLE 2-7 describes the power supply fault sensors. In the table, n designates the numbers 0-1.
TABLE 2-7 Power Supply Sensors
Sensor Name
|
Sensor Type
|
Description
|
/SYS/PSn/V_OUT_OK
|
Fault
|
Power supply n output voltage OK
|
/SYS/PSn/V_IN_ERR
|
Fault
|
Power supply n input voltage error
|
/SYS/PSn/V_IN_WARN
|
Fault
|
Power supply n input voltage warning
|
/SYS/PSn/V_OUT_ERR
|
Fault
|
Power supply n output voltage error
|
/SYS/PSn/I_OUT_ERR
|
Fault
|
Power supply n output current error
|
/SYS/PSn/I_OUT_WARN
|
Fault
|
Power supply n output current warning
|
/SYS/PSn/T_ERR
|
Fault
|
Power supply n temperature error
|
/SYS/PSn/T_WARN
|
Fault
|
Power supply n temperature warning
|
/SYS/PSn/FAN_ERR
|
Fault
|
Power supply n fan error
|
/SYS/PSn/FAN_WARN
|
Fault
|
Power supply n fan warning
|
/SYS/PSn/ERR
|
Fault
|
Power supply n error
|
Fan Speed, and Physical Security Sensors
TABLE 2-8 describes the fan and security sensors. In the table, n designates numbers 0, 1, 2, etc.
TABLE 2-8 Fan and Security Sensors
Sensor Name
|
Sensor Type
|
Description
|
/SYS/FB/FANn/TACH
|
Fan speed
|
Fan board; Fan n tachometer
|
/SYS/INTSW
|
Physical security
|
This sensor tracks the state of the chassis intrusion switch. If the server’s top cover is opened while the AC power cords are still connected so that power is being applied to the server, this sensor asserts. If the top cover is subsequently replaced, this sensor is de-asserted.
For more information, see Server Chassis Intrusion Sensor.
|
Power Supply Unit Current, Voltage, and Power Sensors
TABLE 2-9 describes the power supply unit current, voltage, and power sensors. In the table, n designates numbers 0-1.
TABLE 2-9 Power Supply Unit Current, Voltage, and Power Sensors
Sensor Name
|
Sensor Type
|
Description
|
/SYS/PSn/V_IN
|
Voltage
|
Power supply n AC input voltage sensor
|
/SYS/PSn/V_12V
|
Voltage
|
Power supply n 12 volt output sensor
|
/SYS/PSn/V_3V3
|
Voltage
|
Power supply n 3.3 volt output sensor
|
/SYS/PSn/P_IN
|
Power
|
Power supply n input power sensor
|
/SYS/PSn/P_OUT
|
Power
|
Power supply n output power sensor
|
/SYS/VPS
|
Power
|
Server total input power consumption sensor
|
Entity Presence Sensors
TABLE 2-10 describes the entity presence sensors. In the table, n designates numbers
0, 1, 2, etc.
TABLE 2-10 Presence Sensors
Sensor Name
|
Sensor Type
|
Description
|
/SYS/DBP/HDDn/PRSNT
|
Entity presence
|
Hard drive device present monitor
|
/SYS/DBP/PRSNT
|
Entity presence
|
Disk backplane present monitor
|
/SYS/FB/FANn/PRSNT
|
Entity presence
|
Fan board; Fan n present monitor
|
/SYS/MB/Pn/PRSNT
|
Entity presence
|
Motherboard; CPU n present monitor
|
/SYS/MB/Pn/MRn/PRSNT
|
Entity presence
|
Motherboard; CPU n; Memory riser n present monitor
|
/SYS/MB/Pn/MRn/Dn/PRSNT
|
Entity presence
|
Motherboard; CPU n; Memory riser n; DIMM n present monitor
|
/SYS/MB/PCIEn/PRSNT
|
Entity presence
|
PCIe card n present monitor
Note - n represents PCIe cards 0-9 or the cluster controller (cc) card.
|
/SYS/PSn/PRSNT
|
Entity presence
|
Power supply n present monitor
|
SNMP and PET Message Reference Information
This section describes Simple Network Management Protocol (SNMP) and Platform Event Trap (PET) messages that are generated by devices being monitored by Oracle ILOM.
SNMP Traps
SNMP Traps are generated by the SNMP agents that are installed on the SNMP devices being managed by Oracle ILOM. Oracle ILOM receives the SNMP Traps and converts them into SNMP event messages that appear in the event log. For more information about the SNMP event messages that might be generated on your system, see TABLE 2-11.
TABLE 2-11 SNMP Traps and Corresponding Oracle ILOM Events for Sun Fire X4470 Server
SNMP Trap Message
|
Oracle ILOM Event Message
|
Severity and Description
|
Sensor Name
|
Memory Events
|
sunHwTrapComponentFault
|
fault.memory.intel.boot-setup-init-failed
|
Major; A component is suspected of causing a fault
|
/SYS/
|
fault.memory.intel.boot-retries-failed
|
fault.memory.intel.dimm.none
|
/SYS/MB
|
fault.memory.controller.input-invalid
|
fault.memory.controller.init-failed
|
sunHwTrapComponentFault
Cleared
|
fault.memory.intel.boot-setup-init-failed
|
Informational; A component fault has been cleared
|
/SYS/
|
fault.memory.intel.boot-retries-failed
|
fault.memory.intel.dimm.none
|
/SYS/MB
|
fault.memory.controller.input-invalid
|
fault.memory.controller.init-failed
|
Service Processor Events
|
sunHwTrapComponentFault
|
fault.chassis.device.misconfig
|
Major; A component is suspected of causing a fault
|
/SYS/SP
|
fault.sp.failed
|
sunHwTrapComponentFault
Cleared
|
fault.chassis.device.misconfig
|
Informational; A component fault has been cleared
|
fault.sp.failed
|
Environmental Events
|
sunHwTrapComponentFault
|
fault.chassis.env.temp.over-fail
|
Major; A component is suspected of causing a fault
|
/SYS/
|
sunHwTrapComponentFault
Cleared
|
fault.chassis.env.temp.over-fail
|
Informational; A component fault has been cleared
|
/SYS/
|
sunHwTrapTempCrit
ThresholdExceeded
|
Lower critical threshold exceeded
|
Major; A temperature sensor has reported that its value has gone above an upper critical threshold setting or below a lower critical threshold setting
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
Upper critical threshold exceeded
|
/SYS/MB/T_OUT
/SYS/T_AMB
/SYS/DBP/T_AMB
|
sunHwTrapTempCrit
ThresholdDeasserted
|
Lower critical threshold no longer exceeded
|
Informational; A temperature sensor has reported that its value is in the normal operating range
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
Upper critical threshold no longer exceeded
|
/SYS/MB/T_OUT
/SYS/T_AMB
/SYS/DBP/T_AMB
|
sunHwTrapTempNonCrit
ThresholdExceeded
|
Upper noncritical threshold exceeded
|
Minor; A temperature sensor has reported that its value has gone above an upper critical threshold setting or below a lower critical threshold setting
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
sunHwTrapTempOk
|
Upper noncritical threshold no longer exceeded
|
Informational; A temperature sensor has reported that its value is in the normal operating range
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
sunHwTrapTempFatal
ThresholdExceeded
|
Lower fatal threshold exceeded
|
Critical; A temperature sensor has reported that its value has gone above an upper fatal threshold setting or below a lower fatal threshold setting
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
Upper fatal threshold exceeded
|
/SYS/MB/T_OUT
/SYS/T_AMB
/SYS/DBP/T_AMB
|
sunHwTrapTempFatal
ThresholdDeasserted
|
Lower fatal threshold no longer exceeded
|
Informational; A temperature sensor has reported that its value has gone below an upper fatal threshold setting or above a lower fatal threshold setting
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
|
Upper fatal threshold no longer exceeded
|
/SYS/MB/T_OUT
/SYS/T_AMB
/SYS/DBP/T_AMB
|
System Power Events
|
sunHwTrapComponentFault
|
fault.chassis.power.missing
|
Major; A component is suspected of causing a fault
|
/SYS/
|
fault.chassis.power.overcurrent
|
fault.chassis.power.inadequate
|
sunHwTrapComponentFault
Cleared
|
fault.chassis.power.missing
|
Informational; A component fault has been cleared
|
/SYS/
|
fault.chassis.power.overcurrent
|
fault.chassis.power.inadequate
|
sunHwTrapPowerSupplyFault
|
fault.chassis.env.power.loss
|
Major; A power supply component is suspected of causing a fault
|
/SYS/PS
|
fault.chassis.power.ac-low-line
|
fault.chassis.device.wrong
|
sunHwTrapPowerSupplyFaultCleared
|
fault.chassis.env.power.loss
|
Informational; A power supply component fault has been cleared
|
/SYS/PS
|
fault.chassis.power.ac-low-line
|
fault.chassis.device.wrong
|
sunHwTrapPowerSupplyError
|
Assert
|
Major; A power supply sensor has detected an error
|
/SYS/PWRBS
/SYS/PSn/
V_IN_ERR
/SYS/PSn/
V_IN_WARN
/SYS/PSn/
V_OUT_ERR
/SYS/PSn/
I_OUT_ERR
/SYS/PSn/
I_OUT_WARN
/SYS/PSn/T_ERR
/SYS/PSn/
T_WARN
/SYS/PSn/
FAN_ERR
/SYS/PSn/
FAN_WARN
/SYS/PSn/ERR
|
Deassert
|
/SYS/PSn/
V_OUT_OK
|
sunHwTrapPowerSupplyOk
|
Deassert
|
Informational; A power supply sensor has returned to its normal state
|
/SYS/PWRBS
/SYS/PSn/
V_IN_ERR
/SYS/PSn/
V_IN_WARN
/SYS/PSn/
V_OUT_ERR
/SYS/PSn/
I_OUT_ERR
/SYS/PSn/
I_OUT_WARN
/SYS/PSn/T_ERR
/SYS/PSn/
T_WARN
/SYS/PSn/
FAN_ERR
/SYS/PSn/
FAN_WARN
/SYS/PSn/ERR
|
Assert
|
/SYS/PSn/
V_OUT_OK
|
sunHwTrapComponentError
|
ACPI_ON_WORKING ASSERT
|
Major; A sensor has detected an error
|
/SYS/ACPI
|
ACPI_ON_WORKING DEASSERT
|
ACPI_SOFT_OFF ASSERT
|
ACPI_SOFT_OFF DEASSERT
|
Entity Presence Events
|
UNKNOWN
|
ENTITY_PRESENT ASSERT
|
Informational
|
/SYS/MB/Pn/
PRSNT
/SYS/MB/Pn/MRn/PRSNT
/SYS/MB/PCIEn/PRSNT
/SYS/MB/
PCIE_CC/PRSNT
|
ENTITY_PRESENT DEASSERT
|
ENTITY_ABSENT ASSERT
|
ENTITY_ABSENT DEASSERT
|
ENTITY_DISABLED ASSERT
|
ENTITY_DISABLED DEASSERT
|
Fans, Hard Drives, and Physical Security Events
|
sunHwTrapComponentFault
|
fault.chassis.device.fan.column-fail
|
Major; A component is suspected of causing a fault
|
/SYS
|
fault.security.enclosure-open
|
sunHwTrapComponentFault
Cleared
|
fault.chassis.device.fan.column-fail
|
Informational; A component fault has been cleared
|
/SYS/
|
fault.security.enclosure-open
|
UNKNOWN
|
Assert
|
Informational
|
/SYS/MB/PCIEn/WIDTH
/SYS/ESMR/ESM/FAULT
|
Deassert
|
sunHwTrapSecurityIntrusion
|
CHASSIS_INTRUSION ASSERT
|
Major; An intrusion sensor has detected that someone may have physically tampered with the system
|
/SYS/INTSW
|
CHASSIS_INTRUSION DEASSERT
|
sunHwTrapFanSpeedCrit
ThresholdExceeded
|
Lower critical threshold exceeded
|
Major; A fan speed sensor has reported that its value has gone above an upper critical threshold setting or below a lower critical threshold setting
|
/SYS/FB/FANn/
TACH
|
sunHwTrapFanSpeedCrit
ThresholdDeasserted
|
Lower critical threshold no longer exceeded
|
Informational; A fan speed sensor has reported that its value has gone below an upper critical threshold setting or above a lower critical threshold setting
|
sunHwTrapFanSpeedFatal
ThresholdExceeded
|
Lower fatal threshold exceeded
|
Critical; A fan speed sensor has reported that its value has gone above an upper fatal threshold setting or below a lower fatal threshold setting
|
sunHwTrapFanSpeedFatal
ThresholdDeasserted
|
Lower fatal threshold no longer exceeded
|
Informational; A fan speed sensor has reported that its value has gone below an upper fatal threshold setting or above a lower fatal threshold setting
|
System Chassis and I/O Events
|
sunHwTrapComponentFault
|
fault.chassis.boot.ipmi-init-failed
|
Major; A component is suspected of causing a fault
|
/SYS/
|
fault.io.quickpath.qpirc-init-failed
|
fault.io.quickpath.qpirc-failed
|
fault.io.quickpath.mrc-failed
|
sunHwTrapComponentFault
Cleared
|
fault.chassis.boot.ipmi-init-failed
|
Informational; A component fault has been cleared
|
/SYS/
|
fault.io.quickpath.qpirc-init-failed
|
fault.io.quickpath.qpirc-failed
|
fault.io.quickpath.mrc-failed
|
PET Event Messages
PET event messages are generated by systems with Alert Standard Format (ASF) or an IPMI baseboard management controller. The PET events provide advance warning of possible system failures. For more information about the PET event messages that might occur on your system, see TABLE 2-12.
TABLE 2-12 PET Messages and Corresponding Oracle ILOM Events for Sun Fire X4470 Server
PET Message
|
Oracle ILOM Event Message
|
Severity and Description
|
Sensor Name
|
System Power Events
|
petTrapACPIPowerStateS5G2
SoftOffAssert
|
SystemACPI 'ACPI_ON_WORKING'
|
Informational; System ACPI Power State S5/G2 (soft-off) was asserted
|
/SYS/ACPI
|
petTrapACPIPowerStateS5G2
SoftOffDeassert
|
System ACPI Power State : ACPI : S5/G2: soft-off : Deasserted
|
Informational; System ACPI Power State S5/G2 (soft-off) was deasserted
|
petTrapACPIPowerStateS0G0
WorkingAssert
|
System ACPI Power State : ACPI : S0/G0: working : Asserted
|
Informational; System ACPI Power State S0/G0 (working)
|
petTrapACPIPowerStateS0G0
WorkingDeassert
|
System ACPI Power State : ACPI : S0/G0: working : Deasserted
|
Informational; System ACPI Power State S0/G0 (working) was deasserted
|
petTrapPowerSupplyState
AssertedAssert
|
PowerSupply sensor DEASSERT
|
Informational; Power Supply is connected to AC Power
|
/SYS/PSn/
V_OUT_OK
/SYS/PSn/
V_IN_ERR
/SYS/PSn/
V_IN_WARN
/SYS/PSn/
V_OUT_ERR
/SYS/PSn/
I_OUT_ERR
/SYS/PSn/
I_OUT_WARN
/SYS/PSn/T_ERR
/SYS/PSn/
T_WARN
/SYS/PSn/
FAN_ERR
/SYS/PSn/
FAN_WARN
/SYS/PSn/ERR
|
petTrapPowerSupplyState
DeassertedAssert
|
PowerSupply sensor ASSERT
|
Warning; Power Supply is disconnected from AC Power
|
Entity Presence Events
|
petTrapEntityPresenceEntity
PresentAssert
|
Entity Presence : PCIE1/PRSNT : Present : Asserted
|
Informational; The Entity identified by the Entity ID is present
|
/SYS/PCIEn/
PRSNT
/SYS/PCIE_CC/
PRSNT
|
petTrapEntityPresenceEntity
AbsentDeassert
|
Entity Presence : PCIE1/PRSNT : Absent : Deasserted
|
petTrapEntityPresenceEntity
AbsentAssert
|
Entity Presence : PCIE1/PRSNT : Absent : Asserted
|
Informational; The Entity identified by the Entity ID is absent
|
petTrapEntityPresenceEntity
PresentDeassert
|
Entity Presence : PCIE1/PRSNT : Present : Deasserted
|
Informational; The Entity identified by the Entity ID for the sensor is absent
|
petTrapEntityPresenceEntity
DisabledAssert
|
Entity Presence : PCIE1/PRSNT : Disabled : Asserted
|
Informational; The Entity identified by the Entity ID is present, but has been disabled
|
/SYS/PCIE4/
PRSNT
/SYS/PCIE6/
PRSNT
/SYS/PCIE_CC/
PRSNT
|
petTrapEntityPresenceEntity
DisabledDeassert
|
Entity Presence : PCIE1/PRSNT : Disabled : Deasserted
|
Informational; The Entity identified by the Entity ID is present and has been enabled
|
petTrapEntityPresenceDevice
InsertedAssert
|
Entity Presence : PS0/PRSNT : DevicePresent
|
Informational; A device is present or has been inserted
|
/SYS/PSn/PRSNT
/SYS/FB/FANn/
PRSNT
/SYS/DBP/HDDn/PRSNT
|
petTrapEntityPresenceDevice
RemovedAssert
|
Entity Presence : PS0/PRSNT : DeviceAbsent
|
Informational; A device is absent or has been removed
|
Environmental Events
|
petTrapTemperatureUpper
NonRecoverableGoingLow
Deassert
|
Temperature Upper non-critical threshold has been exceeded
|
Major; Temperature has decreased below upper non-recoverable threshold
|
/SYS/MB/T_OUT
/SYS/DBP/T_AMB
/SYS/T_AMB
|
petTrapTemperatureUpper
CriticalGoingLowDeassert
|
Temperature Lower non-critical threshold has been exceeded
|
Warning; Temperature has decreased below upper critical threshold
|
petTrapTemperatureUpper
NonRecoverableGoingHigh
|
Temperature Lower non-critical threshold no longer exceeded
|
Critical; Temperature has decreased below upper non-recoverable threshold
|
petTrapTemperatureUpper
CriticalGoingHigh
|
Temperature Lower fatal threshold has been exceeded
|
Major; Temperature has increased above upper critical threshold
|
Fans, Hard Drives, and Physical Security Events
|
petTrapPhysicalSecurity
ChassisIntrusionState
DeassertedAssert
|
Physical Security : INTSW : State Deasserted
|
Informational; Physical security: chassis intrusion alarm cleared
|
/SYS/INTSW
|
petTrapPhysicalSecurity
ChassisIntrusionState
AssertedAssert
|
Physical Security : INTSW : State Asserted
|
Warning; Physical security breach: chassis intrusion
|
petTrapFanLowerCriticalGoingLow
|
Fan Lower fatal threshold has been exceeded
|
Major; Fan speed has decreased below lower critical threshold
|
/SYS/FB/FANn/
TACH
|
petTrapFanLowerCriticalGoingHighDeassert
|
Fan Lower fatal threshold no longer exceeded
|
Warning; Fan speed has increased above lower critical threshold
|
petTrapDriveSlotDriveFault
Assert
|
Drive Slot : DBP/HDD0/STATE : Drive Fault : Asserted
|
Critical; HDD Fault has been detected. A corresponding HDD Fault LED is ON
|
DBP/HDDn/STATE
|
petTrapDriveSlotDriveFault
Deassert
|
Drive Slot : DBP/HDD0/STATE : Drive Fault : Deasserted
|
Informational; HDD Fault has been cleared. An HDD Fault LED that was ON is now OFF
|
petTrapDriveSlotPredictive
FailureAssert
|
Drive Slot : DBP/HDD0/STATE : Predictive Failure : Asserted
|
Major; HDD Predictive Failure has been detected
|
petTrapDriveSlotReadyTo
RemoveAssert
|
Drive Slot : DBP/HDD0/STATE : Hot Spare : Asserted
|
Informational: A drive has been unmounted and is ready to be physically removed. A corresponding OK-to-Remove LED is ON
|
petTrapDriveSlotReadyTo
RemoveDeassert
|
Drive Slot : DBP/HDD0/STATE : Hot Spare : Deasserted
|
Informational; A drive is no longer ready to be physically removed. It has either been removed or mounted again. A corresponding OK-to-Remove LED is OFF
|
petTrapDriveSlotPredictive
FailureDeassert
|
Drive Slot : DBP/HDD0/STATE : Predictive Failure : Deasserted
|
Informational; Hard Disk Predictive Failure state has been cleared
|
Oracle Integrated Lights Out Manager (ILOM) 3.0 Supplement for Sun Fire X4470 Server
|
E21741-02
|
|
Copyright © 2011, Oracle and/or its affiliates. All rights reserved.