C H A P T E R 3 |
Server Management and ILOM Notes and Issues |
This chapter describes server management and ILOM service processor (SP) issues that apply to the Sun Blade X6275 server module.
The following information is available from the server module ILOM when FRUs are inserted:
ILOM provides a restricted shell, which allows access to logs and commands to view and search them such as grep/less. In addition, access to administrative commands such as uptime as well as commands to safely reboot the SP are provided.
Use the root account to enter the restricted shell through the spsh shell.
Note - You cannot enter the restricted shell from an spsh shell that was spawned through ipmitool. |
From the ILOM web interface, access the power values from Power Management> Allocation > Power Allocation Plan, Target Limit. The value can be in watts or a percent between:
Note - Installed Hardware Minimum power is the recommended minimum power you can set and should be regarded as a reference. |
Capping the power to this minimum value will have two issues:
1. CPU performance will be severely downgraded.
2. You may see a "power violation" in the CLI and ILOM SEL log as described in Details of the Error Message. This is due to the minimum power calculation which is hard to perfect.
The calculation’s accuracy should be taken into account for each component and for different usage patterns. The Violation status occurs since the system is not able to reduce power to below the Installed Hardware Minimum power due to the usage pattern.
Caution - To avoid the issue described above, do not cap the power to this minimum value. |
On the "Consumption" tab under "Power Management" in the ILOM Web interface, the following Warning might be seen for the target limit:
Warning: /Peak Permitted/ exceeds /Target Limit/
Warning: /Actual Power/ exceeds /Target Limit/.
Through the ILOM CLI, the event is recorded as follows:
The ILOM SEL records a IPMI log similar to the following:
ID = 10e2 : 10/27/2009 : 14:28:56 : Power Supply : PWRBS : State Asserted
As viewed from the Chassis Management Module (CMM) ILOM interface, the power budget as shown in the CMM is on a per blade basis. For the Sun Blade X6275, it shows total power consumption of the blade (both nodes together).
A policy has been added to the CMM ILOM to support the Sun Cooling Door that might be used with your chassis. Sun supports two types of cooling doors: the Sun Cooling Door 5200 and the Sun Cooling Door 5600.
To configure the Sun Cooling Door policy using the ILOM web interface or CLI, see the Sun Integrated Lights Out Manager (ILOM) 3.0 Feature Update and Release Notes, (820-7329) for detailed information. The Integrated Lights Out Manager (ILOM) 3.0 document collection is available from:
http://www.oracle.com/pls/topic/lookup?ctx=ilom30&id=homepage
If the Chassis Monitoring Module (CMM) is offline (due to a problem with the CMM or because the CMM is going through the boot process), the Sun Blade X6275 will not power on.
1. Ensure that the CMM is online before booting the Sun Blade X6275.
2. To power on the blade, run start -force.
According to the IPMI specification, the locate LED on the front of the blade is supposed to turn itself off after 15 seconds. However, Oracle has determined that this might not give the customer sufficient time to physically locate the system. For this reason, Oracle has chosen to deviate from the IPMI specification and set the default time-out value to 30 minutes.
You can choose to turn the locate LED off by using one of the following methods:
Wait 30 minutes for the locate LED to automatically turn off on its own.
You can use the preconfigured ILOM default user account to recover a lost password or re-create the root account. The default user account cannot be changed or deleted and is only available through a local serial console connection (refer to the Sun Blade X6275 Server Module Service Manual). In order to access the default user account, you must prove physical presence.
To prove physical presence for a node of the Sun Blade X6275 server module, press the Locate button for the node on the server module front panel when prompted by ILOM. For information about the server module front panel and indicators, refer to Sun Blade X6275 Server Module Installation Guide.
When using a node’s ILOM CLI to review power consumption, the following command can be used:
This displays output similar to:
When logged into node 0, the available_power listed is actually the combined available power for the entire blade (node 0 plus node 1). To calculate the available power for node 0, subtract the available_power number listed when logged into node 1 from the total available_power number listed for node 0.
This section contains fixed and open issues for Oracle Integrated Lights Out Management (ILOM).
The following issues have been fixed.
When the X6275 blade is in the Sun Blade 6048 chassis, the sensor list shown from node 1 incorrectly shows the output for when the blade is inserted into a Sun Blade 6000. Shared sensors, such as BLx/STATE, BLx/ERR, FMx/ERR, FMx/Fy/TACH, PSx/PRSNT, PSx/Sy/V_OUT_OK and PSx/Sy/V_IN_ERR are not all shown.
SP Faults on fan modules are not clearing. Once the SP logs the error, there appears to be no way to clear the condition.
The FAULT capability on the chassis FRUs, specifically, NEMs, FMs and the CMM was not functioning correctly. FAULT capability was added to all chassis FRUs.
Note - This applies to CMM 2.0.3.13 only. |
When the blade chassis boots up, there must be at least one Sun Blade X6275 server module in the chassis. Then, the CMM will enable the Sun Blade X6275 server module mode, which supports two nodes in one blade. Otherwise, if you install your first Sun Blade X6275 server after the blade chassis boots, only one of the blade’s nodes will be seen by the ILOM CMM web interface.
When running pecitool on a single CPU reference numbers can appear to be reversed.
Note - This issue has been fixed in ILOM 3.0.8.10. |
The CPU reference numbers appear to be reversed due to an error in the motherboard silkscreen. The reference numbers were internally reversed in hostdiag to accommodate the silkscreen. The Sun Blade X6275 does not need this reversal, as the CPU designators appear correctly on the motherboard.
Note - This issue has been fixed in ILOM 3.0.8.10. |
Some of the blade Field Replaceable Unit (FRU) information might become corrupt or unavailable when viewed through the CMM.
Workaround: If this happens, login to the desired blade node from ILOM web interface or CLI to read FRU data.
After a CMM reboot, there might be an erroneous hot insertion event for the Sun Blade X6275 server module logged in the CMM event log, even though the blade was not removed from the chassis.
You may safely ignore this event.
On rare occasions, the host will not be able to establish remote control redirection (RKVM session) through the ILOM service processor. This might happen after you change the service processor IP address. You might also see the following warning messages:
Could not connect to host <new_ip>. Please verify your host IP or name.
Workaround: Retry the remote connection. If that does not solve the problem, log in to the host service processor using ILOM and reset the service processor. You can do this using the ILOM web interface or CLI.
In rare cases, you may see a number of false events in the System Event Log for Node1 of the Sun Blade X6275 server module relating to NEM and PEM removal. There are no such events listed in the System Event Log for Node0.
The false messages in the Node1 System Event Log can be safely ignored. However, please contact your Sun service provider so that Sun might track these occurrences.
Due to a memory leak in the ILOM software, repeated use of ILOM to monitor sensors and components may result in ILOM becoming sluggish, erratic, and/or non-responsive.
Workaround: Reset the Sun Blade x6275’s service processor or the chassis CMM, depending on which device becomes sluggish, erratic, and/or non-responsive.
The Sun Blade X6275 server module BIOS does not set service processor time at POST. IPMI commands can also be used to set the service processor clock.
Workaround: The time for the service processor can be set using the ILOM interface or Network Time Protocol (NTP).
Several IPMI critical events are logged in SEL at every service processor boot. In the ILOM event log, you will see these messages after the SP boot.
ID = b : 03/09/2009 : 09:46:34 : Entity Presence : PEM/PRSNT : Device Absent
ID = 2f : 02/23/2009 : 11:43:39 : Power Supply : PS1/AC1_ERR : State Deasserted
ID = 2e : 02/23/2009 : 11:43:39 : Power Supply : PS0/PWROK0 : State Deasserted
ID = 2c : 02/23/2009 : 11:43:38 : Power Supply : PS1/AC2_ERR : State Deasserted
ID = 2b : 02/23/2009 : 11:43:38 : Power Supply : PS0/PWROK1 : State Deasserted
These events can safely be ignored.
The correct command for running ipmiflash -I pci is:
ipmiflash -I pci write ILOM.pkg :: --platform-type vayu_QDR_IB --id-num 38 -l 0xa0000
Note - "::" are required in this command. |
When performing a firmware upgrade of ILOM, you are given the choice of preserving the configuration information for the current ILOMversion before it is upgraded. This includes information configured by the user (account information, network configurations, management settings, etc.). This information is stored in the SP and will be used if you ever decide to go back to the previous version of ILOM.
-> load -source tftp://serverfolder/ILOM-version-Sun_Blade_X6275M2.pkg
Typically, you would opt to preserve existing configurations in case you need to roll back to the previous version of ILOM after an upgrade. However, if you choose to not preserve ILOM configurations during the upgrade and answer no to the prompt to Preserve existing configuration (y/n)?, the configurations might be saved anyway.
This action is harmless and can occur intermittently.
Workaround: Try performing the upgrade again, answering no to the prompt to Preserve existing configuration (y/n)? during an upgrade.
The following issues are open.
If you reset the SP to factory defaults when the host is on, permitted power is miscalculated and power operations might have strange results.
1. From the CLI, set /SP/ reset_to_defaults=factory.
An error message might occur under the following conditions:
1. Log in to SP from the web interface through Internet Explore 8.
2. Go to Configuration >Serial Port Settings.
3. Set the Serial Port sharing from Service Processor to Host Server, then click the Save button. An error message window appears with the following:
Error: Unable to get serial port property
4. After the OK button is clicked, the baud rate of the host serial port turns blank, as seen below.
(The baud rate turns blank only in Internet Explorer).
This error message will appear in both Firefox and Internet Explorer. The error message does not occur when the port sharing is used from CLI.
When SP or BIOS FW is getting the upgrade, the green LED should slow blink (1 Hz) with 0.5 second on and 0.5 second off.
Currently, the ILOM code does not change the state of the green LED. If it is solid on, it remains as solid on during FW upgrade.
When registering blades, they might contain invalid reference ids. The error message is:
products cannot be registered, they contain invalid reference ids
If the Service Processor firmware is flashed using Ipmiflash over the USB interface by specifying the -I usb parameter, file transfer will be terminated and flashing of the SP will fail. Therefore the following command will fail:
# ipmiflash -I usb -U root write SP_FW.pkg
351K [sending...]unexpected response to our file-upload command
Workaround: The Service Processor can be flashed using the open option in ipmiflash as follows:
# ipmiflash -I open -U root write SP_FW.pkg config
The Service Processor can also be flashed using the pci option by replacing the parameter "open" with "pci" in the above example.
Flashing of the firmware can also be done from the ILOM Web interface, CLI or through the Preboot Menu. For more information please refer to Chapters 2 and 3 of the Sun Integrated Lights Out Manager 2.0 Supplement for the Sun Blade X6275 Server Module, 820-6851. This document is available from:
http://download.oracle.com/docs/cd/E19464-01/index.html
If you set the system serial port baud rate from 9600 to 115200 in the system BIOS and then save the new settings, the new settings are not propagated to the system’s service processor.
Workaround: Change the serial port baud rate of the service processor through the Keyboard Problem With Multiple JavaRConsole Sessions Open to Same Service Processor (6795975)
Workaround: If this occurs, perform one of the following actions.
If you are upgrading the CMM ILOM image using web interface and have five or more ILOM CLI sessions open, the CMM may run out of memory and may become unresponsive and/or reset.
Workaround: Do not invoke more than four ILOM CLI sessions while upgrading firmware from the CMM ILOM web interface. Close those that are not in use.
When the ILOM start /SYS command is issued to power on the host, it will occasionally fail with the following message:
start: Insufficient power available for this operation: The chassis Available Power must exceed the chassis Ticketed Power by greater than the power budget requirement of this blade (see power ticket denied message in the CMM event log)
The above message might not accurately describe the correct reason for the failure of the host system to power on. Although insufficient available power is one possible cause, other factors such as hardware malfunction, system faults on the peer node of the same blade, or chassis CMM failures might result in the same error.
If you encounter this error, do the following to help in identifying the source of the problem:
While doing a backup configuration without entering pass phrase, you do not receive a warning message saying sensitive data will not be backed up. However, the backup occurs immediately.
If the configuration backup is done, while restoring it without entering the pass phrase, you do not receive a message asking for the pass phrase. The restore occurs immediately.
After CMM reset, the SP SEL log shows meaningless fan failure messages. Two nodes might show different messages, as follows:
3. You may see this kind of message in node 0:
11 IPMI Log critical Fri Jul 24 18:47:50 2009 ID = 6 : 07/24/2009 : 18:47:50 : Fan : FM6/ERR : Predictive Failure Deasserted
And different report in node 1:
19 IPMI Log critical Fri Jul 24 18:48:49 2009 ID = b : 07/24/2009 : 18:48:49 : Fan : FM5/ERR : Predictive Failure Deasserted
18 IPMI Log critical Fri Jul 24 18:48:49 2009 ID = a : 07/24/2009 : 18:48:49 : Fan : FM0/ERR : Predictive Failure Deasserted
Workaround: You can ignore this kind of "Predictive Failure Deasserted" error event in both nodes.
On rare occasions, when power cycling batches of blade nodes by either individually issuing a power-on command using ipmitool or /start/ SYS, or when powering on a Sun Blade 6048 Modular System chassis with a rack full of blades, a node might fail to power on. The failed node will return an OFF status when an ipmi power status query is made.
Workaround: If you encounter this issue, try the following:
If more than ten ILOM CLI sessions are opened, system response time can degrade proportionally. When ILOM web interface sessions are included among the 10 open sessions, performance is likely to degrade at a higher rate.
Workaround: Do not invoke more then ten ILOM CLI sessions. Close those that are not in use to optimize performance. Close ILOM web interface sessions first for the best results.
If you reset the SP to factory defaults when the host is on, you might no longer see DIMM FRU information in ILOM.
1. Open a terminal window and log in to the node ILOM SP using an SSh connection.
2. From the prompt, power off the node host by entering the command:
3. Reset the SP by entering the command:
-> set /SP/reset_to_defaults=factory
4. Reboot the node SP by entering the command:
5. After the SP successfully reboots, power on the node host by entering the command:
You should now be able to view DIMM FRU information using ILOM.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved.