C H A P T E R  3

Feedback Server Management and ILOM Notes and Issues

This chapter describes server management and ILOM service processor (SP) issues that apply to the Sun Blade X6275 server module.


General Notes and Issues

Dynamic Field Replaceable Unit (FRU) ID Information

The following information is available from the server module ILOM when FRUs are inserted:

Restricted Bash Shell

ILOM provides a restricted shell, which allows access to logs and commands to view and search them such as grep/less. In addition, access to administrative commands such as uptime as well as commands to safely reboot the SP are provided.


procedure icon  Entering the Restricted Shell

Use the root account to enter the restricted shell through the spsh shell.



Note - You cannot enter the restricted shell from an spsh shell that was spawned through ipmitool.


Power Values in ILOM Web Interface

From the ILOM web interface, access the power values from Power Management> Allocation > Power Allocation Plan, Target Limit. The value can be in watts or a percent between:



Note - Installed Hardware Minimum power is the recommended minimum power you can set and should be regarded as a reference.


Capping the power to this minimum value will have two issues:

1. CPU performance will be severely downgraded.

2. You may see a "power violation" in the CLI and ILOM SEL log as described in Details of the Error Message. This is due to the minimum power calculation which is hard to perfect.

The calculation’s accuracy should be taken into account for each component and for different usage patterns. The Violation status occurs since the system is not able to reduce power to below the Installed Hardware Minimum power due to the usage pattern.



caution icon Caution - To avoid the issue described above, do not cap the power to this minimum value.


Details of the Error Message

On the "Consumption" tab under "Power Management" in the ILOM Web interface, the following Warning might be seen for the target limit:

Warning: /Peak Permitted/ exceeds /Target Limit/

Warning: /Actual Power/ exceeds /Target Limit/.

Through the ILOM CLI, the event is recorded as follows:

/SP/powermgmt/budget

Properties:

activation_state = enabled

status = violation

The ILOM SEL records a IPMI log similar to the following:

ID = 10e2 : 10/27/2009 : 14:28:56 : Power Supply : PWRBS : State Asserted

Reading Sun Blade X6275 Power Consumption in the CMM

As viewed from the Chassis Management Module (CMM) ILOM interface, the power budget as shown in the CMM is on a per blade basis. For the Sun Blade X6275, it shows total power consumption of the blade (both nodes together).

Enabling and Disabling the Sun Cooling Door

A policy has been added to the CMM ILOM to support the Sun Cooling Door that might be used with your chassis. Sun supports two types of cooling doors: the Sun Cooling Door 5200 and the Sun Cooling Door 5600.

To configure the Sun Cooling Door policy using the ILOM web interface or CLI, see the Sun Integrated Lights Out Manager (ILOM) 3.0 Feature Update and Release Notes, (820-7329) for detailed information. The Integrated Lights Out Manager (ILOM) 3.0 document collection is available from:

http://www.oracle.com/pls/topic/lookup?ctx=ilom30&id=homepage

Sun Blade X6275 Does Not Boot if the CMM is Off-Line

If the Chassis Monitoring Module (CMM) is offline (due to a problem with the CMM or because the CMM is going through the boot process), the Sun Blade X6275 will not power on.

Workaround:

1. Ensure that the CMM is online before booting the Sun Blade X6275.

2. To power on the blade, run start -force.

Locate LED Programmed to Stay On For 30 Minutes (6793865)

According to the IPMI specification, the locate LED on the front of the blade is supposed to turn itself off after 15 seconds. However, Oracle has determined that this might not give the customer sufficient time to physically locate the system. For this reason, Oracle has chosen to deviate from the IPMI specification and set the default time-out value to 30 minutes.

You can choose to turn the locate LED off by using one of the following methods:

Wait 30 minutes for the locate LED to automatically turn off on its own.

Proving Physical Presence (6881237)

You can use the preconfigured ILOM default user account to recover a lost password or re-create the root account. The default user account cannot be changed or deleted and is only available through a local serial console connection (refer to the Sun Blade X6275 Server Module Service Manual). In order to access the default user account, you must prove physical presence.

To prove physical presence for a node of the Sun Blade X6275 server module, press the Locate button for the node on the server module front panel when prompted by ILOM. For information about the server module front panel and indicators, refer to Sun Blade X6275 Server Module Installation Guide.

Understanding the Node available_power Statistic (6892763)

When using a node’s ILOM CLI to review power consumption, the following command can be used:

-> show /SP/powermgmt

This displays output similar to:

/SP/powermgmt

Targets:

budget

powerconf

Properties:

actual_power = 56

permitted_power = 190

allocated_power = 190

available_power = 380

threshold1 = 0

threshold2 = 0

Where:

When logged into node 0, the available_power listed is actually the combined available power for the entire blade (node 0 plus node 1). To calculate the available power for node 0, subtract the available_power number listed when logged into node 1 from the total available_power number listed for node 0.


ILOM Fixed and Open Issues

This section contains fixed and open issues for Oracle Integrated Lights Out Management (ILOM).

ILOM Fixed Issues

The following issues have been fixed.


TABLE 3-1 ILOM Fixed Issues

Description of Issue

Status

Release Fixed

(Fixed) Sensor List For X6275 Blade Node 1 Is Not Correct When In Sun Blade 6048 (6924167)

Fixed

2.3

(Fixed) Blades Cannot Clear Fan Faults (6920801)

Fixed

2.3

(Fixed) During Chassis Boot, at Least One Sun Blade X6275 Server Module Must Be Installed

Fixed

2.1

(Fixed) pecitool Shows Wrong CPU Number (6890473)

Fixed

2.1

(Fixed) Hostdiag Reports CPU Number Reversed In 2.0.3.xx and 3.0.4.10 (6857083)

Fixed

2.1

(Fixed) FRU Properties Change Intermittently (6804445)

Fixed

2.0

(Fixed) Erroneous Chassis Hot Insertion Event Logged After CMM Reboot (6797938)

Fixed

2.1

(Fixed) Host Intermittently Cannot Connect to RKVM Session (6783184)

Fixed

2.1

(Fixed) CMM ILOM Interface Becomes Unresponsive After Repeated Use (6798257)

Fixed

2.1

(Fixed) BIOS Does Not Set Service Processor Time (6801525)

Fixed

2.1

(Fixed) Confusing Critical Events Logged in SEL at Service Processor Boot (6808890)

Fixed

2.1

(Fixed) ipmiflash -I pci Causes SP to Lose All Network Connections (6850823)

Fixed

2.1

(Fixed) ILOM Configurations Are Preserved During Upgrade Even After Specifying “No” (6971164)

Fixed

2.7


(Fixed) Sensor List For X6275 Blade Node 1 Is Not Correct When In Sun Blade 6048 (6924167)

When the X6275 blade is in the Sun Blade 6048 chassis, the sensor list shown from node 1 incorrectly shows the output for when the blade is inserted into a Sun Blade 6000. Shared sensors, such as BLx/STATE, BLx/ERR, FMx/ERR, FMx/Fy/TACH, PSx/PRSNT, PSx/Sy/V_OUT_OK and PSx/Sy/V_IN_ERR are not all shown.

(Fixed) Blades Cannot Clear Fan Faults (6920801)

SP Faults on fan modules are not clearing. Once the SP logs the error, there appears to be no way to clear the condition.

The FAULT capability on the chassis FRUs, specifically, NEMs, FMs and the CMM was not functioning correctly. FAULT capability was added to all chassis FRUs.

(Fixed) During Chassis Boot, at Least One Sun Blade X6275 Server Module Must Be Installed



Note - This applies to CMM 2.0.3.13 only.


When the blade chassis boots up, there must be at least one Sun Blade X6275 server module in the chassis. Then, the CMM will enable the Sun Blade X6275 server module mode, which supports two nodes in one blade. Otherwise, if you install your first Sun Blade X6275 server after the blade chassis boots, only one of the blade’s nodes will be seen by the ILOM CMM web interface.

(Fixed) pecitool Shows Wrong CPU Number (6890473)

When running pecitool on a single CPU reference numbers can appear to be reversed.



Note - This issue has been fixed in ILOM 3.0.8.10.


(Fixed) Hostdiag Reports CPU Number Reversed In 2.0.3.xx and 3.0.4.10 (6857083)

The CPU reference numbers appear to be reversed due to an error in the motherboard silkscreen. The reference numbers were internally reversed in hostdiag to accommodate the silkscreen. The Sun Blade X6275 does not need this reversal, as the CPU designators appear correctly on the motherboard.



Note - This issue has been fixed in ILOM 3.0.8.10.


(Fixed) FRU Properties Change Intermittently (6804445)

Some of the blade Field Replaceable Unit (FRU) information might become corrupt or unavailable when viewed through the CMM.

Workaround: If this happens, login to the desired blade node from ILOM web interface or CLI to read FRU data.

(Fixed) Erroneous Chassis Hot Insertion Event Logged After CMM Reboot (6797938)

After a CMM reboot, there might be an erroneous hot insertion event for the Sun Blade X6275 server module logged in the CMM event log, even though the blade was not removed from the chassis.

You may safely ignore this event.

(Fixed) Host Intermittently Cannot Connect to RKVM Session (6783184)

On rare occasions, the host will not be able to establish remote control redirection (RKVM session) through the ILOM service processor. This might happen after you change the service processor IP address. You might also see the following warning messages:

Video redirection error

or

Could not connect to host <new_ip>. Please verify your host IP or name.

Workaround: Retry the remote connection. If that does not solve the problem, log in to the host service processor using ILOM and reset the service processor. You can do this using the ILOM web interface or CLI.

(Fixed) Blade Node1 System Event Log Lists False /SYS/NEM1 or /SYS/PEM Hot Removal Messages (6791106)

In rare cases, you may see a number of false events in the System Event Log for Node1 of the Sun Blade X6275 server module relating to NEM and PEM removal. There are no such events listed in the System Event Log for Node0.

The false messages in the Node1 System Event Log can be safely ignored. However, please contact your Sun service provider so that Sun might track these occurrences.

(Fixed) CMM ILOM Interface Becomes Unresponsive After Repeated Use (6798257)

Due to a memory leak in the ILOM software, repeated use of ILOM to monitor sensors and components may result in ILOM becoming sluggish, erratic, and/or non-responsive.

Workaround: Reset the Sun Blade x6275’s service processor or the chassis CMM, depending on which device becomes sluggish, erratic, and/or non-responsive.

(Fixed) BIOS Does Not Set Service Processor Time (6801525)

The Sun Blade X6275 server module BIOS does not set service processor time at POST. IPMI commands can also be used to set the service processor clock.

Workaround: The time for the service processor can be set using the ILOM interface or Network Time Protocol (NTP).

(Fixed) Confusing Critical Events Logged in SEL at Service Processor Boot (6808890)

Several IPMI critical events are logged in SEL at every service processor boot. In the ILOM event log, you will see these messages after the SP boot.

ID = b : 03/09/2009 : 09:46:34 : Entity Presence : PEM/PRSNT : Device Absent

ID = 2f : 02/23/2009 : 11:43:39 : Power Supply : PS1/AC1_ERR : State Deasserted

ID = 2e : 02/23/2009 : 11:43:39 : Power Supply : PS0/PWROK0 : State Deasserted

ID = 2c : 02/23/2009 : 11:43:38 : Power Supply : PS1/AC2_ERR : State Deasserted

ID = 2b : 02/23/2009 : 11:43:38 : Power Supply : PS0/PWROK1 : State Deasserted

These events can safely be ignored.

(Fixed) ipmiflash -I pci Causes SP to Lose All Network Connections (6850823)

The correct command for running ipmiflash -I pci is:

ipmiflash -I pci write ILOM.pkg :: --platform-type vayu_QDR_IB --id-num 38 -l 0xa0000



Note - "::" are required in this command.


(Fixed) ILOM Configurations Are Preserved During Upgrade Even After Specifying “No” (6971164)

When performing a firmware upgrade of ILOM, you are given the choice of preserving the configuration information for the current ILOMversion before it is upgraded. This includes information configured by the user (account information, network configurations, management settings, etc.). This information is stored in the SP and will be used if you ever decide to go back to the previous version of ILOM.

For example:

-> load -source tftp://serverfolder/ILOM-version-Sun_Blade_X6275M2.pkg

Typically, you would opt to preserve existing configurations in case you need to roll back to the previous version of ILOM after an upgrade. However, if you choose to not preserve ILOM configurations during the upgrade and answer no to the prompt to Preserve existing configuration (y/n)?, the configurations might be saved anyway.

This action is harmless and can occur intermittently.

Workaround: Try performing the upgrade again, answering no to the prompt to Preserve existing configuration (y/n)? during an upgrade.

ILOM Open Issues

The following issues are open.


TABLE 3-2 ILOM Open Issues

Description of Issue

Status

Resetting the SP to Factory Defaults With Host Powered On Causes Permitted Power Miscalculation (6960011)

Open

Set Port Sharing Error Message From SP via SP Web Interface (6895495)

Open

Green LED Should Slow Blink (1 Hz) During FW Upgrade (6862377)

Open

Products Cannot Be Registered, (6861523)

Open

Ipmiflash Over USB Interface Fails Due to Unexpected Response to File-upload Command (6856369)

Open

Setting the Serial Baud Rate in the System BIOS Does Not Propagate to the Service Processor (6784341)

Open

CMM ILOM Becomes Unresponsive With Multiple CLI Sessions Open (6780171)

Open

Blade Power On Issues With the start /SYS Command (6784708)

Open

Missing Warning Message While Doing Backup Configuration Without Pass Phrase (6859295)

Open

After CMM is Reset Fan and Blade Events in Blade Event Logs Show Different Events (6864597)

Open

Powering On Batches of Blades Might Cause a Node to Fail to Power On (6813202)

Open

More Than Ten Open ILOM CLI Sessions Degrades Performance (6787190)

Open

Resetting the SP to Factory Defaults With Host Powered On Causes DIMM FRU Information to Be Lost (6970476, 6913602)

Open


Resetting the SP to Factory Defaults With Host Powered On Causes Permitted Power Miscalculation (6960011)

If you reset the SP to factory defaults when the host is on, permitted power is miscalculated and power operations might have strange results.

Workaround:

1. From the CLI, set /SP/ reset_to_defaults=factory.

2. Reseat the blade.

Set Port Sharing Error Message From SP via SP Web Interface (6895495)

An error message might occur under the following conditions:

1. Log in to SP from the web interface through Internet Explore 8.

2. Go to Configuration >Serial Port Settings.

3. Set the Serial Port sharing from Service Processor to Host Server, then click the Save button. An error message window appears with the following:

Error: Unable to get serial port property

4. After the OK button is clicked, the baud rate of the host serial port turns blank, as seen below.
(The baud rate turns blank only in Internet Explorer).


This error message will appear in both Firefox and Internet Explorer. The error message does not occur when the port sharing is used from CLI.

Green LED Should Slow Blink (1 Hz) During FW Upgrade (6862377)

When SP or BIOS FW is getting the upgrade, the green LED should slow blink (1 Hz) with 0.5 second on and 0.5 second off.

Currently, the ILOM code does not change the state of the green LED. If it is solid on, it remains as solid on during FW upgrade.

Products Cannot Be Registered, (6861523)

When registering blades, they might contain invalid reference ids. The error message is:

products cannot be registered, they contain invalid reference ids

Ipmiflash Over USB Interface Fails Due to Unexpected Response to File-upload Command (6856369)

If the Service Processor firmware is flashed using Ipmiflash over the USB interface by specifying the -I usb parameter, file transfer will be terminated and flashing of the SP will fail. Therefore the following command will fail:

# ipmiflash -I usb -U root write SP_FW.pkg

351K [sending...]unexpected response to our file-upload command

(ccode = 0x0c)

Workaround: The Service Processor can be flashed using the open option in ipmiflash as follows:

# ipmiflash -I open -U root write SP_FW.pkg config

The Service Processor can also be flashed using the pci option by replacing the parameter "open" with "pci" in the above example.

Flashing of the firmware can also be done from the ILOM Web interface, CLI or through the Preboot Menu. For more information please refer to Chapters 2 and 3 of the Sun Integrated Lights Out Manager 2.0 Supplement for the Sun Blade X6275 Server Module, 820-6851. This document is available from:

http://download.oracle.com/docs/cd/E19464-01/index.html

Setting the Serial Baud Rate in the System BIOS Does Not Propagate to the Service Processor (6784341)

If you set the system serial port baud rate from 9600 to 115200 in the system BIOS and then save the new settings, the new settings are not propagated to the system’s service processor.

Workaround: Change the serial port baud rate of the service processor through the Keyboard Problem With Multiple JavaRConsole Sessions Open to Same Service Processor (6795975)



Note - When multiple JavaRConsole sessions are opened to the same service processor on the Sun Blade X6275 server module, the additional session’s keyboard interface may not work. The first session’s keyboard is not affected.


Workaround: If this occurs, perform one of the following actions.

OR

CMM ILOM Becomes Unresponsive With Multiple CLI Sessions Open (6780171)

If you are upgrading the CMM ILOM image using web interface and have five or more ILOM CLI sessions open, the CMM may run out of memory and may become unresponsive and/or reset.

Workaround: Do not invoke more than four ILOM CLI sessions while upgrading firmware from the CMM ILOM web interface. Close those that are not in use.

Blade Power On Issues With the start /SYS Command (6784708)

When the ILOM start /SYS command is issued to power on the host, it will occasionally fail with the following message:

start: Insufficient power available for this operation: The chassis Available Power must exceed the chassis Ticketed Power by greater than the power budget requirement of this blade (see power ticket denied message in the CMM event log)

The above message might not accurately describe the correct reason for the failure of the host system to power on. Although insufficient available power is one possible cause, other factors such as hardware malfunction, system faults on the peer node of the same blade, or chassis CMM failures might result in the same error.

If you encounter this error, do the following to help in identifying the source of the problem:

Missing Warning Message While Doing Backup Configuration Without Pass Phrase (6859295)

While doing a backup configuration without entering pass phrase, you do not receive a warning message saying sensitive data will not be backed up. However, the backup occurs immediately.

If the configuration backup is done, while restoring it without entering the pass phrase, you do not receive a message asking for the pass phrase. The restore occurs immediately.

After CMM is Reset Fan and Blade Events in Blade Event Logs Show Different Events (6864597)

After CMM reset, the SP SEL log shows meaningless fan failure messages. Two nodes might show different messages, as follows:

1. Reset CMM.

2. Check logs in blade.5

3. You may see this kind of message in node 0:

11 IPMI Log critical Fri Jul 24 18:47:50 2009 ID = 6 : 07/24/2009 : 18:47:50 : Fan : FM6/ERR : Predictive Failure Deasserted

And different report in node 1:

19 IPMI Log critical Fri Jul 24 18:48:49 2009 ID = b : 07/24/2009 : 18:48:49 : Fan : FM5/ERR : Predictive Failure Deasserted

18 IPMI Log critical Fri Jul 24 18:48:49 2009 ID = a : 07/24/2009 : 18:48:49 : Fan : FM0/ERR : Predictive Failure Deasserted

Workaround: You can ignore this kind of "Predictive Failure Deasserted" error event in both nodes.

Powering On Batches of Blades Might Cause a Node to Fail to Power On (6813202)

On rare occasions, when power cycling batches of blade nodes by either individually issuing a power-on command using ipmitool or /start/ SYS, or when powering on a Sun Blade 6048 Modular System chassis with a rack full of blades, a node might fail to power on. The failed node will return an OFF status when an ipmi power status query is made.

Workaround: If you encounter this issue, try the following:

More Than Ten Open ILOM CLI Sessions Degrades Performance (6787190)

If more than ten ILOM CLI sessions are opened, system response time can degrade proportionally. When ILOM web interface sessions are included among the 10 open sessions, performance is likely to degrade at a higher rate.

Workaround: Do not invoke more then ten ILOM CLI sessions. Close those that are not in use to optimize performance. Close ILOM web interface sessions first for the best results.

Resetting the SP to Factory Defaults With Host Powered On Causes DIMM FRU Information to Be Lost (6970476, 6913602)

If you reset the SP to factory defaults when the host is on, you might no longer see DIMM FRU information in ILOM.

Workaround:

1. Open a terminal window and log in to the node ILOM SP using an SSh connection.

2. From the prompt, power off the node host by entering the command:

-> stop /SYS

3. Reset the SP by entering the command:

-> set /SP/reset_to_defaults=factory

4. Reboot the node SP by entering the command:

-> reset /SP

5. After the SP successfully reboots, power on the node host by entering the command:

-> start /SYS

You should now be able to view DIMM FRU information using ILOM.

 

Feedback