C H A P T E R  7

Resolved Issues and Documentation Errata

This chapter describes Sun Fire X4140, X4240, and X4440 Servers Product Notes resolved issues.

The following topics are covered:

 


Resolved Issues

Issues that have been corrected since the previous issue of this document are described below for reference.

OpROM Error Message in the POST Screen When LSI/Adaptec BIOS Utility is Invoked (6713600)

For Sun Fire X4140, X4240, and X4440 servers with an LSI or Adaptec HBA controller, if the adapter's BIOS utility is invoked with Ctrl+A or Ctrl+C, other option ROMs may be prevented from loading, and a message such as "Error: Not Enough Space For Option Rom" might display. This is due to the extra memory space used by these utilities.

If the other option ROMs are required for that boot, it may be necessary to reboot once after running the utility.

Refer to CR 6678276.

X4140 SMBIOS Reports 32 DIMMs Instead Of 16 DIMMs Physically Available (6764995)

SMBIOS reports 32 DIMMs present instead of the 16 DIMMs physically available on Sun Fire X4140 servers while running prtdiag.

SGXPCIESAS-R-INT-Z Card Firmware Should Be At Revision 15825 Or Newer For Best Performance (6747434)

Update the firmware on the internal HBA card, SGXPCIESAS-R-INT-Z to revision 15825 or newer for best performance in X4140, X4240, and X4440 servers. Refer to the Diagnostic Guide for update procedures.

Hard Disk Drive Power Management Using autopm

The Solaris OS provides the ability to perform power management. It can be configured to automatically power off idle system components.



Note - To save power, we recommend that you enable power management for your hard drives.


The file /etc/power.conf contains the configuration settings. It is initialized during boot-up, and can be initialized from the command line by entering the command pmconfig.

The autopm entry in /etc/power.conf is used to enable or disable power management on a system-wide basis. The format of the autopm entry is:

autopm behavior

where behavior can be one of the following:


Command

Description

default

Systems that fall under the United States Environmental Protection Agency's Energy Star Memorandum of Understanding #3 have automatic device Power Management enabled. All others do not.

enable

Automatic device Power Management is started when this entry is encountered.

Disable

Automatic device Power Management is stopped when this entry is encountered.


For additional information, see the man pages for pmconfig(1M) and power.conf(4).

Redirecting the Server Console to the Serial Port Using Solaris Operating System Commands (6623089)

You can redirect the server console to the serial port by using ILOM or by entering the following operating system command from the prompt:

%eeprom console=ttya

Using raidctl on a Failed Drive Can Produce an Error (6590675)

If you are running Solaris 10 08/07 and experience a drive failure in a RAID volume, attempting to run raidctl(1m) might give the following error:

bash-3.00# raidctl 
Device record is invalid.

raidctl(1m) will function correctly once the failed drive is replaced.

Workaround: Use ILOM as an alternate online means to check on drive status during a failure. Then, use the drive’s ready to replace LED when you are ready to replace the failed drive with a new one.

Sun Fire Servers with StorageTek SAS 8-Port Internal HBA Report SERR Events During Boot (6603801)

If your server comes with a Sun StorageTek SAS 8-Port Internal Host Bus Adapter (LSI 3081E-S based), part number SG-XPCIE8SAS-I-Z or SG-PCIE8SAS-I-Z, the HBA reports PCI SERR events via FMA every time Solaris is booted.

The likely cause of the SERR is that when FMA scans the server’s PCI bus, it probes for a function that does not exist on the HBA.

Executing an fmdump -e command after the system has booted yields four events similar to those shown in this example:

Sep 21 16:05:56.8682 ereport.io.pciex.rc.nfe-msg

Sep 21 16:05:56.8682 ereport.io.pci.sec-rserr

Sep 21 16:05:56.8682 ereport.io.pci.sec-ma

Sep 21 16:05:56.8682 ereport.io.pci.sserr

These error messages can be ignored.

Time of Day Jumps Disrupting Time-Controlled Applications (6613085)

On rare occasions it is possible that Sun Fire X4140, X4240, and X4440 servers running Solaris OS might reset system date and time information (either to a past or future time). This can affect time-controlled applications.

Workaround: If you encounter this problem, reset the system time and date through the OS or system BIOS.

MAC Failure During Late Packet Collision Might Hang System (6648502)

On rare occasions, a collision occurring during the preparation of the last byte of a transmit packet’s FCS might cause a Sun Fire server’s onboard Nvidia NIC to hang and require a system reboot. This issue has been found to occur on X4140, X4240 and X4440 servers (utilizing the GbE port of the Nvidia MCP55 chipset) running Solaris and will only occur with an incorrectly configured network connection.

Workaround: To avoid the hang condition, set the onboard Nvidia NIC and network switches (or hub and link stations) to the same speed and duplex settings. Configure the Nvidia NIC and any connected port to either the same duplex mode or both ports configured for auto-negotiate.

Solaris GRUB Might Fail to Find Onboard Nvidia Network Interface Cards (6617677)

In the default configuration, recently released versions of Solaris (Nevada) GRUB might fail to find the server’s onboard Nvidia NICs.

Workaround: Use the server’s BIOS Setup to change the mode for the onboard NICs to “MAC Bridge mode.”

Solaris FMA Might Report Inaccurate PCI-e Slot Number Information (6653828)

For PCI-e cards with a bridge chip, functions behind that bridge might have an erroneous slot number in the IRQ routing table. If an error is reported by Solaris FMA (Fault Management Architecture) for one of the functions behind that bridge chip, it might report the wrong slot number.

Workaround: If Solaris FMA reports a problem related to a function on a PCI-e card, carefully consider both the slot number reported (which might not be accurate) and the configuration of option cards in the server to help you discern which PCI-e card has the problem.

(RHEL 4.5) Shows Incorrect System CPU Speed When AMD PowerNow Feature is Turned On (6614369)

System running RHEL 4.5 does not show the correct CPU speed when the AMD PowerNow! feature is turned on.

Workaround: A patch fix is required for the cpuspeed function to work properly. Go to the Sun download web site and look for the cpuspeed.zip patch file

Sun Fire X4240 Server Does Not Boot From CD or External Bootable Device (6669327)

A Sun Fire X4240 that is fully populated with 16 internal hard disks might not boot from the onboard CD or any external bootable device.

Workaround: If you experience this issue, temporarily unplug one of the server’s internal hard disks to boot from a CD or external device. Replace the disk once you no longer need to boot the server from a CD or external device.

QLogic 4G HBA Periodically Does Not Show Up in the Device List (6642133)

Sun Fire X4240 systems running Solaris with a QLogic FC HBA installed in slot 5 might periodically go offline and not show up in the device list. An example of error message logged in /var/adm/messages follows:

Dec 13 04:20:40 x4240p1-01 qlc: [ID 308975 kern.warning] WARNING: qlc(5): login fabric port failed D_ID=fffffch, error=100h
Dec 13 04:20:40 x4240p1-01 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(5): Link OFFLINE
Dec 13 04:22:15 x4240p1-01 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(5): Link ONLINE

Although you can successfully ping the server, you cannot login or access it from the console. The off-line condition is not silent. There is no data corruption.

There is currently no workaround.

Properties Displayed for Non-Existent Fans (6639289)

The ILOM service processor provides monitoring of fans. On certain models, there is no third fan module, but the ILOM GUI and CLI interfaces display properties for the non-existent fan modules.

Workaround: You can safely ignore information for fan module 3 (fm3) on models which do not support it.

Modifications to ILOM Serial Port Configurations Might Not Be Saved, or Change Unexpectedly (6632937)

The ILOM service processor provides the capability to customize the speed of external and server serial ports. In certain rare circumstances, such as modification of many ILOM properties at the same time, the ILOM serial port configuration might not be saved, or the baud rate might change at unexpected times.

Workaround: Retry the serial port configuration, and reset the ILOM service processor if the serial port stops working. Refer to the Sun Integrated Lights Out Manager User’s Guide, for information on how to reset the service processor).

Serial Port Settings Made Using ILOM Web Interface or CLI Might Not Save Properly (6648398)

The ILOM service processor provides configuration for serial port settings for the external serial management port, and host port. In certain rare circumstances, when using the ILOM Web interface or CLI interface to configure the serial port, the settings might not properly save and the ILOM service processor becomes unresponsive.

Workaround: To clear the condition, shut down and remove AC power from the server. If the problem occurs persistently, upgrade the service processor firmware using the latest Sun Installation Assistant (SIA) for your server to recover.

Sun Fire X4140 Server Does Not Send ILOM Email Alerts (6649656)

The ILOM service processor provides IPMI, SNMP, and e-mail alerts to monitor the server. On the X4140 Server, e-mail alerts are not sent.

Workaround: Use an alternative alert notification method (IPMI or SNMP). You can use the remote syslog capability which provides logging of events to remote syslog servers, or the ipmievd event daemon distributed with the ipmitool package included on the Tools and Drivers CD for your server. These all allow remote monitoring of events on the server.

No SNMP Trap Sent When a Sensor Event Occurs (6675315)

The ILOM service processor provides alert mechanisms when events occur. IPMI pet traps, SNMP traps, and email alerts are supported. In certain circumstances, these alert mechanisms do not function correctly. For example, using the ILOM command line interface (CLI) to configure alerts might not work. This does not cause any problems with server operation.

This issue has been seen on Sun Fire X4140, X4240 and X4440 servers with ILOM version 2.0.2.3.

Workaround: Use the ILOM web GUI interface to configure alerts, rather than ILOM CLI, to avoid issues when modifying and displaying alerts.

If the SNMP alert functionality cannot be enabled, use one of the following alternate alert mechanisms:

Other remote notification mechanisms include:

False Chassis Intrusion Events Logged (6671003, 6676862)

In some X4140 systems with a single AC power supply, spurious chassis intrusion events might get logged by the service processor even when the top cover is properly installed. These events are caused by noise generated from early production power supplies (part number 300-2015-05). These spurious events do not otherwise affect the system.

The events are characterized by chassis intrusion asserting and then, 5-6 seconds later, deasserting in the ILOM event log. This behavior is similar to what you would see if you plugged in a second AC supply into a powered system.

Workaround: Until a long term fix is available, adding a second AC supply into systems that frequently exhibit this issue should significantly reduce the noise and associated false events.

BIOS and ILOM Display Different System GUID (6650248)

The IPMI specification provides for a System GUID (Globally Unique Identifier) which uniquely identifies the server and is viewable through the server’s ILOM. Likewise, the server BIOS also provides a GUID.

The BIOS and ILOM each display the system GUID information differently.

Example (dmidecode)

...

Handle 0x0001
DMI type 1, 27 bytes.
System Information
Manufacturer: Sun Microsystems
Product Name: Sun Fire X4240
Version: 1.00 Serial Number: 0747QCD00F
UUID: 00000000-0000-0000-0000-00144F8D2F26
Wake-up Type: Power Switch

Example (ipmitool)

...

FRU Device Description : /UUID (ID 100)
Product Extra : 080020FFFFFFFFFFFFFF0144F8D2F26

Ethernet Activity Light Stays Steady On During High Traffic (6630669)

During high traffic the host ethernet green activity LED stays on. Typically, you would expect to see flickering to indicate pauses in traffic, but in this case the LED does not flicker. This does not indicate a problem and can be ignored.

Error Message During POST Waits for F1 Keystroke to Resume (6680490)

If the server’s BIOS POST encounters non-fatal errors or warnings, the system might pause with the message “Press F1 to resume”.

Workaround: If this message is encountered, press F1 to allow the system to continue with the boot process.

AMD Erratum 326: Misaligned Load Operation Might Cause Processor Core Hang (6682358)

Under a rare and highly specific set of internal timing conditions, load operations with a misaligned operand might hang the system processor core. Any instruction loading data from memory without a LOCK prefix where the first byte and the last byte are in separate octal words might cause this condition.

There is currently no workaround for this issue. A future release of the server’s BIOS might contain a workaround which will prevent the problem.

Disabling All PCI-e Card Option ROMs (OpROMs) Makes the System Non-Functional (6678276)

On a Sun Fire X4440 server, if you go into the BIOS Setup program and disable all option card OpROMs, the system will hang during option ROM initialization at boot and become non-functional.

Workaround: Do not disable all PCI-e option card OpROMs. If you need extra OpROM space, disable one PCI-e card OpROM at a time until you achieve the desired results, but do not disable all of the PCI-e option card OpROMs at once.

Errors Generated for Non-Existent Entries in PIRQ_Tables (6609245)

The IRQ routing table on this system has two table entries corresponding to PCI bus numbers 0x90 and 0x91 that do not correspond to real devices. On rare occasions, the operating system might log, or system management software might note, warnings that these are unknown devices. In addition, these table entries list a PCI slot number 4 which conflicts with a real PCI slot.

You can safely ignore errors on devices with bus numbers 0x90 and 0x91.

Intel D33025 PRO/1000 PT Desktop Adapter Does Not Allow Network Boot (6663738)

If you require a network boot option from a network card, do not use the Intel single Gigabit Ethernet PCI-e adapter (Intel D33025 Pro/1000 PT). This card does not support network boot in X4140, X4240 or X4440 servers.

Workaround: Use one of the onboard network interfaces or a PCI-e option card supported for your Sun server to perform the network boot.

Displays Error During Boot with Option Cards Installed in Slots 2, 4 or 5 (6648377)

When the BIOS builds the MPS table there is a routine used to check to see what IRQs are assigned to any PCIe option card slots before making IRQs available for legacy devices. However, the code does not check the following slots:

This means that IRQ resources used by cards installed in those slots are not accounted for and reserved by the BIOS. During OS boot, detects this discrepancy and displays the following error message:

COS error message aboveTSC: 755147304 CPU0: 0) Chipset: 252: IRQ 23 has no pin (COS vector is 00)

Make sure PCI bridges are assigned to COS

Workaround: There can be a loss of functionality. Option cards inserted in the affected slots will disable functionality on two of the onboard network interfaces. See “ VMware ESX Does Not Detect All Onboard NICs if PCIe Option Card Is Installed in Slots 2, 4 or 5 (6652529, 6623720)” on page 17 for details on specific slots and workaround information.

ILOM 2.0.2.3 on a Sun Fire X4140 Server Becomes Unresponsive After 100+ Days (6787121)

This issue has been resolved in Sun Fire X4140 servers with software release 2.3.

Apparently Leaking Sockets (6789447)

This issue has been resolved in Sun Fire X4140 servers with software release 2.3.

Partially Resolved Issues

The following issue is partially resolved. For the resolution, see “(RHEL 4.5) Sun Fire X4240/X4440 Quad-Core Systems Have Hypertransport Sync Flood Error Under High IO Load (6682186)” on page 30.

Unexpected Reboot Followed by Hyper Transport Sync Flood Error During POST (6682186)

In rare instances, this been reported with Sun Fire X4240 and X4440 servers with AMD Opteron quad-core processors. A “Hypertransport Link Protocol Error” indicates there is an MMIO mapping overlap or discrepancy in low 4 GB between PCI space and pure memory.

Message: BIOS handoff failed Occurs During Installation of ESX (6639297)

During the installation the ESX 3.0.2 you might receive a message that states:

echci-hcd 00:2.1:Bios handoff failed

This is a warning only and does not affect system functionality.

You can ignore this message.

BIOS Incorrectly Shows Label of Sun Fire X4240 on a Sun Fire X4440 Server (6689691)

The Sun Fire X4440 server shows an incorrect label of X4240 in the BIOS display. The Service Processor reports the correct product name.

This issue only occurs on a two socket Sun Fire X4440, and only when the BIOS is not able to communicate with the SP due to a hardware fault, or due to the SP firmware being unresponsive.

Workaround: To clear this fault, remove all AC cords for a short time, to reset both the SP and the server motherboard.


Documentation Errata

Documentation erratum is described below.

In addition to these issues, see the following issues documented under other sections:

Addendum to the Sun Integrated Lights Out Manager 2.0 User’s Guide Contains Information That Does Not Apply

The Addendum to the Sun Integrated Lights Out Manager 2.0 User’s Guide covers a wide range of x64 servers. Thus, some of the information in that document does not apply to the SunFire X4140, X4240, and X4440 servers.

The following table shows which topics in the Addendum apply to the servers. All other topics do not apply.


TABLE 1 Topics in the Addendum That Apply to Sun Fire X4140/X4240/X4440 Servers

Topic

PDF page

Page

Maintenance->Configuration Management Window Description Revised

13

9

ILOM Configuration Corruption (Workaround 2)

17

13

Documentation Error: Edit Existing IP Addresses in

ILOM Using the CLI Procedure Gives Incorrect

Instruction

22

18


Single CPU Configuration DDR2 DIMM Placement

If a one CPU configuration is ordered, the DDR2 DIMM physical memory layout requires that all memory must be located next to the installed CPU, working from the outside in. Do not install memory on the side that does not have a CPU installed.