C H A P T E R  2

Software Issues

This chapter describes software issues related to the Sun Fire X4100 and Sun Fire X4200 servers and includes these topics:



Note - If an issue statement does not specify a particular platform, the issue applies to all platforms.


This chapter uses the following Linux-related acronyms:

RHEL versions are usually used with a version number (for example, RHEL4) and an update number (for example, U3).

SLES versions are usually used with a version number (for example, SLES9) and a software patch number (for example, SLES9 SP3).


Solaris 10 Operating System Issues

Current Issues

Drives Moved From Two-Drive System to Four-Drive System Might Not Operate Correctly (6300178)

On systems that have two hard-disk drives, the drives in Slot 0 and Slot 1 are mapped to the OS as disk 2 and disk 3. Therefore, drives that are configured in Slot 0 or Slot 1 in systems with four hard-disk drives, and then moved into a two-disk system, might not operate correctly.

Workaround

None.

Solaris 10 3/05 x86 OS Patch Cluster Installation Required Before Installing Patches for Some Host Bus Adapters (6312352)

Certain patches for host bus adapters (HBAs), such as the Sun StorEdge Entry-Level Fibre Channel host bus adapter (QLA210), will not work without first installing a Solaris OS patch cluster on systems running Solaris 10 x86 OS and then rebooting the systems.

To install the patch cluster and the QLA210 patch:

1. Install the Solaris 10 3/05 operating system (if it is not already installed).

2. Install the recommended patch cluster.

For instructions on installing the patch cluster, see:

http://patches.sun.com/clusters/10_x86_Recommended.README

3. Install the recommended patch for the HBA.

For example, to install the QLA210 patch (119131-xx):

a. See the instructions at:

http://sunsolve.sun.com/pub-cgi/show.pl?target=patchpage

b. Enter 119131 in the PatchFinder text box.

4. Reboot the system.

Do Not Use raidctl Command in Solaris 10 3/05 OS (6228874)

The raidctl command enables you can manage the RAID controllers from the command line interface. However, because the raidctl command is not supported on Solaris 10 3/05, using the command might cause the system to panic.

Workaround

A Solaris 10 3/05 patch (119851-13) that resolves this issue is available from the SunSolve download site.

If you do not have the latest Solaris 10 3/05 patch, use the MPT SCSI BIOS to create and manage the RAID volumes.

Ignore Bootup Message: Method or service exit timed out (6297813)

If the input device and output device are set to the serial port (ttya), the following message might appear in the console during bootup:

svc:/system/power:default: Method or service exit timed out. Killing contract 17.

This message does not indicate a problem.

Solaris 10 OS Installation From CD Media Hangs When the Second Disc is Inserted (6374024)

During Solaris 10 OS installation, Solaris might report that it cannot find the second CD even though the second CD is inserted.

Workaround

This problem does not occur if you perform a net install. Solaris is then able to mount and read the CD images. You can also work around this problem by installing from DVD media rather than multiple CDs.

AMD Erratum 131 Warning Message Can Be Safely Ignored During Solaris OS Startup
(6438926, 6447850)

Solaris AMD x64 OS support includes a boot-time check for the presence of a BIOS workaround for the AMD Opteron Erratum 131. If the Solaris OS detects that the workaround for Erratum 131 is needed but it is not yet implemented, Solaris logs and displays the following warning message:

WARNING: BIOS microcode patch for AMD Athlon(tm) 64/Opteron(tm) processor erratum 131 was not detected; updating your system’s BIOS to a version containing this microcode patch is HIGHLY recommended or erroneous system operation may occur.
Workaround

The Sun Fire X4100 and Sun Fire X4200 BIOS implements a superset workaround that includes the workaround required for Erratum 131, so this warning message can be safely ignored.


Sun Installation Assistant Issues

Current Issues

RHEL4: Cannot Enable Security-Enhanced Linux (SELinux) (6288799)

The Sun Installation Assistant does not allow SELinux configuration during the installation of RHEL4. The GUI for the SELinux option is disabled during the installation of RHEL4 U1 with the Sun Installation Assistant CD.

Workaround

To configure SELinux, run system-config-securitylevel after the installation.

Ignore Kudzu Messages After Installing RHEL3 or RHEL4 (6290559)

RHEL runs a hardware discoverer named Kudzu. After installing RHEL3 or RHEL 4 with the Sun Installation Assistant, Kudzu displays messages indicating that the Ethernet drivers need to be removed and added again.

The messages Kudzu displays are incorrect. The Ethernet drivers do not need to be changed. Click Ignore when you are prompted to change the hardware configuration.

Resolved Issues

The ext3 File System Reports Errors After Red Hat Linux Installation Using Sun Installation Assistant CD (6336064)

(Fixed in SIA 1.1.6.)

When the Sun Installation Assistant CD is used to install Red Hat Linux, the ext3 file system might report incorrect disk space utilization and file system full errors. This is because the file system was not being unmounted correctly by the utility on the CD.

Workaround

The problem has been fixed in the new version of the Sun Installation Assistant CD (version 1.1.6 or later) that is available on the Sun Download Center web site. Go to the following URL and click on Downloads.

http://www.sun.com/servers/entry/x4100/index.html

If you use the old version of the CD and you see these errors, correct the problem by entering the tune2fs command at a command line, and then reboot the server.


Linux Operating System Issues

Current Issues

RHEL3U9 (32-bit) Reverses Mapping of Ethernet Ports After BIOS Upgrade (6623425)

After an upgrade to BIOS 0ABGA042, RHEL3U9 (32-bit) reverses the order in which it maps physical to logical ports. This can interfere with network operations, including PXE installation of the OS, when not all Ethernet ports are used. This problem has not been observed in other versions of Red Hat Enterprise Linux, including the 64-bit version of RHEL3U9.

Workaround

Set the following kernel parameter:

pci=nosort

Hard-Disk Drive Display Omits Disk Listing At Installation When Multiple SCSI disks Are Attached to System on RHEL4 U4 (6447738)

The hard-disk drive display omits a disk listing during installation if there are many SCSI disks attached to a system. Not all disks are available during the installation.

In addition, the disk-drive display lists the wrong drive type after the installation.

Workaround

None. However, to display the omitted hard-disk drive, use one of the following instructions:

Duplicate Devices Seen by Linux OS if External RAID Array Connects to Server Through Ultra320 SCSI (6220406)

If a RAID array is attached to the system using a Sun StorEdge PCI/PCI-X Single Ultra320 SCSI host bus adapter (Ultra320 SCSI), you might see the following if you enter the command, fdisk -l, depending on which Linux OS you are using:

List of Attached Hard-Disk Drives for the Pyramid (Qlogic) and Summit Option Cards is Not Displayed in Red Hat Linux (6460883)

Hard-disk drives for the Pyramid and Summit option cards are not displayed during installation or after the installation is complete on Red Hat Linux.

Exceptions: This behavior was not observed in RHEL4 U3 with a 64-bit processor.

Workaround

Add a device keyword to the installer kickstart file:

device <scsi/eth> xyz_driver [options]

To display the omitted cards, enter the following command in a terminal window:

modprobe qla2400

The qla2400 refers to the HBA driver module that is included with this version of Red Hat Linux software.

After you choose and perform one of the workarounds, reboot the system and run the following command to confirm that the driver is loaded:

fdisk -l 

Graceful Shutdown Not Available on Non-ACPI Supported Linux OS (6278514)

Some Linux OSs, such as RHEL3, do not support the Advanced Configuration and Power Interface (ACPI), which allows a graceful shutdown. On systems running non-ACPI Linux operating systems, only a forceful shutdown is available.

External Hard-Disk Drives Attached to Emulex HBA Are Not Recognized Because RHEL3 U8 Does Not Automatically Load Emulex Drivers (6447329, 6460769)

By design, external drives are never loaded automatically on RHEL3 U8.

Workaround

There are two possible workarounds:

1. Use either of the following to manually load the driver:

prompt> modprobe xyz_driver

prompt> insmod <path_to_driver>/xyz_driver

2. Save a copy of the original initrd file:

prompt> cd /boot

prompt> cp initrd-<kernel-version>.img initrd-<kernel-version>.img_SAVED

3. Create a new initrd file:

prompt> mkinitrd -f initrd-<kernel-version>.img <kernel-version>

When the system is rebooted the driver will be loaded automatically.



Note - You might have to modify the initrd file entry in the grub.conf file to reflect the initrd file name change. However, be sure to keep an unmodified kernel entry for the initrd file in the grub.conf file just in case.


device <scsi/eth> xyz_driver [options]

After you choose and perform one of the workarounds, reboot the system and run the following command to confirm that the driver is loaded:

fdisk -l 

Base Versions of Linux Distributions Shipped By Sun Must Be Upgraded to Receive Full Sun Support

The RHEL3, RHEL4, and SLES9 CDs that you can purchase from Sun are the base (initial-release) versions of those operating systems (OSs) and are not the latest updated versions of those OS’s. Although Sun will support customers to help them install these base versions from the shipped media, customers are expected to immediately upgrade to RHEL3 U6, RHEL4 U3, and SLES9 SP2 to get full Sun support for servers running those OS’s.

1. Go to Sun’s download site for these platforms and download the latest Sun Installation Assistant software. The latest version, 1.1.6, is designed to support installation of the base versions of the Linux OS’s.

2. Burn the new SIA software to CD.

3. Use the new SIA CD you burned to install the version of the OS that you received from Sun.

Refer to the Sun Fire X4100 and Sun Fire X4200 servers Operating System Installation Guide for detailed instructions.

4. Immediately download the latest update or patches from the Linux manufacturers’s web site and install them.

Refer to the Sun Fire X4100 and Sun Fire X4200 servers Operating System Installation Guide for detailed instructions.

Unloading QLogic Drivers Might Be Necessary Before Installing Updated Drivers (6312342, 6314923)

When installing the updated QLogic drivers for the QLA210 or QLA2342 option cards, you must manually unload the current drivers or the installation will fail. The modprobe -rv command does not work with these drivers.

Workaround

1. To check for existing QLA drivers, enter the following command:

# lsmod | grep qla

The output should look like this:

qla6322               129536  0qla2xxx_conf          310536  1qla2xxx               226960  1 qla6322scsi_transport_fc      16384  1 qla2xxxscsi_mod              140800  8usb_storage,st,sr_mod,sg,qla2xxx,scsi_transport_fc,mptscsih,sd_mod

2. Unload the drivers as shown in the following example:

# rmmod qla6322# rmmod qla2xxx

3. Load the updated QLA drivers.

Translation Look-Aside Buffer (TLB) Reload Causes Errors With Certain Linux Software (6296473)



Note - We recommend that RHEL3 users install the most recent OS update on the server to alleviate this issue.


The BIOS Advanced menu (CPU Configuration menu), in the BIOS Setup utility, contains an option named “Speculative TLB Reload.” By default, this setting is enabled, which allows TLB reload.

With this default setting, you might see errors similar to the following on systems running any 64-bit version of RHEL or SLES with Service Pack 1.

Northbridge status a60000010005001b GART error 11 Lost an northbridge error NB status: unrecoverable NB error address 0000000037ff07f8 Error uncorrected
Workaround

To avoid these errors, disable TLB reloading:

1. Reboot the server and press F2 to enter the BIOS Setup utility.

2. Go to the Advanced -> CPU Configuration menu.

3. Use the arrow keys to highlight the Speculative TLB Reload option, and change its setting to Disabled.

This disables TLB reloading.

4. Save your changes and exit the utility.

AMD PowerNow! Might Cause System Clock to Lose Ticks (6281771)

The AMD PowerNow! feature is disabled in the BIOS by default. Before enabling it, verify that your operating system and applications support the PowerNow! feature.

The PowerNow! feature changes CPU clock rates. A loss of timer ticks has been observed while running recent Linux SMP kernels when PowerNow! is enabled. This loss of timer ticks might result in timing errors in the kernel and in user applications. Symptoms might include timers that prematurely time out and the time of day clock appearing to behave erratically.

Workaround

Disable the PowerNow! feature by using the BIOS Setup utility. The menu path to the feature’s screen is Main -> Advanced -> AMD PowerNow Configuration.

RHEL3: I/O Errors Are Displayed When Initializing USB Mass Storage Device (6241851)

RHEL3 displays many I/O errors when a USB device is being initialized. The USB mass storage driver uses the SCSI subsystem to access the device. When a USB mass storage device is attached, the driver attempts to identify it as a SCSI device. The I/O errors displayed are a result of this initialization probe. The I/O errors can be ignored, and the USB device should work properly once it is initialized. This problem and its workaround are documented at:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=156831.

RHEL3: Kernel Might Report Incorrect CPU Information on Dual Core Processors (6241701)

When two dual core processors are installed on a Sun Fire X4200 server, the RHEL3 kernel might report four of the hyperthreaded CPUs with the same physical ID of 0. Instead, the IDs should be 0 and 1 for each CPU.

RHEL3 U5 (64-bit): Ignore Keyboard reset failed Message (6306118)

If the USB keyboard is connected to either the front or back USB port, the system running RHEL3 U5 (64-bit) always shows the following error message in the “dmesg” after the reboot.

initialize_kbd: Keyboard reset failed, no ACK

This message does not indicate a problem.

Cannot Access External Storage Attached to Emulex and Qlogic HBA Cards During RHEL3 U8 Installation (6447329)

If you have Emulex and QLogic HBA cards installed in your server, you might not be able to access external storage during RHEL3 U8 installation because the installer software does not load the appropriate kernel modules automatically. You therefore cannot perform setup and initialization of any external storage devices that are connected to those HBA cards during RHEL3 U8 installation (for example, disk formatting or RAID set up).

Workaround

Perform the required hard-disk drive configuration manually after the operating system is installed on the local disks. If you use the KickStart automated installation, it is possible to force the installer to load a specific driver with the device and deviceprobe command. Refer to the Red Hat KickStart documentation for instructions.

Server Might Reboot Sun Fire X4100 Server when MTU is Set to 9K on Kirkwood Interface (6335741)

The Sun Fire X4100 server might spontaneously reboot when running network traffic over the Kirkwood interface, in a Linux environment. This problem has only been observed when the MTU is set to 9K.

Workaround

None.

SLES9 64-Bit: Incorrect CPU Speeds Reported When PowerNow! is Enabled (6287519)

On systems running SLES9, incorrect CPU speeds might be reported in /proc/cpuinfo when the PowerNow! option is enabled. The maximum speed may not be reported.

Workaround

Disable the PowerNow! feature by using the BIOS Setup utility. The menu path to the feature’s screen is Main -> Advanced -> AMD PowerNow Configuration.

SLES9 SP1: Multipath Driver Does Not Work After Reboot (6332988)

SLES9 SP1 multipath driver (mdadm) does not work after a reboot of the host.

Workaround

None.

SLES9 64-Bit: System Does Not Boot With Supported HBA Card Plugged Into Slot 0 (6307424)

On systems running SLES9, if a host bus adapter (HBA) card is plugged in to Slot 0, you might not be able to boot the system. This is because SLES9 enumerates IDE and SCSI devices in scan order, and the BIOS scans PCI devices in ascending order. The scanning priority is:

1. NIC

2. Slot 0

3. SAS

4. Slot 2

5. Slot 3

6. Slot 4

7. Slot 1

If there is only one drive in the system, it is enumerated as /dev/sda. If an external device is later connected to an HBA card in Slot 0, the device will be enumerated as /dev/sda and the internal device will be enumerated as /dev/sdb. However, the SLES9 boot device points to /dev/sda, which is an external device without the OS, and the system cannot boot.

The problem does not occur if the HBA card is plugged in to Slots 1-4, since these slots are scanned later than the on-board SLI controller. This problem is not specific to the server or the HBA card.

Workaround

Plug the supported HBA card in to Slots 1-4, and then reboot the system. Also, follow these general guidelines:

Resolved Issues

Infinite Reboot Loop Cycle in RHEL4 U3 With smp Kernel, BIOS 31/34/36, and Single Dual-Core CPU (6466105)

(Fixed in Software 1.3.)

SunFire X4200 servers running on RHEL4 U3 with BIOS 31, 34, or 36, an smp kernel, and single dual-core CPUs, might fall into an infinite reboot loop.

Workaround.

Use RHEL4 U1 instead of RHEL4 U3. This fix is planned for a future release.

RHEL3 U7 32-Bit Installation Might Hang when any PCI Card is in a PCI Slot Other than PCI 0
(6402552, 6404116, 6404944, 6407997)

(Fixed in Software 1.2.)

When installing RHEL3 U7 32-bit on Sun Fire X4100 or Sun Fire X4200 servers that have a PCI card installed in any slot other than PCI 0, installation might hang. This problem is not observed when installing RHEL3 U7 64-bit.

See FIGURE 2-1 or FIGURE 2-2 for the location of the PCI slots.

Workaround

Use RHEL3 U8 32-bit if you have a PCI card installed in any PCI slot other than PCI 0. (This limitation was fixed in Update 8).

FIGURE 2-1 Sun Fire X4100 Designation and Speeds of PCI Slots


Diagram showing the locations, designations, and speeds of the PCI slots on the motherboard.

FIGURE 2-2 Sun Fire X4200 Designation and Speeds of PCI Slots


Diagram showing the locations, designations, and speeds of the PCI slots on the motherboard.


Windows Server 2003 Operating System Issues

Current Issues

VGA Output Unavailable After Headless Boot (6598754)

If the system is booted with no monitor connected, the VGA port will remain inoperable until the system is booted. The console is still accesible via JavaRConsole.

Workaround

Obtain the latest driver from ATI. Version strings are 5.10.2600.6024 (32-bit) and 6.14.10.6025 (64-bit).

Bootup Time Affected by Degraded RAID Volume (6297804)

The bootup time for Windows Server 2003 could be significant (20 minutes or so) if there is a defective disk in the RAID array. Both Windows and firmware retries contribute to the time delay. The defective disk might be recognized by the controller under SAS Topology, but not under RAID Properties.

OS Cannot Be Installed on LSI RAID Array if RAID is Not Recognized as First Storage Device (6297723)

Windows Server 2003 requires that you use the first storage or the existing partition for installation. You cannot install Windows Server 2003 onto an on-board LSI RAID array if:

Alert and Power Failure LEDs Might Illuminate If AMD PowerNow! Feature is Enabled (6310814)

The AMD PowerNow! feature is disabled in the BIOS by default. Before enabling it, verify that your operating system and applications support the PowerNow! feature.

If you enable PowerNow! in a Windows Server 2003 environment, you might see a loss of timer ticks and a decrease in CPU voltage, resulting in alert and power failure LEDs illuminating.

Workaround

Disable the PowerNow! feature by using the BIOS Setup utility. The menu path to the feature’s screen is Main -> Advanced -> AMD PowerNow Configuration.

Windows Utility mkfloppy.exe Does Not Select Correct Floppy Drive if More Than One Floppy Drive is Present

The mkfloppy.exe utility that is included in FloppyPack.zip can be run on any Windows system; it is used to create the Mass Storage Driver floppy that is used during Windows Server 2003 installation.

However, if there is more than one floppy drive present in the system (including USB-attached floppy drives), mkfloppy.exe does not select the correct floppy drive.

Workaround

Ensure that the system has only one floppy drive present when using mkfloppy.exe.

Resolved Issues

Backup/Restore Functions in LSI MyStorage Causes Severe Problems (6456252)

(Fixed in Software 1.2.)

LSI MyStorage Backup/Restore functionality causes optical drives to become unavailable. LSI controller firmware will need to be reloaded.

Workaround

Do not use the Backup/Restore functionality. The version of the LSI MyStorage application on the Tools and Drivers CD has the Backup/Restore functionality disabled.

Systems with Under 4 GB Memory Fail to Resume from Hibernation when Running Windows Server 2003 with BIOS 34 (6457304)

(Fixed in Software 1.5.)

Hibernation is disabled by default in the InstallPack.exe for Sun Fire X4100 and Sun Fire X4200 servers, but it can be enabled by the user with the Windows Control Panel Power Options settings.

If a server with BIOS 34 enters the S4 Hibernation state, and it has less than 4 GB of available memory, the server might fail to resume from Hibernation. It will instead attempt to reboot, but hang with a blue-screen crash.



Note - The software memory hole feature is disabled by default in the BIOS. When it is disabled, even if you have 4 GB of memory installed, the system effectively has less than 4 GB of available memory. If you enable the software memory hole feature, 4 GB of installed memory gives 4 GB of available memory. You can enable the software memory hole feature in the BIOS Configuration Utility (Chipset menu -> Memory Configuration screen).


Workaround

Do not enable Hibernation if your server has less than 4 GB available memory.


VMWare ESX Issues

Current Issues

ESX Installation Stops (6549480)

While installing ESX Server 2.5.2, 2.5.3, or 2.5.4 in a boot from SAN configuration using an optical drive, the installation may stop after displaying “running /sbin/loader”.

Workaround

When booting from the CD, watch for the “boot:” prompt at the bottom of the screen. When it appears, type

bootfromsan nousb

and press the enter key. The system may also hang when booting from the SAN. Again, watch for the “boot:” prompt; this time, type

nousb

and press the enter key. To have this workaround happen automatically, edit /etc/lilo.conf. Add the keyword nousb to the beginning of every append= line in the file. If there is no append= line, add one:

append=”nousb”

ESX Does Not See Keyboard and Mouse (6550504)

When installing ESX Server 2.5.4, the keyboard and mouse may become inoperative.

Workaround

Same as for ESX Installation Stops (6549480).


Sun VTS Bootable Diagnostics CD Issues

Current Issues

Meter Button in Bootable Diagnostics CD, Version 2.1f Does Not Work (6465167)

SunVTS 6.2 Graphical User Interface (GUI), shipped on the Bootable Diagnostics CD, Version 2.1f, has a Meter button. This Meter button does not work because it requires the Solaris stdperformeter utility, which is not available for bootable diagnostics.

Ignore Messages When Booting from Sun VTS Bootable Diagnostics CD .iso Image, Version 2.1f (6470488)

If you boot from the SunVTS Bootable Diagnostics CD .iso image, version 2.1f, through a virtual CD-ROM or on some CD-ROM models, you might see the following messages. These messages are harmless and can be ignored:

Sep  7 03:49:11  scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci1022,7460@6/pci1022,7464@0,1/storage@1/disk@0,0 (sd0):
Sep  7 03:49:11         Error for Command: read(10)       Error
Level: Fatal
Sep  7 03:49:11  scsi: [ID 107833 kern.notice]  Requested Block:
109118                    Error Block: 109118
Sep  7 03:49:11  scsi: [ID 107833 kern.notice]  Vendor:
AMI                                Serial Number:
Sep  7 03:49:11  scsi: [ID 107833 kern.notice]  Sense Key: Media Error
Sep  7 03:49:11  scsi: [ID 107833 kern.notice]  ASC: 0x11 (unrecovered
read error), ASCQ: 0x0, FRU: 0x0

Resolved Issues

SunVTS ramtest Might Cause System to Reboot When Testing More Than Seven Hours (6369893)

(Fixed in Software 1.1.)

A memory test under exclusive mode in SunVTS (version 6.1 and earlier), ramtest, exercises a corner case that does not follow AMD programming guidelines. Therefore, on early Sun Fire X4100 or Sun Fire X4200 servers, ramtest might cause the system to reboot after an extended test run of more than seven hours. Sun Fire X4100 and Sun Fire X4200 systems running software that follows AMD programming guidelines, which most compilers generate, will function properly.

Workaround

This problem is fixed in Sun VTS version 6.1sp1 and later. To get the latest version of SunVTS, you can download it from this URL:

http://www.sun.com/oem/products/vts/

If you have SunVTS version 6.1 or earlier, SunVTS pmemtest and vmemtest are suitable memory diagnostics for extended test runs. When performing test runs of more than seven hours, use pmemtest or vmemtest, rather than ramtest.