|C H A P T E R 3|
Software Notes and Issues
This chapter describes software issues related to the Sun Blade X6220 server module. It includes the following subjects:
This section lists issues that are not specific to any operating system or apply to more than one operating system.
If you use the following arguments with the suncfg tool, the system will hang:
The suncfg tool will not be included in the SW 1.2 Tools and Drivers CD.
This section lists issues that are specific to the Solaris operating system.
During a hot-swap, if you insert any of these PEM’s:
SG-(X)PCIE2FCGBE-E-Z, SG-(X)PCIE2FCGBE-Q-Z, or X7284A-Z
you might encounter configuration failure errors.
In the Solaris message file, you might see the following error message:
If you manually configure the PEM with the cfgadm command, it will return the following error message:
Reboot Solaris 10 10/08 with PEM installed.
The UDP application may show an RCR L4_CSUM_ERROR in the Solaris message file.
The incorrect UDP checksum causes this error in the Solaris message file on Solaris 10 10/08 with X7287A-Z and X1028A-Z PCI EMs.
This issue will be solved in the next Solaris release. For now, install patch 139570-05 or 138899-07 to fix this problem.
After you hotplug X7284A-Z PCI EM into Slot 1 and reboot Solaris 10 5/08, the NIC path name of the second on-board NGE interface will change to nge2 from nge1.
If a X7287A-Z PCI EM is also in Slot 0 as well, the NIC path name of the nxge interfaces might change to nxge4, nxge5, nxge6, and nxge7, from the original values os nxge0, nxge1, nxge2, and nxge3.
After hotplugging an SG-(X)PCIE8SAS-EB-Z PCI ExpressModule (PCI EM) into Sun Blade X6220 server with S10 5/08 installed, a message similar to the following will display:
Jun 23 14:07:12 nsgsh-dhcp-217 fmd: [ID 441519 daemon.error] SUNW-MSG-ID: SUNOS-8000-1L, TYPE: Defect, VER: 1, SEVERITY: Minor Jun 23 14:07:12 nsgsh-dhcp-217 EVENT-TIME: Mon Jun 23 14:07:12 CST 2008 Jun 23 14:07:12 nsgsh-dhcp-217 PLATFORM: Sun Blade X6220 Server Module, CSN: 0111APO-0749BZ 055C , HOSTNAME: nsgsh-dhcp-217 Jun 23 14:07:12 nsgsh-dhcp-217 SOURCE: eft, REV: 1.16 Jun 23 14:07:12 nsgsh-dhcp-217 EVENT-ID: 2e500640-3f7a-c7cc-d33e-d560f3b08735 Jun 23 14:07:12 nsgsh-dhcp-217 DESC: The EFT Diagnosis Engine encountered telemetry for which it is unable to produce a diagnosis. Refer to http://sun.com/msg/SUNOS-8000-1L for more information. Jun 23 14:07:12 nsgsh-dhcp-217 AUTO-RESPONSE: Error reports from the component will be logg ed for examination by Sun. Jun 23 14:07:12 nsgsh-dhcp-217 IMPACT: Automated diagnosis and response for these events will not occur. Jun 23 14:07:12 nsgsh-dhcp-217 REC-ACTION: Run pkgchk -n SUNWfmd to ensure that fault management software is installed properly. Contact Sun for support.
This message can be safely ignored. Later Solaris releases will eliminate this message.
When upgrading Gemini BIOS from 29 (or lower version) to 30 (or higher), the PCI bus number of the second nge interface changes from 7b to 7c in Solaris. This changes the instance number from nge1 to nge2 in /etc/path_to_inst.
After the BIOS upgrade, you will need to use nge2 to refer the second nge interface, rather than nge 1.
When the Solaris OS is running, Infiniband PCI ExpressModules (PCI EMs) cannot be removed or installed.
Shut down the Solaris OS gracefully before removing or inserting the Infiniband PCI EM.
Every time the Solaris operating system boots, a PCI system error signal (SERR) is logged regarding the SAS3081E-S PCI-E-to-SAS adapter.
The error messages can be ignored.
Users running Solaris 10 11/06 (or higher) installed, will see that Web interface console services are stopped after the system is reconfigured.
|Note - The Web interface consists of the Sun Java Web Console, and the Sun Java Web User Interface Components. The Console provides a common point for Sun web-based system management applications to be registered and accessed.|
Starting with Solaris 10 11/06, the Web interface starts automatically as an smf service when the OS boots. When a Web interface instance is first created, a self-signed x.509 certificate is generated based on the machine hostname. The hostname is stored in the CN Relative Distinguished Name (RDN) of the Distinguished Name (DN) of the certificate.
For SSL exchanges to succeed, the hostname in the CN RDN of the DN of the certificate must be the same as the hostname of the system. If you sys-unconfig a system and change the hostname, the Web interface x.509 self-signed remains associated with the previous hostname. Any SSL exchange between a client (i.e., wcadmin) and the web server will fail due to the hostname mismatch in the x.509 certificate.
Run the following command to remove the entire instance of the web interface from the OS.
/usr/share/webconsole/private/bin/wcremove -i console
Removing the Web interface console instance also deletes the x.509 certificate. The next time the Web interface console is started, a new certificate is generated based on the current hostname.
Systest in SunVTS 6.3 might run into known issue in libmtsk.so bundled in Solaris 10 Update 3.
Install patch 120754-05 or later before running SunVTS 6.3 with Solaris 10 Update 3.
Invoking the Solaris kernel debugger with the command mdb -K -F, might cause the system to hang at high IOPL if the console is set to text, which is the default setting.
Set the console to ttya. By setting the console to ttya. This causes the system to transfer control of the debugger to the serial console port.
Solaris AMD x64 support includes a boot-time check for the presence of a BIOS workaround for the AMD Opteron Erratum 131. If Solaris detects that the workaround for Erratum 131 is needed but it is not yet implemented, Solaris logs and displays the following warning message:
The Sun Blade X6220 server module BIOS implements a superset workaround that includes the workaround required for Erratum 131. This warning message can be safely ignored.
|Note - This issue was resolved in Solaris 10 5/08.|
This section lists issues that are specific to the Linux operating system.
The following RPM packages are provided in the given directories for RedHat and SuSE respectively:
1. From http://www.sun.com/download/products.xml?id=45a593ce
extract the nxge driver from the Sun_10_Gigabit_Ethernet_driver_update_12.zip file.
2. To install the nxge binaries run the following command:
rpm -ivh sun-nxge-1.0-1.x86_64.rpm
The driver binary is installed as:
The config tool binary is installed as:
The man page is installed as:
3. To load the module using modprobe:
# modprobe nxge
4. Add the nxge interfaces to the /etc/modprobe.conf file for loading at boot time.
alias <if_name> nxge
5. Use ethtool command to check the properties of each interface:
ethtool -i <if_name>
6. Assign an IP address to the interface by entering the following:
ifconfig <if_name> <IP_address>
7. Verify that the interface works. Enter the following, where <IP_address> is the IP address for another machine on the same subnet as the interface that is being tested:
Drivers are not currently available for the X1028A-Z and X7287A-Z PCI ExpressModules (PCI EMs) if you are running SLES SP4 64-bit.
When hot-plugging a PCI EM when running RHEL 4 Update 6 (both 32- and 64-bit), there will be a "acpiphp_glue: _HPP evaluation failed" message displayed. After several instances of this message, the hot swap can fail.
If hot swap issues occur within this configuration, reboot the operating system to correct the problem.
If you are running SLES 9 SP4 64-bit, hot-plugging of the X7284A-Z PCI EM is not supported. You will need to power down the Sun Blade X6220 before installing the PCI EM.
When SUSE Linux Enterprise Server 10 SP1 is installed as a fully virtualized guest under Xen, the swap file might not be mounted automatically. This can cause application failure due to unavailable swap space.
Modify the file system configuration as follows.
1. Edit /etc/fstab on the guest.
2. Change the following line:
3. Change the following line:
4. Save the file.
5. Reboot the guest operating system.
These changes should cause the swap file to be mounted automatically.
You might encounter less than the expected throughput of 125 MB/second with the onboard NVIDIA Gigabit Ethernet NIC on all supported Linux OSes.
For Red Hat Linux:
1. In a text editor, open /etc/modprobe.conf
2. Add the line: options forcedeth max_interrupt_work=100
|Note - This workaround only works on RHEL 5 U1. The workaround does not work for RHEL 4 U6. See CR6668885 for more information.|
For SUSE Linux:
1. In a text editor, open: /etc/modprobe.conf.local
2. Add the line: options forcedeth max_interrupt_work=100
RHEL5.0 has some bugs which might result in file system corruption during heavy I/O to the compact flash (CF). It is not recommended that you use RHEL 5.0 to access the CF for I/O intensive applications.
Update your OS to RHEL 5.1.
If you are running an application on SLES10 SP1 and SLES9 SP3 that requires heavy network activity, the application program might fail on the NVIDIA Gigabit Ethernet NEM NIC.
Use a PCI EM Intel NIC instead of the NEM NIC. Part numbers for these PCI EMs are X7282A-Z or X7283A-Z.
If you select the “Everything” option during Red Hat Enterprise Linux 4 (RHEL 4) U4. the installation might require more space than available in the compact flash (8GB).
Deselect some packages during the installation so that installation size requirement matches the size of the available storage.
RHEL 4U4, 4U5 and RHEL 5 installation is not supported with a root partition on Linux Volume Manager (LVM) for a compact flash-based installation.
This installation will cause a kernel panic on bootup.
Configure RHEL with root on a non-LVM partition.
During installation, choose manual partitioning using disk druid. Delete the existing (if any) partitions and create two new partitions 100M (mounted as /boot) and rest of the disk mounted as root (/)
Some fully virtualized Red Hat Enterprise Linux 5 guests will hang when they are given 500 or 1000MB of memory.
When setting up a RHEL 5 guest OS, make sure to allocate at least 512 or 1024MB of memory to the fully virtualized Red Hat Enterprise Linux 5 guests.
PCI Express hotplugging may not work on RHEL4 U4 and SLES10 operating systems.
Execute the following command before hotplugging PCI ExpressModules:
During the installation of a Linux or Windows operating system using a CD/DVD drive connected via USB, the following message might appear:
Once the CD is inserted, the installation program might not recognize it.
To avoid this issue, you must enable memory hole remapping in the BIOS setup as follows:
1. Power on or reboot the Sun Blade X6220 server module.
2. Press F2 when prompted to enter the BIOS Setup Utility.
3. Navigate to the Chipset menu.
4. Make the following selections in order:
a. NorthBridge Configuration
b. Memory Configuration
c. Memory Hole Remapping
5. Enable Memory Hole Remapping by pressing the + key until the value is set to Enabled.
6. Press F10 to save and exit the BIOS.
When a Sun Blade X6220 server module running the RHEL4 U4 operating system is rebooted, it could hang intermittently at different stages of reboot.
Disable the ACPI2.0 objects in the BIOS by performing the following steps:
1. Select Advanced, CPU Configuration, ACPI2.0 objects, and then select disable option.
2. Reboot the system and disable kudzu from running by using the command:
If you are installing RHEL4 U4 x64 version from CD media using an external USB CD-ROM drive, the operating system might report that it cannot find the CD media right before the media check dialog.
Enable memory hole remapping in the BIOS Setup Utility before installing the operating system.
When the Linux OS starts, the following console message might appear for every CPU/Core in the system. You can safely ignore this message.
When installing SLES 10 with the Web interface installation selected, you might receive a blank screen. This occurs because the monitor or LCD screen cannot handle the high refresh rates chosen as default by installer.
Choose one of the following workarounds:
When Ethernet PCI EMs are inserted and configured, Linux automatically reconfigures the device numbers. For example, eth0 would be renumbered to eth5 or eth4.
You can avoid this issue by binding the device number to Ethernet device MAC address. To bind the device number to an Ethernet device MAC address, perform the following steps.
1. Type the following command to find the Ethernet device MAC address:
Replace x with the corresponding numeral like eth0
2. Record the Ethernet device MAC address.
3. Edit the ifcfg file /etc/sysconfig/network-scripts/ifcfg-ethx as follows:
a. Type the following command:
b. Add the previously recorded MAC address.
For example, HWADDR=00:09:3D:00:23:8D
4. Save the file to make the modifications permanent.
The following error message might display several times in /var/log/messages:
The following message might also display in /var/log/messages after updating to the mptlinux-4.00.13.00-1 driver:
These messages can be safely ignored.
When the Sun Blade X6220 server module boots, the following message might appear on the screen.
This message can be safely ignored.
RHEL4 U4 might not be accessible via the service processor (SP). This will happen when the ILOM service processor does not display serial output from Linux OS.
You can avoid this issue by ensuring that the following conditions are met:
The OFED1.1 Infiniband driver will not compile on RHEL4 U4 and SLES10 operating systems. Therefore the Infiniband PEM (X1288A-Z) is not supported when using these operating systems. You might receive the following error messages when trying to compile the Infiniband driver:
ERROR: Failed to execute: make -C /lib/modules/18.104.22.168-0.8-smp/build SUBDIRS=/var/tmp/IBGD//tmp/openib/infiniband CONFIG_INFINIBAND=m CONFIG_INFINIBAND_MELLANOX_HCA=m CONFIG_INFINIBAND_IPOIB=m CONFIG_INFINIBAND_USER_CM=n CONFIG_INFINIBAND_SDP=n CONFIG_INFINIBAND_DAPL=n CONFIG_INFINIBAND_DAPL_SRV=n CONFIG_DAT=n CONFIG_INFINIBAND_KDAPL=n CONFIG_INFINIBAND_KDAPLTEST=n CONFIG_INFINIBAND_SRP=n CONFIG_INFINIBAND_SRP_TARGET=n KERNELRELEASE=22.214.171.124-0.8-smp EXTRAVERSION=.21-0.8-smp V=1 modules See /var/tmp/IBGD//tmp/openib/build_kernel_modules.log for more details
When booting a server module running RHEL4 U5, the following error message might appear a few times:
This message can be safely ignored.
This section lists issues that are specific to the Windows operating system.
The Windows Server 2003 R2 operating system might be preinstalled on your system. For more information, see Windows Server 2003 R2 Operating System.
If Windows 2003 Server is booted with the dongle unplugged, or the dongle plugged in but the VGA cable unplugged, video can only be restored by rebooting Windows. Windows requires that both the dongle and the VGA monitor cable both be plugged in before booting Windows.
Before booting Windows, make sure the dongle and the VGA monitor cable are connected to the server module (blade).
This section lists issues that are specific to the VMware operating system.
VMware ESX 3.0.1 does not support the onboard Ethernet interfaces. A device driver for the on board interfaces is not available. To use or install ESX, you must install the supported PCI ExpressModule for network interfaces.
You must install the supported PCI ExpressModule for network interfaces to use or install ESX.
VMware ESX 3.0.1 numbers network interfaces differently than other operating systems. When a network PCI ExpressModule is installed, the system specifies interface 1 as vmnic0 and interface 0 as vmnic1.
For ESX to operate correctly, the system console must have network connectivity. By default, ESX assigns vmnic0 to the system console.
You should ensure that network interface 1 is the top network interface, and that it is connected and operational.