C H A P T E R 2 |
Software Information and Issues |
This chapter contains information related to software. To update your server to the latest available software, go to:
To determine operating systems certified or supported for the servers, go to:
This chapter contains the following sections:
This section lists issues for Sun Fire V20z and Sun Fire V40z Servers running supported Linux operating systems. For all items, check the product web sites for future enhancements.
To prevent data corruption on Linux operating systems, create IM volumes before you install the OS. To create IM volumes, use the LSI Configuration Utility.
If you run RHEL 4 U1 (64-bit) with NSV 2.4.0.2a, it is necessary to modify the Makefile to point to the correct directory for the kernel source. The source code for the kernel in RHEL4 U1 (64 bit) is not located in the common area (/usr/src/linux) that usually points to the correct version.
To create a soft link to the correct version of the kernel source code, enter the following command:
ln -s full path to kernel source /usr/src/linuxi
If HPET is disabled on a system with AMD PowerNow! processors and a Linux operating system, follow the instructions below to configure the OS appropriately.
Red Hat Enterprise Linux 4 does not support AMD PowerNow! at this time. AMD PowerNow! must be disabled on all systems.
chkconfig --level 12345 cpuspeed off
RHEL 4 does not support AMD PowerNow! at this time. AMD PowerNow! does not default to enable on Red Hat Enterprise Linux 4 (x86), so requires no action.
AMD PowerNow! defaults to enable on all systems, so requires no action to enable AMD PowerNow! functionality.
1. Add the boot parameter, clock=pmtmr to the file /boot/grub/menu.lst.
AMD PowerNow! must be disabled on all systems.
1. In the /etc/sysconfig/powersave/common file, replace
POWERSAVE_CPUFREQD_MODULE=”off”
This OS does not support AMD PowerNow! for dual-core processors.
1. In the /etc/sysconfig/powersave/common file, replace:
POWERSAVE_CPUFREQD_MODULE=”off”
1. Add the boot parameter clock=pmtmr to the file /boot/grub/menu.lst.
Trusted hosts might not work on some implementations without a modification to the OpenSSH script:
Currently, the machine uses this version:
OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f
To modify it, add this to the script:
PCI-X hot-plug support on Linux requires kernel support as well as a driver to enable the hot-plug capabilities of the AMD-8131 PCI-X chipset element. Linux support of PCI-X hot-plug is an evolving area and a tight interdependency exists between kernel versions and hot-plug drivers. Therefore, be careful to use the correct configuration for your systems hardware/software combination. To download the AMD 8131 PCI-X Tunnel Standard Hot Plug Controller (SHPC) driver, go to:
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_871_9034^9504,00.html
Within the kernel source of your distribution, read the file named pci.txt for additional information about the PCI support layer in Linux.
The PRS version included in this software release impacts new products only, because PRS is not field update-able.
To ensure that the Service Processor and the platform OS time zone data are in sync, set the platform RTC in Greenwich Mean Time (GMT), and use an OS mechanism to adjust platform time to the local time zone.
Installation of certain platform OS SNMP agents might cause a review of SP-GROUP-MIB.mib (1.3.6.1.4.1.9237.2.1.1.7) to time out, in the reverse proxy configuration. To avoid this problem, install platform OS SNMP agent net-snmp v5.1.2, only.
With more than 4 Gbyte of memory, if you use the savedump subcommand of lcrash, the 4300 and the 2100 hang, and a watchdog timeout error occurs. To prevent this problem, do not use the savedump subcommand. Instead, use the lkcd dump command.
In order to obtain optimal performance for dual-core systems, customers should ensure that they use the latest kernel patch. The patch is included in SP1 or later.
If you prefer to install the patch, rather than upgrade to SP1 or later, follow the instructions below.
1. Log on to the SUSE Portal at: http://www.novell.com/linux/suse/portal/
2. Select the Patch Support Database link in the right column of the SUSE Linux Portal page.
3. Select By Product on the SUSE Patch Support Database page.
4. Select SUSE Linux Enterprise Server 9 for AMD64 (x86_64) on the Distributions page.
5. Select the most recent kernel patch.
6. Follow the instructions to download and install the packages for SUSE CORE 9 for AMD64 and Intel EM64T (x86_64).
The Trident server video driver, shipped with Red Hat Enterprise Linux (RHEL) 3.0, might cause a system lockup under certain conditions. These conditions are described in Red Hat’s Bugzilla database at:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=113533
To avoid this issue, use the VESA X server video driver bundled with RHEL 3.0.
To select the VESA driver during installation, proceed normally until the Graphical Interface (X) Configuration screen appears. Then expand the Other drivers menu and select “VESA driver (generic).”
To select the VESA driver after installation, switch from using a Trident driver to a VESA driver. For detailed steps, see the RHEL documentation.
The 32-bit version of RHEL 3.0 does not recognize more than 4 Gbyte of physical memory, even when more than 4 Gbyte is installed. This is a limitation of the default kernel.
Some other 32-bit versions of Linux have trouble recognizing more than 4 Gbyte of memory because of limitations of their default kernels. If your OS demonstrates this issue, contact your OS vendor for instructions to get the correct kernel to support your memory configuration.
See the Sun Fire V20z and Sun Fire V40z Servers-Server Management Guide for detailed instructions on installing this customized OpenIPMI driver.
For the Sun Fire V20z Server, it is recommended that you obtain the optional CD/DVD drive (X9260A) to install software from DVD media.
Note - The Sun Fire V40z Servers have only the DVD/diskette drive as an option. |
If SUSE Linux Enterprise Server 8 (SLES 8) is installed from the CD media using the X-windows-based installation utility, a problem might occur during the post-installation steps. The X-windows-based installation utility might revert control to the primary console and display an error message about the ps command. If this happens, you can return control to the X-windows-based installation utility by pressing Ctrl-Alt-F7 simultaneously on your keyboard. At this point you can proceed with normal post-installation setup with SLES 8.
If you have a single SCSI hard drive, the drive can be inserted in any slot. If you have two or more hard disk drives, install the drive with the OS boot sector in the lowest-numbered slot among the populated slots.
To obtain the best performance for dual-core systems running SLES 9, install SP1 or a later service pack. If this is not an option, install patch-9962, released 21 March 2005. The patch is available on the Novell web site at:
http://support.novell.com/techcenter/search/search.do?cmd=displayKC&externalId=2558830537429cdedb543926fd6344a8html
Note - Patch-9962 is not required for systems running SLES9 SP1 or later. |
Note - If you use Red Hat Enterprise Linux (RHEL) 3.0, install the most recent OS update on the server to minimize this issue. |
In the BIOS Advanced menu, the “No Spec. TLB Reload” option is disabled by default. This setting allows the Translation Look-Aside Buffer (TLB) to reload.
With the default setting, errors similar to the following were observed on systems running any 64-bit version of Red Hat Linux and also SUSE Linux with
Service Pack 1.
Northbridge status a60000010005001b GART error 11 Lost an northbridge error NB status: unrecoverable NB error address 0000000037ff07f8 Error uncorrected |
To avoid these errors, disable TLB reloading:
1. Reboot the server and press the F2 key to enter BIOS setup.
2. Navigate to the Advanced > Chipset Configuration BIOS menu.
3. Use the arrow keys to scroll down to the option No Spec. TLB reload and change its setting from Disabled to Enabled.
This disables TLB reloading and eliminates the error message.
SUSE Linux Enterprise Server 9 64-bit currently does not work with the boot option of maxcpus=0, the default option for failsafe mode for the Sun Fire V40z dual-core.
After the BIOS finishes booting, a graphical boot screen appears with three options: Linux, Floppy, or Failsafe.
2. Click in the small text edit box below the options.
3. Scroll to the end of the line.
4. Edit the text: change from “maxcpus=0 3” to “maxcpus=3”.
To install Red Hat Enterprise Linux 4 FCS (32-bit) on Sun Fire V40z (chassis 380-1010) with NSV 2.4.0.6, the High Precision Event Timer (HPET) must be disabled in the BIOS. This modification is not necessary with RHEL4 Update 1.
To disable the HPET in the BIOS:
1. Press the F2 key to enter the BIOS Setup Utility.
2. In the Advanced menu, select the HPET Timer option.
3. Change the value to Disabled.
4. Press the F10 key to save your changes.
This section lists issues and considerations regarding Sun Fire V20z and Sun Fire V40z servers using the Solaris operating system. For all items, check the product web sites for future enhancements.
The first compatible version of the Solaris OS is Solaris 9 OS 4/04 or later for the Sun Fire V20z Server and Solaris 9 OS HW 4/04 for the Sun Fire V40z Server. However, certain functionality might be phased in after the initial product release of the server:
Note - The Sun Fire V40z servers have only the DVD/diskette drive as an option. |
http://www.sun.com/servers/entry/v20z/downloads.html
http://www.sun.com/servers/entry/v40z/downloads.html
To prevent data corruption on the Solaris ZFS operating systems, create IM volumes before you install the OS. To create IM volumes, use the LSI Configuration Utility.
Solaris 10 6/06 and 11/06 security vulnerability in the in.telnetd(1M) daemon might allow unauthorized remote users to gain access to a Solaris Host. Patch# 120069-02 addresses this issue. Apply the patch manually or run the install.sh script on all Solaris 10 6/06 and 11/06 distributions. For more information, please refer to Sun Security Alert 102802. This patch will be incorporated into the Solaris preinstall image at a later date.
Solaris 10 1/06 OS GUI-based installation fails on Sun Fire V40z servers with 64 Gbyte of memory installed (fully populated with 4 Gbyte DIMMs).
The installation does not fail if you use the console text-mode installation.
Solaris 9 OS is a 32-bit OS and is limited to 32 Gbyte of memory. However, addressing 32 Gbyte of physical memory requires a large portion of the 32-bit address space. Applications might not have access to sufficient physical memory.
Sun recommends running the Solaris 10 OS for applications requiring large amounts of physical memory.
Some versions of the Sun Fire V20z and Sun Fire V40z Servers ship with a preinstalled version of the Solaris 10 OS.
If you want to remove the preinstalled version of the Solaris 10 OS from your server, you can simply overwrite it by installing a version of the Linux OS. During the Linux installation process, a warning message that begins as follows might display:
Warning. Unable to align partition properly.
Incorrect partition labels from the preinstalled Solaris 10 OS cause this message, but you can safely ignore it. The error is corrected after a Linux installer changes the partition table.
An operating system can be installed on the server without configuring the service processor or the network share volume (NSV) software. However, if you choose to skip configuration of the service processor and the NSV software, you will not be able to use the remote management capabilities of the system or the diagnostics.
The Sun Installation Assistant CD-ROM helps you to install a supported Linux operating system (OS). It provides a set of Sun-supported drivers that are tested for quality assurance. By using the Sun Installation Assistant CD, you can install the operating system, the appropriate drivers, and additional software on your system. The Assistant eliminates the need to create a Drivers Update diskette.
The Sun Installation Assistant for Sun Fire V20z and Sun Fire V40z Servers CD-ROM might be included in your accessory kit.
You can download the .iso image for this CD-ROM from the product web sites at:
http://www.sun.com/servers/entry/v20z/downloads.jsp
http://www.sun.com/servers/entry/v40z/downloads.jsp
Information about using the Sun Installation Assistant CD-ROM can be found in Chapter 2 of the current version of the Sun Fire V20z and Sun Fire V40z Servers--Linux Operating System Installation Guide.
At the time of this printing, the following versions of the Linux OS are supported by the Sun Installation Assistant:
Note - Sun Installation Assistant does not support Red Hat Enterprise Linux 3 Update 8 and SUSE Linux Enterprise 10. These operating systems can be installed directly without using SIA. |
The Sun Installation Assistant does not support using Logical Volume Manager (LVM) with Red Hat Enterprise Linux 3 and updates or SUSE Linux Enterprise Server 8 or 9 and service packs. The current release available from the download site supports LVM with Red Hat Enterprise Linux 4. Future releases will support LVM with SUSE Linux Enterprise Server 9 and service packs.
If you do not install the platform drivers and a correctable ECC error occurs, duplicate error messages of the most recent ECC failure are reported indefinitely.
To avoid this issue, ensure that you install the appropriate level of platform drivers on your server. For more information, refer to the Sun Fire V20z and Sun Fire V40z Servers--Installation Guide.
Note - This issue is resolved in the BIOS update within NSV 2.2.0.6h and newer releases. |
If you installed the SUSE Linux Professional 9.0 or SUSE Linux Enterprise Server 8 (SLES8) OS on your server, and are running the LSI driver version 2.05.11 and firmware version 1.03.15, you might encounter performance issues on the internal hard disk drives (HDDs).
Sun recommends updating to LSI driver version 2.05.16 and firmware version 1.03.23. You can use these versions of the driver and firmware for all supported operating systems.
Note - This issue is resolved in NSV versions 2.2.0.6h and higher. |
If you are using in-band IPMI functionality with your server, you must unload the OpenIPMI Linux kernel driver before accessing a diskette (floppy disk). If you do not unload the OpenIPMI Linux kernel driver before you access a diskette, diskette writes and management data that is handled by the OpenIPMI Linux kernel driver will be corrupted.
1. To unload the OpenIPMI Linux kernel driver, authenticate as root, then enter the following commands:
$ rmmod ipmi_kcs_drv
$ rmmod ipmi_devintf
$ rmmod ipmi_msghandler
2. After diskette access is complete, restore the in-band IPMI functionality by entering the following commands:
$ modprobe ipmi_devintf
$ modprobe ipmi_kcs_drv
The most current version of diags contains multiple bug fixes and is available at the following URLs:
http://www.sun.com/servers/entry/v20z/support.jsp
http://www.sun.com/servers/entry/v40z/support.xml
In previous versions of the Diagnostics CD, the README and RESCUECD files served two purposes:
In current and future versions of CD-based diagnostics, these files no longer need to contain user-related information. However, the diagnostics kernel still requires the files to verify the match between the kernel and diagnostics application. As a result the files remain intact, but have no content.
Remote access requires the prior creation of a manager-level user on the platform. See "Create Trusted Host Relationships," in the Systems Management Guide for instructions. To establish a remote command-line interface for CD-based diagnostics tests, use SSH network access:
1. SSH to the platform IP address as the user, setup. If you created a manager-level user on the Service Processor, you are prompted for a username and password to create a new account. You can use any username except: diagUser, setup, or root. When your username and password validate, you are logged off of the system.
2. Now use your username and password to SSH to the platform.
3. To enable only platform diagnostics tests without loading the Service Processor tests, execute the command:
Note - For Service Processor-based diagnostics, the -n argument specifies: Do not boot the platform with diagnostics. |
To enable both Service Processor and platform diagnostic tests, execute the command:
This command reboots the platform into diagnostics mode. Wait at least two or three minutes before you attempt to run the tests.
Implement one of the following in shell or Perl:
diags start
sleep 240
rc = diags get state
if (rc ==0)
then
# run desired tests using diags run tests command
else
echo "Diagnostics not loaded in expected time. rc = $rc"
fi
rc = diags get state
timer = 0
while (rc == 25 (device error)) and (timer < MAX_WAIT)
do
sleep SLEEP_TIME
timer=time+SLEEP_TIME
rc = diags get state
done
if (timer < MAX_WAIT)
then
# run desired tests using diags run tests command
else
echo "Error loading platform diagnostics. rc = $rc"
fi
4. To determine if the diagnostics tests are available to run, you can execute the command:
The command returns either a success message:
end
if re == 0
diags run tests -a
In some releases, after you’ve entered the command to execute diagnostics (diags start) some diagnostic tests might fail to load. You can check on the status of the diagnostics by entering the following command:
If the diagnostics have loaded, the above command will return the following message:
If the diagnostics haven not loaded, the above command will return the following message:
To resolve this issue, install NSV 2.4.0.6a.
Platform state changes that are made after diags has been started night not be detected by fan tests, which are dependent on platform power. If you use service processor-based, non-platform mode “diags start -n,” set the desired platform state before you load diags.
The sensor for Fan 10 reports 0 RPM during the reboot cycle, but then quickly reports a return to normal operation.
Packet corruption might happen during diagnostics download. If this happens, platform-side diags never comes up. To resolve the issue, follow this procedure:
1. Stop all diagnostics by entering the command:
2. Verify that the server power is off by entering the command:
3. Start diagnostics by entering the command:
4. Repeatedly check the status of diags by entering:
If the problem persists, call Sun Service for further assistance.
Any platform state changes made after the diags command has been started might not be detected by fan tests, which are dependent on platform power. If you use Service-Processor-based, no-platform mode diags start -n, set the desired platform state before you load diagnostics.
Users might lose SSH connection with the platform when the retention.allDimms test runs. Diagnostics continues to run, but the user cannot SSH into the platform after the connection is lost. To avoid this problem, if you use an SSH connection, do not run the nic test.
Generally, downgrading firmware to a version lower than the version shipped with the machine is not supported.
NSV releases 2.4.0.6 and later support the following SP commands:
This command enables you to set the community name to be used by the service processor (SP), itself, as opposed to the proxy community string that is used between the SP and the platform.
There are no restrictions on the length of the community string. Typical names are “private” and “public.” The factory default name of the community string is “public,” so if you run the command sp get snmp community before you set a value, “public” is returned. Set the value to any string without spaces.
For example, for:
$ sp set snmp community COMMUNITY_STRING
localhost# sp set snmp community private
This command returns the community name that is currently being used by the service processor. For example, with command:
localhost# sp get snmp community public
In rare cases, when the user enters the inventory get software command, the wrong date is displayed for the latest revision of the server diagnostics software. This does not affect functionality or performance of the product and can be safely ignored.
If you need to access the correct install date for the server diagnostics software, perform the following command sequence:
1. Revert to the default settings of the SP by entering.
$ sp reset to default-settings -a
2. Wait for the SP to reboot, then create a manager account by connecting to the IP address of the SP.
IP-ADDRESS The IP address of the SP when it comes back online.
3. Follow the prompts to create a manager account.
4. Log in to the manager account you just created.
5. Mount the SP containing the Server Diagnostics software.
6. Verify your access to the latest revision dates of the installed software.
The correct installation dates for the latest software revisions are now displayed.
For more information about creating a manager account or mounting the SP, see the Sun Fire V20z and Sun Fire V40z Servers--Installation Guide.
The sp get tdulog command has been enhanced and the sp ssh1 commands have been added. For details, see the Sun Fire V20z and Sun Fire V40z Servers Server Management Guide.
Note - This issue appears in NSV versions 2.1.0.16 and earlier. It is resolved in NSV 2.2.0.6 and later versions. |
While running diagnostics on your server, do not interact with the service processor (SP) through the command-line interface or IPMI.
The sensor commands cannot be used reliably while the diagnostics are running. Issuing sensor commands while diagnostics are loaded might result in “false” or erroneous critical events being logged in the events log. The values returned by the sensors are not reliable in this case.
Note - This issue appears in NSV versions 2.1.0.16 and earlier. It is resolved in NSV 2.2.0.6 and later versions. |
Terminating the diagnostics generates critical events on the sensors and system errors. After the diagnostics are terminated and the platform is powered off (the diags terminate command does this automatically), Sun recommends that you clear these events from the event log so that you do not mistake them for actual critical events.
The diagnostics provided with the Sun Fire V20z or Sun Fire V40z Servers are designed for a user who is watching the screen, or for the output to be saved to a file.
TABLE 2-2 and TABLE 2-3 (generated by the SP command sp get events) show the events generated when you run the command diags terminate on your server. You can ignore all of the “critical” errors.
The following steps provide a workaround to clear the false critical events from the event log.
Note - For a complete list of the SP commands, refer to the Sun Fire V20z and Sun Fire V40z Servers--Server Management Guide. |
1. Before running the diagnostics, clear the SP events log.
Wait for the diagnostics to come up.
3. Run any or all of the diagnostics tests.
4. Check the SP event log for any errors.
The event log can be stored away for future investigation.
5. Terminate the diagnostics with the command diags terminate.
This step eliminates all of the false critical events that were generated in the previous step.
Note - For more information about clearing the event log, enter the command:
sp delete event --help |
mount : Mounting /dev/fd0 on /mnt/floppy failed. No such device.
You can safely ignore this error message.
Note - The floppy-disk drive issue is resolved in the diagnostics included with NSV 2.2.0.6h and later releases. |
Copyright © 2008 Sun Microsystems, Inc. All Rights Reserved.