C H A P T E R 3 |
Notes for Servers With Hardware Dash Levels 01 Through 06 |
This chapter contains information about Sun Fire T1000 Servers with motherboard hardware dash levels 01 through 06.
To determine if these notes apply to your server, see Identifying the Notes for Your Server.
The following sections are in this chapter:
Note - For hardware RAID support, you must install Patch 121130-01 or greater for the Solaris 10 1/06 OS. Hardware RAID support is enabled by default with the Solaris 10 6/06 (or later) Operating System (OS). See Hardware RAID Support. |
If you have any technical questions or issues that are not addressed in the Sun Fire T1000 server documentation, contact your local Sun Services representative. For customers in the U.S. or Canada, call 1-800-USA-4SUN (1-800-872-4786). For customers in the rest of the world, find the World Wide Solution Center nearest you by visiting the web site:
http://www.sun.com/service/contacting/solution.html
The Solaris Operating System and Sun Java Enterprise System software are preinstalled on your Sun Fire T1000 server.
If it becomes necessary to reload the software, go to the following web site. You will find instructions for downloading software.
http://www.sun.com/software/preinstall/
Note - If you download a fresh copy of software, that software might not include patches that are mandatory for your Sun Fire T1000 server. After installing the software, see Patch Information for a procedure to check for the presence of patches on the system. |
These are the minimum supported versions of firmware and software for this release of the Sun Fire T1000 server:
You must install the following patches if they are not present on your system. To determine if the patches are present, see To Download Patches.
The following patch is mandatory for Sun Cluster software:
The following patches are required for hardware RAID support:
Note - These patches are not included in some versions of preinstalled software on the Sun Fire T1000 server. If the patches are missing from your server, download them from SunSolveSM as described in To Download Patches. |
To Download Patches |
1. Determine whether the patches have been installed on your system.
For example, using the showrev command, type the following for each patch number:
For example, if patch 119578-16 or later is installed, your system has the required version of this patch.
For example, if no version of the 119578 patch, or a version with an extension of
-15 or earlier is installed, you must download and install the new patch.
2. Go to http://www.sun.com/sunsolve to download the patches.
Using the SunSolve PatchFinder tool, specify the base Patch ID number (the first six digits) to access the current release of a patch.
3. Follow the installation instructions provided in a specific patch's README file.
If you add option cards to your server, refer to the documentation and README files for each card to determine if additional patches are needed.
The chassis cover might be very difficult to remove. If you press too hard on the cover lock button, the front edge of the cover might warp and bind. Also, elastic gasket material on the sides of the chassis might prevent the cover from sliding freely.
To remove the cover, lightly hold down the cover lock button and push the cover slightly toward the front of the chassis (this assists the unlocking action), then slide the cover approximately one half inch (12 mm) toward the rear of the chassis. You can now lift the cover off the chassis.
These are the functionality issues for this release.
The Sun Fire T1000 server is supported by the Sun Explorer 5.2 data collection utility, but is not supported by earlier releases of the utility. Installing Sun Cluster software from the preinstalled Java ES package will automatically install an earlier version of the utility on your system. After installing any of the Java ES software, determine whether an earlier version of the Sun Explorer product has been installed on your system by typing the following:
If an earlier version exists, uninstall it and install version 5.2, or greater. To download Sun Explorer 5.2, go to:
When running Explorer 5.2, or greater, you must specify the Tx000 option to collect the data from the ALOM-CMT commands on Sun Fire T1000 and Sun Fire T2000 platforms. The script is not run by default. To do so, type:
For more details, refer to troubleshooting document 83612, Using Sun Explorer on the Tx000 Series Systems. This document is available on the SunSolve web site.
Sun Fire T1000 servers do not have a full implementation of the Solaris predictive self-healing (PSH) feature. The current implementation provides the server with the ability to detect faults, but not the ability to completely diagnose and handle all faults.
If the server detects a PSH-related error, the following message might be generated:
If you see this message on the console or in the /var/adm/messages file, it might be an indication that patch 119578-16 or greater has not been installed. For information on obtaining patches and a list of mandatory patches for the Sun Fire T1000 server, see Patch Information.
If the patch has been installed but you continue to see error messages, contact Sun technical support.
The Sun Fire T1000 servers might experience a drop in network performance that occurs most notably when the system is configured to transmit or receive data over all four network ports at high rates. This might result in lower than expected throughput rates and some instances where network traffic over all four ports may induce system-wide hangs that would require a system reset to recover. If your Sun Fire T1000 should experience any system hang, please contact Sun with details on the fault, system activity, and system configuration information. Sun is actively working to resolve this problem.
The system will not power on if memory rank 0 is not populated. Rank 0 sockets must always be filled.
The Solaris PSH facility automatically detects the replacement of DIMMs. However, erroneous fault messages might be displayed when the system is booted, and these messages can mislead you to think that a problem persists when it is actually fixed.
For a procedure to manually clear the fault from all logs so that it is not reported at boot time, see To Manually Clear Fault Logs.
To correct future occurrences of the problem, install patch 119578-22.
Read caching and write caching are both enabled by default for the Sun Fire T1000 server disk drive. The use of the caches increases the read and write performance of the disk drive. However, data in the write cache might be lost if system AC power is interrupted. (A loss of AC power does not present a problem for the read cache.)
If you prefer to disable write caching, use the Solaris format -e command:
Caution - These settings are not saved permanently. You must reset the write cache setting every time the system boots. |
1. In the Solaris environment, enter the format expert mode by typing:
3. Select the cache option by typing:
4. Select the write_cache option by typing:
5. Display the current setting for the write cache.
8. Exit from the write_cache mode.
10. Exit from the format command.
TABLE 3-1 lists known bugs for this release of the Sun Fire T1000 server. The CR (change request) IDs are listed in numerical order.
If Solaris power management is required, restart power management manually or reboot the server. If Solaris power management is not required, no action is needed. |
|||
The system fails to power on if memory rank 0 is not populated. |
|||
The iostat -E command reports incorrect vendor information for the SATA drive. |
|||
The SunVTS USB keyboard test (usbtest) reports that a keyboard is present when there is no keyboard attached to the server. |
|||
When accessing the host through the ALOM-CMT console command, you might experience slow console response. |
For optimum responsiveness, access the host through the host network interfaces as soon as the host has completed booting the OS. |
||
Executing the ALOM CMT break command and the OpenBoot PROM go command might cause the system to hang or panic. |
If the console hangs or panics, use the ALOM CMT reset command to reset the system. |
||
Typing unrecognized commands or words at the OBP prompt causes the system to return an erroneous error and might hang the server. This behavior only occurs when you drop into the OBP prompt from Solaris. The erroneous error message is: |
Disregard this message. If the console hangs or panics, use the ALOM CMT reset command to reset the system. |
||
POST or OBP reset-all generates the alert, Host system has shut down. |
This is normal behavior following a reset-all command. The message does not indicate a problem in this situation. |
||
The ALOM CMT console history boot and run logs are the same. |
|||
SunVTS memory or CPU tests could fail due to lack of system resources. When too many instances of SunVTS functional tests are run in parallel on UltraSPARC® T1 CMT CPU-based (sun4v) entry-level servers with low memory configurations, SunVTS tests might fail due to lack of system resources. For example, you could see an error message similar to the following: |
Workaround: Decrease the number of SunVTS test instances or perform SunVTS functional tests separately. In addition, you can increase the delay value for CPU tests or increase the test memory reserve space. |
||
If the clearasr command is used to clear a failed DIMM from the asr database and the resetsc command is issued before the clearasr command can be completed, ALOM-CMT might not properly reboot and returns the following error message:
|
After issuing the clearasrdb command, wait 15 seconds before issuing the resetsc command. |
||
Sun Net Connect 3.2.2 software does not monitor environmental alarms on the Sun Fire T1000 server. |
To receive notification that an environmental error has occurred, use the ALOM-CMT mgt_mailalert feature to have ALOM-CMT send an email when an event occurs. To check whether or not the environmental status of the server is ok, log on to ALOM-CMT and run the showfaults command. To view a history of any events the server encountered, log on to ALOM-CMT and run the showlogs command. |
||
If you issue a break command in the middle of a system boot, and then immediately boot again, the boot process fails with the message, Exception handlers interrupted, please file a bug. |
|||
The maximum throughput of the system network ports decreases unexpectedly as the network load increases. |
|||
The ALOM CMT showfru command displays epoch timestamps of THU JAN 01 00:00:00 1970. |
Ignore timestamps with this date. There is no workaround at this time. |
||
SunVTS memory tests might log a warning message similar to the following in rare cases when the ECC Error Monitor (errmon) option is enabled: WARNING: software error encountered while processing /var/fm/fmd/errlog Additional-Information: end-of-file reached |
Do not enable the errmon option.
|
||
False Ereport error messages might be generated for PCI devices. |
There is no workaround at this time. The FMA diagnostic software required to eliminate false Ereports for PCI devices is still under development. |
||
The poweron command does not power on the system when issued immediately after the ALOM CMT resets. |
If you use a script to reset the ALOM-CMT and power on the system, insert a 1-second delay before the poweron command. |
||
When SunVTS testing is stopped while dtlbtest is running, dtlbtest fails with the error: No CPUs to test |
Upgrade to SunVTS 6.1 PS1 or a subsequent compatible version at this URL: |
||
The showcomponent command hangs if you repeatedly loop on the disablecomponent and enablecomponent commands. |
|||
Displaying large persistent logs with the showlogs -p p command slows down the ALOM CMT command line interface. |
Use the -e flag with the showlogs command: This command displays a specified number of lines of data instead of displaying the entire log. |
||
The virtual-console does not accept paste buffers that are greater than 114 characters. This causes the wanboot NVRAM parameter, network-boot-arguments to not be set. |
Cut and paste in chunks smaller than 114 characters, or don't use cut and paste. |
||
The ALOM CMT poweron command can fail and the console device is not available. If another poweron command is issued, it fails with a "Host poweron is already in progress" message. |
Reset the ALOM CMT with the resetsc command, then issue the poweron command again. |
||
System fault messages and ALOM CMT alerts continue to be generated on boot after the fault has been repaired. |
Install patch 119578-22 to avoid the problem. If the patch has not yet been installed, after you replace the faulty FRU, run the showfaults -v command to determine how to clear the fault. For the full procedure for clearing faulty messages, see To Manually Clear Fault Logs. |
||
Although they are not stable interfaces, putting Dtrace fbt probes on send_one_mondo and send_mondo_set could be used as a workaround. For send_mondo_set, extract the number of CPUs being sent cross calls from the cpuset_t argument. |
|||
The maximum size of the FMA fltlog file might be restricted. |
Remove the restrictions by changing the default log rotation options for the Solaris logadm(1M) command. |
||
A momentary pressure on the Power On/Off button does not initiate a normal shutdown. |
Use the ALOM-CMT poweron and poweroff commands to power the system on and off. |
||
A date changed through the Solaris date command persists across reboots of the Solaris OS but not reboots of the ALOM CMT. |
Use only the ALOM-CMT date command. Do not use the Solaris date command. |
||
See Chassis Cover Might Be Difficult to Remove (CR 6376423). |
|||
At certain stages of the power-on process, if the resetsc command is issued, or if the server loses AC power, the ALOM-CMT record of boot status is not cleared. At the next boot, ALOM-CMT might print the message "Reboot loop detected" and does not power on the system. |
Issue the command poweroff -f and attempt to power on again. |
||
If host power is removed while POST or OpenBoot PROM is testing a device, the device is disabled. |
Use the ALOM-CMT command, enablecomponent to reenable the incorrectly blacklisted device. |
||
The ALOM CMT sc_powerstatememory record might fail during a power failure, preventing the system from powering up afterward. |
Use the ALOM CMT poweroff and poweron commands to cycle power on the host system. If you need to remove AC power from the system, you must wait 5 seconds before reapplying power. |
||
A faulty DIMM in rank 0 memory can prevent POST from running. The ALOM CMT showcomponent command does not list any CPUs if POST fails to run. Cycling power or running the resetsc command does not update the showcomponent list. |
Replace the faulty DIMM, then run POST to update the device list used by the showcomponents command. |
||
The OpenBoot nvramrc script is not evaluated before the probe-all command executes. |
|||
System does not automatically recover and reboot after an error that causes a fatal abort. In these situations, you must manually power on the system. |
Wait for the message SC Alert: Host system has shut down, then issue the ALOM CMT poweron command. (Caution: a system shutdown takes approximately 1-2 minutes. If you issue a poweron or poweroff command before the SC Alert message appears, the system will enter an uncertain state. If this happens, issue the ALOM-CMT resetsc command first, then issue the poweron command.) |
||
False error messages are logged during poweron or system reset. The error messages include this segment: ereport.io.fire.pec.lup |
Instructions for installing, administering, and using your Sun Fire T1000 server are provided in the Sun Fire T1000 server documentation set. The entire documentation set is available for download from the following web site:
http://www.sun.com/documentation/
Note - Information in these product notes supersedes the information in the Sun Fire T1000 documentation set. |
To Manually Clear Fault Logs |
Perform this procedure after replacing Sun Fire T1000 DIMMs. This procedure clears persistent fault information that creates erroneous fault messages at boot time.
1. Troubleshoot and repair a faulty FRU as described in the Sun Fire T1000 Server Service Manual.
2. Gain access to the ALOM-CMT sc> prompt.
Refer to the Advanced Lights Out Management (ALOM) CMT v1.1 Guide for instructions.
3. Run the showfaults -v command to determine how to clear the fault.
The method you use to clear a fault depends on how the fault is identified by the showfaults command.
Then continue to Step 4.
Then run the enablecomponent command to enable the FRU:
4. Perform the following steps to verify that there are no faults:
a. Set the virtual keyswitch to Diag mode so that POST will run in Service mode.
c. Switch to the system console to view POST output.
Watch the POST output for possible fault messages. The following output is a sign that POST did not detect any faults:
d. Issue the Solaris OS fmadm faulty command.
No memory or DIMM faults should be displayed.
If faults are reported, refer to the Diagnostic Flow Chart in the Sun Fire T1000 Server Service Manual for an approach to diagnose the fault.
5. Gain access to the ALOM-CMT sc> prompt.
6. Run the showfaults command.
If the fault was detected by the host and the fault information persists, the output will be similar to the following example:
sc> showfaults -v ID Time FRU Fault 0 SEP 09 11:09:26 MB/CMP0/CH0/R0/D0 Host detected fault, MSGID: SUN4U-8000-2S UUID: 7ee0e46b-ea64-6565-e684-e996963f7b86 |
If the showfaults command does not report a fault with a UUID, then you do not need to proceed with the following steps because the fault is cleared.
7. Run the clearfault command.
8. Switch to the system console.
9. Issue the fmadm repair command with the UUID.
Use the same UUID that you used with the clearfault command.
RAID technology allows for the construction of a logical volume, made up of multiple physical disks, to provide data redundancy, increased performance, or both. The Sun Fire T1000 server onboard disk controller supports the following RAID configurations:
You must have the following patches installed on the server before you create RAID volumes:
Note - For servers with HW dash level 07 or later, the following patches are preinstalled. |
For information on how to implement hardware RAID on the server, refer to the Sun Fire T1000 Server Administration Guide (part number 819-3249). This document is available alongside the other Sun Fire T1000 manuals at http://www.sun.com/documentation.
Any Sun Fire T1000 server with a single hard disk configuration can be upgraded to a two SAS disk configuration by installing the following hardware:
Note - Patches 123456-01 or greater and 119850-14 or greater are required for this hardware upgrade. |
Copyright © 2007, Sun Microsystems, Inc. All Rights Reserved.