C H A P T E R  1

Important Information About the Sun Fire T2000 Server

These product notes contain important and late-breaking information about the Sun Firetrademark T2000 server.

Refer to the following sections for details:

Instructions for installing, administering, and using the Sun Fire T2000 server are provided in the Sun Fire T2000 server documentation set. The entire documentation set is available for download from the following web site:
http://www.sun.com/documentation

Information described in these product notes supersedes the information in the Sun Fire T2000 documentation set.



Note - Some server output displays the string "Sun Fire T200," but should display "Sun Fire T2000." For more information, read change request (CR) 6331169.




Identifying the Notes for Your Server

The product notes for the Sun Fire T2000 server are presented in the following categories:

Start by reviewing the general information in this chapter, and then review the notes in the chapter that apply to your server based on its part number.


procedure icon  To Determine the Part Number and Which Notes Apply to Your Server

1. Gain Access to the ALOM CMT system controller prompt (sc>).

At the Sun Fire T2000 console, type #. (Pound Period).

2. Perform the showfru command as follows:


sc> showfru -s MB
SEGMENT: SD
/ManR
/ManR/UNIX_Timestamp32:      TUE APR 24 18:57:57 2006
/ManR/Description:           ASSY,Sun-Fire-T2000,CPU Board
/ManR/Manufacture Location:  Sriracha,Chonburi,Thailand
/ManR/Sun Part No:           Sun_Partnumber 
/ManR/Sun Serial No:         PC1234
/ManR/Vendor:                Celestica
/ManR/Initial HW Dash Level: 01
/ManR/Initial HW Rev Level:  02
/ManR/Shortname:             T2000_MB
/SpecPartNo:                 885-0689-01
sc>

3. Use the Sun_Partnumber from Step 2 and TABLE 1 to determine which notes apply to your server.


TABLE 1 Sun Fire T2000 Sun Part Numbers

Sun_Partnumber

Refer to Notes

5016843

Notes For Servers With Part Number 501-6843

5017501

Notes for Servers With Part Number 501-7501



Information For All Sun Fire T2000 Servers

The remaining sections in this chapter describe information that applies to all Sun Fire T2000 Servers.

HBA Cards Installed in PCI-E Slots on Sun Fire T2000 Servers Are Not Recognized by OBP or During Boot (CR 6479274, 6513604, 6513621, 6514875)

When PCI-E cards with X1, X2, or X4 lane widths have been installed in PCI-E slots (this problem does not affect X8 PCI-E cards), intermittent failures to recognize the cards can occur in Sun Fire T2000 servers. These failures are intermittent and occur during device training.

Normally, PCI-E devices are recognized by OpenBoot firmware as:


PCI-E slot 0:  /pci@780/pci@0/pci@8/SUNW,device_name@0
PCI-E slot 1:  /pci@7c0/pci@0/pci@8/SUNW,device_name@0
PCI-E slot 2:  /pci@7c0/pci@0/pci@9/SUNW,device_name@0

When these failures occur, OpenBoot firmware does not show some of the PCI-E devices in the device tree (displayed using the show-devs command at the ok prompt). After the Solaris OS boots, the cards will be missing from the output of the prtdiag -v command, as well. The system may also report a generic FMA message:

SUNW-MSG-ID: SUNOS-8000-1L

Workaround: Without the patches described below, reboot the system repeatedly until the system can see all devices (usually 1 or 2 reboots are necessary).

The issue is resolved on the following platforms:



Note - The firmware patches should be applied to each Sun Fire T2000 system with PCI-E cards installed in one or more of the PCI-E slots.



Sun Now Offers and Supports Sun 4 GByte DIMMs for Sun Fire T2000 Servers

Instructions for installing DIMMs are provided in the Sun Fire T2000 Server Service Manual.

4 GByte DIMMs may not be mentioned in the service manual, but the DIMM installation instructions apply to all supported DIMMs (512 MB, 1 GB, 2 GB, and 4 GB).

New Features Released in System Firmware 6.3.0

System Firmware 6.3.0 includes ALOM CMT v1.3. There are several new features in ALOM CMT v1.3:

For further information on the new features of ALOM CMT v1.3, refer to the Advanced Lights Out Management (ALOM) CMT v1.3 Guide (819-7981-10).

Preparing for Changes to the Networking Framework

Changes to the networking framework in upcoming software releases might require system administrators or developers to update references to ipge interfaces. To prepare for this change, note the locations of all references to the names of networking frameworks. For example, if you reference the name of an ipge interface in a system configuration file, note the location now. Alternatively, you might choose to limit the number of applications explicitly configured to use this interface.

Mandatory /etc/system File Entries

This section describes mandatory /etc/system file entries that must be listed in this file to ensure the optimal functionality of the server. These entries resolve CRs 6274126* and 6344888 (see Chapter 3, TABLE 3-3).

The following entry must be in the /etc/system file:

set pcie:pcie_aer_ce_mask=0x1

If you have a Sun Fire T2000 Server with part number 501-6843 and it is running the Solaristrademark 10 3/05 HW2 Operating System, you must also have the following entry:

set segkmem_lpsize=0x400000


procedure icon  To Check and Create the Mandatory /etc/system File Entries

Perform this procedure in the following circumstances:

1. Log in as superuser.

2. Check the /etc/system file to see if the mandatory lines are in the file.


# more /etc/system
*ident  "@(#)system     1.18 05/06/27 SMI" /* SVR4 1.5 */
*
* SYSTEM SPECIFICATION FILE
.
.
.
set pcie:pcie_aer_ce_mask=0x1
set segkmem_lpsize=0x400000     <--See footnote[1]
.

3. If the entries are not there, add them.

Use an editor to edit the /etc/system file and add both lines.

Reboot the server.

Replace Motherboards With Approved Replacements

Over time, different motherboards were manufactured for this server, and not all motherboards are interchangeable. If you replace the motherboard, you must replace it with a motherboard that has the same part number or an approved alternative motherboard (approved alternative part numbers are listed on the Sun Services Substitution List). The part number of the motherboard can be determined by visual inspection of the part number label on the motherboard or by using the showfru SC command.

Sun Explorer Requires the Tx000 Option

When running Sun Explorer 5.2 or greater, you must specify the Tx000 option to collect the data from the ALOM CMT commands on the Sun Fire T2000 server. The script is not run by default. The following example shows how to run the script.


# /opt/SUNWexplo/bin/explorer -w default,Tx000

For more details, refer to the troubleshooting document, Using Sun Explorer on the Tx000 Series Systems. This document is available on the SunSolve web site:
http://www.sun.com/sunsolve

Running SunVTS CPU Tests ... Causes Shutdown Due to Watchdog Timeout (CR 6498483)

Coolthreads servers running SunVTS cpu tests have encountered Solaris watchdog timeouts leading to system shutdown.

Workaround: Set the ALOM CMT sys_autorestart variable to none while running SunVTS, so that ALOM CMT issues a warning message but does not reset the server.

T2000 Correctable Memory Errors in POST can be confusing (CR 6479408)

POST error messages regarding unsupported memory configurations can be misleading. In situations where memory rank 0 (zero) is fully populated, the following message can be ignored safely.

ERROR: Using unsupported memory configuration

Recognizing Erroneous Error Messages

The implementation of the Solaris Predictive Self-Healing (PSH) software provided on this release of Sun Fire T2000 systems causes most systems to display a few erroneous error messages.

Erroneous Boot Time Messages

The following messages usually occur two or three times when the system is booted. These errors are logged and can be viewed with the fmdump command as shown in the following example:


# fmdump -ev
 
TIME                 CLASS              ENA
 
Nov 04 10:56:06.6096 ereport.io.fire.pec.rto   0x00002d1a86f87002
Nov 04 10:56:06.6100 ereport.io.fire.pec.rto   0x00002d1a9d2f2002
Nov 04 10:56:06.6100 ereport.io.fire.pec.rnr   0x00002d1a9d2f2002

These errors are not an indication of faulty devices. Once you confirm that your messages match the example shown, you can ignore them. If you see different error messages, contact your Sun Service representative for support.

Fault Messages Displayed While Booting From Disk, After Booting From the Network (CR 6424812)

If you boot from the hard drive (boot disk) after booting from the network
(boot net), and your server is using System Firmware Version 6.1.9, PSH fault messages might be displayed.

Disregard these messages. You can clear the messages from the PSH fault logs by following the instructions in the Sun Fire T2000 Server Service Manual (819-2548).

Example of the fault messages displayed at boot time:


SUNW-MSG-ID: SUN4-8000-5A, TYPE: Defect, VER: 1, SEVERITY: Critical
EVENT-TIME: Fri May 12 09:37:06 EDT 2006
PLATFORM: SUNW,Sun-Fire-T200, CSN: -, HOSTNAME: wgs94-181
SOURCE: eft, REV: 1.13
EVENT-ID: c788de32-a378-cc46-ad4b-97ce105fb175
DESC: 
A problem was detected in the PCI-Express subsystem software.
  Refer to http://sun.com/msg/SUN4-8000-5A for more information.
AUTO-RESPONSE: This fault does not have an automated response agent and thus requires interaction 
from the user and/or Sun Services.
IMPACT: Loss of services provided by the device instances associated with
this problem
REC-ACTION: Ensure latest driver and patch are installed.  Use fmdump -v -u &lt;EVENT_ID&gt; to identify the module/package, or contact Sun for support.

Example of displaying the messages with the fmdump command:


# fmdump -v -u 755528c5-0bcd-4810-fd86-a34baead30c8
TIME                 UUID                                 SUNW-MSG-ID
May 11 17:07:10.3877 755528c5-0bcd-4810-fd86-a34baead30c8 SUN4-8000-5A
   50%  defect.io.fire.pciex.driver
         FRU: pkg:///SUNWcakr
        rsrc: mod:///mod-name=px/mod-id=25
 
   50%  defect.io.fire.pciex.driver
         FRU: pkg:///SUNWipged
        rsrc: mod:///mod-name=ipge/mod-id=119

Example of displaying the System Firmware Version from the Service Controller:


sc> showhost version
System Firmware 6.1.9 Sun Fire[TM] T2000 2006/03/27 08:05
 
Host flash versions:
   Reset V1.1.4
   Hypervisor 1.1.1 2006/02/24 06:38
   OBP 4.20.3 2006/03/21 14:46
   Sun Fire[TM] T2000 POST 4.20.2 2006/03/02 19:31 
sc>

Erroneous Messages Displayed After a Repair
(CR 6369961)

The Solaris PSH facility automatically detects the replacement of the motherboard and DIMMs. However, erroneous fault messages might be displayed when the system is booted, and these messages can mislead you to think that a problem persists when it is actually fixed. To correct this situation, you must install the Sun Fire T2000 mandatory patch, 119578-2

Erroneous Fault Messages Displayed After a Solaris OS JumpStart Installation

If you perform a Solaris JumpStarttrademark installation of a Sun Fire T2000 server, the server will display erroneous PSH fault messages at boot time. To correct this situation, you must install the Sun Fire T2000 mandatory patches and make changes to the /etc/system file. In addition, you should also clear the PSH and ALOM CMT fault logs to prevent the erroneous messages from being reported again. The steps to do this are described in To Configure the System After a JumpStart Installation.

Example of Erroneous Fault Messages at boot time:


SUNW-MSG-ID: SUN4-8000-0Y, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Fri Jan 27 22:17:36 GMT 2006
PLATFORM: SUNW,Sun-Fire-T200, CSN: -, HOSTNAME: xx
SOURCE: eft, REV: 1.13
EVENT-ID: d79b51d1-aca0-c786-aa50-c8f35ea0fba3
DESC: A problem was detected in the PCI-Express subsystem.
Refer to http://sun.com/msg/SUN4-8000-0Y for more information.
AUTO-RESPONSE: One or more device instances may be disabled
IMPACT: Loss of services provided by the device instances associated with this fault
REC-ACTION: Schedule a repair procedure to replace the affected device. Use fmdump -v -u EVENT_ID to identify the device or contact Sun for support.

Example of displaying the messages with the fmdump command:


# fmdump -v -u d79b51d1-aca0-c786-aa50-c8f35ea0fba3
TIME UUID SUNW-MSG-ID
Jan 27 22:01:58.8757 d79b51d1-aca0-c786-aa50-c8f35ea0fba3 SUN4-8000-0Y 100% fault.io.fire.asic
FRU: hc://product-id=SUNW,Sun-Fire-T200/component=IOBD
rsrc: hc:///ioboard=0/hostbridge=0/pciexrc=0
Jan 27 22:17:36.5980 d79b51d1-aca0-c786-aa50-c8f35ea0fba3 SUN4-8000-0Y
100% fault.io.fire.asic
FRU: hc://product-id=SUNW,Sun-Fire-T200/component=IOBD
rsrc: hc:///ioboard=0/hostbridge=0/pciexrc=0


procedure icon  To Configure the System After a JumpStart Installation

This procedure describes how to configure the Sun Fire T2000 server after a jumpstart installation so that erroneous fault messages are not reported.

1. Install the mandatory patches on the server.

2. Update the /etc/system file.

See Mandatory /etc/system File Entries.

3. Use the fmadm faulty command to list the UUID of each erroneous fault.


# fmadm faulty

4. Clear each fault that was listed in the preceding step.


# fmadm repair d79b51d1-aca0-c786-aa50-c8f35ea0fba3

5. Clear the persistent logs as shown in the following example.


# cd /var/fm/fmd
# rm e* f* c*/eft/* r*/*

6. Reset the Solaris PSH modules as shown.


# fmadm reset cpumem-diagnosis
# fmadm reset cpumem-retire
# fmadm reset eft
# fmadm reset io-retire

7. Reset the faults at the ALOM CMT prompt:

a. Gain access to the ALOM CMT sc> prompt.

Refer to the Advanced Lights Out Management (ALOM) CMT v1.3 Guide for instructions.

b. Run the showfaults -v command to see the UUID of any faults.


sc> showfaults -v
ID Time              FRU               Fault
0 Jan 27 22:01 hc://product-id=SUNW,Sun-Fire-T200/component=IOBD Host detected fault, MSGID: 
SUN4-8000-0Y UUID: d79b51d1-aca0-c786-aa50-c8f35ea0fba3

c. Run the clearfault command with the UUID provided in the showfaults output:


sc> clearfault d79b51d1-aca0-c786-aa50-c8f35ea0fba3
Clearing fault from all indicted FRUs...
Fault cleared.

8. If faults continue to be reported, the server might have a faulty component. Refer to the Sun Fire T2000 Server Service Manual for diagnostic procedures.

 

Documentation Errata

Error Regarding Date Synchronization in the ALOM CMT Guide

There is an error in the documentation of the showdate command in published versions of the ALOM CMT guide. The erroneous text follows:

Displays the ALOM CMT date. The Solaris OS and ALOM CMT time are synchronized, but ALOM CMT time is expressed in Coordinated Universal Time (UTC) rather than local time.

The correct text should be:

Displays the ALOM CMT date. ALOM CMT time is expressed in Coordinated Universal Time (UTC) rather than local time. The Solaris OS and ALOM CMT time are not synchronized.

Typographic Error in Translated Versions of the Sun Fire T2000 Server Installation Guide

There might be a typographical error in the translated versions of the Sun Fire T2000 Server Installation Guide. The error is not present in the English version.

The error is located in Chapter 2, in the section titled, "To Boot the Solaris Operating system, in the example in Step 2.

The incorrect example shows the following:


ok boot / pci@7c0/pci@0/pci@2/pci@0,2/LSILogic,sas@4/disk@0,0p

There is a space after the first / that should not be there.

The following line shows the correct example:


ok boot /pci@7c0/pci@0/pci@2/pci@0,2/LSILogic,sas@4/disk@0,0p


1 (TableFootnote) Only needed on Sun Fire T2000 Servers with part number 501-6843 and running the Solaris 10 3/05 HW2 OS.