SPARC Enterprise M8000/M9000 Servers Product Notes

This document includes these sections:


Supported Firmware and Software Versions

The following firmware and software versions are supported in this release:



Caution - CR ID 6534471: the system may panic or trap during a normal operation. Implement the workaround for CR ID 6534471 or check for the availability of a patch and install it immediately. This CR is listed in the section, Solaris Issues and Workarounds.



If you plan to boot your SPARC Enterprise M8000/M9000 server from a Solaris WAN boot server on the network, you must upgrade the wanboot executable. See Booting From a WAN Boot Server for details.



Note - For the latest information on supported firmware and software versions, see Software Resources.




Solaris Patch Information

These are the mandatory patches for the SPARC Enterprise M8000/M9000 servers. Install the patches in the following order:



Note - See Software Resources for information on how to find the latest patches. Installation information and README files are included in the patch download.




Known Issues

This section describes known hardware and software issues in this release.

General Functionality Issues and Limitations



Caution - Use of DR in an unsupported configuration might result in a domain panic or might hang the system.




Notes for Dual eXtended System Control Facility (XSCF) Unit

Because the dual eXtended System Control Facility (XSCF) unit is a functionality which will be supported in the future, you will find several points that are different from what is written in the documentation of SPARC Enterprise M8000 and M9000 servers.


Hardware Installation and Service Issues

This section describes hardware specific issues and workarounds.

Issues and Workarounds

TABLE 1 lists known hardware issues and possible workarounds.


TABLE 1 Hardware Issues and Workarounds

CR ID

Description

Workaround

6433420

The domain console may display a Mailbox time-out or IOCB interrupt time-out error during boot.

Issue a reset-all command from the OBP (OK) prompt and reboot.

6488846

During boot, the domain console may display a checksum error for the SG(X)PCI2SCSIU320-Z SCSI controller I/O card.

Check for the availability of the latest controller card firmware.

 



Hardware Documentation Updates

This section contains late-breaking hardware information that became known after the documentation set was published.


TABLE 2 Documentation Updates

Title

Page Number

Update

All SPARC Enterprise M8000/M9000 servers documentation

 

All DVD references are now referred to as CD-RW/DVD-RW.

SPARC Enterprise M8000/M9000 Servers Service Manual

Page 4-23

In section 4.7 (procedures for powering the system on and off), the instructions have changed.

See Power-On/Off Procedures of the Server with Expansion Cabinet.

SPARC Enterprise M8000/M9000 Servers Service Manual

Page 20-12

In section 20.2.1 (backplane replacement), the instructions for setting torque have changed:

"For tightening the power bar, choose a torque depending on the bolt size.

For M8 bolts, use a torque of 8.24 N.m (82 kgf.cm).

For M6 bolts, use a torque of 3.73 N.m (38 kgf.cm)."

SPARC Enterprise M8000/M9000 Servers Service Manual

Page 21-1

In section 21.1 (sensor unit replacement), the description of the figures has changed:

"FIGURE 21-1, FIGURE 21-2, and FIGURE 21-3 show the mounting locations of the SNSUs of SPARC Enterprise M8000 Server, SPARC Enterprise M9000 Server (base cabinet), and the base cabinet of SPARC Enterprise M9000 Server with the expansion cabinet respectively."




Note - The following information supersedes the information in the SPARC Enterprise M8000/M9000 Servers Service Manual.



Power-On/Off Procedures of the Server with Expansion Cabinet

On the server with expansion cabinet, when you turn on or turn off the mainline switch, do not fail to follow the order described below.

Power-On:

1. Turn on all the mainline switches of the expansion cabinet.

In case that the power cabinet connected for the dual power feed option, also turn on all the mainline switches of the power cabinet.

2. Turn on all the mainline switches of the base cabinet.

In case that the power cabinet connected for the dual power feed option, also turn on all the mainline switches of the power cabinet.

Power-Off:

1. Turn off all the mainline switches of the base cabinet.

In case that the power cabinet connected for the dual power feed option, also turn off all the mainline switches of the power cabinet.

2. Turn off all the mainline switches of the expansion cabinet.

In case that the power cabinet connected for the dual power feed option, also turn off all the mainline switches of the power cabinet.


Software Issues

This section describes specific software and firmware issues and workarounds.

XCP Issues and Workarounds

TABLE 3 lists XCP issues and possible workarounds.


TABLE 3 XCP Issues and Workarounds

CR ID

Description

Workaround

6486286

 

Domain console connection does not cancel shell when disconnected.

Always log out of the Solaris OS before exiting the console connection.

If you accidentally disconnect the domain console without logging out:

  • Log in again to the domain console
  • Log out
  • Exit the console connection

6519877

 

All domains must be powered off before upgrading the XCP firmware.

Power off domains before using the flashupdate command to upgrade XCP firmware.

6521896

If you log in to the XSCF Unit while it is still booting, you might get a bash$ prompt instead of the XSCF> prompt, and be unable to perform most operations.

Log out of the bash$ prompt and wait for the SCF to finish booting.

6526186

Hot-plugging of the IOU onboard device card (IOUA) is not supported at this time.

There is no workaround. Check for the availability of a patch for this feature.

6529635

The showdomainstatus -a command shows domain status as Powered Off, but the showboards -a command shows the domain is testing.

Use the showboards command to check the status of system power.

The showdomainstatus command takes a longer time to show the correct status.

6532036

 

Some commands which update configuration data take a relatively long time to execute.

Do not cancel set* commands. They appear to hang, but eventually complete in about 30 seconds.

6533158

The fault (memory.block.ue) is encountered and reported periodically.

An uncorrectable error exists in a DIMM and the DIMM should be replaced.

6537345

When using the XSCF Web to import a firmware image, if the image is corrupted (for example, if the browser window is closed during import), the flashupdate command might later report an internal error. CR ID 6537996 is similar.

Use the command getflashimage -d to delete the corrupted image. If necessary, reboot the XSCF Unit, then use the flashupdate command again to clear the internal error.

 

6537408

Attempting to move a COD board using the moveboard command might fail.

Use the deleteboard and addboard commands instead of the moveboard command.

6538022

The XSCF firmware monitors itself and if it detects any inconsistencies, it forces a reboot.

There is no workaround. Allow the XSCF Unit to finish rebooting. It will return to normal operation within approximately five minutes.

6538564

Using the rebootxscf command might result in a process down error, and possibly an FMA event with MSG ID SCF-8005-NE.

There is no workaround. Check for the availability of a patch for this defect.

 

6543260

The showaudit all command shows a long list of defaults in the policy section after the database is cleared.

To clear the non-existent user default settings, run the following commands:

setaudit -a opl=enable

setaudit -a opl=default


Solaris Issues and Workarounds.

TABLE 4 lists Solaris issues and possible workarounds.


TABLE 4 Solaris Issues and Workarounds

CR ID

Description

Workaround

6303418

A SPARC Enterprise M9000 with a single domain and 11 or more fully populated system boards may hang under heavy stress.

Do not exceed 170 CPU strands.

Limit the number of CPU strands to one per CPU core by using the Solaris psradm command to disable the excess CPU strands. For example, disable all odd-numbered CPU strands.

6459540

The DAT72 internal tape drive might time out during tape operations.

The device might also be identified by the system as a QIC drive.

 

Update the Solaris /kernel/drv/st.conf file with the following lines:

 

tape-config-list = "QUANTUM DAT DAT72-00", "QUANTUM DAT DAT72-00", "CFGQUANTUMDATDAT7200", "SEAGATE DAT DAT72-00", "SEAGATE DAT DAT72-00", "CFGSEAGATEDAT7200";

 

CFGQUANTUMDATDAT7200 = 2,0x34,0,0x18619,4,0x47,0x47,0x47,0x47,3,0,600,600,600,600,600,10800;

 

CFGSEAGATEDAT7200 = 2,0x34,0,0x18619,4,0x47,0x47,0x47,0x47,3,0,600,600,600,600,600,10800;

6472153

If you create a Solaris install image or boot image on a non-SPARC Enterprise M8000/M9000 sun4u server and use it on a SPARC Enterprise M8000/M9000 sun4u server, the console's TTY flags will not be set correctly. This can cause the console to lose characters during stress.

Telnet into the SPARC Enterprise M8000/M9000 server to reset the console's TTY flags as follows:

# sttydefs -r console
# sttydefs -a console -i "9600 hupcl opost onlcr crtscts" -f "9600"

6485555

On-board Gigabit Ethernet NVRAM corruption could occur due to a race condition.

If the NVRAM is corrupted, the device is not recognized as a network device. Contact your service representative to replace the FRU.

6498283

Using the DR deleteboard command while psradm operations are running on a domain might cause a system panic.

 

There is no workaround. Check for the availability of a patch for this defect.

 

6505921

Correctable error on the system PCIe bus controller generates an invalid fault.

Create a file /etc/fm/fmd/fmd.conf containing the following lines;

setprop client.buflim 40m

setprop client.memlim 40m

6508432

A large number of spurious PCIe correctable errors can be recorded in the FMA error log.

 

Add the following entry to /etc/system to prevent the problem:

set pcie:pcie_aer_ce_mask = 0x2001

6510779

On a large single domain configuration, the system may incorrectly report very high load average at times.

There is no workaround. Check for the availability of a patch for this defect.

 

6510861

When using the PCIe Dual-Port Ultra320 SCSI controller card (SG-(X)PCIE2SCSIU320Z), a PCIe correctable error causes a Solaris panic.

There is no workaround. Check for the availability of a patch for this defect.

 

6522017

Domains using the ZFS file system cannot use DR.

There is no workaround.

 

6527781

The cfgadm command fails while moving the DVD/DAT drive between two domains.

There is no workaround. To reconfigure DVD/Tape drive, execute reboot -r from the domain exhibiting the problem.

6530178

DR addboard command can hang. Once the problem is observed, further DR operations are blocked. Recovery requires reboot of the domain.

There is no workaround. Check for the availability of a patch for this defect.

6531036

The error message network initialization failed appears repeatedly after a boot net installation.

There is no workaround. Check for the availability of a patch for this defect.

6534471

Systems may panic/trap during normal operation.

  • Make sure you have the correct /etc/system parameter:
    set heaplp_use_stlb=0
  • If a change to the parameter does not correct in the problem, check for the availability of a patch for this defect.

6536564

Faults in I/O devices might not be diagnosed correctly by the Solaris Fault Management Architecture and result in a defect.eft.undiagnosable_problem, or might be diagnosed as fault.io.* but identify the wrong IOU.

If Solaris panics and reboots due to an I/O fault, use fmdump -eV to view the error report. The device path in the error report will indicate where the error was detected, which will help to isolate the I/O fault.

6539084

PCIe Quad-port Gigabit Ethernet Adapter UTP card might panic during a reboot.

There is no workaround. Check for the availability of a patch for this defect.

6539909

Do not use the following I/O cards for network access when you are using the boot net install command to install the Solaris OS:

  • 4447A-Z/X4447A-Z, PCIe Quad-port Gigabit Ethernet Adapter UTP
  • 1027A-Z/X1027A-Z, PCIe Dual 10 Gigabit Ethernet Fiber XFP

Use an alternate type of network card or onboard network device to install the Solaris OS via the network.

6542632

Memory leak in PCIe module if driver attach fails.

There is no workaround. Check for the availability of a patch for this defect.

6545685

If the system has detected Correctable MemoryErrors (CE) at power-on self-test (POST), the domains might incorrectly degrade 4 or 8 DIMMs.

Increase the memory patrol timeout values used via the following setting in /etc/system:

set mc-opl:mc_max_rewrite_loop = 10000


Identifying Permanent Memory in a Target Board

Dynamic reconfiguration is not recommended for production use if the target board (SB/XSB) has permanent memory.

1. Log in to XSCF.

2. Type the following command:


XSCF> - showdevices -d domain_id

The following example shows a display of the showdevices -d command where 0 is the domain_id.


XSCF> showdevices -d 0
 
...
 
Memory:
-------
             board     perm       base                  domain   target deleted remaining
DID XSB   mem MB  mem MB  address             mem MB  XSB    mem MB  mem MB
00  00-0    8192       0  0x0000000000000000   24576
00  00-2    8192    1674  0x000003c000000000   24576
00  00-3    8192       0  0x0000034000000000   24576
 
...

The entry for column 4 perm mem MB indicates the presence of permanent memory if the value is non-zero.

The example shows permanent memory on 00-2, with 1674 MB.

If the board includes permanent memory, when you execute the deleteboard command or the moveboard command, the following notice appears:


System may be temporarily suspended, proceed? [y|n]:

3. If a board includes permanent memory, enter n to cancel the DR command.


System may be temporarily suspended, proceed? [y|n]:n
disconnect SB5
DR operation canceled by operator.
XSCF>

Booting From a WAN Boot Server

To support booting the SPARC Enterprise M8000/M9000 server from a WAN boot server:

1. Install the Solaris 10 11/06 OS on the WAN boot server.

2. Copy the wanboot executable from that release to the appropriate location on the install server. If you need further instructions, refer to:

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh65?a=view

3. Create a WAN boot miniroot from the Solaris 10 11/06 OS. If you need further instructions, refer to:

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh63?a=view

If you do not upgrade the wanboot executable, the SPARC Enterprise M8000/M9000 server will panic, with messages similar to the following:

krtld: load_exec: fail to expand cpu/$CPUkrtld: error during initial load/link phasepanic - boot: exitto64 returned from client program

See http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh5i?a=view for more information on WAN boot.

Abbreviated Man Page for getflashimage

Synopsis

getflashimage [-v] [[-q] -{y|n}] [-u user] [-p proxy [-t proxy_type]] url

getflashimage -l

getflashimage [[-q] -{y|n}] [-d]

getflashimage -h

Description

The getflashimage (8) command downloads a firmware image file for use by the flashupdate (8) command. If any previous image files of the firmware are present on the XSCF unit, they are deleted prior to downloading the new version. You must have platadm or fieldeng privileges to run this command.

Options and Operand

The following table describes the most commonly used options and operand.


-d

Deletes all previous firmware image files still on the XSCF unit, then exits.

-l

Lists firmware image files that are still on the XSCF unit, then exits.

-u user

Specifies the user name when logging in to a remote ftp or http server that requires authentication. You will be prompted for a password.

url

Specifies the URL of the firmware image to download.


Examples

CODE EXAMPLE 1 Downloading Using a User Name and Password

This example uses the optional -u user option.


XSCF> getflashimage -u jsmith \
http://imageserver/images/FFXCP1041.tar.gz 
Existing versions: 
        Version                Size  Date 
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007
Warning: About to delete old versions.
Continue? [y|n]: y 
Password: [not echoed]
Removing FFXCP1040.tar.gz.
  0MB received
  1MB received
  2MB received
...
  43MB received
  44MB received 
  45MB received
Download successful: 46827KB at 1016.857KB/s 

CODE EXAMPLE 2 Listing Available Firmware Image Files

XSCF> getflashimage -l 
Existing versions: 
        Version                Size  Date 
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007

CODE EXAMPLE 3 Deleting All Previous Firmware Image Files

XSCF> getflashimage -d 
Existing versions:
        Version                Size  Date
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007
Warning: About to delete old versions.
Continue? [y|n]: y 
Removing FFXCP1040.tar.gz.


Software Documentation Updates

This section contains late-breaking information on the software documentation that became known after the documentation set was published.


TABLE 5 Software Documentation Updates

Document

Page Number

Change

All SPARC Enterprise M4000/M5000 servers documentation

 

All DVD references are now referred to as CD-RW/DVD-RW.

The list of supported browsers in the SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User's Guide is erroneous.

Page 905

The list of web browsers supported by the XSCF Web includes:

  • Microsoft Internet Explorer 6.0 or later
  • Firefox 2.0 or later
  • Mozilla 1.7.x or later
  • Netscape Navigator 7.1 or later

ioxadm (8) man page

 

The Privileges section of the ioxadm (8) man page is incomplete.

The following description is complete:

  • With platop privileges, you can use the operands: env, list.
  • With platadm privileges, you can use the operands: env, list, locator, poweroff, poweron.
  • With fieldeng privileges, you can use the operands: env, list, locator, poweroff, poweron, reset, and setled.

showldap (8) man page

showlookup (8) man page

showcodusage (8) man page

showemailreport (8) man page

 

The man pages for showldap, showlookup, showcodusage, and showemailreport do not state that these commands are available with the fieldeng privilege.

 

getflashimage (8) man page

 

In XCP104x, the new command getflashimage is available, which can be used to download firmware images in place of the XSCF Web.

An abbreviated man page for getflashimage is included in Abbreviated Man Page for getflashimage.

setaudit (8) man page

showaudit (8) man page

 

 

The setaudit and showaudit man pages are incorrect with respect to audit class information.

The following are the audit classes and their values:

ACS_SYSTEM 1

ACS_WRITE 2

ACS_READ 4

ACS_LOGIN 8

ACS_AUDIT 16

ACS_DOMAIN 32

ACS_USER 64

ACS_PLATFORM 128

ACS_MODES 256

SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User's Guide

Page D-5

Frequently Asked Questions (FAQ) in "Troubleshooting XSCF and FAQ"

The option for OS dump is not "request" but "panic".

Correction:

1. First, execute the reset(8) command with the panic option from the XSCF Shell.

SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF Reference Manual

ioxadm(8) command

The required privileges for the ioxadm(8) command are as follows:

Required Privileges Commands

platop env, list

platadm env, list, locator, poweroff, poweron

fieldend env, list, locator, poweroff, poweron, reset, setled

 

The corrections here, if not otherwise specified, also apply to the man pages which XSCF provides. And they supersede the information on the man pages.

SPARC Enterprise M4000/M5000/M8000/M9000 Servers Administration Guide

 

Hotplugging of the IOU onboard device card (IOUA) is not supported at this time.