SPARC Enterprise M8000/M9000 Servers Product Notes

This document includes these sections:


Supported Firmware and Software Versions

The following firmware and software versions are supported in this release:



caution icon Caution - CR ID 6534471: the system may panic or trap during a normal operation. This bug has been fixed in Solaris 10 8/07. For systems running Solaris 10 11/06, you can upgrade to Solaris 10 8/07 or apply patch 120011-08. This CR is listed in the section, Solaris Issues and Workarounds.




Note - It is required that all SPARC Enterprise M8000/M9000 servers be upgraded to XCP 1050 in order to support adding future COD Right To Use (RTU) licenses. Contact your local Service Representative for assistance.


If you plan to boot your SPARC Enterprise M8000/M9000 server from a Solaris WAN boot server on the network, you must upgrade the wanboot executable. See Booting From a WAN Boot Server for details.



Note - For the latest information on supported firmware and software versions, see Software Resources.



Solaris Patch Information

The following patches are mandatory for Sun SPARC Enterprise M8000/M9000 servers running Solaris 10 11/06 OS. These patches are not required for servers running Solaris 10 8/07 OS.



Note - Each patch ID listed below includes a revision level, shown as a two-digit suffix. Check SunSolve.Sun.COM for the latest patch revision. See Software Resources for information on how to find the latest patches.


Install the patches in the following order:

After installing patch 118833-36, reboot your domain before proceeding.

Install version 125100-08 at minimum. See the 125100-08 README file for a list of other patch requirements.

After installing patch 125127-01, reboot your domain before proceeding.


Known Issues

This section describes known hardware and software issues in this release.

General Functionality Issues and Limitations



caution icon Caution - For DR and hot-plug issues, see TABLE 3, Solaris Issues and Workarounds.



Notes for Dual eXtended System Control Facility (XSCF) Unit

Because the dual eXtended System Control Facility (XSCF) unit is a functionality which will be supported in the future, you will find several points that are different from what is written in the documentation of SPARC Enterprise M8000 and M9000 servers.


Hardware Installation and Service Issues

This section describes hardware-specific issues and workarounds.

Issues and Workarounds

TABLE 1 lists known hardware issues and possible workarounds.


TABLE 1 Hardware Issues and Workarounds

CR ID

Description

Workaround

6433420

The domain console may display a Mailbox time-out or IOCB interrupt time-out error during boot.

Issue a reset-all command from the OBP (OK) prompt and reboot.

6488846

During boot, the domain console may display a checksum error for the SG(X)PCI2SCSIU320-Z SCSI controller I/O card.

Check for the availability of the latest controller card firmware.

 

6557379

Power cables are not redundant on single power feed servers without the dual power feed option.

On servers that have single power feed, all power cables must be connected and powered on at all times.



Software and Firmware Issues

This section describes specific software and firmware issues and workarounds.

XCP Issues and Workarounds

TABLE 2 lists XCP issues and possible workarounds.


TABLE 2 XCP Issues and Workarounds

CR ID

Description

Workaround

6486286

 

Domain console connection does not cancel shell when disconnected.

Always log out of the Solaris OS before exiting the console connection.

If you accidentally disconnect the domain console without logging out:

  • Log in again to the domain console
  • Log out
  • Exit the console connection

6519877

 

All domains must be powered off before upgrading the XCP firmware.

Power off domains before using the flashupdate command to upgrade XCP firmware.

6521896

If you log in to the XSCF Unit while it is still booting, you might get a bash$ prompt instead of the XSCF> prompt, and be unable to perform most operations.

Log out of the bash$ prompt and wait for the SCF to finish booting.

6529635

The showdomainstatus -a command shows domain status as Powered Off, but the showboards -a command shows the domain is testing.

Use the showboards command to check the status of domain power.

The showdomainstatus command takes a longer time to show the correct status.

6532036

 

Some commands which update configuration data take a relatively long time to execute.

Do not cancel set* commands. They appear to hang, but eventually complete in about 30 seconds.

6533158

The fault (memory.block.ue) is encountered and reported periodically.

An uncorrectable error exists in a DIMM and the DIMM should be replaced.

6537345

When using the XSCF Web to import a firmware image, if the image is corrupted, the flashupdate command might later report an internal error.

Import a firmware image again. Reboot the XSCF Unit, then use the flashupdate command again to clear the internal error.

6538564

Using the rebootxscf command might result in a process down error, and possibly an FMA event with MSG ID SCF-8005-NE.

There is no workaround. Check for the availability of a patch for this defect.

 

6543260

The showaudit all command shows a long list of defaults in the policy section after the database is cleared.

To clear the non-existent user default settings, run the following commands:

setaudit -a opl=enable

setaudit -a opl=default

6565422

The Latest communication field in showarchiving is not updated regularly.

Disabling and re-enabling archiving refreshes the Latest communication field in showarchiving output.

6573729

When the snapshot CLI attempts to write to a USB stick that has write protect set results in an I/O error.

Do not attempt to use write-protected USB devices for collecting snapshot.

6577801

An incorrect domain state is reported. After the command sendbreak to domain is issued, showdomainstatus continues to show the state as ‘Running’ when the domain is actually at ’ok’ prompt.

There is no workaround. This is the side affect of the sendbreak operation.

6588650

On occasion, the system is unable to DR after an XSCF failover or XSCF reboot.

There is no workaround. Check for the availability of a patch for this defect.

6595501

If an invalid SMTP server is configured, a subsequent attempt to disable email service (using the setemailreport CLI) may block for up to 30 minutes.

Wait for the CLI to complete. The rest of the system will function normally during this time.

  • The CLI can also be aborted by ^C. Note that the operation (disabling emailreport) is completed, even if ^C is used.
  • showemailreport can be used to confirm that the service has been disabled.

Solaris Issues and Workarounds.

TABLE 3 lists Solaris issues and possible workarounds.


TABLE 3 Solaris Issues and Workarounds

CR ID

Description

Workaround

5076574

A PCIe error can lead to an invalid fault diagnosis on a large M9000/M8000 domain.

Create a file /etc/fm/fmd/fmd.conf containing the following lines;

setprop client.buflim 40m

setprop client.memlim 40m

6303418

A SPARC Enterprise M9000 with a single domain and 11 or more fully populated system boards may hang under heavy stress.

Do not exceed 170 CPU strands.

Limit the number of CPU strands to one per CPU core by using the Solaris psradm command to disable the excess CPU strands. For example, disable all odd-numbered CPU strands.

 

This bug has been fixed in Solaris 10 8/07.

6348554

Using the cfgadm -c disconnect command on the following cards might hang the command:

  • SG-XPCIE2FC-QF4 Sun StorageTek Enterprise Class 4Gb Dual-Port Fibre Channel PCI-E HBA
  • SG-XPCIE1FC-QF4 Sun StorageTek Enterprise Class 4Gb Single-Port Fibre Channel PCI-E HBA
  • SG-XPCI2FC-QF4 Sun StorageTek Enterprise Class 4Gb Dual-Port Fibre Channel PCI-X HBA
  • SG-XPCI1FC-QF4 Sun StorageTek Enterprise Class 4Gb Single-Port Fibre Channel PCI-X HBA

Do not perform cfgadm -c disconnect operation on the affected cards.

 

6459540

The DAT72 internal tape drive might time out during tape operations.

The device might also be identified by the system as a QIC drive.

 

 

Add the following definition to /kernel/drv/st.conf:

 

tape-config-list=

"SEAGATE DAT DAT72-000",

"SEAGATE_DAT____DAT72-000",

"SEAGATE_DAT____DAT72-000";

SEAGATE_DAT____DAT72-000=1,0x34,0,0x9639,4,0x00,0x8c,0x8c,

0x8c,3;

 

There are four spaces between “SEAGATE DAT and DAT72-000.

6472153

If you create a Solaris Flash archive on a non-SPARC Enterprise M8000/M9000 sun4u server and install it on a SPARC Enterprise M8000/M9000 sun4u server, the console’s TTY flags will not be set correctly. This can cause the console to lose characters during stress.

 

 

Just after installing Solaris OS from a Solaris Flash archive, telnet into the SPARC Enterprise M8000/M9000 server to reset the console’s TTY flags as follows:

# sttydefs -r console
# sttydefs -a console -i "9600 hupcl opost onlcr crtscts" -f "9600"

 

This procedure is required only once.

6498283

Using the DR deleteboard command while psradm operations are running on a domain might cause a system panic.

 

There is no workaround. Check for the availability of a patch for this defect.

 

This bug has been fixed in Solaris 10 8/07.

6508432

 

 

A large number of spurious PCIe correctable errors can be recorded in the FMA error log.

 

Add the following entry to /etc/system to prevent the problem:

set pcie:pcie_aer_ce_mask = 0x2001

 

This bug has been fixed in Solaris 10 8/07.

6510779

On a large single domain configuration, the system may incorrectly report very high load average at times.

There is no workaround. Check for the availability of a patch for this defect.

 

6522017

DR and ZFS may not be used in the same domain.

Set the maximum size of the ZFS ARC lower. For detailed assistance please contact Sun Service.

6527781

 

 

The cfgadm command fails while moving the DVD/DAT drive between two domains.

There is no workaround. To reconfigure DVD/Tape drive, execute reboot -r from the domain exhibiting the problem.

 

This bug has been fixed in Solaris 10 8/07.

6530178

DR addboard command can hang. Once the problem is observed, further DR operations are blocked. Recovery requires reboot of the domain.

There is no workaround. Check for the availability of a patch for this defect.

 

This bug has been fixed in Solaris 10 8/07.

6531036

The error message network initialization failed appears repeatedly after a boot net installation.

There is no workaround.

6534471

Systems may panic/trap during normal operation.

Make sure you have the correct /etc/system parameter:
set heaplp_use_stlb=0

 

This bug has been fixed in Solaris 10 8/07.

6539909

Do not use the following I/O cards for network access when you are using the boot net install command to install the Solaris OS:

  • X4447A-Z/X4447A-Z, PCIe Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z/X1027A-Z, PCIe Dual 10 Gigabit Ethernet Fiber XFP

When running Solaris 10 11/06, use an alternate type of network card or onboard network device to install the Solaris OS via the network.

 

This defect does not exist in Solaris 10 8/07.

 

6545685

If the system has detected Correctable MemoryErrors (CE) at power-on self-test (POST), the domains might incorrectly degrade 4 or 8 DIMMs.

Increase the memory patrol timeout values used via the following setting in /etc/system:

set mc-opl:mc_max_rewrite_loop = 20000

6546188

The system panics when running hot-plug (cfgadm) and DR operations (addboard and deleteboard)on the following cards:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

There is no workaround. Check for the availability of a patch for this defect.

 

6551356

The system panics when running hotplug (cfgadm) to configure a previously unconfigured card. The message "WARNING: PCI Expansion ROM is not accessible" will be seen on the console shortly before the system panic. The following cards are affected by this defect:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Perform cfgadm -c disconnect to completely remove the card. After waiting at least 10 seconds, the card may be configured back into the domain using the cfgadm -c configure command.

 

6556742

The system panics when DiskSuite can not read the metadb during DR. This bug affects the following cards:

  • SG-XPCIE2FC-QF4, 4Gb PCI-e Dual-Port Fibre Channel HBA
  • SG-XPCIE1FC-QF4, 4Gb PCI-e Single-Port Fibre Channel HBA
  • SG-XPCI2FC-QF4, 4Gb PCI-X Dual-Port Fibre Channel HBA
  • SG-XPCI1FC-QF4, 4Gb PCI-X Single-Port Fibre Channel HBA

Panic can be avoided when a duplicated copy of the metadb is accessible via another Host Bus Adaptor. Or you can apply patch
125166-06.

 

6559504

Messages of the form "nxge: NOTICE: nxge_ipp_eccue_valid_check: rd_ptr = nnn wr_ptr = nnn" will be observed on the console.

These messages can be safely ignored.

 

 

6563785

Hot-plug operation with the following cards might fail if a card is disconnected and then immediately reconnected:

  • SG-XPCIE2SCSIU320Z Sun StorageTek PCI-E Dual-Port Ultra320 SCSI HBA
  • SGXPCI2SCSILM320-Z Sun StorageTek PCI Dual-Port Ultra320 SCSI HBA

After disconnecting a card, wait for a few seconds before re-connecting.

6564332

Hot-plug operations on Sun Crypto Accelerator (SCA)6000 cards can cause SPARC Enterprise M8000/M9000 servers to panic or hang.

Version 1.0 of the SCA6000 driver does not support hot-plug and should not be attempted. Version 1.1 of the SCA6000 driver and firmware will support hot-plug operations after the required bootstrap firmware upgrade has been performed.

6564934

Performing a DR deleteboard operation on a board which includes Permanent Memory when using the following network cards will result in broken connections:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Re-configure the affected network interfaces after the completion of the DR operation. For basic network configuration procedures, refer to the if man page for more information.

 

 

6568417

After a successful CPU DR deleteboard operation, the system panics when the following network interfaces are in use:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Add the following line to /etc/system and reboot the system:

 

set ip:ip_soft_rings_cnt=0

6571370

Use of the following cards have been observed to cause data corruption in stress test under laboratory conditions:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Add the following line in /etc/system and reboot:

set nxge:nxge_rx_threshold_hi=0

 

6584984

The busstat(1M) command with -w option might cause domains to reboot.

 

There is no workaround. Do not use busstat(1M) command with -w option on pcmu_p.

6589833

The DR addboard command might cause a system hang if you are adding a Sun StorageTek Enterprise Class 4Gb Dual-Port Fibre Channel PCI-E HBA card (SG-XPCIE2FC-QF4) at the same time that an SAP process is attempting to access storage devices attached to this card. The chance of a system hang is increased if the following cards are used for heavy network traffic:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

There is no workaround. Check for the availability of a patch for this defect.

6592302

Unsuccessful DR operation leaves memory partially configured.

To recover, add the board back to the domain with the addboard -d command, then retry the deleteboard command.


Identifying Permanent Memory in a Target Board

1. Log in to XSCF.

2. Type the following command:


XSCF> - showdevices -d domain_id

The following example shows a display of the showdevices -d command where 0 is the domain_id.


XSCF> showdevices -d 0
 
...
 
Memory:
-------
             board     perm       base                  domain   target deleted remaining
DID XSB   mem MB  mem MB  address             mem MB  XSB    mem MB  mem MB
00  00-0    8192       0  0x0000000000000000   24576
00  00-2    8192    1674  0x000003c000000000   24576
00  00-3    8192       0  0x0000034000000000   24576
 
...

The entry for column 4 perm mem MB indicates the presence of permanent memory if the value is non-zero.

The example shows permanent memory on 00-2, with 1674 MB.

If the board includes permanent memory, when you execute the deleteboard command or the moveboard command, the following notice appears:


System may be temporarily suspended, proceed? [y|n]:

Booting From a WAN Boot Server

To support booting the SPARC Enterprise M8000/M9000 server from a WAN boot server:

1. Install the Solaris 10 11/06 OS on the WAN boot server.

2. Copy the wanboot executable from that release to the appropriate location on the install server. If you need further instructions, refer to the Solaris 10 Installation Guide: Network-Based Installations or refer to:

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh65?a=view

3. Create a WAN boot miniroot from the Solaris 10 11/06 OS. If you need further instructions, refer to:

http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh63?a=view

If you do not upgrade the wanboot executable, the SPARC Enterprise M8000/M9000 server will panic, with messages similar to the following:

krtld: load_exec: fail to expand cpu/$CPUkrtld: error during initial load/link phasepanic - boot: exitto64 returned from client program

See http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh5i?a=view for more information on WAN boot.

Abbreviated Man Page for getflashimage

This section provides information on the abbreviated man page for getflashimage.

Synopsis

getflashimage [-v] [[-q] -{y|n}] [-u user] [-p proxy [-t proxy_type]] url

getflashimage -l

getflashimage [[-q] -{y|n}] [-d]

getflashimage -h

Description

The getflashimage (8) command downloads a firmware image file for use by the flashupdate (8) command. If any previous image files of the firmware are present on the XSCF unit, they are deleted prior to downloading the new version. You must have platadm or fieldeng privileges to run this command.

Options and Operand

The following table describes the most commonly used options and operand.


-d 

Deletes all previous firmware image files still on the XSCF unit, then exits.

-l 

Lists firmware image files that are still on the XSCF unit, then exits.

-u user

Specifies the user name when logging in to a remote ftp or http server that requires authentication. You will be prompted for a password.

url

Specifies the URL of the firmware image to download.


Examples

CODE EXAMPLE 1 Downloading Using a User Name and Password

This example uses the optional -u user option.


XSCF> getflashimage -u jsmith \
http://imageserver/images/FFXCP1041.tar.gz 
Existing versions: 
        Version                Size  Date 
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007
Warning: About to delete old versions.
Continue? [y|n]: y 
Password: [not echoed]
Removing FFXCP1040.tar.gz.
  0MB received
  1MB received
  2MB received
...
  43MB received
  44MB received 
  45MB received
Download successful: 46827KB at 1016.857KB/s 

CODE EXAMPLE 2 Listing Available Firmware Image Files

XSCF> getflashimage -l 
Existing versions: 
        Version                Size  Date 
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007

CODE EXAMPLE 3 Deleting All Previous Firmware Image Files

XSCF> getflashimage -d 
Existing versions:
        Version                Size  Date
        FFXCP1040.tar.gz   46827123  Wed Mar 14 19:11:40 2007
Warning: About to delete old versions.
Continue? [y|n]: y 
Removing FFXCP1040.tar.gz.


Software Documentation Updates

This section contains late-breaking information on the software documentation that became known after the documentation set was published.


TABLE 4 Software Documentation Updates

Document

Page Number

Change

All SPARC Enterprise M4000/M5000/M8000/M9000 servers documentation

 

All DVD references are now referred to as CD-RW/DVD-RW.

The list of supported browsers in the SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide is erroneous.

Page 9-5

The list of web browsers supported by the XSCF Web includes:

  • Microsoft Internet Explorer 6.0 or later
  • Firefox 2.0 or later
  • Mozilla 1.7 or later
  • Netscape Navigator 7.1 or later

SPARC Enterprise M4000/M5000/M8000/M9000 Servers Administration Guide

Page 2

The following caution will be added:

Note: The XSCF firmware requires that all domains have the SUNWsckmr and SUNWsckmu.u packages. Since the Core System, Reduced Network, and Minimal System versions of the Solaris OS do not automatically install these packages, you must do so on any domains that do not already have them.

SPARC Enterprise M4000/M5000/M8000/M9000 Servers Dynamic Reconfiguration (DR) User’s Guide

Page 2-15

Update 2.3: “Conditions and Settings Using Solaris OS”

The following caution will be added:

Caution: DR is not initially supported on domains with one of the following Solaris software groups installed: Core System, Reduced Network, or Minimal System. To use DR on such a domain, you first must install the SUNWsckmr and SUNWsckmu.u packages.

SPARC Enterprise M4000/M5000/M8000/M9000 Servers XSCF User’s Guide

Page D-5

Frequently Asked Questions (FAQ) in “Troubleshooting XSCF and FAQ“

The option for OS dump is not "request" but "panic".

Correction:

1. First, execute the reset(8) command with the panic option from the XSCF Shell.

ioxadm (8) man page

 

 

The Privileges section of the ioxadm (8) man page is incomplete.

The following description is complete:

  • With platop privileges, you can use the operands: env, list.
  • With platadm privileges, you can use the operands: env, list, locator, poweroff, poweron.
  • With fieldeng privileges, you can use the operands: env, list, locator, poweroff, poweron, reset, and setled.

showldap (8) man page

showlookup (8) man page

showemailreport (8) man page

 

The man pages for showldap, showlookup, and showemailreport do not state that these commands are available with the fieldeng privilege.

 

getflashimage (8) man page

 

 

In XCP104x, the new command getflashimage is available, which can be used to download firmware images in place of the XSCF Web.

An abbreviated man page for getflashimage is included in Abbreviated Man Page for getflashimage.

setaudit (8) man page

showaudit (8) man page

 

 

The setaudit and showaudit man pages are incorrect with respect to audit class information.

The following are the audit classes and their values:

ACS_SYSTEM 1

ACS_WRITE 2

ACS_READ 4

ACS_LOGIN 8

ACS_AUDIT 16

ACS_DOMAIN 32

ACS_USER 64

ACS_PLATFORM 128

ACS_MODES 256