Sun SPARC Enterprise M8000/M9000 Servers Product Notes |
This document includes these sections:
The following firmware and software versions are supported in this release:
If you plan to boot your Sun SPARC Enterprise M8000/M9000 server from a Solaris WAN boot server on the network, you must upgrade the wanboot executable. See Booting From a WAN Boot Server for details.
Note - For the latest information on supported firmware and software versions, see Software Resources. |
The following patches are mandatory for Sun SPARC Enterprise M8000/M9000 servers running Solaris 10 11/06 OS. These patches are not required for servers running Solaris 10 8/07 OS.
Note - The patches include a revision level, shown as a two-digit suffix. Check SunSolve.Sun.COM for the latest patch revision. See Software Resources for information on how to find the latest patches. |
Install the following patches in numerical order:
1. 118833-xx (minimum revision -36. Reboot your domain before proceeding.)
2. 125100-xx (minimum revision -10)
See the patch README file for a list of other patch requirements.
3. 123839-xx (minimum revision -07)
4. 120068-xx (minimum revision -03)
5. 125424-xx (minimum revision -01)
6. 118918-xx (minimum revision -24)
7. 120222-xx (minimum revision -21)
8. 125127-xx (minimum revision -01 Reboot your domain before proceeding.)
9. 125670-xx (minimum revision -02)
10. 125166-xx (minimum revision -05)
This section describes known hardware and software issues in this release.
![]() |
Caution - For dynamic reconfiguration (DR) and hot-plug issues, see TABLE 4, Solaris Issues and Workarounds. |
This section describes hardware-specific issues and workarounds.
TABLE 1 lists known hardware issues and possible workarounds.
This section contains late-breaking hardware information that became known after the documentation set was published.
The following information supersedes the information in the Sun SPARC Enterprise M8000/M9000 Servers Site Planning Guide.
The following figures are a correction to the description in Section 1.2.2.2, "Bottom View of the Components".
FIGURE 1 shows the Sun SPARC Enterprise M8000 Server + Power Cabinet Bottom View.
FIGURE 2 shows the Sun SPARC Enterprise M9000 Server (Base Cabinet) + Power Cabinet Bottom View.
FIGURE 1 Sun SPARC Enterprise M8000 Server + Power Cabinet Bottom View
FIGURE 2 Sun SPARC Enterprise M9000 Server (Base Cabinet) + Power Cabinet Bottom View
This section describes specific software and firmware issues and workarounds.
TABLE 3 lists XCP issues and possible workarounds.
TABLE 4 lists Solaris issues and possible workarounds.
A PCIe error can lead to an invalid fault diagnosis on a large M9000/M8000 domain. |
Create a file /etc/fm/fmd/fmd.conf containing the following lines; |
|
A Sun SPARC Enterprise M9000 with a single domain and 11 or more fully populated system boards might hang under heavy stress. |
Do not exceed 170 CPU strands. Limit the number of CPU strands to one per CPU core by using the Solaris psradm command to disable the excess CPU strands. For example, disable all odd-numbered CPU strands. |
|
Using the cfgadm -c disconnect command on the following cards might hang the command:
|
Do not perform cfgadm -c disconnect operation on the affected cards. Check http://sunsolve.sun.com for patch 126670-01. |
|
The Solaris cfgadm(1M) command does not unconfigure a DVD drive from a domain on a Sun SPARC Enterprise M8000/M9000 server |
Disable the Volume Management Daemon (vold) before unconfiguring a DVD drive with the cfgadm(1M) command. To disable vold, stop the daemon by issuing the command /etc/init.d/volmgt stop. After the device has been removed or inserted, restart the daemon by issuing the command /etc/init.d/volmgt start" |
|
The DAT72 internal tape drive might time out during tape operations. The device might also be identified by the system as a QIC drive. |
Add the following definition to /kernel/drv/st.conf: |
|
If you create a Solaris Flash archive on a non-Sun SPARC Enterprise M8000/M9000 sun4u server and install it on a Sun SPARC Enterprise M8000/M9000 sun4u server, the console’s TTY flags will not be set correctly. This can cause the console to lose characters during stress. |
Just after installing Solaris OS from a Solaris Flash archive, telnet into the Sun SPARC Enterprise M8000/M9000 server to reset the console’s TTY flags as follows: # sttydefs -r console
|
|
Using the DR deleteboard command while psradm operations are running on a domain might cause a system panic. |
There is no workaround. Check for the availability of a patch for this defect. |
|
A large number of spurious PCIe correctable errors can be recorded in the FMA error log. |
To mask these errors, add the following entry to /etc/system and reboot the system: |
|
On a large single domain configuration, the system might incorrectly report very high load average at times. |
There is no workaround. Check for the availability of a patch for this defect. |
|
When using the PCIe Dual-Port Ultra320 SCSI controller card (SG-(X)PCIE2SCSIU320Z), a PCIe correctable error causes a Solaris panic. |
Add the following entry to /etc/system to prevent the problem: |
|
Memory translation warning messages might appear during boot if memory banks were disabled due to excessive errors. |
After the system is rebooted, the fmadm repair command may be used to prevent a recurrence of the problem on the next boot. |
|
When a domain reboots, SCF might not be able to service other domains that share the same physical board. DR operation can exceed the default timeout period and panic can occur. |
Increase the DR timeout period by setting the following statement in /etc/system and reboot your system.: This workaround is not needed if no physical board is shared among multiple domains |
|
Set the maximum size of the ZFS ARC lower. For detailed assistance contact Sun Service. |
||
The incorrect motherboard may be identified by fmdump for cpu faults after reboot. |
None at this time. Check for the availability of a patch for this defect. |
|
The cfgadm command fails while moving the DVD/DAT drive between two domains. |
There is no workaround. To reconfigure DVD/Tape drive, execute reboot -r from the domain exhibiting the problem. |
|
The showhardconf(8) command on the XSCF cannot display PCI card information that is installed in the External I/O Expansion Unit, if the External I/O Expansion Unit is configured using PCI hot-plug. |
There is no workaround. When each PCI card in the External I/O Expansion Unit is configured using PCI hotplug, the PCI card information is displayed correctly. |
|
DR addboard command can hang. Once the problem is observed, further DR operations are blocked. Recovery requires reboot of the domain. |
There is no workaround. Check for the availability of a patch for this defect. |
|
The error message network initialization failed appears repeatedly after a boot net installation. |
||
An error condition can occur when a physical board is shared between 2 domains. |
If the board is shared between domains, do not use DR at the same time on this shared board. |
|
Make sure you have the correct /etc/system parameter and reboot the system:
|
||
There is a low probability of a domain panic during reboot when the Sun Quad GbE UTP x8 PCIe (X4447A-Z) card is present in a domain. |
There is no workaround. Check for the availability of a patch for this defect. Check http://sunsolve.sun.com for patch 125670-01. |
|
Do not use the following I/O cards for network access when you are using the boot net install command to install the Solaris OS: |
When running Solaris 10 11/06, use an alternate type of network card or onboard network device to install the Solaris OS via the network. |
|
When kcage daemon is expanding the kcage area, if the user stack exists in the expanded area, its area is demapped and might cause a ptl_1 panic during the flushw handler execution. |
There is no workaround. Check for the availability of a patch for this defect. |
|
If the system has detected Correctable MemoryErrors (CE) at power-on self-test (POST), the domains might incorrectly degrade 4 or 8 DIMMs. |
Increase the memory patrol timeout values used via the following setting in /etc/system and reboot the system: |
|
The system panics when running hot-plug (cfgadm) and DR operations (addboard and deleteboard)on the following cards: |
For Solaris 10 8/07, check http://sunsolve.sun.com for patch 127741-01. For Solaris 10 11/06, check http://sunsolve.sun.com for patch 125670-04. |
|
The system panics when running hot-plug (cfgadm) to configure a previously unconfigured card. The message "WARNING: PCI Expansion ROM is not accessible" will be seen on the console shortly before the system panic. The following cards are affected by this defect: |
DO NOT use cfgadm -c unconfigure to disconnect the I/O card. Use cfgadm -c disconnect to completely remove the card. After waiting at least 10 seconds, the card might be configured back into the domain using the cfgadm -c configure command. For Solaris 10 8/07, check http://sunsolve.sun.com for patch 127741-01. |
|
The system panics when DiskSuite cannot read the metadb during DR. This bug affects the following cards: |
Panic can be avoided when a duplicated copy of the metadb is accessible via another Host Bus Adaptor. Or you can apply patch. Check http://sunsolve.sun.com for patch 125166-06. |
|
Messages of the form nxge: NOTICE: nxge_ipp_eccue_valid_check: rd_ptr = nnn wr_ptr = nnn will be observed on the console with the following cards: |
These messages can be safely ignored. For Solaris 10 8/07, check http://sunsolve.sun.com for patch 127741-01. |
|
Hot-plug operation with the following cards might fail if a card is disconnected and then immediately reconnected: |
After disconnecting a card, wait for a few seconds before re-connecting. Check http://sunsolve.sun.com for patch 127750-01. |
|
Hot-plug operations on Sun Crypto Accelerator (SCA)6000 cards can cause Sun SPARC Enterprise M8000/M9000 servers to panic or hang. |
Version 1.0 of the SCA6000 driver does not support hot-plug and should not be attempted. Version 1.1 of the SCA6000 driver and firmware supports hot-plug operations after the required bootstrap firmware upgrade has been performed. |
|
Performing a DR deleteboard operation on a board which includes Permanent Memory when using the following network cards results in broken connections: |
Re-configure the affected network interfaces after the completion of the DR operation. For basic network configuration procedures, refer to the ifconfig man page for more information. Check http://sunsolve.sun.com for patch 127741-01. |
|
After a successful CPU DR deleteboard operation, the system panics when the following network interfaces are in use: |
Add the following line to /etc/system and reboot the system: Check http://sunsolve.sun.com for patch 127111-02. |
|
Use of the following cards have been observed to cause data corruption in stress test under laboratory conditions: |
Add the following line in /etc/system and reboot the system: set nxge:nxge_rx_threshold_hi=0 For Solaris 10 8/07, check http://sunsolve.sun.com for patch 127741-01. For Solaris 10 11/06, check http://sunsolve.sun.com for patch 125670-04. |
|
On Sun SPARC Enterprise M8000/M9000 platforms, one of the columns in the IO Devices section of the output from prtdiag -v is "Type". This reports "PCIe", "PCIx", "PCI" or "UNKN" for each device. The algorithm used to compute this value is incorrect. It reports "PCI" for PCI-X leaf devices and "UNKN" for legacy PCI devices. |
||
Do not start an XSCF failover while a DR operation is running. Wait for a DR operation to finish before starting the failover. If you start the failover first, wait for the failover to finish before starting the DR operation. |
||
After using the addfru or replacefru command to hot-plug a CMU, further DR operations might fail with a misleading message regarding the board being unavailable for DR. |
When performing the addfru and replacefru commands, it is mandatory to run diagnostic tests. If you forget to run the diagnostic tests during addfru/replacefru then either run testsb to test the CMU or remove the CMU/IOU with the deletefru command and then use the addfru command with the diagnostic tests. |
|
The busstat(1M) command with -w option might cause domains to reboot. |
There is no workaround. Do not use busstat(1M) command with -w option on pcmu_p. |
|
Permanent memory DR operation during XSCF failover might cause domain panic. |
Do not start an XSCF failover while a DR operation is running. Wait for a DR operation to finish before starting the failover. If you start the failover first, wait for the failover to finish before starting the DR operation. |
|
prtdiag does not show all IO devices of the following cards: |
||
When the SB is added to the system by the addboard command, the information on the main console path is missing in the SRAM on |
There is no workaround. Check for the availability of a patch for this defect. |
|
The DR addboard command might cause a system hang if you are adding a Sun StorageTek Enterprise Class 4Gb Dual-Port Fibre Channel PCI-E HBA card (SG-XPCIE2FC-QF4) at the same time that an SAP process is attempting to access storage devices attached to this card. The chance of a system hang is increased if the following cards are used for heavy network traffic: |
There is no workaround. Check for the availability of a patch for this defect. |
|
Unsuccessful DR operation leaves memory partially configured. |
It might be possible to recover by adding the board back to the domain with an addboard -d command. |
2. Type the following command:
The following example shows a display of the showdevices -d command where 0 is the domain_id.
The entry for column 4 perm mem MB indicates the presence of permanent memory if the value is non-zero.
The example shows permanent memory on 00-2, with 1674 MB.
If the board includes permanent memory, when you execute the deleteboard command or the moveboard command, the following notice is displayed:
There are two steps that must be completed prior to upgrading:
1. Delete any routes configured on the lan#0 and lan#1 interfaces (failover interfaces).
Note - The applynetwork -n command will not run unless some network configuration has changed. Reseting the hostname (sethostname) to exactly what it is will prompt the command to run. |
The following example show two routes that must be deleted.
The last applynetwork should say "y" to reset and continue.
2. Delete any accounts named ’admin’.
Use the showuser -lu command to list all XSCF accounts. Any accounts named admin must be deleted prior to upgrading to XCP 1050 or later. This account name is reserved in XCP 1050 and higher. Use the deleteuser command to delete the account.
Note - For more information on admin accounts, see TABLE 5, Software Documentation Updates. |
Note - Do not access the XSCF units via the "Takeover IP address". |
Note - LAN connections are disconnected when the XSCF resets. Use the XSCF serial connection to simplify the XCP upgrade procedure. |
1. Log in to the XSCF#0 on an account with platform administrative privileges.
2. Verify that there are no faulted or deconfigured components by using the showstatus command.
The showstatus prompt will return if there are no failures found in the System Initialization. If anything is listed, contact your authorized service representative before proceeding.
4. Confirm that all domains are stopped:
5. Move the key position on the operator panel from Locked to Service.
6. Collect an XSCF snapshot to archive the system status for future reference.
7. Upload the XCP 1060 upgrade image by using the command line getflashimage.
The BUI on XSCFU#0 can also be used to upload the XCP 1060 upgrade image.
8. Update the firmware by using the flashupdate (8) command.
XSCF> flashupdate -c update -m xcp -s1060 |
Specify the XCP version to be updated. In this example, it is 1060.
9. Confirm completion of the update.
Confirm no abnormality happens while updating XCSF_B#0.
10. Confirm that both the current and reserve banks of XSCFU#0 display the updated XCP versions.
XSCF> version -c xcp XSCF#0 (Active ) XCP0 (Reserve): 1060 XCP1 (Current): 1060 XSCF#1 (Standby) XCP0 (Reserve): 0000 XCP1 (Current): 0000 |
If the Current and Reserve banks on XSCF#0 do not indicate XCP revision 1060, contact your authorized service representative.
11. Confirm the newly introduced ’servicetag’ facility is enabled.
When a system is upgraded from XCP 104x to XCP 1050 or later, the newly introduced ’servicetag’ facility is not automatically enabled.
a. Check the ’servicetag’ facility status by using the ’showservicetag’ CLI.
b. If it is currently disabled, you must enable it.
c. An XSCF reboot is required for the ’servicetag’ facility to be enabled.
Note - Service tags are used by Sun Service. Fujitsu customers cannot enable service tags. |
d. Wait until XSCF firmware reaches the ready state.
This can be confirmed when the READY LED of the XSCF remains lit, or the message ’XSCF Initialize complete’ appears on the serial console.
12. Turn off all of the server’s power switches for 30 seconds.
13. After 30 seconds, turn the power switches back on.
14. Wait until XSCF firmware reaches the ready state.
This can be confirmed when the READY LEDs of XSCF_B#0 and XSCF_B#1 remain lit.
15. Log in on to XSCFU#0 using a serial connection or LAN connection.
16. Confirm no abnormality occurred by using showlogs error -v and showstatus commands.
If you encounter any hardware abnormality of the XSCF contact your authorized service representative.
17. Confirm and update the imported XCP image again.
Specify the XCP version to be updated. In this example, it is 1060. XSCF#1 will be updated, and then XSCF#0 updated, again.
When the firmware update for XSCF#0 is complete, XSCF#1 is active.
18. Log in to XSCFU#1 using a serial connection or LAN connection.
19. Confirm completion of the update by using the showlogs event command.
Confirm no abnormality is found during the update.
20. Confirm that both the current and reserve banks of XSCFU#0 display the updated XCP versions.
XSCF> version -c xcp XSCF#1 (Active ) XCP0 (Reserve): 1060 XCP1 (Current): 1060 XSCF#0 (Standby) XCP0 (Reserve): 1060 XCP1 (Current): 1060 |
If the Current and Reserve banks on XSCF#0 do not indicate XCP revision 1060, contact your authorized service representative.
21. Confirm switching over between XSCFs works properly.
XSCF> switchscf -t Standby The XSCF unit switch between the Active and Standby states. Continue? [y|n] :y |
a. When the READY LED on XSCFU_B#1 remains lit, log in to XSCFU#0 using a serial connection or LAN connection.
b. Confirm switching over between XSCFs using the following commands:
Confirm XSCF#1 is now the standby, and that XSCF#0 has become the active.
Confirm no new errors have been recorded since the check in Step 16.
Confirm a message "XSCFU entered active state from standby state".
Confirm a message “No failures found in System Initialization”.
23. Log in to XSCFU#0 and confirm all domains start up properly.
24. Check that there are no new errors.
25. Move position of the key switch on the operator panel from Service to Lock.
To support booting the Sun SPARC Enterprise M8000/M9000 server from a WAN boot server:
1. Install the Solaris 10 11/06 OS on the WAN boot server.
2. Copy the wanboot executable from that release to the appropriate location on the install server. If you need further instructions, refer to the Solaris 10 Installation Guide: Network-Based Installations or refer to:
http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh65?a=view
3. Create a WAN boot miniroot from the Solaris 10 11/06 OS. If you need further instructions, refer to:
http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh63?a=view
If you do not upgrade the wanboot executable, the Sun SPARC Enterprise M8000/M9000 server will panic, with messages similar to the following:
krtld: load_exec: fail to expand cpu/$CPUkrtld: error during initial load/link phasepanic - boot: exitto64 returned from client program
See http://docs.sun.com/app/docs/doc/817-5504/6mkv4nh5i?a=view for more information on WAN boot.
In XCP 105x, the command getflashimage is available, which can be used to download firmware images in place of the XSCF Web.
This section contains late-breaking information on the software documentation that became known after the documentation set was published.
Copyright © 2007, Sun Microsystems, Inc. All Rights Reserved.