C H A P T E R  3

Information About Software

This chapter describes Oracle Solaris software issues and workarounds as related to this XCP firmware release. See also Compatible Hardware, Firmware, and Software Matrix.

Your server was shipped with the Oracle Solaris Operating System and Java Enterprise System software preinstalled.


Capacity on Demand (COD)

The XCP 1101 firmware introduces a new release of the Capacity on Demand (COD) feature. See the latest version of the SPARC Enterprise M4000/M5000/M8000/M9000 Servers Capacity on Demand (COD) User’s Guide.

COD Changes

The new version of COD:



Note - The XCP 1100 firmware release introduced support for the showcodactivationhistory(8) command, which lets you view or transfer COD activation history.



Remote Initial Login

In addition to the standard default login, M-Series servers are delivered with a temporary login called admin to enable remote initial login, through a serial port. The admin user privileges are fixed to useradm and cannot be changed. You cannot log in as temporary admin using the standard UNIX user name and password authentication or SSH public key authentication. The temporary admin account has no password, and one cannot be added for it.

The temporary admin account is disabled after someone logged in as the default user or as temporary admin has successfully added the first user with valid password and privileges.

For more information about login account names, see also TABLE 2-1 in Chapter 2.


Fault Management

Fault management software does not differentiate between SPARC64 VII+ and SPARC64 VII processors. Ereport/fault event strings display SPARC64 VII for both processor types, for example:


fault.chassis.SPARC-Enterprise.cpu.SPARC64-VII.core.ce

However, the FRU field of the fault does contain the correct part number, allowing you to identify the processor type. For example:


XSCF> fmdump -v
Nov 19 00:58:18.6244 1147afbe-d006-4d46-8cf2-d9b6e5a893dc SCF-8007-AR
  100%  fault.chassis.SPARC-Enterprise.cpu.SPARC64-VII.way.ce
 
        Problem in: hc:///chassis=0/cmu=1/cpu=0
           Affects: hc:///chassis=0/cmu=1/cpu=0
               FRU: hc://:product-id=SPARC Enterprise M8000:chassis-id=2030638006:server-id=aaa-dc1-3-sf0:serial=PP1032026V:part=CA06620-D061 B1   \371-4929-02:revision=0a01/component=/CMU#1/CPUM#0
          Location: /CMU#1/CPUM#0


Identifying Degraded Memory in a System


procedure icon  To Identify Degraded Memory in a System

1. Log in to XSCF.

2. Type the following command:


XSCF> showstatus

3. The following example reveals that DIMM number 0A on the motherboard unit has degraded memory:


XSCF> showstatus
    MBU_A Status: Normal;
      MEM#0A Status:Degraded


Sun Java Enterprise System

The Sun Java Enterprise System software is a comprehensive set of software and life cycle services that make the most of your software investment. It might not include patches that are mandatory for your server.



Note - Due to an issue that arises from the installation of the Java Enterprise System 5 Update 1 on your system, it might be necessary to enable the WebConsole SMF service.



Enabling the Web Console SMF Service


procedure icon  To Enable the Web Console SMF Service

single-step bullet  Log in to a terminal as root, then enable the service.


# svcadm enable svc:/system/webconsole:console 

If you have to reload the software, go to the following web site for download and installation instructions:

http://myoraclesupport.com

If you download a fresh copy of software, that software might not include patches that are mandatory for your server. After installing the software, verify that all required patches are installed and install any that are not.


Software Functionality Issues and Limitations

This section describes software functionality issues and limitations in this release.


TABLE 3-1 Software Functionality Issues and Limitations

M3000

M4000/M5000

M8000/M9000

Issue

o

o

o

The setsnmp(8) and showsnmp(8) commands do not notify the user of authorization failure. Upon such failure, confirm that the SNMP trap host is working and re-execute the command using the correct user name.

 

o

 

The following functions, which display power consumption, are not supported on M4000/M5000 servers. Any values displayed are invalid:

  • The power operand of the showenvironment(8) command.
  • XSCF Web.

o

o

o

In the settimezone -c adddst command, when you set eight or more letters to the abbreviation of time zone and the name of Daylight Saving Time, execution of the showlogs command induces a segmentation fault and results in an error. [CR 6789066]

 

Workaround: Specify the abbreviation of time zone and the name of Daylight Saving Time in seven letters or less.

o

 

 

The M3000 server does not support External I/O Expansion Units.

o

o

o

Only M3000 servers with a SPARC64 VII+ (2.86 GHz) processor let you use the raidctl(1M) command to create a hardware RAID volume using the onboard SAS/LSI controller.

All M-Series servers support use of the raidctl(1M) command to view disk/controller status, and on any PCI host bus adapter (HBA) installed in the system.

The RAID-creation limitation was once designated as CR 6723202. There is no workaround.



Oracle Solaris OS Issues (CRs) and Workarounds

This section contains information about Oracle Solaris OS issues known at time of publication. The following tables list issues you might encounter, depending in part on which Oracle Solaris OS release you are using.

Known Issues in All Supported Oracle Solaris Releases

TABLE 3-2 lists Oracle Solaris OS issues that you might encounter in any Oracle Solaris release. If your domains are not running the latest Oracle Solaris release, also take notice of CRs fixed in releases later than yours, as noted in the tables that follow.


TABLE 3-2 Solaris OS Issues and Workarounds for All Supported Releases

CR ID

M3000

M4000/ M5000

M8000/ M9000

Description

Workaround

CR 4816837

 

o

o

System hangs when executing parallel hot-plug operation with SP DR in suspend phase.

There is no workaround.

CR 6459540

 

o

o

The DAT72 internal tape drive connected to M4000/M5000/ M8000/M9000 servers might time out during tape operations.

The device might also be identified by the system as a QIC drive.

Add the following definition to /kernel/drv/st.conf:

 

tape-config-list=

"SEAGATE DAT DAT72-000",

"SEAGATE_DAT DAT72-000",

"SEAGATE_DAT DAT72-000";

SEAGATE_DAT DAT72-000=1,0x34,0,0x9639,4,0x00,0x8c,0x8c,

0x8c,3;

 

There are four spaces between SEAGATE DAT and DAT72-000.

CR 6522017

 

o

o

Domains using the ZFS file system cannot use DR.

Set the maximum size of the ZFS ARC lower. For detailed assistance, contact your authorized service representative.

CR 6531036

o

o

o

The error message network initialization failed appears repeatedly after a boot net installation.

There is no workaround.

CR 6532215

o

o

o

volfs or dscp services might fail when a domain is booted.

Restart the service. To avoid the problem, issue the following commands.

# svccfg -s dscp setprop \start/timeout_seconds=count: 300
# svccfg -s volfs setprop \start/timeout_seconds=count: 300
# svcadm refresh dscp
# svcadm refresh volfs

CR 6588650

 

o

o

On occasion, a M4000/M5000/M8000/M9000 server is unable to DR after an XSCF failover to or from backup XSCF

There is no workaround.

CR 6589644

 

 

o

When XSCF switchover happens on an M8000/M9000 server after the system board has been added using the addboard command, the console is no longer available.

The console can be recovered by pressing CTRL-q.

CR 6592302

 

o

o

Unsuccessful DR operation leaves memory partially configured.

It might be possible to recover by adding the board back to the domain with an addboard
-d
command. Otherwise try deleteboard(8) again.

CR 6611966

 

o

o

DR deleteboard(8) and moveboard(8) operations might fail. Example for messages on domain:

drmach: WARNING: Device
driver failure: /pci
dcs: <xxxx>
config_change_state:

Hardware specific failure:

unconfigure SB1: Device
driver failure: /pci

Try DR operations again.

CR 6660168

o

o

o

See CR 6660168, removed from this table due to the length of the description.

 

CR 6674266

 

o

o

 

This CR is a duplicate of CR 6611966.

CR 6745410

o

o

o

Boot program ignores the Kadb option which causes the system not to boot.

Use kmdb instead of kadb.

CR 6794630

o

o

o

An attempt to use the GUI to install Oracle Solaris in a domain larger than 2TB might fail.

Use the command-line interface to install the Oracle Solaris OS.

CR 7009469

o

 

 

Creating a RAID using the raidctl(1M) command generates a warning message on initial process. Subsequent RAID configurations do not.

None.


CR 6660168

If a ubc.piowbeue-cpu error occurs on a domain, the Oracle Solaris Fault Management cpumem-diagnosis module might fail, causing an interruption in FMA service. If this happens, you will see output similar to the following sample in the console log:


SUNW-MSG-ID: FMD-8000-2K, TYPE: Defect, VER: 1, SEVERITY: Minor
EVENT-TIME: Fri Apr  4 21:41:57 PDT 2008
PLATFORM: SUNW,SPARC-Enterprise, CSN: 2020642002,
HOSTNAME: <hostname>
SOURCE: fmd-self-diagnosis, REV: 1.0
EVENT-ID: 6b2e15d7-aa65-6bcc-bcb1-cb03a7dd77e3
DESC: A Oracle Solaris Fault Manager component has experienced
an error that required the module to be disabled. Refer to
http://sun.com/msg/FMD-8000-2K for more information.
AUTO-RESPONSE: The module has been disabled. Events
destined for the module will be saved for manual diagnosis.
IMPACT: Automated diagnosis and response for subsequent events
associated with this module will not occur.
REC-ACTION: Use fmdump -v -u <EVENT-ID> to locate the module. Use
fmadm reset <module> to reset the module.

Workaround: If fmd service fails, issue the following command on the domain to recover:


# svcadm clear fmd 

Then restart cpumem-diagnosis:


# fmadm restart cpumem-diagnosis

Issues Fixed in Oracle Solaris 10 9/10

TABLE 3-3 lists issues that have been fixed in the Oracle Solaris 10 9/10 OS. You might encounter them in earlier releases.


TABLE 3-3 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 9/10

CR ID

M3000

M4000/ M5000

M4000/ M5000

Description

Workaround

CR 6888928

o

o

o

The IPMP interface fails since probe packets are not sent through that interface. Problem occurs with M3000/M4000/M5000/M8000/M900. Seen on servers running the Oracle Solaris 10 10/09 OS and IPMP, or any Oracle Solaris release running IPMP with Patch 141444-09 installed.

Disable probe-based failure detection. See IPMP Link-based Only Failure Detection with Solaris 10 Operating System (OS) (Doc ID 1008064.1)

CR 6668237

o

o

o

After DIMMs are replaced, the corresponding DIMM faults are not cleared on the domain.

Use the fmadm repair fmri|uuid to record the repair. Then use the fmadm rotate command to clear out any leftover events.

CR 6872501

o

o

o

Cores are not offlined when requested by the XSCF. This CR affects only Oracle Solaris 10 5/09 and Oracle Solaris 10 10/09 releases.

Use fmdump(1M) with its -v option on the Service Processor to identify the faulty core. Once identified, use psradm(8) on the domain to offline the core.


 

Issues Fixed in Oracle Solaris 10 10/09

TABLE 3-4 lists issues that have been fixed in the Oracle Solaris 10 10/09 OS. You might encounter them in earlier releases.


TABLE 3-4 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 10/09

CR ID

M3000

M4000/ M5000

M4000/ M5000

Description

Workaround

CR 6572827

o

o

o

The prtdiag -v command reports PCI bus types incorrectly. It reports “PCI” for PCI-X leaf devices and “UNKN” for legacy PCI devices.

There is no workaround.

CR 6724307

 

 

o

Scheduler decisions are occasionally unbalanced.

Sometimes two threads will be on one core (causing both to run at about half speed) while another core is idle. For many OpenMP and similar parallel applications, the application performance is limited by the speed of the slowest thread.

Uneven scheduling is not common, perhaps 1 in 50 or 1 in 100 decisions. But if there are 128 threads running, then the application might have at least one uneven schedule event.

Use processor sets to prevent uneven threads to core assignment.

 

CR 6800734

 

o

o

deleteboard hangs in a domain

There is no workaround.

CR 6816913

 

o

o

The XSCF showdevices command displays the incorrect processor cache size for fractional processor cache sizes, such as displaying “5MB” when the correct display would be “5.5MB.”

Use the prtdiag(1M) command on the domain to report processor information.

CR 6821108

 

o

o

DR and showdevices don’t work after XSCF reboot.

Reboot the XSCF service processor twice. Half the SAs are deleted the first time, half are deleted the second time, so the second addition succeeds and IPsec communication is reestablished.

CR 6827340

o

o

o

DR and Memory patrol might fail due to SCF command error.

There is no workaround.


Issues Fixed in Oracle Solaris 10 5/09

TABLE 3-5 lists issues that have been fixed in the Oracle Solaris 10 5/09 OS. You might encounter them in earlier releases.


TABLE 3-5 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 5/09

CR ID

M3000

M4000/ M5000

M4000/ M5000

Description

Workaround

CR 6588555

 

o

o

Resetting the XSCF during a DR operation on permanent memory might cause domain panic.

Do not start an XSCF reset while a DR operation is underway. Wait for the DR operation to complete before starting the reset.

CR 6623226

o

o

o

The Oracle Solaris command lockstat(1M) or the dtrace lockstat provider might cause a system panic.

Do not use the Oracle Solaris lockstat(1M) command or the dtrace lockstat provider.

CR 6680733

o

o

o

Sun Quad-port Gigabit Ethernet Adapter UTP (QGC) & Sun Dual 10 GigE Fiber XFP Low Profile Adapter (XGF) NICs might panic under high load conditions.

If possible, use the card in x8 slot. Otherise, there is no workaround.

CR 6689757

o

o

o

Sun Dual 10 GigE Fiber XFP Low Profile adapter (XGF) with a single or improperly installed XFP optical transceivers might cause the following error to show on the console:

The XFP optical transceiveris broken or missing.

Check and make sure that both XFP optical transceivers are firmly seated in the housing.

Do not mix INTEL and Sun XFP optical transceivers in the same adapter.

Do NOT plumb a port with the ifconfig command if the port does not contain an XFP optical transceiver or it contains one but the transceiver is not in use.

CR 6725885

o

 

 

cfgadm will display non-existent M3000 system boards (SB1 to SB15).

The cfgadm output for SB1-SB15 can be ignored.


Issues Fixed in Oracle Solaris 10 10/08

TABLE 3-6 lists issues that have been fixed in the Oracle Solaris 10 10/08 OS. You might encounter them in earlier releases.


TABLE 3-6 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 10/08

CR ID

M3000

M4000/ M5000

M8000/ M9000

Description

Workaround

CR 6511374

 

o

o

Memory translation warning messages might appear during boot if memory banks were disabled due to excessive errors.

After the system is rebooted, the fmadm repair command can be used to prevent a recurrence of the problem on the next boot.

CR 6533686

 

o

o

When XSCF is low on system resources, DR deleteboard or moveboard operations that relocate permanent memory might fail with one or more of these errors:

SCF busy 
DR parallel copy timeout

This applies only to Quad-XSB configured System Boards hosting multiple domains.

Retry the DR operation at a later time.

CR 6535018

 

 

o

In Oracle Solaris domains that include SPARC64 VII processors, workloads that make heavy use of the Oracle Solaris kernel might not scale as expected when you increase the thread count to a value greater than 256.

For Oracle Solaris domains that include SPARC64 VII processors, limit domains to a maximum of 256 threads.

CR 6556742

o

o

o

The system panics when DiskSuite cannot read the metadb during DR. This bug affects the following cards:

  • SG-XPCIE2FC-QF4, 4-Gigabit PCI-e
    Dual-Port Fiber Channel HBA
  • SG-XPCIE1FC-QF4, 4-Gigabit PCI-e
    Single-Port Fiber Channel HBA
  • SG-XPCI2FC-QF4, 4-Gigabit PCI-X
    Dual-Port Fiber Channel HBA
  • SG-XPCI1FC-QF4, 4-Gigabit PCI-X
    Single-Port Fiber Channel HBA

Panic can be avoided when a duplicated copy of the metadb is accessible via another Host Bus Adaptor.

CR 6589833

 

o

o

The DR addboard command might cause a system hang if you are adding a Sun StorageTek Enterprise Class 4-Gigabit Dual-Port Fiber Channel PCI-E HBA card (SG-XPCIE2FC-QF4) at the same time that an SAP process is attempting to access storage devices attached to this card. The chance of a system hang is increased if the following cards are used for heavy network traffic:

  • X4447A-Z, PCI-e Quad-port Gigabit
    Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit
    Ethernet Fiber XFP Low profile
    Adapter

There is no workaround.

CR 6608404

 

o

o

Hot-plug of the X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP card in slot 1 might cause other network devices to fail.

To avoid the defect, do not install this card in slot 1.

CR 6614737

 

o

o

The DR deleteboard(8) and moveboard(8) operations might hang if any of the following conditions exist:

A DIMM has been degraded.

The domain contains system boards with different memory size.

Avoid performing DR operations if any of the following conditions exist:

  • Degraded memory - To determine whether the system contains degraded memory, use the XSCF command showstatus.
  • Differing memory sizes - To determine whether the domain contains system boards with different memory sizes, display the list of memory sizes using the XSCF command showdevices or the prtdiag command on the domain.

If a DR command hangs, reboot the domain to recover.

CR 6619224

 

 

o

For Oracle Solaris domains that include SPARC64 VII processors, a single domain of 256 threads or more might hang for an extended period of time under certain unusual situations. Upon recovery, the uptime command will show extremely high load averages.

For Oracle Solaris domains that include SPARC64 VII processors, do not exceed a domain size of 256 virtual processors in a single Oracle Solaris domain. This means a maximum of 32 CPUs in a single domain configuration (maximum configuration for an M8000 server).

CR 6632549

 

o

o

fmd service on domain might fail to go into maintenance mode after DR operations.

Issue the following command on the domain:

# svcadm clear fmd

CR 6660197

 

o

o

DR might cause the domain to hang if either of the following conditions exist:

  • A domain contains 256 or more CPUs.
  • Memory error occurred and the DIMM has been degraded.
  1. Set the following parameter in the system specification file
    (/etc/system):
set drmach:drmach_disable_mcopy=1
  1. Reboot the domain.

CR 6679370

 

o

o

The following message might be output on the console during system boot, addition of the External I/O Expansion Unit using hotplug, or an FMEMA operation by DR:

SUNW-MSG-ID: SUN4-8000-75,
TYPE: Fault, VER: 1, SEVERITY:
Critical ... DESC:

A problem was detected in the PCIExpress subsystem.

Add the following to the /etc/system file, then reboot the domain.

set pcie_expected_ce_mask
= 0x2001

CR 6720261

o

o

o

If your domain is running Oracle Solaris 10 5/08 OS, the system might panic/trap during normal operation.

Set the following parameter in the system specification file (/etc/system):

set heaplp_use_stlb=0

Then reboot the domain.

CR 6737039

o

 

 

WAN boot of M3000 servers fails intermittently with a panic early in the boot process. Sample output:

ERROR: Last Trap: Fast DataAccess MMU Miss
%TL:1 %TT:68 %TPC:13aacc%TnPC:13aad0 %TSTATE:1605
%PSTATE:16 ( IE:1 PRIV:1 PEF:1 )
DSFSR:4280804b ( FV:1 OW:1 PR:1E:1 TM:1 ASI:80 NC:1 BERR:1 )
DSFAR:fda6f000DSFPAR:401020827000 D-TAG:6365206f66206000

Power off and power on the chassis, then retry the operation.


Issues Fixed in Oracle Solaris 10 5/08

TABLE 3-7 lists issues that have been fixed in the Oracle Solaris 10 5/08 OS. You might encounter them in earlier releases.


TABLE 3-7 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 5/08

CR ID

M3000

M4000/ M5000

M8000/ M9000

Description

Workaround

CR 5076574

 

 

o

A PCIe error can lead to an invalid fault diagnosis on a large M8000/M9000 domain.

Create a file /etc/fm/fmd/fmd.conf containing the following lines:

setprop client.buflim 40m
setprop client.memlim 40m

CR 6348554

 

o

o

Using the cfgadm -c disconnect command on the following cards might hang the command:

  • SG-XPCIE2FC-QF - 4Sun StorageTek Enterprise Class 4-Gigabit Dual-Port Fiber Channel PCI-E HBA
  • SG-XPCIE1FC-QF4 - Sun StorageTek Enterprise Class 4-Gigabit Single-Port Fiber Channel PCI-E HBA
  • SG-XPCI2FC-QF4 - Sun StorageTek Enterprise Class 4-Gigabit Dual-Port Fiber Channel PCI-X HBA
  • SG-XPCI1FC-QF4 - Sun StorageTek Enterprise Class 4-Gigabit Single-Port Fiber Channel PCI-X HBA

Do not perform cfgadm -c disconnect operation on the affected cards.

CR 6402328

 

 

o

If more than six IOUA (Base I/O Card) cards are used in a single domain, a panic might occur under high I/O stress.

Limit the maximum number of IOUAs in a single domain to 6.

 

CR 6472153

 

o

o

If you create a Oracle Solaris Flash archive on a sun4u server other than an M4000/M5000/M8000/M9000 server, then install it on one of these servers, the console’s TTY flags will not be set correctly. This can cause the console to lose characters during stress.

Just after installing Oracle Solaris OS from a Oracle Solaris Flash archive, telnet into the M4000/M5000/ M8000/M9000 server to reset the console’s TTY flags as follows:

# sttydefs -r console
# sttydefs -a console -i "9600hupcl opost onlcr crtscts" -f "9600"

This procedure is required only once.

CR 6505921

 

 

o

Correctable error on the system PCIe bus controller generates an invalid fault.

Create a file /etc/fm/fmd/fmd.conf containing the following lines;

setprop client.buflim 40m

setprop client.memlim 40m

CR 6522433

 

o

o

The incorrect motherboard might be identified by fmdump for CPU faults after reboot.

Check system status on XSCF.

CR 6527811

 

o

o

The showhardconf(8) command on the XSCF cannot display PCI card information that is installed in the External I/O Expansion Unit, if the External I/O Expansion Unit is configured using PCI hot-plug.

There is no workaround. When each PCI card in the External I/O Expansion Unit is configured using PCI hot-plug, the PCI card information is displayed correctly.

CR 6536564

 

o

o

showlogs(8) and showstatus(8) command might report wrong I/O component.

To avoid this problem, issue the following commands on the domain.

# cd /usr/platform/SUNW,SPARCEnterprise/lib/fm/topo/plugins
# mv ioboard.so ioboard.so.orig
# svcadm restart fmd
 

Contact a service engineer if the following messages are displayed:

SUNW-MSG-ID: SUNOS-8000-1L, TYPE: Defect, VER: 1, SEVERITY: Minor, EVENT-TIME: Sun May 6 18:22:24 PDT 2007 PLATFORM: SUNW,SPARC-Enterprise, CSN: BE80601007, HOSTNAME: sparc

CR 6545143

 

o

o

There is a low probability that a system panic can occur during trap processing of a TLB miss for a user stack address. The problem can occur if the user stack is unmapped concurrently with the user process executing a flush windows trap (ta 3). The panic message will contain the following string:

bad kernel MMU trap at TL 2

There is no workaround.

CR 6545685

 

o

o

If the system has detected correctable memory errors (CE) at power-on self-test (POST), the domains might incorrectly degrade 4 or 8 DIMMs.

Increase the memory patrol timeout values used via the following setting in /etc/system and reboot the system:

set mc-opl:mc_max_rewrite_loop =20000

CR 6546188

 

o

o

The system panics when running hot-plug (cfgadm) and DR operations (addboard and deleteboard) on the following cards:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

There is no workaround.

CR 6551356

 

o

o

The system panics when running hot-plug (cfgadm) to configure a previously unconfigured card. The message “WARNING: PCI Expansion ROM is not accessible” will be seen on the console shortly before the system panic. The following cards are affected by this defect:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Use cfgadm -c disconnect to completely remove the card. After waiting at least 10 seconds, the card might be configured back into the domain using the cfgadm -c configure command.

CR 6559504

 

o

o

Messages of the form nxge: NOTICE: nxge_ipp_eccue_valid_check: rd_ptr = nnn wr_ptr = nnn will be observed on the console with the following cards:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

These messages can be safely ignored.

CR 6563785

 

o

o

Hot-plug operation with the following cards might fail if a card is disconnected and then immediately reconnected:

  • SG-XPCIE2SCSIU320Z - Sun StorageTek PCI-E Dual-Port Ultra320 SCSI HBA
  • SGXPCI2SCSILM320-Z - Sun StorageTek PCI Dual-Port Ultra 320 SCSI HBA

After disconnecting a card, wait for a few seconds before re-connecting.

CR 6564934

 

o

o

Performing a DR deleteboard operation on a board which includes Permanent Memory when using the following network cards results in broken connections:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10 Gigabit Ethernet Fiber XFP Low profile Adapter

Reconfigure the affected network interfaces after the completion of the DR operation. For basic network configuration procedures, refer to the ifconfig man page for more information.

CR 6568417

 

o

o

After a successful CPU DR deleteboard operation, the system panics when the following network interfaces are in use:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10-Gigabit Ethernet Fiber XFP Low profile Adapter

Add the following line to /etc/system and reboot the system:

set ip:ip_soft_rings_cnt=0 

CR 6571370

 

o

o

Use of the following cards have been observed to cause data corruption in stress test under laboratory conditions:

  • X4447A-Z, PCI-e Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z1, PCI-e Dual 10-Gigabit Ethernet Fiber XFP Low profile Adapter

Add the following line in /etc/system and reboot the system:

set nxge:nxge_rx_threshold_hi=0 

CR 6584984

 

 

o

The busstat(1M) command with -w option might cause M8000/M9000 server domains to reboot.

There is no workaround. Do not use busstat(1M) command with -w option on pcmu_p.

CR 6589546

 

o

o

prtdiag does not show all IO devices of the following cards:

  • SG-XPCIE2FC-EM4 Sun StorageTek Enterprise Class 4-Gigabit Dual-Port Fiber Channel PCI-E HBA
  • SG-XPCIE1FC-EM4 Sun StorageTek Enterprise Class 4-Gigabit Single-Port Fiber Channel PCI-E HBA

Use prtdiag -v for full output.

CR 6663570

 

o

o

DR operations involving the lowest numbered CPU might cause the domain to panic.

Do not use DR to remove the system board that hosts the CPU with the lowest CPU ID. Use the Oracle Solaris prtdiag command to identify the CPU with the lowest CPU ID.


Issues Fixed in Oracle Solaris 10 8/07

TABLE 3-8 lists issues that have been fixed in the Oracle Solaris 10 8/07 OS. You might encounter them in earlier releases.


TABLE 3-8 Oracle Solaris OS Issues and Workarounds Fixed in Oracle Solaris 10 8/07

CR ID

M3000

M4000/ M5000

M8000/ M9000

Description

Workaround

CR 6303418

 

 

o

M9000 server with a single domain and 11 or more fully populated system boards might hang under heavy stress.

Do not exceed 170 CPU threads.

Limit the number of CPU threads to one per CPU core by using the Oracle Solaris psradm command to disable the excess CPU threads. For example, disable all odd-numbered CPU threads.

CR 6416224

 

o

o

System performance can degrade using a single NIC card with more than 5,000 connections.

Use multiple NIC cards to split network connections.

CR 6441349

 

o

o

I/O error can hang the system.

There is no workaround.

CR 6485555

 

o

o

On-board Gigabit Ethernet NVRAM corruption could occur due to a race condition. The window of opportunity for this race condition is very small.

There is no workaround.

CR 6496337

 

o

o

The “cpumem-diagnosis” module might fail to load after uncorrectable error (UE) panic. Systems will function correctly but events normally automatically diagnosed by FMA using this module will require manual diagnosis.

Example:

SUNW-MSG-ID: FMD-8000-2K, TYPE: Defect, VER: 1, SEVERITY: Minor EVENT-TIME: Thu Feb 15 15:46:57 JST 2007 PLATFORM: SUNW,SPARC-Enterprise, CSN: BE80601007, HOSTNAME: col2-ffem7-d0

If the problem has already occurred:

  1. Remove the cpumemdiagnosis file:
# rm /var/fm/fmd/ckpt \/cpumemdiagnosis/cpumem-diagnosis
  1. Restart fmd service:

# svcadm restart fmd

 

To avoid this problem in advance, add the following line in the file: /lib/svc/method/svc-dumpadm:

 
# savedev=none 
rm -f /var/fm/fmd/ckpt/cpumemdiagnosis \ /cpumem-diagnosis
#

CR 6495303

 

o

o

The use of a PCIe Dual-Port Ultra320 SCSI controller card (SG-(X)PCIE2SCSIU320Z) in IOU Slot 1 on a SPARC Enterprise M4000/M5000 server might result in a system panic.

Do not use this card in IOU Slot 1.

CR 6498283

 

o

o

Using the DR deleteboard command while psradm operations are running on a domain might cause a system panic.

There is no workaround.

CR 6499304

 

o

o

Unexpected message is displayed on console and CPU isn’t offlined when numerous correctable errors (CEs) occur.

Example:

SUNW-MSG-ID: FMD-8000-11, TYPE: Defect, VER: 1, SEVERITY: Minor EVENT-TIME: Fri Feb 2 18:31:07 JST 2007, PLATFORM: SPARC-Enterprise, CSN: BE80601035, HOSTNAME: FF2-35-0

Check CPU status on XSCF.

CR 6502204

 

o

o

Unexpected error messages might be displayed on console on booting after CPU UE panic.

Example:

SUNW-MSG-ID: FMD-8000-11, TYPE: Defect, VER: 1, SEVERITY: Minor EVENT-TIME: Tue Jan 9 20:45:08 JST 2007 PLATFORM: SUNW,SPARC-Enterprise, CSN: 2030636002, HOSTNAME: P2-DC1- 16-d0

If you see unexpected messages, use the showdomainstatus(8) command to check system status on XSCF.

CR 650275

 

o

o

Inserted or removed hot-plugged PCI card might not output notification message.

There is no workaround.

CR 6508432

 

o

o

A large number of spurious PCIe correctable errors can be recorded in the FMA error log.

 

To mask these errors, add the following entry to /etc/system and reboot the system:

set pcie:pcie_aer_ce_mask = 0x2001

CR 6508434

 

o

 

The domain might panic when an additional PCI-X card is installed or a PCI-X card is replaced using PCI hot-plug.

Do not insert a different type of PCI-X card on the same PCI slot by using PCI hot-plug.

CR 6510861

 

o

o

When using the PCIe Dual-Port Ultra320 SCSI controller card (SG-(X)PCIE2SCSIU320Z), a PCIe correctable error causes a Oracle Solaris panic.

Add the following entry to /etc/system to prevent the problem:

set pcie:pcie_aer_ce_mask = 0x31c1

CR 6520990

 

o

o

When a domain reboots, SCF might not be able to service other domains that share the same physical board. DR operation can exceed the default timeout period and panic can occur.

Increase the DR timeout period by setting the following statement in /etc/system and reboot your system:

set drmach:fmem_timeout = 30

CR 6527781

 

 

o

The cfgadm command fails while moving the DVD/DAT drive between two domains.

There is no workaround. To reconfigure DVD/Tape drive, execute reboot -r from the domain exhibiting the problem.

CR 6530178

 

o

o

DR addboard command can hang. Once the problem is observed, further DR operations are blocked. Recovery requires reboot of the domain.

There is no workaround.

CR 6530288

 

o

o

cfgadm(1M) command might not correctly show Ap_Id format.

There is no workaround.

CR 6534471

 

o

o

Systems might panic/trap during normal operation.

If a patch is not available, disable the kernel large page sTLB programming. In the file /etc/system, change the heaplp_use_stlb variable to 0:

set heaplp_use_stlb=0

CR 6535564

 

o

o

PCI hot-plug to PCI slot #0, #1 or External I/O Expansion Unit might fail on XSB added by DR.

Use DR instead of PCI hot plug if need to add or remove PCI card on the XSB.

CR 6539084

 

o

o

There is a low probability of a domain panic during reboot when the Sun Quad GbE UTP x8 PCIe (X4447A-Z) card is present in a domain.

There is no workaround.

CR 6539909

 

o

o

Do not use the following I/O cards for network access when you are using the boot net install command to install the Oracle Solaris OS:

  • X4447A-Z/X4447A-Z, PCIe Quad-port Gigabit Ethernet Adapter UTP
  • X1027A-Z/X1027A-Z, PCIe Dual 10-Gigabit Ethernet Fiber XFP

Use an alternative type of network card or onboard network device to install the Oracle Solaris OS via the network.

 

CR 6542632

 

o

o

Memory leak in PCIe module if driver attach fails.

There is no workaround.



Software Documentation Updates

This section contains late-breaking information that became known after the documentation set was published or was very recently added.



Note - Online man pages generally are updated more frequently than the SPARC Enterprise M3000/M4000/M5000/ M8000/M9000 Servers XSCF Reference Manual. In case of a conflict, check the Last Modified date at the bottom of the man page.



TABLE 3-9 Changes to Man Pages

Man Page

Change

addcodactivation(8), setcod(8), showcod(8), and showcodusage(8)

These man pages still describe the COD headroom feature, which is no longer supported. See Capacity on Demand (COD).

showenvironment(8)

The XCP 1100 firmware release introduced support of the showenvironment air command on M4000/M5000 servers. You can now use it on any M-Series server.

setpasswordpolicy(8)

A more complete description of the -r remember option is:
Sets the number of passwords remembered in password history. Valid values are integers of 1 - 10. The initial setting is 3. A zero value is not supported and prevents further password modifications by the user.


 


TABLE 3-10 Changes to Manuals

Document Title

Change

SPARC Enterprise M4000/M5000/M8000/M9000 Servers Capacity on Demand (COD) User’s Guide

This document does not yet include instructions about setting headroom to zero before upgrading to XCP 1101 firmware. See Capacity on Demand (COD).