Oracle Solaris OS Issues

This section describes issues related to the Oracle Solaris OS.

The `cfgadm -al` Command Takes a Long Time to Print Output (Bug ID 15631390, Bug ID 15723609)

Note - This issue was originally listed as CR 6937169.

Note - This issue was fixed in Oracle Solaris 11.

The cfgadm(1m) command for configuring or unconfiguring hot-plug devices takes a long time to complete. For example, the cfgadm -al command could take more than five minutes before it lists the attachment points for all the hot-plug devices.

Workaround: Use the hotplug(1M) command to manage PCIe hot-plug devices.

Note - The workaround using the hotplug command instead of cfgadm -al only works for PCIe devices.

Use the hotplug list -l command to list the status of all hot-plug PCIe slots. For example:

# hotplug list -l | grep PCI-EM
/pci@400/pci@2/pci@0/pci@1 [PCI-EM0] (EMPTY)
/pci@400/pci@1/pci@0/pci@4 [PCI-EM2] (EMPTY)
/pci@400/pci@1/pci@0/pci@4 [PCI-EM2] (EMPTY)/pci@400/pci@2/pci@0/pci@2 [PCI-EM1] (EMPTY)
/pci@400/pci@2/pci@0/pci@3 [PCI-EM3] (ENABLED)
/pci@500/pci@1/pci@0/pci@1 [PCI-EM8] (EMPTY)
/pci@500/pci@1/pci@0/pci@2 [PCI-EM10] (ENABLED)
/pci@500/pci@2/pci@0/pci@2 [PCI-EM9] (ENABLED)
/pci@500/pci@2/pci@0/pci@3 [PCI-EM11] (EMPTY)
/pci@600/pci@1/pci@0/pci@4 [PCI-EM4] (EMPTY)
/pci@600/pci@1/pci@0/pci@5 [PCI-EM6] (ENABLED)
/pci@600/pci@2/pci@0/pci@0 [PCI-EM7] (EMPTY)
/pci@600/pci@2/pci@0/pci@5 [PCI-EM5] (EMPTY)
/pci@700/pci@1/pci@0/pci@4 [PCI-EM14] (EMPTY)
/pci@700/pci@2/pci@0/pci@3 [PCI-EM12] (ENABLED)
/pci@700/pci@2/pci@0/pci@4 [PCI-EM13] (EMPTY)
/pci@700/pci@2/pci@0/pci@5 [PCI-EM15] (EMPTY)

Use the hotplug disable command to disable a PCIe card.

For example, to disable the EM card in PCI-EM3 and confirm that it is no longer enabled:
```
# hotplug disable /pci@400/pci@2/pci@0/pci@3 PCI-EM3
# hotplug list -l | grep PCI-EM3/pci@400/pci@2/pci@0/pci@3 [PCI-EM3] (POWERED)
```
You may now physically remove the EM card.

Use the hotplug list command to verify that a card is removed.

For example:

# hotplug list -l | grep PCI-EM...
/pci@400/pci@2/pci@0/pci@3 [PCI-EM3] (EMPTY)
...

Use the hotplug poweron command to power on a PCIe card.

For example, to power on the EM card in PCI-EM3 and confirm that it has moved to the POWERED state:
```
# hotplug poweron /pci@400/pci@2/pci@0/pci@3 PCI-EM3
# hotplug list -l | grep PCI-EM3
/pci@400/pci@2/pci@0/pci@3 [PCI-EM3] (POWERED) 
```
Use the hot-plug enable command to enable a PCIe card.For example, to enable the EM card in PCI-EM3 and confirm that it has moved to the ENABLED state:
```
# hotplug enable /pci@400/pci@2/pci@0/pci@3 PCI-EM3
# hotplug list -l | grep PCI-EM3
/pci@400/pci@2/pci@0/pci@3 [PCI-EM3] (ENABLED)
```

Note - For more information about the hotplug command, see the hotplug(1M) man page.

Spurious Interrupt Message in System Console (Bug ID 15651697, Bug ID 15771956, Bug ID 15771958)

Note - This issue was originally listed as CR 6963563.

Note - This issue was fixed in System Firmware 8.2.0.a.

During the normal operation of the server, and when running the Oracle VTS system exerciser, you might see the following message in the system console:

date time hostname px: [ID 781074 kern.warning] WARNING: px0: spurious interrupt from ino 0x4
date time hostname px: [ID 548919 kern.info] ehci-0#0
date time hostname px: [ID 100033 kern.info]

Workaround: You can safely ignore this message.

Spurious Error Message During Initial Oracle Solaris OS Installation (Bug ID 15658412)

Note - This issue was originally listed as CR 6971896.

The miniroot is a bootable root file system that includes the minimum Oracle Solaris OS software required to boot the server and configure the OS. The miniroot runs only during the installation process.

When the server boots the miniroot for the initial configuration, you might see the following messages in the system console:

Fatal server error:
InitOutput: Error loading module for /dev/fb
 
giving up.
/usr/openwin/bin/xinit: Network is unreachable (errno 128):
unable to connect to X server
/usr/openwin/bin/xinit: No such process (errno 3): Server error.

The messages indicate that the Xsun server in the Oracle Solaris OS miniroot cannot find a supported driver for the AST graphics device in the service processor. These messages are legitimate, as the miniroot contains only the Xsun environment, and the AST framebuffer (astfb) is supported only in the Xorg environment. The Xorg environment is included in the installed system, so the graphics device might be used when running the installed Oracle Solaris OS.

Workaround: You can safely ignore this message.

When `diag-switch?` Is Set to `true`, Oracle Solaris OS Fails to Update EEPROM for Automatic Rebooting (Bug ID 15666767)

Note - This issue was originally listed as CR 6982060.

When installing the Oracle Solaris OS to a device when the OBP diag-switch? parameter is set to true, the Oracle Solaris OS installer fails to update the bootdevice parameter with the new device path where the OS was installed. Therefore, this new device path will not be used during the subsequent automatic system reboots.

Under these conditions, the server will display the following error message, and you will not be able to reboot from the device:

Installing boot information
       - Installing boot blocks (cxtxdxsx)
       - Installing boot blocks (/dev/rdsk/cxtxdxsx)
       - Updating system firmware for automatic rebooting
WARNING: Could not update system for automatic rebooting

On previous systems, the OBP diag-device parameter was used to set the new device path to the boot device when the diag-switch? parameter was set to true. On SPARC T4 systems, the diag-device parameter is no longer supported, and the Oracle Solaris OS installer warns that setting the OBP boot-device parameter is not possible.

Workaround: From the ILOM prompt, set the OBP diag-switch? parameter to false:

-> set /HOST/bootmode script="setenv diag-switch? false"

Alternatively, you can set this parameter at the OBP ok prompt:

ok setenv diag-switch? false

Memory Allocation Issues With Emulex 8Gb HBAs in a Magma I/O Expansion Box (Bug ID 15666779)

Note - This issue was originally listed as CR 6982072.

Memory allocation errors might occur when four or more 8Gb FC PCI-Express HBA, Emulex cards are used in a Magma I/O expansion box connected to an Oracle SPARC T4 series server. The following is an example of the types of messages that might be logged in /var/adm/messages with this configuration:

date time hostname emlxs: [ID 349649 kern.info] [ 8.019A]emlxs22:  ERROR: 301: Memory 
alloc failed. (BPL Pool buffer[1760]. size=1024)
date time hostname emlxs: [ID 349649 kern.info] [ 8.019A]emlxs20:  ERROR: 301: Memory 
alloc failed. (BPL Pool buffer[2765]. size=1024)
date time hostname emlxs: [ID 349649 kern.info] [ 8.019A]emlxs24:  ERROR: 301: Memory 
alloc failed. (BPL Pool buffer[3437]. size=1024)
date time hostname emlxs: [ID 349649 kern.info] [13.0363]emlxs22:  ERROR: 201: 
Adapter initialization failed. (Unable to allocate memory buffers.)
date time hostname emlxs: [ID 349649 kern.info] [ 5.064D]emlxs22:  ERROR: 201: 
Adapter initialization failed. (status=c)
date time hostname emlxs: [ID 349649 kern.info] [ B.1949]emlxs22:  ERROR: 101: Driver 
attach failed. (Unable to initialize adapter.)
date time hostname emlxs: [ID 349649 kern.info] [13.0363]emlxs20:  ERROR: 201: 
Adapter initialization failed. (Unable to allocate memory buffers.)
date time hostname emlxs: [ID 349649 kern.info] [ 5.064D]emlxs20: ERROR: 201:  
Adapter initialization failed. (status=c)
date time hostname emlxs: [ID 349649 kern.info] [ B.1949]emlxs24:  ERROR: 101: Driver 
attach failed. (Unable to initialize adapter.)
date time hostname emlxs: [ID 349649 kern.info] [13.0363]emlxs24:  ERROR: 201: 
Adapter initialization failed. (Unable to allocate memory buffers.)
date time hostname emlxs: [ID 349649 kern.info] [ 5.064D]emlxs24:  ERROR: 201: 
Adapter initialization failed. (status=c)
date time hostname emlxs: [ID 349649 kern.info] [ B.1949]emlxs24:  ERROR: 101: Driver 
attach failed. (Unable to initialize adapter.)

Workaround: Add the following line in the /kernel/drv/emlxs.conf file:

num-iotags=1024;

Reboot the server for the changes to take effect.

The Fault Management Suite Sometimes Sends Resolved Cases to the SP (Bug ID 15667874, Bug ID 15741999)

Note - This issue was originally listed as CR 6983432.

Note - This issue is fixed in Patch 147790-01: SunOS 5.10: fmd patch, and in Oracle Solaris 11.

This defect results in previously diagnosed and repaired PSH faults from the server to reappear in Oracle ILOM when the host reboots. It manifests itself as an incorrect report of a PSH-diagnosed fault represented through the Oracle ILOM CLI, BUI, and fault LED.

Tip - You can identify this defect by checking to see if the same PSH fault was reported from the server as well. If it was reported only by Oracle ILOM and not from the server, it is probably an example of this defect.

Recovery Action: Use the Oracle ILOM diagnostic and repair tools to identify the error condition and correct it. The following example illustrates how to diagnose and repair a PSH fault diagnosed by the server. This example is based on the Oracle ILOM fault management shell. You could instead use the Oracle ILOM CLI or BUI to accomplish the same results.

Display the fault information.

faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- -------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- -------
2011-09-16/15:38:19 af875d87-433e-6bf7-cb53-c3d665e8cd09 SUN4V-8002-6E  Major
 
Fault class : fault.cpu.generic-sparc.strand
 
FRU         : /SYS/MB
              (Part Number: 7015272)
              (Serial Number: 465769T+1130Y6004M)
 
Description : A fault has been diagnosed by the Host Operating System.
 
Response    : The service required LED on the chassis and on the affected
              FRU may be illuminated.
 
Impact      : No SP impact.  Check the Host OS for more information.
 
Action      : The administrator should review the fault on the Host OS.
              Please refer to the Details section of the Knowledge Article
              for additional information.

Check for faults on the server.

# fmadm fault
#                       <-- Server displays no faults

Verify that the fault shown by Oracle ILOM was repaired on the server.

# fmdump
TIME                 UUID                                 SUNW-MSG-ID
Sep 16 08:38:19.5582 af875d87-433e-6bf7-cb53-c3d665e8cd09 SUN4V-8002-6E
Sep 16 08:40:47.8191 af875d87-433e-6bf7-cb53-c3d665e8cd09 FMD-8000-4M Repaired
Sep 16 08:40:47.8446 af875d87-433e-6bf7-cb53-c3d665e8cd09 FMD-8000-6U Resolved
#

Flush the previously faulty component from the server resource cache.

# fmadm flush /SYS/MB
fmadm: flushed resource history for /SYS/MB
#

Repair the fault in Oracle ILOM.

faultmgmtsp> fmadm repair /SYS/MB
faultmgmtsp> fmadm faulty
No faults found
faultmgmtsp>

Gigabit Ethernet (`nxge`) Driver Not Loading on Systems With Oracle Solaris 10 10/09 OS and Solaris 10 9/10 Patch Bundle (Bug ID 15677751)

Note - This issue was originally listed as CR 6995458.

A problem in the Oracle Solaris 10 10/09 package installation process prevents the kxge alias definition for the SPARC T4 series servers from be entered in the /etc/driver_aliases file. If this alias is not properly defined, the nxge cannot be attached.

Workaround: To correct this problem, perform the steps described below.

Note - You must be logged in as root to edit the driver_aliases file.

Add the following line to the /etc/driver_aliases file:
```
nxge "SUNW,niusl-kt"
```
Reboot the server.
Configure the network interfaces.

`nxge` Driver Warning Messages Displayed After Reboot (Bug ID 15710067, Bug ID 15777789, Bug ID 15777790)

Note - This issue was originally listed as CR 7037575.

Note - This issue is fixed in Oracle Solaris 11.1.

During reboot, nxge warnings such as the following are displayed in the /var/adm/messages log:

Apr 18 08:35:56 san-t4-4-0-a nxge: [ID 752849 kern.warning] 
 WARNING: nxge3 : nxge_nlp2020_xcvr_init: Unknown type [0x70756f88] detected
Apr 18 08:36:16 san-t4-4-0-a nxge: [ID 752849 kern.warning] 
WARNING: nxge7 : nxge_nlp2020_xcvr_init: Unknown type [0x70756f88] detected

Workaround: These messages can be ignored.

The `trapstat -T` Command Causes Bad Watchdog Resets at TL2 (Bug ID 15720390)

Note - This issue was originally listed as CR 7052070.

In some instances, servers equipped with Solaris 10 10/09 or Solaris 10 09/10 might panic when running the trapstat -T command.

Workaround: Add the missing SUNWust1 and SUNWust2 packages from the Solaris 10 10/09 or Solaris 10 09/10 media. The Solaris 10 ISO image is available at https://support.oracle.com/epmos/faces/DocumentDisplay?id=1277964.1

Watchdog Timeouts Occur With Heavy Workloads and Maximum Memory Configurations (Bug ID 15737671, Bug ID 15744469, Bug ID 15771943)

Note - This issue was originally listed as CR 7083001.

Note - This issue is fixed in KU 147440-05, and in Oracle Solaris 11.

With certain unusual heavy workloads, especially where a highly processor-intensive workload is bound to CPU 0, the host might appear to suddenly reset back to OBP without any sign of a crash or a panic, with the Oracle ILOM event log containing a “Host watchdog expired” entry. The problem is more prevalent on select systems with full memory configurations.

If you see this sort of sudden reset, display the SP event log using this command from the Oracle ILOM CLI:

-> show /SP/logs/event/list

If you encounter this error, the event list includes an entry labeled “Host watchdog expired.”

Workaround: If you encounter this error, contact your authorized service provider to see if a fix is available.

You can also work around this problem by extending the watchdog period by adding this entry to the Oracle Solaris /etc/system file:

set watchdog_timeout = 60000

This extends the watchdog timeout period to 1 minute (60000 milliseconds).

In extreme cases, you can also disable the watchdog timeout altogether by adding this entry to the /etc/system file:

set watchdog_enabled = 0

A reboot is required for any /etc/system modification to take effect.

If you do not want to reboot the system immediately after editing /etc/system, you can apply an additional temporary workaround that takes effect immediately. As root, type:

# psrset -c -F 0

This command creates a temporary processor set containing only CPU 0, preventing application workloads from using this processor and preventing this issue from occurring.

Note - This command unbinds any threads that were bound to CPU 0.

This temporary processor set will be removed on the next operating system reboot, at which point the /etc/system workaround described above will take effect.

`ereport.fm.fmd.module` Generated During a Reboot of an SDIO Domain (Bug ID 15738845, Bug ID 15742069)

Note - This issue was originally listed as CR 7085231.

Note - This issue is fixed in Oracle Solaris 11.1.

The server module might generate an ereport.fm.fmd.module message during a reboot of an SDIO domain. This ereport indicates that an error occurred on one of the fmd modules, but the fmdump command does not display a valid message (msg).

For example:

# fmdump -eV -c ereport.fm.fmd.module
TIME                           CLASS
Sep 27 2011 06:27:19.954801492 ereport.fm.fmd.module
 ena = 0x425fc9b065404001
nvlist version: 0
msg = cannot open write-only transport __ttl = 0x1
version = 0x0class = ereport.fm.fmd.module__tod = 0x4e81cf37 0x38e91d54detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = fmd
                authority = (embedded nvlist)
                nvlist version: 0
                        version = 0x0
                        product-id = ORCL,SPARC-T4-1
                        server-id = c193-133
               (end authority)
 
                mod-name = etm
                mod-version = 1.2
        (end detector)

Workaround: You can safety ignore ereport.fm.fmd.module ereports.

Oracle VTS `dtlbtest` Hangs When the CPU Threading Mode Is Set to `max-ipc` (Bug ID 15743740, Bug ID 15744945)

Note - This issue was originally listed as CR 7094158.

The Oracle VTS component stress dtlbtest hangs when max-ipc threading mode is set. This issue is not specific to any processor type, and can happen when both the following cases are true:

Only one CPU per core is online.
The total number of online CPUs is less than or equal to 128.

Workaround: Do not run the Oracle VTS Processor test in high stress mode when Oracle VM for SPARC is set to max-ipc mode.

Some pciex8086,105f Devices Fail to Attach On Servers Equipped with System Firmware 8.2.0.b (Bug ID 15774699)

Note - This issue was originally listed as CR 7147940.

Note - This issue is fixed in Oracle VTS 7.0, ps13.

In some cases, the server becomes unresponsive after it is upgraded from System Firmware from 8.1.0.e or earlier to System Firmware 8.2.1.b or later. Log entries such as the following appear:

e1000g: [ID 801725 kern.warning] WARNING: pciex8086,105f - e1000g[0] : Mapping registers failed

Workaround: Download and install Patch ID 148233-02 before updating the system firmware. This patch is available at http://support.oracle.com.

L2 Cache Uncorrectable Errors Causing a Reboot Abort (Bug ID 15826320)

On rare occasions, when rebooting a server running Oracle Solaris 11, an error similar to the following appears in the system console:

ABORT: ../../../greatlakes/n2/src/err_subr.s, line 0x291: strand_in

In addition, if you perform the fmdump -eV command, the following error appears:

ereport.cpu.generic-sparc.l2data-uc@/host proxied

This error appears on servers running Oracle VM Server for SPARC 2.1.x, which is embedded in all versions of Oracle Solaris 11 up to Oracle Solaris 11 SRU 8. This uncorrectable memory error occurs in the memory scrubbing process during system shutdown, and is not a data corruption or memory loss.

Workaround: If you encounter this issue, contact your authorized and upgrade to Oracle VM Server for SPARC 2.2.x.

Skip Navigation Links
Exit Print View
	SPARC T4-4 Server Product Notes

Product Notes