C H A P T E R  2

Feedback Late-Breaking Issues

This chapter provides the following information:


Storms of Events Might Impact Logging of Telemetry Data (CR 6983799)

Modular systems might experience issues when handling error events, where error telemetry might not be processed or logged by the service processor to the host upon processing a stream of error events. This problem can occur when the server module is running system firmware 7.2.10.a and earlier.

Workaround: Upgrade system firmware to 7.3.0 (or later). See Supported Versions of the Oracle Solaris OS, Firmware, and Software.


System May Hang, Panic, Reset, or Power Off While Handling Correctable Events (6983478)

This problem only occurs on multi-socket Sun4v systems that are running system firmware 7.2.10.a or earlier.

When processing fault events under certain conditions the server module might reset, panic, or hang. This occurs when the system is handling events that require data from a remote CPU node.

Workaround: Upgrade the system firmware to 7.3.0 (or later). See Supported Versions of the Oracle Solaris OS, Firmware, and Software.

 


Drive OK-to-Remove LED Might Not Work When Using the cfgadm -c unconfigure Command (CR 6946124)

When a SAS2 capable REM is installed in the server module, using the cfgadm -c unconfigure command fails to illuminate the drives OK-to-Remove LED making it difficult to identify which drive to remove.

Workaround: If you are still uncertain about the location of the drive, perform the following procedure.


procedure icon  Manually Locate a Drive

1. Run format utility and select the device that you need to locate.

Example:


# format 
Searching for disks...done 
AVAILABLE DISK SELECTIONS: 
       0. c0t5000C5000F8AD1FFd0 <SUN300G cyl 46873 alt 2 hd 20 sec 625> 
          /scsi_vhci/disk@g5000c5000f8ad1ff 
       1. c0t5000C5000F8BB997d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625> 
          /scsi_vhci/disk@g5000c5000f8bb997 
       2. c0t5000C50003D3D85Bd0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> 
          /scsi_vhci/disk@g5000c50003d3d85b 
       3. c0t5000C50012EEE447d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848> 
          /scsi_vhci/disk@g5000c50012eee447 
       4. c0t5000C5000258C457d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424> 
          /scsi_vhci/disk@g5000c5000258c457 
       5. c0t5000CCA00A4A924Cd0 <SUN300G cyl 46873 alt 2 hd 20 sec 625> 
          /scsi_vhci/disk@g5000cca00a4a924c 
 
Specify disk (enter its number): 4 
selecting c0t5000C5000258C457d0    <<==

2. Make note of the cntndn number associated with the drive.

For example, in the previous output example, the string to note is
c0t5000C5000258C457d0.

3. Type q to exit the format utility.

4. Find the serial number for the device:

a. Redirect the output of the iostat command to a file.

Example:


# iostat -En > iostat_output

b. In the file, search for the string you noted in Step 2.

You can use an editor and search for the string. In the following example, we are searching for c0t5000C5000258C457d0.


c0t5000C50003D3D85Bd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE  Product: ST973402SSUN72G  Revision: 0603 Serial No: 0715215EVK 
Size: 73.41GB <73407865856 bytes> 
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c0t5000C5000258C457d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE Product:ST973451SSUN72G Revision: 0302 Serial No:0802V16VTE <<==
Size: 73.41GB <73407865856 bytes> 
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c1t0d0           Soft Errors: 4 Hard Errors: 2 Transport Errors: 0 
Vendor: AMI      Product: Virtual CDROM    Revision: 1.00 Serial No: 
Size: 0.00GB <0 bytes> 
Media Error: 0 Device Not Ready: 0 No Device: 2 Recoverable: 0 
Illegal Request: 4 Predictive Failure Analysis: 0 
c0t5000CCA00A4A924Cd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: HITACHI  Product: H103030SCSUN300G Revision: A2A8 Serial No: 0950GA0B7E 
Size: 300.00GB <300000000000 bytes> 
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c0t5000C50012EEE447d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: SEAGATE  Product: ST914603SSUN146G Revision: 0768 Serial No: 092180GMM6 
Size: 146.81GB <146810536448 bytes> 
/c0t5000C5000258C457d0 

c. Identify the serial number associated with the string.

In the previous example, 0802V16VTE is the serial number.

5. Change to the directory where you installed the SAS2IRCU utility.

For information on downloading and installing the SAS2IRCU utility, refer to the Sun Storage 6 Gb SAS REM HBA Installation Guide.

6. Find the SAS2 controller number (shown under Index) using the sas2ircu LIST command.

Example:


# ./sas2ircu LIST
LSI Corporation SAS2 IR Configuration Utility.
Version 3.250.02.00 (2009.09.29)
Copyright (c) 2009 LSI Corporation. All rights reserved.
 
 
        Adapter      Vendor  Device                       SubSys  SubSys
Index    Type          ID      ID    Pci Address          Ven ID  Dev ID
-----  ------------  ------  ------  -----------------    ------  ------
  0     SAS2008     1000h    72h   00h:700h:00h:00h      1000h   3180h
SAS2IRCU: Utility Completed Successfully.

7. Redirect the output of the sas2ircu n display command to a file, where n is the controller number from Step 6.

Example:


# ./sas2ircu 0 display > sas2ircu_output

8. In the output file, search for the serial number obtained from Step 4.


# cat sas2ircu_output
 
Device is a Hard disk
 Enclosure #                             : 1              <<==
 Slot #                                        : 1					      <<==
 State                                          : Ready (RDY)
 Size (in MB)/(in sectors)               : 70007/143374737
 Manufacturer                            : SEAGATE
 Model Number                           : ST973451SSUN72G
 Firmware Revision                       : 0302
 Serial No                               : 0802V16VTE
 Protocol                                : SAS
 Drive Type                              : SAS_HDD

9. In the output, look for the enclosure # and slot # that correspond to this device.

The drive is in a server module. The Slot # refers to slot number on the server module. In the previous example, Slot # 1 corresponds to HDD1 on the front panel of the server module.

Locate the drive and do not complete the remaining steps in this procedure.

The drive is in a storage module. The Slot # refers to the slot number on the storage module.

Perform the remaining steps in this procedure.

10. To locate the drive in storage module, use the sas2ircu LOCATE command.

The locate ID on the drive will start blinking (amber).

Example specifying a drive in enclosure # 6, slot # 7:


# ./sas2ircu 0 LOCATE 6:7 ON

11. After replacing the drive, turn off the locate LED.

Example specifying a drive in enclosure # 6, slot # 7:


# ./sas2ircu 0 LOCATE 6:7 OFF


cfgadm Does Not Unconfigure the Path When Multipathing Software Is Enabled (6948701)

The cfgadm -c unconfigure command fails if the path specified is an mpxio enabled device.

Workaround: This issue is fixed in the Oracle Solaris 9/10 OS and in kernel patch 14909-13 (or later). If you are unable to install Oracle Solaris 9/10 OS or patch 14909-13, perform the following procedure.


procedure icon  Manually Unconfiguring Multipath-Enabled Drives

1. Start the format utility to see the drives and to obtain the drive numbers (such as c0t5000C5000F0E5AFFd0) for the drive you plan to unconfigure.


# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t5000C5000F0E5AFFd0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
/scsi_vhci/disk@g5000c5000f0e5aff
1. c0t5000C5000F0FE227d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>
 /scsi_vhci/disk@g5000c5000f0fe227

2. To exit the format utility, select one of the drives and type q.


Specify disk (enter its number): 1
selecting c0t5000C5000F0FE227d0

3. Use the mount command to identify whether the device is mounted or if it is a boot drive.

4. Based on your results, do one of the following:

5. Identify the processes running on the drive:

a. Run the fuser command to identify the processes accessing the disk.

b. If you identify a process, use the ps command to further identify the process.

Example:


# ps -ef | grep 1036
root 1036 982 0 11:56:34 pts/2 0:02 dd if=/dev/dsk/c0t5000C5000F0E5AFFd0s2 of=/dev/dsk/c0t5000C5000F0FE227d0s7

c. Kill processes identified in Step b using kill -9 PID.

d. Use the umount command to unmount any mount points and then run sync command to synchronize the disk.

Example:


# umount /mnt
 
# mount | grep c0t5000C5000F0E5AFFd0
 
# sync

e. Remove the disk, and do not continue with subsequent steps in this procedure.

 

6. If the drive is a boot drive, run the following commands to synchronize the drive and shutdown the system:


# sync
 
# init 0


False Power Failure Faults Might Be Reported During POST or SunVTS Memory Testing (CR 6895793)

On some Sun Blade T6340 Modular Servers, the following intermittent error message is displayed during POST or SunVTS testing:


Fault   |  critical: "SP detected fault at time Tue Oct 27 18:17:32 2009. Host Power Failure: MB_DC_POK Fault"

Fix: Update the modular server System Firmware to version 7.2.4.f or higher.


Remote Console Does Not Launch When Using Web Interface Connection to CMM (CR 6740614)

This issue occurs on Sun Blade T6340 server modules that are in a modular system chassis with CMM firmware 3.0.3.32.

When you launch the web interface by connecting to the CMM in the chassis where your server module is installed, you can then select the server module within the web interface to connect to it. If you connect this way, however, the ILOM Remote Console does not launch for the Sun Blade T6340 server module.

Workaround: Use the web interface to connect directly to the Sun Blade T6340 server module, not to the CMM.


Memory Configuration Issues at Power On (CR 6730610)

This issue is fixed in System Firmware 7.1.8.a and later versions.

When powering on a Sun Blade T6340 server module, you might encounter the following error messages:


Chassis | major: Jul 27 16:40:17 ERROR: dt_allocprop: prop == NULL:
Not enough memory to expand MD for new property fwd
 
Chassis | major: Jul 27 16:40:17 ERROR: dt_allocnode: Not enough memory to expand MD for new node mblock
 
Chassis | critical: Jul 27 16:41:55 FATAL: The Service Processor software has taken a FATAL configuration error,
 
Chassis | critical: the HOST Process cannot be started.
Chassis | critical: Please examine the logs to determine the reason for failure and then 
Chassis | critical: reset the Service Processor

Workaround: Update the System Firmware version to 7.1.8.a or later.

Other Workarounds: This error is encountered when there is a large difference between the amount of memory on the different CMP and MEM modules. For example, it could happen if the memory on CMP0+MEM0 added up to 128 Gbytes, but the memory on CMP1+MEM1 added up to only 16 Gbytes. This situation can happen in two different situations. Each situation has its own recovery procedure.

Recovery: Reallocate the FB-DIMMs across the CMP and MEM modules to keep the total number and types of FB-DIMMs the same on each CMP and MEM module.

Recovery: Take one of the following two steps:

You must take this step if replacing the failed FB-DIMM is not immediately possible or desired.

i. View a list of enabled and disabled devices.

In ILOM: show components

In ALOM compatibility shell: showcomponent

ii. Identify the FB-DIMM devices to be disabled.

For each FB-DIMM device that is disabled, you will disable the corresponding FB-DIMM associated with the other CMP/MEM units. For example, if the following device is disabled:

/SYS/MB/MEM0/CMP0/BR0/CH0/D1

Then you must disable the following devices:

/SYS/MB/MEM1/CMP1/BR0/CH0/D1

/SYS/MB/MEM2/CMP2/BR0/CH0/D1

/SYS/MB/MEM3/CMP3/BR0/CH0/D1

iii. Disable the target FB-DIMM devices.

In ILOM: set /SYS/component component_state=disabled

In ALOM CMT compatibility shell: disablecomponent component


Procedure for Resetting Root Password to Factory Default Ineffective
(CR 6749470)

This issue is fixed in System Firmware 7.1.8.a and later.

The service procedure “To Reset the Root Password to the Factory Default” described in the Sun Blade T6340 Server Module Service Manual does not reset the root password.

Workaround: If possible, update the System Firmware to 7.1.8.a or later.


Command prtdiag -v Might Appear to Hang (CR 6588550)

The prtdiag -v command is slow and could appear to hang. The command might take up to five minutes to complete.

Fix: Update the OS to Oracle Solaris 10 5/09 or install the Solar55is 10 kernel patch 139555-08 (or later).


ALOM Compatibility Shell Command setdate Issue (CR 6586305)

This issue is fixed in System Firmware 7.2.0.

Using the SP setdate command (ALOM compatibility shell) after having configured nondefault logical domains can cause the date on nondefault domains to change.

Workaround: Update the System Firmware to 7.2.0 or later.

Another workaround: Use the setdate command to configure the date on the SP before configuring and saving logical domain configurations.

If you use setdate after nondefault logical domain configurations have been saved, each nondefault domain must be booted to Oracle Solaris and the date corrected. (Refer to the date(1) or ntpdate(1M) man page.)


SunVTS xnetlbtest Might Fail During XAUI Loopback Testing (CR 6603354)

SunVTS xnetlbtest can fail during XAUI loopback testing. Failures occur with this error message:


Excessive packets dropped

Workaround: Do not run SunVTS xnetlbtest on XAUI interfaces.

Fix: Update the OS to Oracle Solaris 10/08 or install the Oracle Solaris 10 OS kernel patch 137137-09 or later.

 

Feedback