Solaris 8 2/02 Release Notes Supplement for Sun Hardware

Chapter 6 Sun Midrange Systems Open Issues

This chapter contains the latest information for the Sun Enterprise systems running the Solaris 8 operating environment. These include the Sun Enterprise 6500, 6000, 5500, 5000, 4500, 4000, 3500, and 3000 systems.

The Solaris 8 operating environment includes support for the CPU/memory boards and most I/O boards in the systems mentioned above.

Dynamic Reconfiguration of Sun Enterprise 6x00, 5x00, 4x00, and 3x00 Systems

These release notes provide the latest information on Dynamic Reconfiguration (DR) functionality for Sun Enterprise 6x00, 5x00, 4x00, and 3x00 systems running the Solaris 8 2/02 operating environment from Sun Microsystems. For more information on Sun Enterprise Server Dynamic Reconfiguration, refer to the Dynamic Reconfiguration User's Guide for Sun Enterprise 3x00/4x00/5x00/6x00 Systems.

The Solaris 8 2/02 operating environment includes support for CPU/memory boards and most I/O boards in Sun Enterprise 6x00, 5x00, 4x00, and 3x00 systems.

Supported Hardware

Before proceeding, ensure the system supports dynamic reconfiguration. If you see the following message on your console or in your console logs, the hardware is of an older design and not suitable for dynamic reconfiguration.


Hot Plug not supported in this system

Supported I/O boards are listed in the “Solaris 8” section on the following Web site:

http://sunsolve5.sun.com/sunsolve/Enterprise-dr

I/O board Type 2 (graphics), Type 3 (PCI), and Type 5 (graphics and SOC+) are not currently supported.

Firmware Notes

FC-AL Disk Arrays or Internal Drives

For Sun StorEdge A5000 disk arrays or for internal FC-AL disks in the Sun Enterprise 3500 system, the firmware version must be ST19171FC 0413 or a subsequently compatible version. For more information, refer to the “Solaris 8” section at the following web site:

http://sunsolve5.sun.com/sunsolve/Enterprise-dr

PROM Updates for CPU and I/O Boards

Users of Solaris 8 2/02 software who wish to use Dynamic Reconfiguration must be running CPU PROM version 3.2.22 (firmware patch ID 103346-xx) or a subsequently compatible version. This firmware is available from the Web site. See How to Obtain Firmware.

Older versions of the CPU PROM may display the following message during boot:


Firmware does not support Dynamic Reconfiguration


Caution – Caution –

CPU PROM 3.2.16 and earlier versions do not display this message, although they do not support dynamic reconfiguration of CPU/memory boards.


  1. To see your current PROM revision, enter .version and banner at the ok prompt.

    Your display will be similar to the following:


    ok .version 
    Slot  0 - I/O Type 1 FCODE 1.8.22 1999/xx/xx 19:26  iPOST 3.4.22 1999/xx/xx 19:31
    Slot  1 - I/O Type 1 FCODE 1.8.22 1999/xx/xx 19:26  iPOST 3.4.22 1999/xx/xx 19:31
    Slot  2 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot  3 - I/O Type 4 FCODE 1.8.22 1999/xx/xx 19:27  iPOST 3.4.22 1999/xx/xx 19:31
    Slot  4 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot  5 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot  6 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot  7 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot  9 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot 11 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot 12 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    Slot 14 - CPU/Memory OBP   3.2.22 1999/xx/xx 19:27  POST  3.9.22 1999/xx/xx 19:31
    ok banner
    16-slot Sun Enterprise E6500
    OpenBoot 3.2.22, 4672 MB memory installed, Serial #xxxxxxxx.
    Ethernet address 8:0:xx:xx:xx:xx, Host ID: xxxxxxxx.

How to Obtain Firmware

For information about updating your firmware, refer to the “Solaris 8” section at the following Web site:

http://sunsolve5.sun.com/sunsolve/Enterprise-dr

At this site, there is information on how to:

If you cannot use the Web site, contact your Sun support service provider for assistance.

Software Notes

Enabling Dynamic Reconfiguration

In the /etc/system file, two variables must be set to enable dynamic reconfiguration and an additional variable must be set to enable the removal of CPU/memory boards.

  1. Log in as superuser.

  2. To enable dynamic reconfiguration, edit the /etc/system file and add the following lines to the /etc/system file:


    set pln:pln_enable_detach_suspend=1
    set soc:soc_enable_detach_suspend=1

  3. To enable the removal of a CPU/memory board, add this line to the /etc/system file:


    set kernel_cage_enable=1

    Setting this variable enables the memory unconfiguration operation.

  4. Reboot the system to put the changes into effect.

Quiesce Test

On a large system, the quiesce-test command (cfgadm -x quiesce-test sysctrl0:slotnumber) may run as long as a minute or so. During this time no messages are displayed if cfgadm does not find incompatible drivers. This is normal behavior.

Disabled Board List

If a board is on the disabled board list, an attempt to connect the board may produce an error message:


# cfgadm -c connect sysctrl0:slotnumber
cfgadm: Hardware specific failure: connect failed: board is disabled: must override with [-f][-o enable-at-boot]

  1. To override the disabled condition, use the force flag (-f) or the enable option (-o enable-at-boot) with the cfgadm command:


    # cfgadm -f -c connect sysctrl0:slotnumber
    

    # cfgadm -o enable-at-boot -c connect sysctrl0:slotnumber
    

  1. To remove all boards from the disabled board list, set the disabled-board-list variable to a null set with the system command:


    # eeprom disabled-board-list=
    

  1. If you are at the OpenBoot prompt, use this command instead of the above to remove all boards from the disabled board list:


    OK set-default disabled-board-list
    

    For further information about the disabled-board-list setting, refer to the section “Specific NVRAM Variables” in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems manual in the Solaris on Sun Hardware Collection AnswerBook set in this release.

Disabled Memory List

For information about the OpenBoot PROM disabled-memory-list setting, refer to the section “Specific NVRAM Variables” in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems in the Solaris on Sun Hardware Collection AnswerBook set in this release.

Unloading Detach-Unsafe Drivers

If it is necessary to unload detach-unsafe drivers, use the modinfo(1M) line command to find the module IDs of the drivers. You can then use the module IDs in the modunload(1M) command to unload detach-unsafe drivers.

Interleaved Memory

A memory board or CPU/memory board that contains interleaved memory cannot be dynamically unconfigured.

To determine if memory is interleaved, use the prtdiag command or the cfgadm command.

To permit DR operations on CPU/memory boards, set the NVRAM memory-interleave property to min.

For related information about interleaved memory, see Memory Interleaving Set Incorrectly After a Fatal Reset (BugID 4156075) and DR: Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory (BugID 4210234).

Self-Test Failure During a Connect Sequence

If the error “cfgadm: Hardware specific failure: connect failed: firmware operation error” is displayed during a DR connect sequence, remove the board from the system as soon as possible. The board has failed self-test, and removing the board avoids possible reconfiguration errors that can occur during the next reboot.

If you want to immediately retry the failed operation, you must first remove and reinsert the board, because the board status does not allow further operations.

Known Bugs

The following list is subject to change at any time. For the latest bug and patch information, refer to:

http://sunsolve5.sun.com/sunsolve/Enterprise-dr.

cfgadm -v Not Working Properly (BugID 4149371)

The memory test should give occasional indications that it is still running. During a long test, the user cannot easily determine that the system is not hanging.

Workaround: Monitor system progress in another shell or window, using vmstat(1M), ps(1), or similar shell commands.

Memory Interleaving Set Incorrectly After a Fatal Reset (BugID 4156075)

Memory interleaving is left in an incorrect state when a Sun Enterprise x500 server is rebooted after a Fatal Reset. Subsequent DR operations fail. The problem only occurs on systems with memory interleaving set to min.

Workarounds: Two choices are listed below.

  1. To clear the problem after it occurs, manually reset the system at the OK prompt.

  1. To avoid the problem before it occurs, set the NVRAM memory-interleave property to max.

    This causes memory to be interleaved whenever the system is booted. However, you may find this option to be unacceptable, as a memory board containing interleaved memory cannot be dynamically unconfigured. See DR: Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory (BugID 4210234).

vmstat Output Is Incorrect After Configuring Processors (Bug ID 4159024)

vmstat shows an unusually high number of interrupts after configuring CPUs. With vmstat in the background, the interrupt field becomes abnormally large (but this does not indicate a problem exists). In the last row in the example below, the interrupts (in) column has a value of 4294967216:


#  procs     memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s6 s9 s1 --   in   sy   cs us sy id
 0 0 0 437208 146424  0   1  4  0  0  0  0  0  1  0  0   50   65   79  0  1 99
 0 0 0 413864 111056  0   0  0  0  0  0  0  0  0  0  0  198  137  214  0  3 97
 0 0 0 413864 111056  0   0  0  0  0  0  0  0  0  0  0  286  101  200  0  3 97
 0 0 0 413864 111072  0  11  0  0  0  0  0  0  1  0  0 4294967216 43 68 0 0 100

Workaround: Restart vmstat.

DR: Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory (BugID 4210234)

Cannot unconfigure a CPU/memory board that has interleaved memory.

To unconfigure and subsequently disconnect a CPU board with memory or a memory‐only board, it is necessary to first unconfigure the memory. However, if the memory on the board is interleaved with memory on other boards, the memory cannot currently be unconfigured dynamically.

Memory interleaving can be displayed using the prtdiag or the cfgadm commands.

Workaround: Shut down the system before servicing the board, then reboot afterward. To permit future DR operations on the CPU/memory board, set the NVRAM memory-interleave property to min. See also Memory Interleaving Set Incorrectly After a Fatal Reset (BugID 4156075) for a related discussion on interleaved memory.

DR: Cannot Unconfigure a CPU/Memory Board That Has Permanent Memory (BugID 4210280)

To unconfigure and subsequently disconnect a CPU board with memory or a memory-only board, it is necessary to first unconfigure the memory. However, some memory is not currently relocatable. This memory is considered permanent.

Permanent memory on a board is marked “permanent” in the cfgadm status display:


# cfgadm -s cols=ap_id:type:info
Ap_Id Type Information
ac0:bank0 memory slot3 64Mb base 0x0 permanent
ac0:bank1 memory slot3 empty
ac1:bank0 memory slot5 empty
ac1:bank1 memory slot5 64Mb base 0x40000000

In this example, the board in slot3 has permanent memory and so cannot be removed.

Workaround: Shut down the system before servicing the board, then reboot afterward.

cfgadm Disconnect Fails When Running Concurrent cfgadm Commands (BugID 4220105)

If a cfgadm process is running on one board, an attempt to simultaneously disconnect a second board fails.

A cfgadm disconnect operation fails if another cfgadm process is already running on a different board. The message is:


cfgadm: Hardware specific failure:
disconnect failed: nexus error during detach: address

Workaround: Do only one cfgadm operation at a time. If a cfgadm operation is running on one board, wait for it to finish before you start a cfgadm disconnect operation on a second board.

Cannot Drain and/or Detach Sun Enterprise Server Boards That Host QFE Cards (BugID 4231845)

A server configured as a boot server for Solaris 2.5.1-based Intel platform clients has several rpld jobs running, whether or not such devices are in use. These active references prevent DR operations from detaching these devices.

Workaround: Perform a DR detach operation:

  1. Remove or rename the /rplboot directory.

  2. Shut down NFS services with this command:


    # sh /etc/init.d/nfs.server stop
    

  3. Perform the DR detach operation.

  4. Restart NFS services with this command:


    # sh /etc/init.d/nfs.server start