This section contains the latest information about dynamic reconfiguration (DR) functionality for the following midrange servers that are running the Solaris 10 software:
Sun Enterprise 6x00
Sun Enterprise 5x00
Sun Enterprise 4x00
Sun Enterprise 3x00
For more information about Sun Enterprise Server Dynamic Reconfiguration, refer to the Dynamic Reconfiguration User's Guide for Sun Enterprise 3x00/4x00/5x00/6x00 Systems. The Solaris 10 release includes support for all CPU/memory boards and most I/O boards in the systems that are mentioned in the preceding list.
Before proceeding, make sure that the system supports dynamic reconfiguration. If your system is of an older design, the following message appears on your console or in your console logs. Such a system is not suitable for dynamic reconfiguration.
Hot Plug not supported in this system
The following I/O boards are not currently supported:
Type 2 (graphics)
Type 3 (PCI)
Type 5 (graphics and SOC+)
This section provides general software information about DR.
To enable dynamic reconfiguration, you must set two variables in the /etc/system file. You must also set an additional variable to enable the removal of CPU/memory boards. Perform the following steps:
Log in as superuser.
Edit the /etc/system file by adding the following lines:
set pln:pln_enable_detach_suspend=1 set soc:soc_enable_detach_suspend=1
To enable the removal of a CPU/memory board, add this line to the file:
Setting this variable enables the memory unconfiguration operation.
Reboot the system to apply the changes.
You start the quiesce test with the following command:
# cfgadm -x quiesce-test sysctr10:slot number
On a large system, the quiesce test might run for up to a minute. During this time no messages are displayed if cfgadm does not find incompatible drivers.
Attempting to connect a board that is on the disabled board list might produce an error message:
# cfgadm -c connect sysctrl0:slotnumber cfgadm: Hardware specific failure: connect failed: board is disabled: must override with [-f][-o enable-at-boot]
To override the disabled condition, two options are available:
Using the force flag (-f)
# cfgadm -f -c connect sysctrl0:slot number
Using the enable option (-o enable-at-boot)
# cfgadm -o enable-at-boot -c connect sysctrl0:slot number
To remove all boards from the disabled board list, choose one of two options depending on the prompt from which you issue the command:
From the superuser prompt, type:
# eeprom disabled-board-list=
From the OpenBoot PROM prompt, type:
OK set-default disabled-board-list
For further information about the disabled-board-list setting, refer to the “Specific NVRAM Variables” section in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems manual. This manual is part of the documentation set in this release.
Information about the OpenBoot PROM disabled-memory-list setting is published in this release. See “Specific NVRAM Variables” in the Platform Notes: Sun Enterprise 3x00, 4x00, 5x00, and 6x00 Systems in the Solaris on Sun Hardware documentation.
If you need to unload detach-unsafe drivers, use the modinfo line command to find the module IDs of the drivers. You can then use the module IDs in the modunload command to unload detach-unsafe drivers.
Remove the board from the system as soon as possible if the following error message is displayed during a DR connect sequence:
cfgadm: Hardware specific failure: connect failed: firmware operation error
The board has failed self-test, and removing the board avoids possible reconfiguration errors that can occur during the next reboot.
The failed self-test status does not allow further operations. Therefore, if you want to retry the failed operation immediately, you must first remove and then reinsert the board.
The following list is subject to change at any time.
If a process is holding open a network device, any DR operation that would involve that device fails. Daemons and processes that hold reference counts stop DR operations from completing.
Workaround: As superuser, perform the following steps:
Remove or rename the /rplboot directory.
Shut down NFS services.
# sh /etc/init.d/nfs.server stop
Shut down Boot Server services.
# sh /etc/init.d/boot.server stop
Perform the DR detach operation.
Restart NFS services.
# sh /etc/init.d/nfs.server start
Restart Boot Server services.
# sh /etc/init.d/boot.server start
Memory interleaving is left in an incorrect state when a Sun Enterprise5 x500 server is rebooted after a fatal reset. Subsequent DR operations fail. The problem only occurs on systems with memory interleaving set to min.
Workaround: Choose one of the following options:
To clear the problem, manually reset the system at the OK prompt.
To avoid the problem, set the NVRAM memory-interleave property to max.
The second option causes memory to be interleaved whenever the system is booted. However, this option might be unacceptable because a memory board that contains interleaved memory cannot be dynamically unconfigured. See Cannot Unconfigure a CPU/Memory Board That Has Interleaved Memory (4210234).
To unconfigure and subsequently disconnect a CPU board with memory or a memory-only board, first unconfigure the memory. However, if the memory on the board is interleaved with memory on other boards, the memory cannot currently be unconfigured dynamically.
Memory interleaving can be displayed by using the prtdiag or the cfgadm commands.
Workaround: Shut down the system before servicing the board, then reboot afterward. To permit future DR operations on the CPU/memory board, set the NVRAM memory-interleave property to min. See also Memory Interleaving Set Incorrectly After a Fatal Reset (4156075) for a related discussion about interleaved memory.
To unconfigure and subsequently disconnect a CPU board with memory or a memory-only board, first unconfigure the memory. However, some memory cannot currently be relocated. This memory is considered permanent.
Permanent memory on a board is marked “permanent” in the cfgadm status display:
# cfgadm -s cols=ap_id:type:info Ap_Id Type Information ac0:bank0 memory slot3 64Mb base 0x0 permanent ac0:bank1 memory slot3 empty ac1:bank0 memory slot5 empty ac1:bank1 memory slot5 64Mb base 0x40000000
In this example, the board in slot3 has permanent memory and so cannot be removed.
Workaround: Shut down the system before servicing the board, then reboot afterward.
If a cfgadm process is running on one board, an attempt to simultaneously disconnect a second board fails. The following error message is displayed:
cfgadm: Hardware specific failure: disconnect failed: nexus error during detach:address
Workaround: Run only one cfgadm operation at a time. Allow a cfgadm operation that is running on one board to finish before you start a cfgadm disconnect operation on a second board.