This chapter contains information about the DR 3.0 model of dynamic reconfiguration (DR) on the Sun EnterpriseTM 10000 server.
The DR 3.0 model is based on the use of the domain configuration server, dcs(1M), to control DR operations. This model includes the Automated DR (ADR) commands, such as addboard(1M), deleteboard(1M), and moveboard(1M). The showusage(1M) command is no longer supported. DR 3.0 includes three new commands:
showdevices(1M) displays the usage of devices (see "Showing Device Information" for more information).
rcfgadm(1M) displays the status of attachment points on the domain.
cfgadm(1M) lists the status of the dynamically reconfigurable hardware resources on the domain (see also cfgadm_sbd(1M) for more information).
Automatic DR enables an application to execute DR operations without requiring user interaction. This ability is provided by an enhanced DR framework that includes a reconfiguration coordination manager (RCM) and a system event facility called sysevent. The RCM enables application-specific loadable modules to register callbacks with the RCM. The callbacks perform preparatory tasks before a DR operation, error recovery during a DR operation, or clean-up after a DR operation. The sysevent facility enables applications to register for system events and to receive notifications of those events. The automatic DR framework interfaces with the RCM and with sysevent to notify applications to give up resources automatically prior to unconfiguring them, and to capture new resources as they are configured into the domain.
Automatic DR is a different feature from Automated DR (ADR)
For more information about RCM, refer to the Solaris 8 System Administration Supplement in the Solaris 8 4/01 Update Collection.
The DR feature enables you to hot-swap system boards without bringing the server down. It is used to unconfigure the resources on a faulty system board from a domain so that the system board can be removed from the server. The repaired, or replacement, board can be inserted into the domain while the Solaris Operating Environment is running. DR then configures the resources on the board into the domain.
You must use caution when you add or remove system boards with I/O devices. Before you can remove a board with I/O devices, all of its devices must be closed and all its file systems must be unmounted.
If you need to remove a board with I/O devices from a domain temporarily and then re-add it before any other boards with I/O devices are added, reconfiguration is not necessary and need not be performed. In this case, device paths to the board devices will remain unchanged. However, if you add another board with I/O devices after the first was removed, then re-add the first board; reconfiguration is required because the paths to devices on the first board have changed.
The Sun Enterprise 10000 server can be divided into domains that contain system boards, I/O boards, and components such as CPUs, memory chips, and CompactPCI cards that are connected to the boards. Each domain is electrically isolated into hardware partitions, which ensures that any failure in one domain does not affect the other domains in the server.
This section contains procedures that describe how to use the DR 3.0 commands. The following procedures are included:
Before you attempt to add, move, or delete a board to or from a specific domain, you can use the domain_status(1M) command to determine the domain name and the board number.
Use the domain_status(1M) command to obtain the domain information.
% domain_status -m |
Using the domain_status with the -m option command displays the domain name, the DR model, and the number of the boards in the domain, as in the following example.
% domain_status -m DOMAIN TYPE PLATFORM DR-MODEL OS SYSBDS A Ultra-Enterprise-10000 all-A 2.0 5.8 2 B Ultra-Enterprise-10000 all-A 3.0 5.8 3 4 C Ultra-Enterprise-10000 all-A 2.0 5.7 5 6 |
Before you attempt to perform any DR operation, use the showdevices(1M) command to display the device information, especially when you are removing devices.
Use the showdevices(1M) command to display the device information for the domain.
% showdevices -v -d A |
The above command displays the device information for all of the devices in the domain. Refer to the showdevices(1M) man page to learn how to display device-specific information.
The above command produces the following output for CPUs in domain A (the following is an example only).
CPU ---- domain board id state speed ecache usage A C1 40 online 400 4 A C1 41 online 400 4 A C1 42 online 400 4 A C1 43 online 400 4 A C2 55 online 400 4 A C2 56 online 400 4 A C2 57 online 400 4 A C2 58 online 400 4 |
The following output represents an example of the memory output for the showdevices(1M) command above.
Memory drain in progress: ----------------- board perm base domain target deleted remaining domain board mem MB mem MB addr mem MB board MB MB A C1 2048 933 0x600000 4096 C2 250 1500 A C2 2048 0 0x200000 4096 |
The following output represents an example of the I/O devices output for the showdevices(1M) command above.
IO Devices ---------- domain board device resource usage A IO1 sd0 A IO1 sd1 A IO1 sd2 A IO1 sd3 /dev/dsk/c0t3d0s0 mounted filesystem "/" A IO1 sd3 /dev/dsk/c0t3d0s1 dump device (swap) A IO1 sd3 /dev/dsk/c0t3d0s1 swap area A IO1 sd3 /dev/dsk/c0t3d0s3 mounted filesystem "/var" A IO1 sd3 /var/run mounted filesystem "/var/run" A IO1 sd4 A IO1 sd5 |
Refer to the showdevices(1M) man page for a complete list of the options and arguments for this command.
Adding a board to a domain moves the board through several state changes. It is connected to the domain and then configured into the Solaris Operating Environment. After it is connected, it is considered to be part of the physical domain and available to be used by the operating system.
Use the addboard(1M) command to add the board to the domain.
The following example of the addboard(1M) command adds system board 2 to the domain specified by domain_id. Two retries are performed, if necessary, with a wait time of 10 minutes between retries.
% addboard -d domain_id -r 2 -t 600 SB2 |
Deleting a board from a domain removes the board from the domain.
You should always check the usage of the components on a board before you delete it from a domain. If the board hosts permanent memory, the memory is moved to another board within the same domain before the board is deleted from the domain. Likewise, if any busy devices are present, you must wait or ensure that the device is no longer being used by the system before you attempt to remove the board.
You must use the power command to power off the board before you physically remove it from the server. The deleteboard(1M) command does not power off the board. Refer to the power(1M) man page for information about the power command. Or see the section "To Physically Replace a System Board". In addition, the Sun Enterprise 10000 Systems Service Manual contains complete information about physically removing and replacing boards.
Use the deleteboard(1M) command to delete the board from the domain.
The following example of the deleteboard(1M) command deletes system board 2 from its current domain. Two retries are performed, if necessary, with a wait time of 15 minutes between retries.
% deleteboard -r 2 -t 900 SB2 |
Moving a board from one domain to another domain removes the board from the first domain; and then connects and configures it into the target domain.
You should always check the usage of the memory and devices on a board before you move it out of a domain. If the board hosts permanent memory, the memory must be moved to another board within the same domain before the board can be moved to another domain. Likewise, if any busy devices are present, you must wait or ensure that the device is no longer being used by the system before you attempt to move the board.
Use the moveboard(1M) command to move the board from one domain to another domain.
The following example of the moveboard(1M) command moves system board 2 from its current domain to the domain specified by domain_id. Two retries are performed, if necessary, with a wait time of 15 minutes between retries.
% moveboard -d domain_id -r 2 -t 900 SB2 |
This section describes how to physically replace a board in a domain by using the commands described in this chapter.
In the following steps, system board 2 is removed from its current domain and replaced by system board 3.
Delete the board from the domain.
% deleteboard -r 2 -t 900 SB2 |
Power off system board 2.
Refer to the power(1M) man page for information about the power command.
% power -off -sb 2 |
For complete information about physically removing and replacing boards, refer to the Sun Enterprise 10000 Systems Service Manual. Failure to follow the stated procedures can result in damage to system boards and other components.
Physically remove system board 2 and replace it with system board 3.
Power on system board 3.
% power -on -sb 3 |
Add system board 3 to the domain.
% addboard -r 2 -t 900 SB3 |