|C H A P T E R 2|
Using DR 3.0 Model
This chapter contains information about using DR model 3.0 on a Sun Enterprise 10000 system that is running version 3.5 of the SSP software and one of the following versions of the Solaris operating environment: Solaris 8 10/01, Solaris 8 02/02, or Solaris 9.
DR model 3.0 uses the domain configuration server, dcs (1M), to control DR operations. DR 3.0 includes Automated DR (ADR) commands such as addboard (1M), deleteboard (1M), and moveboard (1M). DR 3.0 also includes the following commands:
rcfgadm (1M)--displays the status of attachment points on the domain. (See also cfgadm_sbd(1M) for more information.)
cfgadm (1M)--displays the status of attachment points on the domain. (See also cfgadm_sbd (1M) for more information.)
Automatic DR enables an application to perform DR operations without requiring user interaction. This ability is provided by an enhanced DR framework that includes the reconfiguration coordination manager (RCM) and a system event facility called sysevent. The RCM enables application-specific loadable modules to register with it for callbacks. The callbacks perform preparatory tasks before; error recovery during; and clean-up after a DR operation. The sysevent facility enables applications to register for notification of system events. The automatic DR framework interfaces with the RCM and with sysevent to notify applications to give up resources automatically prior to unconfiguring them, and to capture new resources as they are configured into the domain.
The DR feature enables you to hot-swap system boards without bringing the system down. DR is used to unconfigure the resources on a faulty system board from a domain so that the system board can be removed from the system. The repaired (or replacement) board can be inserted into the domain while the Solaris operating environment is running. DR then configures the resources on the board into the domain.
If you need to remove a board with I/O devices from a domain temporarily and then re-add it before any other boards with I/O devices are added, reconfiguration is unnecessary. In this case, device paths to the board devices remain unchanged. However, if you add another board with I/O devices after the first was removed, then re-add the first board, reconfiguration is required because the paths to the devices on the first board have changed.
The Sun Enterprise 10000 system can be divided into domains that contain system boards; and the components such as CPUs, memory chips, and CompactPCI cards that are connected to the boards. Each domain is electrically isolated into hardware partitions, which ensures that a hardware or software failure in one domain does not affect the other domains in the system.
This section contains procedures that describe how to use the DR 3.0 commands. The following procedures are included:
Adding a board to a domain moves the board through several state changes. First the board is connected to the domain, and then it is configured into the Solaris operating environment. After the board is connected, it is considered to be part of the physical domain and available for use by the operating system.
The following example shows how the addboard (1M) command adds system board 2 to the domain specified by domain_id . Two retries are performed, if necessary, with a wait time of 10 minutes (600 seconds) between retries.
Always check the usage of the components on a board before you delete it from a domain. If the board hosts permanent memory, the memory is moved to another board within the same domain before the board is deleted from the domain. Likewise, if any busy devices are present, you must wait or ensure that the device is no longer being used by the system before you attempt to remove the board.
Caution Caution - You must use the power command to power off the board before physically removing it from the system. The deleteboard(1M) command does not power off the board. Refer to the power(1M) man page for information about the power command. Also see the section To Physically Replace a System Board. In addition, the Sun Enterprise 10000 Systems Service Manual contains complete information about physically removing and replacing boards.
The following example of the deleteboard (1M) command deletes system board 2 from its current domain. Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retries.
Always check memory usage on a board, and the devices that are connected to it, before moving it out of a domain. If the board hosts permanent memory, the memory must be moved to another board within the same domain before the board can be moved to another domain. Likewise, if a busy device is present, you must wait until the device is no longer being used by the system before you attempt to move the board.
The following example of the moveboard (1M) command moves system board 2 from its current domain to the domain specified by domain_id . Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retries.
In the following steps, system board 2 is removed from its current domain and replaced by system board 3. Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retries.
Caution Caution - For complete information about physically removing and replacing boards, refer to the Sun Enterprise 10000 Systems Service Manual. Failure to follow the procedures described therein can cause damage to system boards and other components.