C H A P T E R  5

SMS DR Procedures - From the SC (High-End Only)

This chapter describes procedures for using DR from the Sun Fire high-end server system controller (SC), which runs the system management services (SMS) software.



caution icon

Caution - Before you attempt to perform any DR operation on a board or component, determine its state and condition, as described in Preparing to Use DR.



This chapter covers the following topics:



Note - If an SMS DR command fails during a DR operation, the board does not return to its original state. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.



The SMS DR command rcfgadm(1M) works very much like cfgadm(1M) in the domain, accepting the same options. The main visible difference is that rcfgadm(1M) often requires an additional -d domain_id parameter. This chapter focuses on other SMS commands. For information about rcfgadm(1M), see rcfgadm(1M).


Showing Device Information

Before you attempt to perform any DR operation, use the SMS command showdevices(1M) to display device information, especially before removing devices.


procedure icon  To Show Device Information

single-step bulletDisplay device information for the domain.


# showdevices -v -d domain_id

showdevices(1M) displays information about all devices in the domain and produces output similar to that in the following tables.


TABLE 5-1 showdevices Sample Output, CPU
domain
board
id
state
speed
ecache
usage
A
SB1
40
online
400
4
 
A
SB1
41
online
400
4
 
A
SB1
42
online
400
4
 
A
SB1
43
online
400
4
 
A
SB2
55
online
400
4
 
A
SB2
56
online
400
4
 
A
SB2
57
online
400
4
 
A
SB2
58
online
400
4
 

TABLE 5-2 showdevices Sample Output, UltraSPARC IV+ ( showdevices -d G )
domain
board
id
state
speed
ecache
usage
G
SB0
0
on-line
1050
8

 

G
SB0
1
on-line
1050
8

 

G
SB0
2
on-line
1050
8

 

G
SB0
3
on-line
1050
8

 

G
SB0
4
on-line
1050
8

 

G
SB0
5
on-line
1050
8

 

G
SB0
6
on-line
1050
8

 

G
SB0
7
on-line
1050
8

 

G
SB9
288
on-line
900
8

 

G
SB9
289
on-line
900
8

 

G
SB9
290
on-line
900
8

 

G
SB9
291
on-line
900
8

 

G
SB12
384
on-line
900
8

 

G
SB12
385
on-line
900
8

 

G
SB12
386
on-line
900
8

 

G
SB12
387
on-line
900
8

 


TABLE 5-3 showdevices Sample Output, Memory Drain In-Progress
domain
board
board mem MB
perm mem MB
base addr
domain mem MB
target board
deleted MB
remaining MB
A
SB1
2048
933
0x600000
4096

C2

250

1500

A
SB2
2048
0
0x200000
4096

 

 

 


TABLE 5-4 showdevices Sample Output, IO Devices
domain
board
device
resource
usage
A
101
sd0
 
 
A
101
sd1
 
 
A
101
sd2
 
 
A
101
sd3
/dev/dsk/c0t3d0s0
mounted from filesystem "/"
A
101
sd3
/dev/dsk/c0t3d0s1
dump device (swap)
A
101
sd3
/dev/dsk/c0t3d0s1
swap area
A
101
sd3
/dev/dsk/c0t3d0s3
mounted filesystem "/var"
A
101
sd3
/var/run
mounted filesystem "/var/run"
A
101
sd4
 
 
A
101
sd5
 
 

   

For more information see showdevices(1M), or see the showdevices(1M) man page for a complete list of options and arguments, and for information about displaying device-specific information.


Showing Platform Information

Before you attempt to add, move, or delete a board to or from a specific domain, use the showboards(1M) command to determine the domain ID, the boards available to the domain, and the status of the domain.

You can use the domain ID with all DR commands. You can use the board list to determine the domain to which a specific board is assigned, and you can use the domain status to determine whether or not you can add, delete, or move a board to or from the domain. Use the showplatform(1M) command to determine whether the component is in the available component list (ACL).

You must have the appropriate privileges to use the showplatform(1M) command. See showplatform(1M) for more information, including a table that shows which user groups can use it.


procedure icon  To Show Platform Information

single-step bulletList domain and ACL information.


# showplatform

The showplatform(1M) command displays the domain ID, the ACL, and the status of the domain, as in the following example.


ACLs for domain domainA:
        slot0: SB0, SB1, SB2, SB3
        slot1: IO0, IO1, IO2, IO3
 
ACLs for domain domainB:
        slot0: None
        slot1: None
 
 
Domain        Solaris Nodename      Domain Status
 
domainA       sms3-b0               Powered Off
domainB       sms3-b1               Running Solaris


Showing Board Information

Before you attempt to delete or move a system board, you must query the board to determine the state of the board and the domain to which it is assigned. See showboards(1M) for more information. including a table showing which user groups can use it, and the showboards(1M) man page.

SC State Models

On the Sun Fire high-end server SC, a board can be in one of four states: unavailable, available, assigned, or active.



Note - The state of a board on the SC is not the same as the state of a board on the domain. For more information about board states on the domain, see DR Concepts.




TABLE 5-5 Board State Conditions on the Sun Fire High-End Systems SC

Name

Description

unavailable

The board is unavailable to the domain. The board has not been added to the ACL for the specified domain, or the board is currently assigned to another domain. Note that boards that are not in the ACL are invisible to the domain. In the unavailable state, the board is not considered part of the specified domain.

available

The board is available to be added to the domain. The board is in the ACL for the domain. Note that the board can be available to any number of domains. In the available state, the board is not considered to be part of the logical domain.

assigned

The board has been assigned to the domain, and might be in the domain's ACL. The board is unavailable to any other domain. In the assigned state, the board is considered to be part of the logical domain.

active

The board has been connected. Or, the board has been connected and configured into the Solaris OS and is available for use by the operating system. In the active state, the board is considered part of the physical domain.


The showboards(1M) command

After you have determined the domain ID that contains the board that you want to delete or move, or after you have determined that a particular board has already been assigned to a specific domain, use the showboards(1M) command to determine the state of the board. The board might be in a state that makes it impossible for you to delete or move it.



Note - The output of the showboards(1M) command depends on the privileges of the user. For instance, the platform administrator can obtain information about all of the boards in the server. The domain administrator and domain configurator, however, can obtain the information about only those boards that are assigned and available to the domain(s) to which they have access. For more information, see showboards(1M) and the showboards(1M) man page.




procedure icon  To Show Board Information

single-step bulletDisplay board information for the domain.


# showboards -d domain_id

The above command displays the device information similar to the following:


Slot

Power

Board Type

Board Status

Test Status

Domain

SB0

On

CPU Board

Active

Passed

A

SB1

-

Empty Slot

Assigned

-

A


You can use the showboards(1M) command to display all assigned and available system boards, and all I/O boards in the domain. See the showboards(1M) man page for more information about showing board information.


Adding Boards

Adding a board to a domain moves the board through several state changes. If it is not already assigned, it is first assigned to the domain. Then, it is connected to the domain and configured into the Solaris OS. After it is connected, it is considered part of the physical domain and available for use by the operating system.

You must have the appropriate privileges to add a board to a domain. For more information, including a description of the privileges needed to use this command, see addboard(1M) and the addboard(1M) man page.



Note - Before you use DR to add a COD board into a domain, make sure the system has enough RTU licenses available to the target domain to enable each active CPU on the COD board. Otherwise, DR displays a message for each CPU that cannot be enabled in the domain. For more information about the COD option, see the System Management Services (SMS) Administrator Guide.




procedure icon  To Add a Board to a Domain

single-step bulletAdd the board to the domain.


# addboard -d domain_id board_id

The following example adds system board 2 (SB2) to domain A. Two retries are performed, if necessary, with a wait time of 10 minutes (600 seconds) between retries.


# addboard -d A -r 2 -t 600 SB2



Note - If the addboard(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.




Deleting Boards

Deleting a board from a domain removes the board from the domain to which it is currently assigned, and in which it might be active. To delete a board, it must be in the assigned or active state.

Always check the usage of the components on a board before you delete it from a domain. If the board hosts permanent memory, the memory is moved to another board within the same domain before the board is deleted from the domain. Likewise, if any busy devices are present, you must wait or ensure that the device is no longer being used by the system before you attempt to remove the board.

A domain administrator can unconfigure and disconnect a board, but cannot unassign a board from a domain unless the board is in the ACL. For more information, including a description of privileges required to use this command, see deleteboard(1M) and the deleteboard(1M) man page.


procedure icon  To Delete a Board From a Domain

single-step bulletDelete the board from the domain.


# deleteboard board_id

The following example of the deleteboard(1M) command deletes system board 2 (SB2) from its current domain. Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retries.


# deleteboard -r 2 -t 900 SB2



Note - If the deleteboard(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.




Moving Boards

Moving a board from one domain to another domain is performed in several steps. First, the board is removed from the domain to which it is currently assigned, and in which it might be active; the board must be in the assigned or active state. Next, it is assigned to the target domain. Then, it is connected to the target domain and configured into the Solaris OS, where it becomes available for use.

You should always check the usage of the memory and devices on a board before you move it out of a domain. If the board hosts permanent memory, the memory must be moved to another board within the same domain before the board can be moved to another domain. Likewise, if any busy devices are present, you must wait or ensure that the device is no longer being used by the system before you attempt to move the board.

For more information, including a description of privileges required to use this command, see moveboard(1M) and the moveboard(1M) man page.



Note - Before you use DR to move a COD board into a domain, make sure the ssytem has enough RTU licenses available to the target domain to enable each active CPU on the COD board. Otherwise, DR displays a message for each CPU that cannot be enabled in the domain. For more information about the COD option, see the System Management Services (SMS) Administrator Guide.




procedure icon  To Move a Board

single-step bulletMove the board from one domain to another domain.


# moveboard -d domain_id board_id

The following example of the moveboard(1M) command moves system board 2 (SB2) from its current domain to domain A. Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retires.


# moveboard -d A -r 2 -t 900 SB2



Note - If the moveboard(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.




Replacing Active System Boards

This section describes how to replace a system board that is active in a domain.


procedure icon  To Replace an Active System Board

1. Delete the system board from its current domain.


# deleteboard board_id

The following example removes system board 2 (SB2) from its current domain:


# deleteboard -r 2 -t 900 SB2

2. Add the replacement board to the specified domain.


# addboard -d domain_id board_id

The following example adds system board 3 to the domain A. Two retries are performed, if necessary, with a wait time of 15 minutes (900 seconds) between retries.


# addboard -d A -r 2 -t 900 SB3


SMS DR Commands and Options

This section contains descriptions of the SMS DR commands and related options. For more information about each SMS DR command, see the System Management Services (SMS) Reference Manual.

addboard(1M)

The addboard(1M) command attaches board to a domain. See Adding Boards and the addboard(1M) man page for more information.


TABLE 5-6 addboard Command Options

Options and Operands

Specifies

board_id

The ID of the board to be added. The board ID corresponds to the board location. For example, SB2 is the board in slot 2. Multiple board identifiers are permitted.

-c function

Configure the board into the specified configuration state. You can add a board by steps. For example, you can assign the board, connect it, then configure it.

-d domain_id

Execute the DR operation in the specified domain.

-f

Force the specified action to occur. Typically, this is a hardware-specific override of a safety feature. Forcing a state change operation can allow use of the hardware resources of an occupant that is not in the ok or unknown conditions, at the discretion of any hardware-dependent safety checks.

-h

Display Help (usage) information.

-n

Answer No to all prompts.

-q

Run in quiet mode. Messages and prompts are not written to standard output. When used alone, -q defaults to the -n option for all prompts.

-r retry_count

If the operation fails, retry the specified number of times.

-t timeout

Wait the specified time, in seconds, between retries.

-y

Answer Yes to all prompts.


TABLE 5-7 describes the privileges needed to use the addboard(1M) command. The platform operator, platform service, and superuser groups cannot initiate this command.


TABLE 5-7 Privileges Needed to Use the addboard command

Platform Admin

Domain Admin

Domain Configurator

Can assign a board to a domain using the -c option with the assign function.

Can connect or configure a board into a domain if the board has been assigned to the domain, or if it appears in the ACL for the domain and is not assigned to another domain.

Can connect or configure a board into a domain if the board has been assigned to the domain, or if it appears in the ACL for the domain and is not assigned to another domain.


The following example attaches system board 2 (SB2) to domainA. Two retries are performed, if necessary, with a wait time of 10 minutes (600 seconds) between retries.


# addboard -d domainA -r 2 -t 600 SB2



Note - If addboard(1M) fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.



deleteboard(1M)

The deleteboard(1M) command detaches a board from a domain. See Deleting Boards and the deleteboard(1M) man page for more information.


TABLE 5-8 deleteboard Command Options

Options and Operands

Specifies

board_id

The ID of the board to be deleted. The board ID corresponds to the board location. For example, SB2 is the system board in slot 2. Multiple board identifiers are permitted.

-c function

Configure the board into the specified configuration state. You can delete a board by steps. For example, you can unconfigure the board, disconnect it, and then unassign it.

-f

Force the specified action to occur. Typically, this is a hardware-specific override of a safety feature. Forcing a state change operation can allow use of the hardware resources of an occupant that is not in the ok or unknown conditions, at the discretion of any hardware-dependent safety checks.

-h

Display Help (usage) information.

-n

Answer No to all prompts.

-q

Run in quiet mode. Messages and prompts are not written to standard output. When used alone, -q defaults to the -n option for all prompts.

-r retry_count

If the operation fails, retry the specified number of times.

-t timeout

Wait the specified time, in seconds, between retries.

-y

Answer Yes to all prompts.


TABLE 5-9 describes the privileges needed to use the deleteboard(1M) command. The platform operator, platform service, and superuser groups cannot initiate this command.


TABLE 5-9 Privileges Needed to Use the deleteboard Command

Platform Admin

Domain Admin

Domain Configurator

Can unassign boards that are not active in a domain by using the -c option with the unassign function. If the user also has domain privileges, deleteboard also unconfigures and disconnects the board before it unassigns it.

Can unconfigure, disconnect or unassign a board from the domain. The board can be unassigned from the domain only if it appears in the ACL.

Can unconfigure, disconnect or unassign a board from the domain. The board can be unassigned from the domain only if it appears in the ACL.


The following example of the deleteboard(1M) command detaches system board 2 (SB2) from its current domain. The command specifies two retries at 15-minute (900-second) intervals.


# deleteboard -r 2 -t 900 SB2



Note - If deleteboard(1M) fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.



moveboard(1M)

The moveboard(1M) command detaches a board from a domain, then attaches it to another domain. See Moving Boards and the moveboard(1M) man page for more information.


TABLE 5-10 moveboard Command Options

Options and Operands

Specifies

board_id

The ID of the board to be moved. The board ID corresponds to the board location. For example, SB2 is the system board in slot 2. Multiple board identifiers are permitted.

-c function

Configure the board into the specified configuration state. You can move a board by steps. For example, you can assign the board, connect it, and then configure it.

-d domain_id

Execute the DR operation on the specified domain.

-f

Force the specified action to occur. Typically, this is a hardware-specific override of a safety feature. Forcing a state change operation can allow use of the hardware resources of an occupant that is not in the ok or unknown conditions, at the discretion of any hardware-dependent safety checks.

-h

Display Help (usage) information.

-n

Answer No to all prompts.

-q

Run in quiet mode. Messages and prompts are not written to standard output. When used alone, -q defaults to the -n option for all prompts.

-r retry_count

If the operation fails, retry the specified number of times.

-t timeout

Wait the specified time, in seconds, between retries.

-y

Answer Yes to all prompts.


TABLE 5-11 describes the privileges needed to use the moveboard(1M) command. The platform operator, platform service, and superuser groups cannot initiate this command.


TABLE 5-11 Privileges Needed to Use the moveboard Command

Platform Admin

Domain Admin

Domain Configurator

Can re-assign boards from one domain to another domain by using the -c option with the assign function. The board cannot be active in the domain from which it is being re-assigned.

Can assign, connect, or configure a board that is in another domain. If the board is active in another domain, the moveboard command unconfigures and disconnects the board from that domain. The board must be in the ACL in order to unassign and re-assign it using moveboard. The moveboard command can connect and configure the board.

 

The domain administrator must have domain privileges for both domains to use the moveboard(1M) command.

Can assign, connect, or configure a board that is in another domain. If the board is active in another domain, the moveboard command unconfigures and disconnects the board from that domain. The board must be in the ACL in order to unassign and re-assign it using moveboard. The moveboard command can connect and configure the board.

 

The domain configurator must have domain privileges for both domains to use the moveboard(1M) command.


The following example of the moveboard(1M) command moves system board 5 (SB5) from its current domain to domain B. The command specifies two retries at 15-minute (900-second) intervals.


# moveboard -d domainB -r 2 -t 900 SB5



Note - If the moveboard(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.



rcfgadm(1M)

The rcfgadm(1M) command performs DR operations from the SC, providing remote configuration administration operations on attachment points, which are device nodes in the device tree. See the rcfgadm(1M) man page for more information and examples of how to use this command.

TABLE 5-12 describes the rcfgadm(1M) command options and operands.


TABLE 5-12 rcfgadm Command Options

Options and Operands

Specifies

-a

List dynamic attachment points.

-c function

Configure the board into the specified configuration state: connect, disconnect, configure, or unconfigure.

-d domain_id

Execute the DR operation on the specified domain.

-f

Force the specified action.

-h

-h ap_id

-h ap_type

Print the specified help message. If ap_id or ap_type is given, display the hardware-specific help for the attachment point.

-l ap_id | ap_type

List the state and condition of the specified attachment points.

-n

Answer No to all prompts.

-o hardware_options

Use the specified hardware-specific options.

-r retry_count

If the operation fails, retry the specified number of times.

-s listing_options

List the specified listing options.

-T timeout

Wait the specified time, in seconds, between retries.

-t

Test one or more attachment points.

-v

Execute in verbose mode.

-x hardware_function

Use hardware-specific functions.

-y

Answer Yes to all prompts.


TABLE 5-13 describes the privileges needed to use the rcfgadm(1M) command. The platform operator, platform service, and superuser groups cannot initiate this command.


TABLE 5-13 Privileges Needed to Use the rcfgadm Command

Platform Admin

Domain Admin

Domain Configurator

Can assign boards to, or unassign boards them from, a domain by using the -x option with the assign or unassign function, respectively. To use the unassign function, the board must be assigned and cannot be active in a running domain.

Can disconnect, connect, configure, or unconfigure a board to or from the domain. Can assign or unassign a board if the board is in the domain's ACL.

Can disconnect, connect, configure, or unconfigure a board to or from the domain. Can assign or unassign a board if the board is in the domain's ACL.




Note - If rcfgadm(1M) fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you must reboot the domain to use the board.



scdrhelp(1M)

The scdrhelp(1M) shell script starts the Sun Fire high-end server dynamic reconfiguration error help system. The help system uses the JavaHelptrademark hsviewer script.

All user privilege groups can use this command except domain administrator and domain configurator.

See Error Message Help System and the scdrhelp(1M) man page for more information about this script.

showboards(1M)

The showboards(1M) command displays assignment information and status of system boards in a domain, and indicates whether a board is a Capacity On Demand (COD) board. See Showing Board Information and the showboards(1M) man page for more information.

Although showboards(1M) is not a DR-specific command, Sun suggests you use it with DR commands. TABLE 5-14 describes the showboards(1M) command options.


TABLE 5-14 showboards Command Options

Option

Specifies

-d domain_id

Execute the DR operation on the specified domain.

-h

Display Help (usage) information.

-v

Execute in verbose mode. In this mode the command displays all components, including domain configurable units (DCUs), which include CPUs, PCIs, and SCs.


All user privilege groups can use this command, but domain administrators and domain configurators can show boards only in the domains for which they have privileges.

showdevices(1M)

The showdevices(1M) command displays the configured physical devices on system boards and the resources made available by these devices. Although the showdevices(1M) command is not DR-specific, Sun sugggests you use it with DR commands. See Showing Device Information and the showdevices(1M) man page for more information.

Usage information is provided by applications and subsystems that are actively managing system resources. To see the predicted impact of a system board DR operation, do an offline query of managed resources.


TABLE 5-15 showdevices Command Options

Options and Operands

Specifies

board_id

The ID of the board to be added. The board ID corresponds to the board location. For example, SB2 is the system board in slot 2. Multiple board identifiers are permitted.

-d domain_id

Execute the DR operation in the specified domain.

-h

Display Help (usage) information.

-p reports

Show offline query information.

-v

Display information about all I/O devices.


Only the domain administrator and the domain configurator can display device information about a domain. And they can do so only for domains for which they have privileges.

showplatform(1M)

The showplatform(1M) command shows the ACL, domain state for each domain, and Capacity on Demand (COD) information. Although the showplatform(1M) command is not DR-specific, Sun suggests you use it with DR commands. See Showing Platform Information and the showplatform(1M) man page for more information.


TABLE 5-16 showplatform Command Options

Options and Operands

Specifies

-d domain_id

Execute the DR operation in the specified domain.

-h

Display Help (usage) information.

-p domains | available ethernet | cod

Display reports that include information about COD, grouped as specified by:

  • domain state (domains)
  • domain ACL (available)
  • domain ethernet addresses (ethernet)

-v

Display all available command information.


All user privileges groups except platform service and superuser groups can use this command. But domain administrators and domain configurators can show platform information only in domains for which they have privileges.


Error Message Help System

The SMS software contains an error message help system that you can use to find a description and recovery procedure for a specific error message.

To start the DR error message help system, use the following command:


# /opt/SUNWSMS/jh/scdrhelp/scdrhelp &

The standard JavaHelp system viewer, hsviewer, displays the DR error messages help system. The viewer consists of a toolbar and two panes: the content pane and the navigation pane, as shown in FIGURE 5-1.


FIGURE 5-1 hsviewer GUI Components


JavaHelp Table of Contents

DR error messages are separated into logical groups according to error type, as shown in FIGURE 5-1. These groups represent the major topics that appear as the top-level headings in the table of contents. Error message numbers and/or abbreviated text appear under their respective group name.

JavaHelp Index

DR error messages are indexed so that key topics are represented in the Index display (FIGURE 5-2). Index topics are embedded when appropriate. For these topics, only the embedded topics are links to error messages.


FIGURE 5-2 JavaHelp Index Display


JavaHelp Search

The DR error messages help system provides a full-text search function. The search database is constructed by the indexing of error message help files.

Before searching for a specific error message, search on a specific string of text in the error message. Also, avoid using numeric values, as they are treated as replaceable text. The error JavaHelp system window is shown below:


FIGURE 5-3 JavaHelp Search Display