This chapter provides the procedures for shutting down and booting a cluster and individual cluster nodes.
For a high-level description of the related procedures in this chapter, see Table 2–1 and Table 2–2.
The Sun Cluster scshutdown(1M) command stops cluster services in an orderly fashion and cleanly shuts down the entire cluster. You might do use the scshutdown command when moving the location of a cluster. You can also use the command to shut down the cluster if you have data corruption caused by an application error.
Use scshutdown instead of the shutdown or halt commands to ensure proper shutdown of the entire cluster. The Solaris shutdown command is used with the scswitch(1M) command to shut down individual nodes. See How to Shut Down a Cluster or Shutting Down and Booting a Single Cluster Node for more information.
The scshutdown command stops all nodes in a cluster by:
Taking all running resource groups offline.
Unmounting all cluster file systems.
Shutting down active device services.
Running init 0 and bringing all nodes to the OBP ok prompt.
If necessary, you can boot a node in non-cluster mode so that the node does not participate in cluster membership. Non-cluster mode is useful when installing cluster software or for performing certain administrative procedures. See How to Boot a Cluster Node in Non-Cluster Mode for more information.
Task |
For Instructions |
---|---|
Stop the cluster - Use scshutdown(1M) | |
Start the cluster by booting all nodes. The nodes must have a working connection to the cluster interconnect to attain cluster membership. | |
Reboot the cluster - Use scshutdown At the ok prompt, boot each node individually with the boot(1M) command. The nodes must have a working connection to the cluster interconnect to attain cluster membership. |
Do not use send brk on a cluster console to shut down a cluster node. The command is not supported within a cluster. If you use send brk with the go at the ok prompt to reboot, the node panics.
If your cluster is running Oracle® Parallel Server or Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on any node in the cluster.
Shut down the cluster immediately to OBP.
From a single node in the cluster, type the following command.
# scshutdown -g0 -y |
Verify that all nodes have reached the ok prompt.
Do not power off any nodes until all cluster nodes are at the ok prompt.
If necessary, power off the nodes.
The following example shows the console output when stopping normal cluster operation and bringing down all nodes to the ok prompt. The -g 0 option sets the shutdown grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of the other nodes in the cluster.
# scshutdown -g0 -y May 2 10:08:46 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) The system is down. syncing file systems... done Program terminated ok |
See How to Boot a Cluster to restart a cluster that has been shut down.
To start a cluster whose nodes have been shut down and are at the ok prompt, boot(1M) each node.
If you make configuration changes between shutdowns, start the node with the most current configuration first. Except in this situation, the boot order of the nodes does not matter.
ok boot |
Messages are displayed on the booted nodes' consoles as cluster components are activated.
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the nodes booted without error and are online.
The scstat(1M) command reports the nodes' status.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when booting node phys-schost-1 into the cluster. Similar messages appear on the consoles of the other nodes in the cluster.
ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster NOTICE: Node 1 with votecount = 1 added. NOTICE: Node 2 with votecount = 1 added. NOTICE: Node 3 with votecount = 1 added. ... NOTICE: Node 1: attempting to join cluster ... NOTICE: Node 2 (incarnation # 937690106) has become reachable. NOTICE: Node 3 (incarnation # 937690290) has become reachable. NOTICE: cluster has reached quorum. NOTICE: node 1 is up; new incarnation number = 937846227. NOTICE: node 2 is up; new incarnation number = 937690106. NOTICE: node 3 is up; new incarnation number = 937690290. NOTICE: Cluster members: 1 2 3 ... |
Run the scshutdown(1M) command to shut down the cluster, then boot the cluster with the boot(1M) command on each node.
(Optional). For a cluster that is running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on any node in the cluster.
Shut down the cluster to OBP.
From a single node in the cluster, type the following command.
# scshutdown -g0 -y |
Each node is shut down to the ok prompt.
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Boot each node.
The order in which the nodes are booted does not matter unless you make configuration changes between shutdowns. If you make configuration changes between shutdowns, start the node with the most current configuration first.
ok boot |
Messages appear on the booted nodes' consoles as cluster components are activated.
Verify that the nodes booted without error and are online.
The scstat command reports the nodes' status.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when stopping normal cluster operation, bringing down all nodes to the ok prompt, then restarting the cluster. The -g 0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of other nodes in the cluster.
# scshutdown -g0 -y May 2 10:08:46 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. ... The system is down. syncing file systems... done Program terminated ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node 1: attempting to join cluster ... NOTICE: Node 2 (incarnation # 937690106) has become reachable. NOTICE: Node 3 (incarnation # 937690290) has become reachable. NOTICE: cluster has reached quorum. ... NOTICE: Cluster members: 1 2 3 ... NOTICE: Node 1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: NOTICE: Node 1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: |
Use the scswitch(1M) command in conjunction with the Solaris shutdown(1M) command to shut down an individual node. Use the scshutdown command only when shutting down an entire cluster.
Task |
For Instructions |
---|---|
Stop a cluster node - Use scswitch(1M) and shutdown(1M) | |
Start a node The node must have a working connection to the cluster interconnect to attain cluster membership. | |
Stop and restart (reboot) a cluster node - Use scswitch and shutdown The node must have a working connection to the cluster interconnect to attain cluster membership. | |
Boot a node so that the node does not participate in cluster membership - Use scswitch and shutdown, then boot -x |
Do not use send brk on a cluster console to shut down a cluster node. Using send brk and entering go at the ok prompt to reboot causes a node to panic. This functionality is not supported within a cluster.
If you are running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on the cluster node to be shut down.
Switch all resource groups, resources, and device groups from the node being shut down to other cluster members.
On the node to be shut down, type the following command.
# scswitch -S -h node |
Evacuates all device services and resource groups from the specified node.
Specifies the node from which you are switching resource groups and device groups.
Shut down the cluster node to OBP.
On the node to be shut down, type the following command.
# shutdown -g0 -y -i0 |
Verify that the cluster node has reached the ok prompt.
If necessary, power off the node.
The following example shows the console output when shutting down node phys-schost-1. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 May 2 10:08:46 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled.phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done Program terminated ok |
See How to Boot a Cluster Node to restart a cluster node that has been shut down.
Starting a cluster node can be affected by the quorum configuration. In a two-node cluster, you must have a quorum device configured so that the total quorum count for the cluster is three. You should have one quorum count for each node and one quorum count for the quorum device. In this situation, if the first node is shut down, the second node continues to have quorum and runs as the sole cluster member. For the first node to come back in the cluster as a cluster node, the second node must be up and running. The required cluster quorum count (two) must be present.
To start a cluster node that has been shut down, boot the node.
ok boot |
Messages are displayed on all node consoles as cluster components are activated.
A cluster node must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the node has booted without error, and is online.
The scstat(1M) command reports the status of a node.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when booting node phys-schost-1 into the cluster.
ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node 1: attempting to join cluster ... NOTICE: Node 1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: |
If the cluster node is running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on the cluster node to be shut down.
Shut down the cluster node by using the scswitch and shutdown commands.
Enter these commands on the node to be shut down. The -i 6 option with the shutdown command causes the node to reboot after the node shuts down to the ok prompt.
# scswitch -S -h node # shutdown -g0 -y -i6 |
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the node has booted without error, and is online.
# scstat -n |
The following example shows the console output when rebooting node phys-schost-1. Messages for this node, such as shutdown and startup notification, appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i6 May 2 10:08:46 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 6 The system is coming down. Please wait. System services are now being stopped. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done rebooting... Resetting ... ,,, Sun Ultra 1 SBus (UltraSPARC 143MHz), No Keyboard OpenBoot 3.11, 128 MB memory installed, Serial #5932401. Ethernet address 8:8:20:99:ab:77, Host ID: 8899ab77. ... Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node 1: attempting to join cluster ... NOTICE: Node 1: joined cluster ... The system is coming up. Please wait. The system is ready. phys-schost-1 console login: |
You can boot a node so that the node does not participate in the cluster membership, that is, in non-cluster mode. Non-cluster mode is useful when installing the cluster software or performing certain administrative procedures, such as patching a node.
Become superuser on the cluster node to be started in non-cluster mode.
Shut down the node by using the scswitch and shutdown commands.
# scswitch -S -h node # shutdown -g0 -y -i0 |
Verify that the node is at the ok prompt.
Boot the node in non-cluster mode by using the boot(1M) command with the -x option.
ok boot -x |
Messages appear on the node's console stating that the node is not part of the cluster.
The following example shows the console output when shutting down node phys-schost-1 then restarting the node in non-cluster mode. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 May 2 10:08:46 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# ... rg_name = schost-sa-1 ... offline node = phys-schost-2 ... num of node = 0 ... phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. Print services stopped. syslogd: going down on signal 15 ... The system is down. syncing file systems... done WARNING: node 1 is being shut down. Program terminated ok boot -x ... Not booting as part of cluster ... The system is ready. phys-schost-1 console login: |
Both Solaris and Sun Cluster software write error messages to the /var/adm/messages file, which over time can fill the /var file system. If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. Additionally, you might not be able to log in to the node.
If a node reports a full /var file system and continues to run Sun Cluster services, use this procedure to clear the full file system. Refer to “Viewing System Messages” in System Administration Guide: Advanced Administration for more information.