Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Cluster System Administration Guide Oracle Solaris Cluster 4.1 |
1. Introduction to Administering Oracle Solaris Cluster
2. Oracle Solaris Cluster and RBAC
3. Shutting Down and Booting a Cluster
Overview of Shutting Down and Booting a Cluster
Repairing a Full /var File System
How to Repair a Full /var File System
4. Data Replication Approaches
5. Administering Global Devices, Disk-Path Monitoring, and Cluster File Systems
7. Administering Cluster Interconnects and Public Networks
10. Configuring Control of CPU Usage
You can shut down a global-cluster node or a zone-cluster node. This section provides instructions for shutting down a global-cluster node and a zone-cluster node.
To shut down a global-cluster node, use the clnode evacuate command with the Oracle Solaris shutdown command. Use the cluster shutdown command only when shutting down an entire global cluster.
On a zone-cluster node, use the clzonecluster halt command on a global cluster to shut down a single zone-cluster node or an entire zone cluster. You can also use the clnode evacuate and shutdown commands to shut down a zone-cluster node.
For more information, see the clnode(1CL), shutdown(1M), and clzonecluster(1CL) man pages.
In the procedures in this chapter, phys-schost# reflects a global-cluster prompt. The clzonecluster interactive shell prompt is clzc:schost>.
Table 3-2 Task Map: Shutting Down and Booting a Node
|
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
Caution - Do not use send brk on a cluster console to shut down a node on a global cluster or a zone cluster. The command is not supported within a cluster. |
Refer to the Oracle RAC product documentation for shutdown procedures.
Perform all steps in this procedure from a node of the global cluster.
phys-schost# clzonecluster halt -n physical-name zoneclustername
When you specify a particular zone-cluster node, you stop only that node. By default, the halt command stops the zone clusters on all nodes.
On the global-cluster node to shut down, type the following command. The clnode evacuate command switches over all resource groups and device groups from the specified node to the next-preferred node. (You can also run clnode evacuate within a zone-cluster node.)
phys-schost# clnode evacuate node
Specifies the node from which you are switching resource groups and device groups.
Specify the global-cluster node you want to shut down.
phys-schost# shutdown -g0 -y -i0
Verify that the global-cluster node is showing the ok prompt on a SPARC based system or the Press any key to continue message on the GRUB menu on an x86 based system.
Example 3-7 SPARC: Shutting Down a Global-Cluster Node
The following example shows the console output when node phys-schost-1 is shut down. The -g0 option sets the grace period to zero, and the -y option provides an automatic yes response to the confirmation question. Shutdown messages for this node appear on the consoles of other nodes in the global cluster.
phys-schost# clnode evacuate phys-schost-1 phys-schost# shutdown -g0 -y Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done Program terminated ok
Example 3-8 x86: Shutting Down a Global-Cluster Node
The following example shows the console output when node phys-schost-1 is shut down. The -g0 option sets the grace period to zero, and the -y option provides an automatic yes response to the confirmation question. Shutdown messages for this node appear on the consoles of other nodes in the global cluster.
phys-schost# clnode evacuate phys-schost-1 phys-schost# shutdown -g0 -y Shutdown started. Wed Mar 10 13:47:32 PST 2004 Changing to init state 0 - please wait Broadcast Message from root (console) on phys-schost-1 Wed Mar 10 13:47:32... THE SYSTEM phys-schost-1 IS BEING SHUT DOWN NOW ! ! ! Log off now or risk your files being damaged phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling clnode evacuate failfasts disabled on node 1 Print services already stopped. Mar 10 13:47:44 phys-schost-1 syslogd: going down on signal 15 umount: /global/.devices/node@2 busy umount: /global/.devices/node@1 busy The system is down. syncing file systems... done WARNING: CMM: Node being shut down. Type any key to continue
Example 3-9 Shutting Down a Zone-Cluster Node
The following example shows how use the clzonecluster halt to shut down a node on a zone cluster called sparse-sczone. (You can also run the clnode evacuate and shutdown commands in a zone-cluster node.)
phys-schost# clzonecluster status === Zone Clusters === --- Zone Cluster Status --- Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- sparse-sczone schost-1 sczone-1 Online Running schost-2 sczone-2 Online Running schost-3 sczone-3 Online Running schost-4 sczone-4 Online Running phys-schost# phys-schost# clzonecluster halt -n schost-4 sparse-sczone Waiting for zone halt commands to complete on all the nodes of the zone cluster "sparse-sczone"... Sep 5 19:24:00 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster 'sparse-sczone' died. phys-host# phys-host# clzonecluster status === Zone Clusters === --- Zone Cluster Status --- Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- sparse-sczone schost-1 sczone-1 Online Running schost-2 sczone-2 Online Running schost-3 sczone-3 Offline Installed schost-4 sczone-4 Online Running phys-schost#
See Also
See How to Boot a Node to restart a global-cluster node that was shut down.
If you intend to shut down or reboot other active nodes in the global cluster or zone cluster, wait until the multiuser-server milestone comes online for the node you are booting.
Otherwise, the node will not be available to take over services from other nodes in the cluster that you shut down or reboot.
Note - Starting a node can be affected by the quorum configuration. In a two-node cluster, you must have a quorum device configured so that the total quorum count for the cluster is three. You should have one quorum count for each node and one quorum count for the quorum device. In this situation, if the first node is shut down, the second node continues to have quorum and runs as the sole cluster member. For the first node to come back in the cluster as a cluster node, the second node must be up and running. The required cluster quorum count (two) must be present.
If you are running Oracle Solaris Cluster in a guest domain, rebooting the control or I/O domain can have an impact on the running guest domain, including the domain going down. You should rebalance the workload to other nodes and stop the guest domain running Oracle Solaris Cluster before you reboot the control or I/O domain.
When a control or I/O domain is rebooted, heartbeats are not received or sent by the guest domain. This causes split brain and a cluster reconfiguration to occur. Since the control or I/O domain is rebooting, the guest domain cannot access any shared devices. The other cluster nodes will fence this guest domain from the shared devices. When the control or I/O domain finishes its reboot, I/O resumes on the guest domain and any I/O to shared storage causes the guest domain to panic because it has been fenced off the shared disks as part of the cluster reconfiguration. You can mitigate this issue if a guest is employing two I/O domains for redundancy and you reboot the I/O domains one at a time.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
Note - Nodes must have a working connection to the cluster interconnect to attain cluster membership.
Perform all steps in this procedure from a node of the global cluster.
On SPARC based systems, run the following command.
ok boot
On x86 based systems, run the following commands.
When the GRUB menu is displayed, select the appropriate Oracle Solaris entry and press Enter.
Messages appear on the booted nodes' consoles as cluster components are activated.
If you have a zone cluster, you can specify a node to boot.
phys-schost# clzonecluster boot -n node zoneclustername
phys-schost# cluster status -t node
phys-schost# clzonecluster status
A zone-cluster node can only be booted in cluster mode when the node hosting the node is booted in cluster mode.
Note - If a node's /var file system fills up, Oracle Solaris Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
Example 3-10 SPARC: Booting a Global-Cluster Node
The following example shows the console output when node phys-schost-1 is booted into the global cluster.
ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login:
To shut down or reboot other active nodes in the global cluster or zone cluster, wait until the multiuser-server milestone comes online for the node that you are rebooting.
Otherwise, the node will not be available to take over services from other nodes in the cluster that you shut down or reboot.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
Caution - If a method for any resource times out and cannot be killed, the node will be rebooted only if the resource's Failover_mode property is set to HARD. If the Failover_mode property is set to any other value, the node will not be rebooted. |
Refer to the Oracle RAC product documentation for shutdown procedures.
Perform all steps in this procedure from a node of the global cluster.
Shut down the zone cluster with the clzonecluster halt command executed on a node of the global cluster. (The clnode evacuate and shutdown commands also work in a zone cluster.)
For a global cluster, type the following commands on the node to shut down. The clnode evacuate command switches over all device groups from the specified node to the next-preferred node. The command also switches all resource groups from global zones on the specified node to the next-preferred global zone on other nodes.
Note - To shut down a single node, use the shutdown -g0 -y -i6 command. To shut down multiple nodes at the same time, use the shutdown -g0 -y -i0 command to halt the nodes. After all the nodes have halted, use the boot command on all nodes to boot them back in to the cluster.
On a SPARC based system, run the following commands to reboot a single node.
phys-schost# clnode evacuate node
phys-schost# shutdown -g0 -y -i6
On an x86 based system, run the following commands to reboot a single node.
phys-schost# clnode evacuate node
phys-schost# shutdown -g0 -y -i6
When the GRUB menu is displayed, select the appropriate Oracle Solaris entry and press Enter.
Specify the zone-cluster node to shut down and reboot.
phys-schost# clzonecluster reboot - node zoneclustername
Note - Nodes must have a working connection to the cluster interconnect to attain cluster membership.
phys-schost# cluster status -t node
phys-schost# clzonecluster status
Example 3-11 SPARC: Rebooting a Global-Cluster Node
The following example shows the console output when node phys-schost-1 is rebooted. Messages for this node, such as shutdown and startup notification, appear on the consoles of other nodes in the global cluster.
phys-schost# clnode evacuate phys-schost-1 phys-schost# shutdown -g0 -y -i6 Shutdown started. Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 6 The system is coming down. Please wait. System services are now being stopped. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done rebooting... Resetting ... ,,, Sun Ultra 1 SBus (UltraSPARC 143MHz), No Keyboard OpenBoot 3.11, 128 MB memory installed, Serial #5932401. Ethernet address 8:8:20:99:ab:77, Host ID: 8899ab77. ... Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. The system is ready. phys-schost-1 console login:
Example 3-12 Rebooting a Zone-Cluster Node
The following example shows how to reboot a node on a zone cluster.
phys-schost# clzonecluster reboot -n schost-4 sparse-sczone Waiting for zone reboot commands to complete on all the nodes of the zone cluster "sparse-sczone"... Sep 5 19:40:59 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster 'sparse-sczone' died. phys-schost# Sep 5 19:41:27 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster 'sparse-sczone' joined. phys-schost# phys-schost# clzonecluster status === Zone Clusters === --- Zone Cluster Status --- Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- sparse-sczone schost-1 sczone-1 Online Running schost-2 sczone-2 Online Running schost-3 sczone-3 Online Running schost-4 sczone-4 Online Running phys-schost#
You can boot a global-cluster node in noncluster mode, where the node does not participate in the cluster membership. Noncluster mode is useful when installing the cluster software or performing certain administrative procedures, such as updating a node. A zone-cluster node cannot be in a boot state that is different from the state of the underlying global-cluster node. If the global-cluster node is booted in noncluster mode, the zone-cluster node is automatically in noncluster mode.
The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.
This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.
Perform all steps in this procedure from a node of the global cluster.
The clnode evacuate command switches over all device groups from the specified node to the next-preferred node. The command also switches all resource groups from global zones on the specified node to the next-preferred global zones on other nodes.
phys-schost# clnode evacuate node
phys-schost# shutdown -g0 -y
phys-schost# clzonecluster halt -n node zoneclustername
You can also use the clnode evacuate and shutdown commands within a zone cluster.
On SPARC based systems, run the following command.
ok boot -xs
On x86 based systems, run the following commands.
The GRUB menu appears.
For more information about GRUB based booting, see Booting a System in Booting and Shutting Down Oracle Solaris 11.1 Systems.
The GRUB boot parameters screen appears.
[ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename. ESC at any time exits. ] grub edit> kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -x
The screen displays the edited command.
Note - This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps again to add the -x option to the kernel boot parameter command.
Example 3-13 SPARC: Booting a Global-Cluster Node in Noncluster Mode
The following example shows the console output when node phys-schost-1 is shut down and restarted in noncluster mode. The -g0 option sets the grace period to zero, the -y option provides an automatic yes response to the confirmation question, and the -i0 option invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the global cluster.
phys-schost# clnode evacuate phys-schost-1 phys-schost# cluster shutdown -g0 -y Shutdown started. Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# ... rg_name = schost-sa-1 ... offline node = phys-schost-2 ... num of node = 0 ... phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. Print services stopped. syslogd: going down on signal 15 ... The system is down. syncing file systems... done WARNING: node phys-schost-1 is being shut down. Program terminated ok boot -x ... Not booting as part of cluster ... The system is ready. phys-schost-1 console login: