JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster System Administration Guide     Oracle Solaris Cluster 4.1
search filter icon
search icon

Document Information

Preface

1.  Introduction to Administering Oracle Solaris Cluster

2.  Oracle Solaris Cluster and RBAC

3.  Shutting Down and Booting a Cluster

Overview of Shutting Down and Booting a Cluster

How to Shut Down a Cluster

How to Boot a Cluster

How to Reboot a Cluster

Shutting Down and Booting a Single Node in a Cluster

How to Shut Down a Node

How to Boot a Node

How to Reboot a Node

How to Boot a Node in Noncluster Mode

Repairing a Full /var File System

How to Repair a Full /var File System

4.  Data Replication Approaches

5.  Administering Global Devices, Disk-Path Monitoring, and Cluster File Systems

6.  Administering Quorum

7.  Administering Cluster Interconnects and Public Networks

8.  Adding and Removing a Node

9.  Administering the Cluster

10.  Configuring Control of CPU Usage

11.  Updating Your Software

12.  Backing Up and Restoring a Cluster

A.  Example

Index

Shutting Down and Booting a Single Node in a Cluster

You can shut down a global-cluster node or a zone-cluster node. This section provides instructions for shutting down a global-cluster node and a zone-cluster node.

To shut down a global-cluster node, use the clnode evacuate command with the Oracle Solaris shutdown command. Use the cluster shutdown command only when shutting down an entire global cluster.

On a zone-cluster node, use the clzonecluster halt command on a global cluster to shut down a single zone-cluster node or an entire zone cluster. You can also use the clnode evacuate and shutdown commands to shut down a zone-cluster node.

For more information, see the clnode(1CL), shutdown(1M), and clzonecluster(1CL) man pages.

In the procedures in this chapter, phys-schost# reflects a global-cluster prompt. The clzonecluster interactive shell prompt is clzc:schost>.

Table 3-2 Task Map: Shutting Down and Booting a Node

Task
Tool
Instructions
Stop a node.
For a global-cluster node, use the clnode evacuate and shutdown commands. For a zone-cluster node, use the clzonecluster halt command.
Start a node.

The node must have a working connection to the cluster interconnect to attain cluster membership.

For a global-cluster node, use the boot or b command. For a zone-cluster node, use the clzonecluster boot command.
Stop and restart (reboot) a node on a cluster.

The node must have a working connection to the cluster interconnect to attain cluster membership.

For a global-cluster node, use the clnode evacuate and shutdown commands, followed by boot or b.

For a zone-cluster node, use the clzonecluster reboot command.

Boot a node so that the node does not participate in cluster membership.
For a global-cluster node, use clnode evacuate and shutdown commands, followed by boot -x on SPARC or GRUB menu entry editing on x86.

If the underlying global cluster is booted in noncluster mode, the zone cluster node is automatically in noncluster mode.

How to Shut Down a Node

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.


Caution

Caution - Do not use send brk on a cluster console to shut down a node on a global cluster or a zone cluster. The command is not supported within a cluster.


  1. If your cluster is running Oracle RAC, shut down all instances of the database on the cluster you are shutting down.

    Refer to the Oracle RAC product documentation for shutdown procedures.

  2. Assume a role that provides solaris.cluster.admin RBAC authorization on the cluster node to be shut down.

    Perform all steps in this procedure from a node of the global cluster.

  3. If you want to halt a specific zone cluster member, skip Steps 4 - 6 and execute the following command from a global-cluster node:
    phys-schost# clzonecluster halt -n physical-name zoneclustername

    When you specify a particular zone-cluster node, you stop only that node. By default, the halt command stops the zone clusters on all nodes.

  4. Switch all resource groups, resources, and device groups from the node being shut down to other global cluster members.

    On the global-cluster node to shut down, type the following command. The clnode evacuate command switches over all resource groups and device groups from the specified node to the next-preferred node. (You can also run clnode evacuate within a zone-cluster node.)

    phys-schost# clnode evacuate node
    node

    Specifies the node from which you are switching resource groups and device groups.

  5. Shut down the node.

    Specify the global-cluster node you want to shut down.

    phys-schost# shutdown -g0 -y -i0

    Verify that the global-cluster node is showing the ok prompt on a SPARC based system or the Press any key to continue message on the GRUB menu on an x86 based system.

  6. If necessary, power off the node.

Example 3-7 SPARC: Shutting Down a Global-Cluster Node

The following example shows the console output when node phys-schost-1 is shut down. The -g0 option sets the grace period to zero, and the -y option provides an automatic yes response to the confirmation question. Shutdown messages for this node appear on the consoles of other nodes in the global cluster.

phys-schost# clnode evacuate phys-schost-1
phys-schost# shutdown -g0 -y
Wed Mar 10 13:47:32 phys-schost-1 cl_runtime:
WARNING: CMM monitoring disabled.
phys-schost-1# 
INIT: New run level: 0
The system is coming down.  Please wait.
Notice: rgmd is being stopped.
Notice: rpc.pmfd is being stopped.
Notice: rpc.fed is being stopped.
umount: /global/.devices/node@1 busy
umount: /global/phys-schost-1 busy
The system is down.
syncing file systems... done
Program terminated
ok 

Example 3-8 x86: Shutting Down a Global-Cluster Node

The following example shows the console output when node phys-schost-1 is shut down. The -g0 option sets the grace period to zero, and the -y option provides an automatic yes response to the confirmation question. Shutdown messages for this node appear on the consoles of other nodes in the global cluster.

phys-schost# clnode evacuate phys-schost-1
phys-schost# shutdown -g0 -y
Shutdown started.    Wed Mar 10 13:47:32 PST 2004

Changing to init state 0 - please wait
Broadcast Message from root (console) on phys-schost-1 Wed Mar 10 13:47:32... 
THE SYSTEM phys-schost-1 IS BEING SHUT DOWN NOW ! ! !
Log off now or risk your files being damaged

phys-schost-1#
INIT: New run level: 0
The system is coming down.  Please wait.
System services are now being stopped.
/etc/rc0.d/K05initrgm: Calling clnode evacuate
failfasts disabled on node 1
Print services already stopped.
Mar 10 13:47:44 phys-schost-1 syslogd: going down on signal 15
umount: /global/.devices/node@2 busy
umount: /global/.devices/node@1 busy
The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
Type any key to continue 

Example 3-9 Shutting Down a Zone-Cluster Node

The following example shows how use the clzonecluster halt to shut down a node on a zone cluster called sparse-sczone. (You can also run the clnode evacuate and shutdown commands in a zone-cluster node.)

phys-schost# clzonecluster status

=== Zone Clusters ===

--- Zone Cluster Status ---

Name            Node Name   Zone HostName   Status   Zone Status
----            ---------   -------------   ------   -----------
sparse-sczone   schost-1    sczone-1        Online   Running
                schost-2    sczone-2        Online   Running
                schost-3    sczone-3        Online   Running
                schost-4    sczone-4        Online   Running

phys-schost#
phys-schost# clzonecluster halt -n schost-4 sparse-sczone
Waiting for zone halt commands to complete on all the nodes of the zone cluster "sparse-sczone"...
Sep  5 19:24:00 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster 'sparse-sczone' died.
phys-host#
phys-host# clzonecluster status

=== Zone Clusters ===

--- Zone Cluster Status ---

Name            Node Name   Zone HostName   Status    Zone Status
----            ---------   -------------   ------    -----------
sparse-sczone   schost-1    sczone-1        Online    Running
                schost-2    sczone-2        Online    Running
                schost-3    sczone-3        Offline   Installed
                schost-4    sczone-4        Online    Running

phys-schost# 

See Also

See How to Boot a Node to restart a global-cluster node that was shut down.

How to Boot a Node

If you intend to shut down or reboot other active nodes in the global cluster or zone cluster, wait until the multiuser-server milestone comes online for the node you are booting.

Otherwise, the node will not be available to take over services from other nodes in the cluster that you shut down or reboot.


Note - Starting a node can be affected by the quorum configuration. In a two-node cluster, you must have a quorum device configured so that the total quorum count for the cluster is three. You should have one quorum count for each node and one quorum count for the quorum device. In this situation, if the first node is shut down, the second node continues to have quorum and runs as the sole cluster member. For the first node to come back in the cluster as a cluster node, the second node must be up and running. The required cluster quorum count (two) must be present.


If you are running Oracle Solaris Cluster in a guest domain, rebooting the control or I/O domain can have an impact on the running guest domain, including the domain going down. You should rebalance the workload to other nodes and stop the guest domain running Oracle Solaris Cluster before you reboot the control or I/O domain.

When a control or I/O domain is rebooted, heartbeats are not received or sent by the guest domain. This causes split brain and a cluster reconfiguration to occur. Since the control or I/O domain is rebooting, the guest domain cannot access any shared devices. The other cluster nodes will fence this guest domain from the shared devices. When the control or I/O domain finishes its reboot, I/O resumes on the guest domain and any I/O to shared storage causes the guest domain to panic because it has been fenced off the shared disks as part of the cluster reconfiguration. You can mitigate this issue if a guest is employing two I/O domains for redundancy and you reboot the I/O domains one at a time.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.


Note - Nodes must have a working connection to the cluster interconnect to attain cluster membership.


  1. To start a global-cluster node or zone-cluster node that has been shut down, boot the node.

    Perform all steps in this procedure from a node of the global cluster.

    • On SPARC based systems, run the following command.

      ok boot
    • On x86 based systems, run the following commands.

      When the GRUB menu is displayed, select the appropriate Oracle Solaris entry and press Enter.

      Messages appear on the booted nodes' consoles as cluster components are activated.

    • If you have a zone cluster, you can specify a node to boot.

      phys-schost# clzonecluster boot -n node zoneclustername
  2. Verify that the node booted without error, and is online.
    • Running the cluster status command reports the status of a global-cluster node.
      phys-schost# cluster status -t node
    • Running the clzonecluster status command from a node on the global cluster reports the status of all zone-cluster nodes.
      phys-schost# clzonecluster status

      A zone-cluster node can only be booted in cluster mode when the node hosting the node is booted in cluster mode.


      Note - If a node's /var file system fills up, Oracle Solaris Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.


Example 3-10 SPARC: Booting a Global-Cluster Node

The following example shows the console output when node phys-schost-1 is booted into the global cluster.

ok boot
Rebooting with command: boot 
...
Hostname: phys-schost-1
Booting as part of a cluster
...
NOTICE: Node phys-schost-1: attempting to join cluster
...
NOTICE: Node phys-schost-1: joined cluster
...
The system is coming up.  Please wait.
checking ufs filesystems
...
reservation program successfully exiting
Print services started.
volume management starting.
The system is ready.
phys-schost-1 console login:

How to Reboot a Node

To shut down or reboot other active nodes in the global cluster or zone cluster, wait until the multiuser-server milestone comes online for the node that you are rebooting.

Otherwise, the node will not be available to take over services from other nodes in the cluster that you shut down or reboot.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.


Caution

Caution - If a method for any resource times out and cannot be killed, the node will be rebooted only if the resource's Failover_mode property is set to HARD. If the Failover_mode property is set to any other value, the node will not be rebooted.


  1. If the global-cluster or zone-cluster node is running Oracle RAC, shut down all instances of the database on the node that you are shutting down.

    Refer to the Oracle RAC product documentation for shutdown procedures.

  2. Assume a role that provides solaris.cluster.admin RBAC authorization on the node to shut down.

    Perform all steps in this procedure from a node of the global cluster.

  3. Shut down the global-cluster node by using the clnode evacuate and shutdown commands.

    Shut down the zone cluster with the clzonecluster halt command executed on a node of the global cluster. (The clnode evacuate and shutdown commands also work in a zone cluster.)

    For a global cluster, type the following commands on the node to shut down. The clnode evacuate command switches over all device groups from the specified node to the next-preferred node. The command also switches all resource groups from global zones on the specified node to the next-preferred global zone on other nodes.


    Note - To shut down a single node, use the shutdown -g0 -y -i6 command. To shut down multiple nodes at the same time, use the shutdown -g0 -y -i0 command to halt the nodes. After all the nodes have halted, use the boot command on all nodes to boot them back in to the cluster.


    • On a SPARC based system, run the following commands to reboot a single node.

      phys-schost# clnode evacuate node
      phys-schost# shutdown -g0 -y -i6
    • On an x86 based system, run the following commands to reboot a single node.

      phys-schost# clnode evacuate node
      phys-schost# shutdown -g0 -y -i6

      When the GRUB menu is displayed, select the appropriate Oracle Solaris entry and press Enter.

    • Specify the zone-cluster node to shut down and reboot.

      phys-schost# clzonecluster reboot - node zoneclustername

    Note - Nodes must have a working connection to the cluster interconnect to attain cluster membership.


  4. Verify that the node booted without error and is online.
    • Verify that the global-cluster node is online.
      phys-schost# cluster status -t node
    • Verify that the zone-cluster node is online.
      phys-schost# clzonecluster status

Example 3-11 SPARC: Rebooting a Global-Cluster Node

The following example shows the console output when node phys-schost-1 is rebooted. Messages for this node, such as shutdown and startup notification, appear on the consoles of other nodes in the global cluster.

phys-schost# clnode evacuate phys-schost-1
phys-schost# shutdown -g0 -y -i6
Shutdown started.    Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: 

WARNING: CMM monitoring disabled.
phys-schost-1# 
INIT: New run level: 6
The system is coming down.  Please wait.
System services are now being stopped.
Notice: rgmd is being stopped.
Notice: rpc.pmfd is being stopped.
Notice: rpc.fed is being stopped.
umount: /global/.devices/node@1 busy
umount: /global/phys-schost-1 busy
The system is down.
syncing file systems... done
rebooting...
Resetting ... 
,,,
Sun Ultra 1 SBus (UltraSPARC 143MHz), No Keyboard
OpenBoot 3.11, 128 MB memory installed, Serial #5932401.
Ethernet address 8:8:20:99:ab:77, Host ID: 8899ab77.
...
Rebooting with command: boot
...
Hostname: phys-schost-1
Booting as part of a cluster
...
NOTICE: Node phys-schost-1: attempting to join cluster
...
NOTICE: Node phys-schost-1: joined cluster
...
The system is coming up.  Please wait.
The system is ready.
phys-schost-1 console login: 

Example 3-12 Rebooting a Zone-Cluster Node

The following example shows how to reboot a node on a zone cluster.

phys-schost# clzonecluster reboot -n schost-4 sparse-sczone
Waiting for zone reboot commands to complete on all the nodes of the zone cluster
   "sparse-sczone"...
Sep  5 19:40:59 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster
   'sparse-sczone' died.
phys-schost# Sep  5 19:41:27 schost-4 cl_runtime: NOTICE: Membership : Node 3 of cluster
   'sparse-sczone' joined.

phys-schost#
phys-schost# clzonecluster status

=== Zone Clusters ===

--- Zone Cluster Status ---
Name            Node Name   Zone HostName   Status   Zone Status
----            ---------   -------------   ------   -----------
sparse-sczone   schost-1    sczone-1        Online   Running
                schost-2    sczone-2        Online   Running
                schost-3    sczone-3        Online   Running
                schost-4    sczone-4        Online   Running

phys-schost#

How to Boot a Node in Noncluster Mode

You can boot a global-cluster node in noncluster mode, where the node does not participate in the cluster membership. Noncluster mode is useful when installing the cluster software or performing certain administrative procedures, such as updating a node. A zone-cluster node cannot be in a boot state that is different from the state of the underlying global-cluster node. If the global-cluster node is booted in noncluster mode, the zone-cluster node is automatically in noncluster mode.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume a role that provides solaris.cluster.admin RBAC authorization on the cluster to be started in noncluster mode.

    Perform all steps in this procedure from a node of the global cluster.

  2. Shut down the zone-cluster node or the global-cluster node.

    The clnode evacuate command switches over all device groups from the specified node to the next-preferred node. The command also switches all resource groups from global zones on the specified node to the next-preferred global zones on other nodes.

    • Shut down a specific global cluster node.
      phys-schost# clnode evacuate node
      phys-schost# shutdown -g0 -y
    • Shut down a specific zone-cluster node from a global-cluster node.
      phys-schost# clzonecluster halt -n node zoneclustername

      You can also use the clnode evacuate and shutdown commands within a zone cluster.

  3. Verify that the global-cluster node is showing the ok prompt on an Oracle Solaris-based system or the Press any key to continue message on a GRUB menu on an x86 based system.
  4. Boot the global-cluster node in noncluster mode.
    • On SPARC based systems, run the following command.

      ok boot -xs
    • On x86 based systems, run the following commands.

    1. In the GRUB menu, use the arrow keys to select the appropriate Oracle Solaris entry and type e to edit its commands.

      The GRUB menu appears.

      For more information about GRUB based booting, see Booting a System in Booting and Shutting Down Oracle Solaris 11.1 Systems.

    2. In the boot parameters screen, use the arrow keys to select the kernel entry and type e to edit the entry.

      The GRUB boot parameters screen appears.

    3. Add -x to the command to specify system boot in noncluster mode.
      [ Minimal BASH-like line editing is supported. For the first word, TAB
      lists possible command completions. Anywhere else TAB lists the possible
      completions of a device/filename. ESC at any time exits. ]
      
      grub edit> kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -x
    4. Press the Enter key to accept the change and return to the boot parameters screen.

      The screen displays the edited command.

    5. Type b to boot the node into noncluster mode.

      Note - This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps again to add the -x option to the kernel boot parameter command.


Example 3-13 SPARC: Booting a Global-Cluster Node in Noncluster Mode

The following example shows the console output when node phys-schost-1 is shut down and restarted in noncluster mode. The -g0 option sets the grace period to zero, the -y option provides an automatic yes response to the confirmation question, and the -i0 option invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the global cluster.

phys-schost# clnode evacuate phys-schost-1
phys-schost# cluster shutdown -g0 -y
Shutdown started.    Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: 

WARNING: CMM monitoring disabled.
phys-schost-1# 
...
rg_name = schost-sa-1 ...
offline node = phys-schost-2 ...
num of node = 0 ...
phys-schost-1# 
INIT: New run level: 0
The system is coming down.  Please wait.
System services are now being stopped.
Print services stopped.
syslogd: going down on signal 15
...
The system is down.
syncing file systems... done
WARNING: node phys-schost-1 is being shut down.
Program terminated

ok boot -x
...
Not booting as part of cluster
...
The system is ready.
phys-schost-1 console login: