How to Boot a Node

Language:

If you intend to shut down or reboot other active nodes in the global cluster or zone cluster, wait until the multiuser-server milestone comes online for the node you are booting.

Otherwise, the node will not be available to take over services from other nodes in the cluster that you shut down or reboot.

Note - Starting a node can be affected by the quorum configuration. In a two-node cluster, you must have a quorum device configured so that the total quorum count for the cluster is three. You should have one quorum count for each node and one quorum count for the quorum device. In this situation, if the first node is shut down, the second node continues to have quorum and runs as the sole cluster member. For the first node to come back in the cluster as a cluster node, the second node must be up and running. The required cluster quorum count (two) must be present.

If you are running Oracle Solaris Cluster in a guest domain, rebooting the control or I/O domain can have an impact on the running guest domain, including the domain going down. You should rebalance the workload to other nodes and stop the guest domain running Oracle Solaris Cluster before you reboot the control or I/O domain.

When a control or I/O domain is rebooted, heartbeats are not received or sent by the guest domain. This causes split brain and a cluster reconfiguration to occur. Since the control or I/O domain is rebooting, the guest domain cannot access any shared devices. The other cluster nodes will fence this guest domain from the shared devices. When the control or I/O domain finishes its reboot, I/O resumes on the guest domain and any I/O to shared storage causes the guest domain to panic because it has been fenced off the shared disks as part of the cluster reconfiguration. You can mitigate this issue if a guest is employing two I/O domains for redundancy and you reboot the I/O domains one at a time.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

Note - Nodes must have a working connection to the cluster interconnect to attain cluster membership.

You can also boot a zone cluster node using the Oracle Solaris Cluster Manager GUI. For GUI log-in instructions, see How to Access Oracle Solaris Cluster Manager.

To start a global-cluster node or zone-cluster node that has been shut down, boot the node.
Perform all steps in this procedure from a node of the global cluster.
- On SPARC based systems, run the following command.
```
ok boot
```
- On x86 based systems, run the following commands.
  
  When the GRUB menu is displayed, select the appropriate Oracle Solaris entry and press Enter.
  
  Messages appear on the booted nodes' consoles as cluster components are activated.
- If you have a zone cluster, you can specify a node to boot.
```
phys-schost# clzonecluster boot -n node zoneclustername
```

Verify that the node booted without error, and is online.
- Running the cluster status command reports the status of a global-cluster node.
```
phys-schost# cluster status -t node
```
- Running the clzonecluster status command from a node on the global cluster reports the status of all zone-cluster nodes.
```
phys-schost# clzonecluster status
```
  A zone-cluster node can only be booted in cluster mode when the node hosting the node is booted in cluster mode.
  
  Note - If a node's /var file system fills up, Oracle Solaris Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.

Example 3-12 SPARC: Booting a Global-Cluster Node

The following example shows the console output when node phys-schost-1 is booted into the global cluster.

ok boot
Rebooting with command: boot
...
Hostname: phys-schost-1
Booting as part of a cluster
...
NOTICE: Node phys-schost-1: attempting to join cluster
...
NOTICE: Node phys-schost-1: joined cluster
...
The system is coming up.  Please wait.
checking ufs filesystems
...
reservation program successfully exiting
Print services started.
volume management starting.
The system is ready.
phys-schost-1 console login:

Example 3-13 x86: Booting a Cluster Node

The following example shows the console output when node phys-schost-1 is booted into the cluster.

                     <<< Current Boot Parameters >>>
Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/sd@0,0:a
Boot args:

Type    b [file-name] [boot-flags] <ENTER>   to boot with options
or      i <ENTER>                            to enter boot interpreter
or      <ENTER>                              to boot with defaults

<<< timeout in 5 seconds >>>

Select (b)oot or (i)nterpreter: Size: 276915 + 22156 + 150372 Bytes
/platform/i86pc/kernel/unix loaded - 0xac000 bytes used
SunOS Release 5.9 Version on81-feature-patch:08/30/2003 32-bit
Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
configuring IPv4 interfaces: e1000g2.
Hostname: phys-schost-1
Booting as part of a cluster
NOTICE: CMM: Node phys-schost-1 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node phys-schost-2 (nodeid = 2) with votecount = 1 added.
NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask
of nodes with configured paths = 0x3.
WARNING: CMM: Initialization for quorum device /dev/did/rdsk/d1s2 failed with
error EACCES. Will retry later.
NOTICE: clcomm: Adapter e1000g3 constructed
NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being constructed
NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being initiated
NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 online
NOTICE: clcomm: Adapter e1000g0 constructed
NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being constructed
NOTICE: CMM: Node phys-schost-1: attempting to join cluster.
WARNING: CMM: Reading reservation keys from quorum device /dev/did/rdsk/d1s2
failed with error 2.
NOTICE: CMM: Cluster has reached quorum.
NOTICE: CMM: Node phys-schost-1 (nodeid = 1) is up; new incarnation number =
1068503958.
NOTICE: CMM: Node phys-schost-2 (nodeid = 2) is up; new incarnation number =
1068496374.
NOTICE: CMM: Cluster members: phys-schost-1 phys-schost-2.
NOTICE: CMM: node reconfiguration #3 completed.
NOTICE: CMM: Node phys-schost-1: joined cluster.
NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being initiated
NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 online
NOTICE: CMM: Retry of initialization for quorum device /dev/did/rdsk/d1s2 was
successful.
WARNING: mod_installdrv: no major number for rsmrdt
ip: joining multicasts failed (18) on clprivnet0 - will use link layer
broadcasts for multicast
The system is coming up.  Please wait.
checking ufs filesystems
/dev/rdsk/c1t0d0s5: is clean.
NIS domain name is dev.eng.mycompany.com
starting rpc services: rpcbind keyserv ypbind done.
Setting netmask of e1000g2 to 192.168.255.0
Setting netmask of e1000g3 to 192.168.255.128
Setting netmask of e1000g0 to 192.168.255.128
Setting netmask of clprivnet0 to 192.168.255.0
Setting default IPv4 interface for multicast: add net 224.0/4: gateway phys-schost-1
syslog service starting.
obtaining access to all attached disks


*****************************************************************************
*
* The X-server can not be started on display :0...
*
*****************************************************************************
volume management starting.
Starting Fault Injection Server...
The system is ready.

phys-schost-1 console login: