This chapter provides the procedures for shutting down and booting a cluster and individual cluster nodes.
For a high-level description of the related procedures in this chapter, see Table 3–1 and Table 3–2.
The Sun Cluster scshutdown(1M) command stops cluster services in an orderly fashion and cleanly shuts down the entire cluster. You might do use the scshutdown command when moving the location of a cluster. You can also use the command to shut down the cluster if you have data corruption caused by an application error.
Use the scshutdown command instead of the shutdown or halt commands to ensure proper shutdown of the entire cluster. The Solaris shutdown command is used with the scswitch(1M) command to shut down individual nodes. See How to Shut Down a Cluster or Shutting Down and Booting a Single Cluster Node for more information.
The scshutdown command stops all nodes in a cluster by:
Taking all running resource groups offline.
Unmounting all cluster file systems.
Shutting down active device services.
Running init 0 and bringing all nodes to the OpenBootTM PROM ok prompt on a SPARC based system or to a boot subsystem on an x86 based system. Boot subsystems are described in more detail in “Boot Subsystems” in System Administration Guide: Basic Administration.
If necessary, you can boot a node in non-cluster mode so that the node does not participate in cluster membership. Non-cluster mode is useful when installing cluster software or for performing certain administrative procedures. See How to Boot a Cluster Node in Non-Cluster Mode for more information.
Task |
For Instructions |
---|---|
Stop the cluster -Use scshutdown(1M) | |
Start the cluster by booting all nodes. The nodes must have a working connection to the cluster interconnect to attain cluster membership. | |
Reboot the cluster - Use scshutdown At the ok prompt or the Select (b)oot or (i)nterpreter prompt on the Current Boot Parameters screen, boot each node individually with the boot(1M) or the b command. The nodes must have a working connection to the cluster interconnect to attain cluster membership. |
Do not use send brk on a cluster console to shut down a cluster node. The command is not supported within a cluster.
SPARC: If your cluster is running Oracle Parallel Server or Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on any node in the cluster.
Shut down the cluster immediately.
From a single node in the cluster, type the following command.
# scshutdown -g0 -y |
Verify that all nodes are showing the ok prompt on a SPARC based system or a Boot Subsystem on an x86 based system.
Do not power off any nodes until all cluster nodes are at the ok prompt on a SPARC based system or in a Boot Subsystem on an x86 based system.
If necessary, power off the nodes.
The following example shows the console output when stopping normal cluster operation and bringing down all nodes so that the ok prompt is shown. The -g 0 option sets the shutdown grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of the other nodes in the cluster.
# scshutdown -g0 -y Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) The system is down. syncing file systems... done Program terminated ok |
The following example shows the console output when stopping normal cluster operation and bringing down all nodes. The -g 0 option sets the shutdown grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of the other nodes in the cluster.
# scshutdown -g0 -y May 2 10:32:57 phys-schost-1 cl_runtime: WARNING: CMM: Monitoring disabled. root@phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) failfasts already disabled on node 1 Print services already stopped. May 2 10:33:13 phys-schost-1 syslogd: going down on signal 15 The system is down. syncing file systems... done Type any key to continue |
See How to Boot a Cluster to restart a cluster that has been shut down.
To start a cluster whose nodes have been shut down and are at the ok prompt or at the Select (b)oot or (i)nterpreter prompt on the Current Boot Parameters screen, boot(1M) each node.
If you make configuration changes between shutdowns, start the node with the most current configuration first. Except in this situation, the boot order of the nodes does not matter.
SPARC:
ok boot |
x86:
<<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b |
Messages are displayed on the booted nodes' consoles as cluster components are activated.
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the nodes booted without error and are online.
The scstat(1M) command reports the nodes' status.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when booting node phys-schost-1 into the cluster. Similar messages appear on the consoles of the other nodes in the cluster.
ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster NOTICE: Node phys-schost-1 with votecount = 1 added. NOTICE: Node phys-schost-2 with votecount = 1 added. NOTICE: Node phys-schost-3 with votecount = 1 added. ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-2 (incarnation # 937690106) has become reachable. NOTICE: Node phys-schost-3 (incarnation # 937690290) has become reachable. NOTICE: cluster has reached quorum. NOTICE: node phys-schost-1 is up; new incarnation number = 937846227. NOTICE: node phys-schost-2 is up; new incarnation number = 937690106. NOTICE: node phys-schost-3 is up; new incarnation number = 937690290. NOTICE: Cluster members: phys-schost-1 phys-schost-2 phys-schost-3. ... |
The following example shows the console output when booting node phys-schost-1 into the cluster. Similar messages appear on the consoles of the other nodes in the cluster.
ATI RAGE SDRAM BIOS P/N GR-xlint.007-4.330 * BIOS Lan-Console 2.0 Copyright (C) 1999-2001 Intel Corporation MAC ADDR: 00 02 47 31 38 3C AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 Press <F2> to enter SETUP, <F12> Network Adaptec AIC-7899 SCSI BIOS v2.57S4 (c) 2000 Adaptec, Inc. All Rights Reserved. Press <Ctrl><A> for SCSISelect(TM) Utility! Ch B, SCSI ID: 0 SEAGATE ST336605LC 160 SCSI ID: 1 SEAGATE ST336605LC 160 SCSI ID: 6 ESG-SHV SCA HSBP M18 ASYN Ch A, SCSI ID: 2 SUN StorEdge 3310 160 SCSI ID: 3 SUN StorEdge 3310 160 AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 SunOS - Intel Platform Edition Primary Boot Subsystem, vsn 2.0 Current Disk Partition Information Part# Status Type Start Length ================================================ 1 Active X86 BOOT 2428 21852 2 SOLARIS 24280 71662420 3 <unused> 4 <unused> Please select the partition you wish to boot: * * Solaris DCB loading /solaris/boot.bin SunOS Secondary Boot version 3.00 Solaris Intel Platform Edition Booting System Autobooting from bootpath: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/ pci8086,341a@7,1/sd@0,0:a If the system hardware has changed, or to boot from a different device, interrupt the autoboot process by pressing ESC. Press ESCape to interrupt autoboot in 2 seconds. Initializing system Please wait... Warning: Resource Conflict - both devices are added NON-ACPI device: ISY0050 Port: 3F0-3F5, 3F7; IRQ: 6; DMA: 2 ACPI device: ISY0050 Port: 3F2-3F3, 3F4-3F5, 3F7; IRQ: 6; DMA: 2 <<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: Size: 275683 + 22092 + 150244 Bytes /platform/i86pc/kernel/unix loaded - 0xac000 bytes used SunOS Release 5.9 Version Generic_112234-07 32-bit Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. configuring IPv4 interfaces: e1000g2. Hostname: phys-schost-1 Booting as part of a cluster NOTICE: CMM: Node phys-schost-1 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) with votecount = 1 added. NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3. NOTICE: clcomm: Adapter e1000g3 constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 online NOTICE: clcomm: Adapter e1000g0 constructed NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being constructed NOTICE: CMM: Node phys-schost-1: attempting to join cluster. NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being initiated NOTICE: CMM: Quorum device /dev/did/rdsk/d1s2: owner set to node 1. NOTICE: CMM: Cluster has reached quorum. NOTICE: CMM: Node phys-schost-1 (nodeid = 1) is up; new incarnation number = 1068496374. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) is up; new incarnation number = 1068496374. NOTICE: CMM: Cluster members: phys-schost-1 phys-schost-2. NOTICE: CMM: node reconfiguration #1 completed. NOTICE: CMM: Node phys-schost-1: joined cluster. |
Run the scshutdown(1M) command to shut down the cluster, then boot the cluster with the boot(1M) command on each node.
SPARC: If your cluster is running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on any node in the cluster.
Shut down the cluster.
From a single node in the cluster, type the following command.
# scshutdown -g0 -y |
Each node is shut down.
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Boot each node.
The order in which the nodes are booted does not matter unless you make configuration changes between shutdowns. If you make configuration changes between shutdowns, start the node with the most current configuration first.
SPARC:
ok boot |
x86:
<<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b |
Verify that the nodes booted without error and are online.
The scstat command reports the nodes' status.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when stopping normal cluster operation, bringing down all nodes to the ok prompt, then restarting the cluster. The -g 0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of other nodes in the cluster.
# scshutdown -g0 -y Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. ... The system is down. syncing file systems... done Program terminated ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-2 (incarnation # 937690106) has become reachable. NOTICE: Node phys-schost-3 (incarnation # 937690290) has become reachable. NOTICE: cluster has reached quorum. ... NOTICE: Cluster members: phys-schost-1 phys-schost-2 phys-schost-3. ... NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: |
The following example shows the console output when stopping normal cluster operation, bringing down all nodes, then restarting the cluster. The -g 0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question. Shutdown messages also appear on the consoles of other nodes in the cluster.
# scshutdown -g0 -y May 2 10:32:57 phys-schost-1 cl_runtime: WARNING: CMM: Monitoring disabled. root@phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) failfasts already disabled on node 1 Print services already stopped. May 2 10:33:13 phys-schost-1 syslogd: going down on signal 15 The system is down. syncing file systems... done Type any key to continue ATI RAGE SDRAM BIOS P/N GR-xlint.007-4.330 * BIOS Lan-Console 2.0 Copyright (C) 1999-2001 Intel Corporation MAC ADDR: 00 02 47 31 38 3C AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 Press <F2> to enter SETUP, <F12> Network Adaptec AIC-7899 SCSI BIOS v2.57S4 (c) 2000 Adaptec, Inc. All Rights Reserved. Press <Ctrl><A> for SCSISelect(TM) Utility! Ch B, SCSI ID: 0 SEAGATE ST336605LC 160 SCSI ID: 1 SEAGATE ST336605LC 160 SCSI ID: 6 ESG-SHV SCA HSBP M18 ASYN Ch A, SCSI ID: 2 SUN StorEdge 3310 160 SCSI ID: 3 SUN StorEdge 3310 160 AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 SunOS - Intel Platform Edition Primary Boot Subsystem, vsn 2.0 Current Disk Partition Information Part# Status Type Start Length ================================================ 1 Active X86 BOOT 2428 21852 2 SOLARIS 24280 71662420 3 <unused> 4 <unused> Please select the partition you wish to boot: * * Solaris DCB loading /solaris/boot.bin SunOS Secondary Boot version 3.00 Solaris Intel Platform Edition Booting System Autobooting from bootpath: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/ pci8086,341a@7,1/sd@0,0:a If the system hardware has changed, or to boot from a different device, interrupt the autoboot process by pressing ESC. Press ESCape to interrupt autoboot in 2 seconds. Initializing system Please wait... Warning: Resource Conflict - both devices are added NON-ACPI device: ISY0050 Port: 3F0-3F5, 3F7; IRQ: 6; DMA: 2 ACPI device: ISY0050 Port: 3F2-3F3, 3F4-3F5, 3F7; IRQ: 6; DMA: 2 <<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b Size: 275683 + 22092 + 150244 Bytes /platform/i86pc/kernel/unix loaded - 0xac000 bytes used SunOS Release 5.9 Version Generic_112234-07 32-bit Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. configuring IPv4 interfaces: e1000g2. Hostname: phys-schost-1 Booting as part of a cluster NOTICE: CMM: Node phys-schost-1 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) with votecount = 1 added. NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3. NOTICE: clcomm: Adapter e1000g3 constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 online NOTICE: clcomm: Adapter e1000g0 constructed NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being constructed NOTICE: CMM: Node phys-schost-1: attempting to join cluster. NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being initiated NOTICE: CMM: Quorum device /dev/did/rdsk/d1s2: owner set to node 1. NOTICE: CMM: Cluster has reached quorum. NOTICE: CMM: Node phys-schost-1 (nodeid = 1) is up; new incarnation number = 1068496374. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) is up; new incarnation number = 1068496374. NOTICE: CMM: Cluster members: phys-schost-1 phys-schost-2. NOTICE: CMM: node reconfiguration #1 completed. NOTICE: CMM: Node phys-schost-1: joined cluster. WARNING: mod_installdrv: no major number for rsmrdt ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast The system is coming up. Please wait. checking ufs filesystems /dev/rdsk/c1t0d0s5: is clean. NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 online NIS domain name is dev.eng.mycompany.com starting rpc services: rpcbind keyserv ypbind done. Setting netmask of e1000g2 to 255.255.255.0 Setting netmask of e1000g3 to 255.255.255.128 Setting netmask of e1000g0 to 255.255.255.128 Setting netmask of clprivnet0 to 255.255.255.0 Setting default IPv4 interface for multicast: add net 224.0/4: gateway phys-schost-1 syslog service starting. obtaining access to all attached disks ***************************************************************************** * * The X-server can not be started on display :0... * ***************************************************************************** volume management starting. Starting Fault Injection Server... The system is ready. phys-schost-1 console login: |
Use the scswitch(1M) command in conjunction with the Solaris shutdown(1M) command to shut down an individual node. Use the scshutdown command only when shutting down an entire cluster.
Task |
For Instructions |
---|---|
Stop a cluster node - Use scswitch(1M) and shutdown(1M) | |
Start a node The node must have a working connection to the cluster interconnect to attain cluster membership. | |
Stop and restart (reboot) a cluster node - Use scswitch and shutdown The node must have a working connection to the cluster interconnect to attain cluster membership. | |
Boot a node so that the node does not participate in cluster membership - Use scswitch and shutdown, then boot -x or b -x |
Do not use send brk on a cluster console to shut down a cluster node. The command is not supported within a cluster.
SPARC: If your cluster is running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on the cluster node to be shut down.
Switch all resource groups, resources, and device groups from the node being shut down to other cluster members.
On the node to be shut down, type the following command.
# scswitch -S -h node |
Evacuates all device services and resource groups from the specified node.
Specifies the node from which you are switching resource groups and device groups.
Shut down the cluster node.
On the node to be shut down, type the following command.
# shutdown -g0 -y -i0 |
Verify that the cluster node is showing the ok prompt or the Select (b)oot or (i)nterpreter prompt on the Current Boot Parameters screen.
If necessary, power off the node.
The following example shows the console output when shutting down node phys-schost-1. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done Program terminated ok |
The following example shows the console output when shutting down node phys-schost-1. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 Shutdown started. Wed Mar 10 13:47:32 PST 2004 Changing to init state 0 - please wait Broadcast Message from root (console) on phys-schost-1 Wed Mar 10 13:47:32... THE SYSTEM phys-schost-1 IS BEING SHUT DOWN NOW ! ! ! Log off now or risk your files being damaged phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) failfasts disabled on node 1 Print services already stopped. Mar 10 13:47:44 phys-schost-1 syslogd: going down on signal 15 umount: /global/.devices/node@2 busy umount: /global/.devices/node@1 busy The system is down. syncing file systems... done WARNING: CMM: Node being shut down. Type any key to continue |
See How to Boot a Cluster Node to restart a cluster node that has been shut down.
Starting a cluster node can be affected by the quorum configuration. In a two-node cluster, you must have a quorum device configured so that the total quorum count for the cluster is three. You should have one quorum count for each node and one quorum count for the quorum device. In this situation, if the first node is shut down, the second node continues to have quorum and runs as the sole cluster member. For the first node to come back in the cluster as a cluster node, the second node must be up and running. The required cluster quorum count (two) must be present.
To start a cluster node that has been shut down, boot the node.
SPARC:
ok boot |
x86:
<<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b |
A cluster node must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the node has booted without error, and is online.
The scstat command reports the status of a node.
# scstat -n |
If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. If this problem arises, see How to Repair a Full /var File System.
The following example shows the console output when booting node phys-schost-1 into the cluster.
ok boot Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. checking ufs filesystems ... reservation program successfully exiting Print services started. volume management starting. The system is ready. phys-schost-1 console login: |
The following example shows the console output when booting node phys-schost-1 into the cluster.
<<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: Size: 276915 + 22156 + 150372 Bytes /platform/i86pc/kernel/unix loaded - 0xac000 bytes used SunOS Release 5.9 Version on81-feature-patch:08/30/2003 32-bit Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. configuring IPv4 interfaces: e1000g2. Hostname: phys-schost-1 Booting as part of a cluster NOTICE: CMM: Node phys-schost-1 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) with votecount = 1 added. NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3. WARNING: CMM: Initialization for quorum device /dev/did/rdsk/d1s2 failed with error EACCES. Will retry later. NOTICE: clcomm: Adapter e1000g3 constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 online NOTICE: clcomm: Adapter e1000g0 constructed NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being constructed NOTICE: CMM: Node phys-schost-1: attempting to join cluster. WARNING: CMM: Reading reservation keys from quorum device /dev/did/rdsk/d1s2 failed with error 2. NOTICE: CMM: Cluster has reached quorum. NOTICE: CMM: Node phys-schost-1 (nodeid = 1) is up; new incarnation number = 1068503958. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) is up; new incarnation number = 1068496374. NOTICE: CMM: Cluster members: phys-schost-1 phys-schost-2. NOTICE: CMM: node reconfiguration #3 completed. NOTICE: CMM: Node phys-schost-1: joined cluster. NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 online NOTICE: CMM: Retry of initialization for quorum device /dev/did/rdsk/d1s2 was successful. WARNING: mod_installdrv: no major number for rsmrdt ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast The system is coming up. Please wait. checking ufs filesystems /dev/rdsk/c1t0d0s5: is clean. NIS domain name is dev.eng.mycompany.com starting rpc services: rpcbind keyserv ypbind done. Setting netmask of e1000g2 to 255.255.255.0 Setting netmask of e1000g3 to 255.255.255.128 Setting netmask of e1000g0 to 255.255.255.128 Setting netmask of clprivnet0 to 255.255.255.0 Setting default IPv4 interface for multicast: add net 224.0/4: gateway phys-schost-1 syslog service starting. obtaining access to all attached disks ***************************************************************************** * * The X-server can not be started on display :0... * ***************************************************************************** volume management starting. Starting Fault Injection Server... The system is ready. phys-schost-1 console login: |
SPARC: If the cluster node is running Oracle Parallel Server/Real Application Clusters, shut down all instances of the database.
Refer to the Oracle Parallel Server/Real Application Clusters product documentation for shutdown procedures.
Become superuser on the cluster node to be shut down.
Shut down the cluster node by using the scswitch and shutdown commands.
Enter these commands on the node to be shut down. The -i 6 option with the shutdown command causes the node to reboot after the node shuts down.
# scswitch -S -h node # shutdown -g0 -y -i6 |
Cluster nodes must have a working connection to the cluster interconnect to attain cluster membership.
Verify that the node has booted without error, and is online.
# scstat -n |
The following example shows the console output when rebooting node phys-schost-1. Messages for this node, such as shutdown and startup notification, appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i6 Shutdown started. Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# INIT: New run level: 6 The system is coming down. Please wait. System services are now being stopped. Notice: rgmd is being stopped. Notice: rpc.pmfd is being stopped. Notice: rpc.fed is being stopped. umount: /global/.devices/node@1 busy umount: /global/phys-schost-1 busy The system is down. syncing file systems... done rebooting... Resetting ... ,,, Sun Ultra 1 SBus (UltraSPARC 143MHz), No Keyboard OpenBoot 3.11, 128 MB memory installed, Serial #5932401. Ethernet address 8:8:20:99:ab:77, Host ID: 8899ab77. ... Rebooting with command: boot ... Hostname: phys-schost-1 Booting as part of a cluster ... NOTICE: Node phys-schost-1: attempting to join cluster ... NOTICE: Node phys-schost-1: joined cluster ... The system is coming up. Please wait. The system is ready. phys-schost-1 console login: |
The following example shows the console output when rebooting node phys-schost-1. Messages for this node, such as shutdown and startup notification, appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i6 Shutdown started. Wed Mar 10 13:47:32 PST 2004 Changing to init state 6 - please wait Broadcast Message from root (console) on phys-schost-1 Wed Mar 10 13:47:32... THE SYSTEM phys-schost-1 IS BEING SHUT DOWN NOW ! ! ! Log off now or risk your files being damaged phys-schost-1# INIT: New run level: 6 The system is coming down. Please wait. System services are now being stopped. /etc/rc0.d/K05initrgm: Calling scswitch -S (evacuate) Print services already stopped. Mar 10 13:47:44 phys-schost-1 syslogd: going down on signal 15 umount: /global/.devices/node@2 busy umount: /global/.devices/node@1 busy The system is down. syncing file systems... done WARNING: CMM: Node being shut down. rebooting... ATI RAGE SDRAM BIOS P/N GR-xlint.007-4.330 * BIOS Lan-Console 2.0 Copyright (C) 1999-2001 Intel Corporation MAC ADDR: 00 02 47 31 38 3C AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 Press <F2> to enter SETUP, <F12> Network Adaptec AIC-7899 SCSI BIOS v2.57S4 (c) 2000 Adaptec, Inc. All Rights Reserved. Press <Ctrl><A> for SCSISelect(TM) Utility! Ch B, SCSI ID: 0 SEAGATE ST336605LC 160 SCSI ID: 1 SEAGATE ST336605LC 160 SCSI ID: 6 ESG-SHV SCA HSBP M18 ASYN Ch A, SCSI ID: 2 SUN StorEdge 3310 160 SCSI ID: 3 SUN StorEdge 3310 160 AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 SunOS - Intel Platform Edition Primary Boot Subsystem, vsn 2.0 Current Disk Partition Information Part# Status Type Start Length ================================================ 1 Active X86 BOOT 2428 21852 2 SOLARIS 24280 71662420 3 <unused> 4 <unused> Please select the partition you wish to boot: * * Solaris DCB loading /solaris/boot.bin SunOS Secondary Boot version 3.00 Solaris Intel Platform Edition Booting System Autobooting from bootpath: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/ pci8086,341a@7,1/sd@0,0:a If the system hardware has changed, or to boot from a different device, interrupt the autoboot process by pressing ESC. Press ESCape to interrupt autoboot in 2 seconds. Initializing system Please wait... Warning: Resource Conflict - both devices are added NON-ACPI device: ISY0050 Port: 3F0-3F5, 3F7; IRQ: 6; DMA: 2 ACPI device: ISY0050 Port: 3F2-3F3, 3F4-3F5, 3F7; IRQ: 6; DMA: 2 <<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: Size: 276915 + 22156 + 150372 Bytes /platform/i86pc/kernel/unix loaded - 0xac000 bytes used SunOS Release 5.9 Version on81-feature-patch:08/30/2003 32-bit Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. configuring IPv4 interfaces: e1000g2. Hostname: phys-schost-1 Booting as part of a cluster NOTICE: CMM: Node phys-schost-1 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) with votecount = 1 added. NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d1s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3. WARNING: CMM: Initialization for quorum device /dev/did/rdsk/d1s2 failed with error EACCES. Will retry later. NOTICE: clcomm: Adapter e1000g3 constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being constructed NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g3 - phys-schost-2:e1000g3 online NOTICE: clcomm: Adapter e1000g0 constructed NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being constructed NOTICE: CMM: Node phys-schost-1: attempting to join cluster. WARNING: CMM: Reading reservation keys from quorum device /dev/did/rdsk/d1s2 failed with error 2. NOTICE: CMM: Cluster has reached quorum. NOTICE: CMM: Node phys-schost-1 (nodeid = 1) is up; new incarnation number = 1068503958. NOTICE: CMM: Node phys-schost-2 (nodeid = 2) is up; new incarnation number = 1068496374. NOTICE: CMM: Cluster members: phys-schost-1 phys-schost-2. NOTICE: CMM: node reconfiguration #3 completed. NOTICE: CMM: Node phys-schost-1: joined cluster. NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 being initiated NOTICE: clcomm: Path phys-schost-1:e1000g0 - phys-schost-2:e1000g0 online NOTICE: CMM: Retry of initialization for quorum device /dev/did/rdsk/d1s2 was successful. WARNING: mod_installdrv: no major number for rsmrdt ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast The system is coming up. Please wait. checking ufs filesystems /dev/rdsk/c1t0d0s5: is clean. NIS domain name is dev.eng.mycompany.com starting rpc services: rpcbind keyserv ypbind done. Setting netmask of e1000g2 to 255.255.255.0 Setting netmask of e1000g3 to 255.255.255.128 Setting netmask of e1000g0 to 255.255.255.128 Setting netmask of clprivnet0 to 255.255.255.0 Setting default IPv4 interface for multicast: add net 224.0/4: gateway phys-schost-1 syslog service starting. obtaining access to all attached disks ***************************************************************************** * * The X-server can not be started on display :0... * ***************************************************************************** volume management starting. Starting Fault Injection Server... The system is ready. phys-schost-1 console login: |
You can boot a node so that the node does not participate in the cluster membership, that is, in non-cluster mode. Non-cluster mode is useful when installing the cluster software or performing certain administrative procedures, such as patching a node.
Become superuser on the cluster node to be started in non-cluster mode.
Shut down the node by using the scswitch and shutdown commands.
# scswitch -S -h node # shutdown -g0 -y -i0 |
Verify that the node is showing the ok prompt or the Select (b)oot or (i)nterpreter prompt on the Current Boot Parameters screen.
Boot the node in non-cluster mode by using the boot(1M) or b command with the -x option.
SPARC:
ok boot -x |
x86:
<<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b -x |
The following example shows the console output when shutting down node phys-schost-1 then restarting the node in non-cluster mode. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 Shutdown started. Wed Mar 10 13:47:32 phys-schost-1 cl_runtime: WARNING: CMM monitoring disabled. phys-schost-1# ... rg_name = schost-sa-1 ... offline node = phys-schost-2 ... num of node = 0 ... phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. Print services stopped. syslogd: going down on signal 15 ... The system is down. syncing file systems... done WARNING: node phys-schost-1 is being shut down. Program terminated ok boot -x ... Not booting as part of cluster ... The system is ready. phys-schost-1 console login: |
The following example shows the console output when shutting down node phys-schost-1 then restarting the node in non-cluster mode. The -g0 option sets the grace period to zero, -y provides an automatic yes response to the confirmation question, and -i0 invokes run level 0 (zero). Shutdown messages for this node appear on the consoles of other nodes in the cluster.
# scswitch -S -h phys-schost-1 # shutdown -g0 -y -i0 Shutdown started. Wed Mar 10 13:47:32 PST 2004 phys-schost-1# INIT: New run level: 0 The system is coming down. Please wait. System services are now being stopped. Print services already stopped. Mar 10 13:47:44 phys-schost-1 syslogd: going down on signal 15 ... The system is down. syncing file systems... done WARNING: CMM: Node being shut down. Type any key to continue ATI RAGE SDRAM BIOS P/N GR-xlint.007-4.330 * BIOS Lan-Console 2.0 Copyright (C) 1999-2001 Intel Corporation MAC ADDR: 00 02 47 31 38 3C AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 Press <F2> to enter SETUP, <F12> Network Adaptec AIC-7899 SCSI BIOS v2.57S4 (c) 2000 Adaptec, Inc. All Rights Reserved. Press <Ctrl><A> for SCSISelect(TM) Utility! Ch B, SCSI ID: 0 SEAGATE ST336605LC 160 SCSI ID: 1 SEAGATE ST336605LC 160 SCSI ID: 6 ESG-SHV SCA HSBP M18 ASYN Ch A, SCSI ID: 2 SUN StorEdge 3310 160 SCSI ID: 3 SUN StorEdge 3310 160 AMIBIOS (C)1985-2002 American Megatrends Inc., Copyright 1996-2002 Intel Corporation SCB20.86B.1064.P18.0208191106 SCB2 Production BIOS Version 2.08 BIOS Build 1064 2 X Intel(R) Pentium(R) III CPU family 1400MHz Testing system memory, memory size=2048MB 2048MB Extended Memory Passed 512K L2 Cache SRAM Passed ATAPI CD-ROM SAMSUNG CD-ROM SN-124 SunOS - Intel Platform Edition Primary Boot Subsystem, vsn 2.0 Current Disk Partition Information Part# Status Type Start Length ================================================ 1 Active X86 BOOT 2428 21852 2 SOLARIS 24280 71662420 3 <unused> 4 <unused> Please select the partition you wish to boot: * * Solaris DCB loading /solaris/boot.bin SunOS Secondary Boot version 3.00 Solaris Intel Platform Edition Booting System Autobooting from bootpath: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/ pci8086,341a@7,1/sd@0,0:a If the system hardware has changed, or to boot from a different device, interrupt the autoboot process by pressing ESC. Press ESCape to interrupt autoboot in 2 seconds. Initializing system Please wait... Warning: Resource Conflict - both devices are added NON-ACPI device: ISY0050 Port: 3F0-3F5, 3F7; IRQ: 6; DMA: 2 ACPI device: ISY0050 Port: 3F2-3F3, 3F4-3F5, 3F7; IRQ: 6; DMA: 2 <<< Current Boot Parameters >>> Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/ sd@0,0:a Boot args: Type b [file-name] [boot-flags] <ENTER> to boot with options or i <ENTER> to enter boot interpreter or <ENTER> to boot with defaults <<< timeout in 5 seconds >>> Select (b)oot or (i)nterpreter: b -x ... Not booting as part of cluster ... The system is ready. phys-schost-1 console login: |
Both Solaris and Sun Cluster software write error messages to the /var/adm/messages file, which over time can fill the /var file system. If a cluster node's /var file system fills up, Sun Cluster might not be able to restart on that node. Additionally, you might not be able to log in to the node.
If a node reports a full /var file system and continues to run Sun Cluster services, use this procedure to clear the full file system. Refer to “Viewing System Messages” in System Administration Guide: Advanced Administration for more information.