Sun Cluster System Administration Guide for Solaris OS

Adding and Removing a Cluster Node

The following table lists the tasks to perform when adding a node to an existing cluster. To complete the procedure correctly, these tasks must be performed in the order shown.

Table 7–2 Task Map: Adding a Cluster Node to an Existing Cluster

Task 

For Instructions, Go To 

Install the host adapter on the node and verify that the existing cluster interconnects can support the new node 

Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS

Add shared storage 

Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS

Add the node to the authorized node list 

   - Use scsetup.

How to Add a Node to the Authorized Node List

Install and configure the software on the new cluster node 

   - Install the Solaris operating system and Sun Cluster software 

   - Configure the node as part of the cluster 

Chapter 2, Installing and Configuring Sun Cluster Software, in Sun Cluster Software Installation Guide for Solaris OS

The following table lists the tasks to perform when removing a node from an existing cluster. To complete the procedure correctly, the tasks must be performed in the order shown.


Caution – Caution –

Do not use this procedure if your cluster is running an OPS configuration. At this time, removing a node in an OPS configuration might cause nodes to panic at reboot.


Table 7–3 Task Map: Removing a Cluster Node (5/02)

Task 

For Instructions, Go To 

Move all resource groups and disk device groups off of the node to be removed. 

   - Use scswitch(1M)

# scswitch -S -h from-node

Remove the node from all resource groups. 

   - Use scrgadm(1M)

Sun Cluster Data Services Planning and Administration Guide for Solaris OS

Remove node from all disk device groups 

   - Use scconf(1M), metaset(1M), and scsetup(1M)

How to Remove a Node From a Disk Device Group (Solstice DiskSuite/Solaris Volume Manager)

SPARC: How to Remove a Node From a Disk Device Group (VERITAS Volume Manager)

SPARC: How to Remove a Node From a Raw Disk Device Group

Caution: If the number of desired secondaries is configured as 2 or more, it must be decreased to 1.

Remove all fully connected quorum devices. 

   - Use scsetup.

Caution: Do not remove the quorum device if you are removing a node from a two-node cluster.

 

How to Remove a Quorum Device

 

Note that although you must remove the quorum device before you remove the storage device in the next step, you can add the quorum device back immediately afterward. 

Remove all fully connected storage devices from the node.  

  - Use devfsadm(1M), scdidadm(1M).

Caution: Do not remove the quorum device if you are removing a node from a two-node cluster. How to Remove Connectivity Between an Array and a Single Node, in a Cluster With Greater Than Two-Node Connectivity

Add back the quorum devices (to only the nodes that are intended to remain in the cluster). 

  - Use scconf -a -q globaldev=d[n],node= node1,node=node2

scconf(1M)

Place the node being removed into maintenance state. 

   - Use scswitch(1M), shutdown(1M), and scconf(1M).

How to Put a Node Into Maintenance State

Remove all logical transport connections (transport cables and adapters) to the node being removed. 

   - Use scsetup.

How to Remove Cluster Transport Cables, Transport Adapters, and Transport Junctions

Remove all quorum devices connected to the node being removed. 

   - Use scsetup, scconf(1M).

How to Remove the Last Quorum Device From a Cluster

Remove node from the cluster software configuration. 

   - Use scconf(1M).

How to Remove a Node From the Cluster Software Configuration

ProcedureHow to Add a Node to the Authorized Node List

Before adding a machine to an existing cluster, be sure the node has all of the necessary hardware correctly installed and configured, including a good physical connection to the private cluster interconnect.

For hardware installation information, refer to the Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS or the hardware documentation that shipped with your server.

This procedure permits a machine to install itself into a cluster by adding its node name to the list of authorized nodes for that cluster.

You must be superuser on a current cluster member to complete this procedure.

Steps
  1. Be sure you have correctly completed all prerequisite hardware installation and configuration tasks listed in the task map for Adding and Removing a Cluster Node.

  2. Type the scsetup command.


    # scsetup
    

    The Main Menu is displayed.

  3. To access the New Nodes Menu, type 7 at the Main Menu.

  4. To modify the authorized list, type 3 at the New Nodes Menu, Specify the name of a machine which may add itself.

    Follow the prompts to add the node's name to the cluster. You will be asked for the name of the node to be added.

  5. Verify that the task has been performed successfully.

    The scsetup utility prints a “Command completed successfully” message if it completes the task without error.

  6. Quit the scsetup utility.

  7. Install and configure the software on the new cluster node.

    Use either scinstall or JumpStartTM to complete the installation and configuration of the new node, as described in the Sun Cluster Software Installation Guide for Solaris OS.

  8. To prevent any new machines from being added to the cluster, type 1 at the New Nodes Menu.

    Follow the scsetup prompts. This option tells the cluster to ignore all requests coming in over the public network from any new machine trying to add itself to the cluster.


Example 7–11 Adding a Cluster Node to the Authorized Node List

The following example shows how to add a node named phys-schost-3 to the authorized node list in an existing cluster.


[Become superuser and execute the scsetup utility.]
# scsetup
Select New nodes>Specify the name of a machine which may add itself.
Answer the questions when prompted.
Verify that the scconf command completed successfully.
 
scconf -a -T node=phys-schost-3
 
    Command completed successfully.
Select Prevent any new machines from being added to the cluster.
Quit the scsetup New Nodes Menu and Main Menu.
[Install the cluster software.]

See Also

For an overall list of tasks for adding a cluster node, see Table 7–2, “Task Map: Adding a Cluster Node.”

To add a node to an existing resource group, see the Sun Cluster Data Services Planning and Administration Guide for Solaris OS.

ProcedureHow to Remove a Node From the Cluster Software Configuration

Perform this procedure to remove a node from the cluster.

Steps
  1. Be sure you have correctly completed all prerequisite tasks listed in the “Removing a Cluster Node” task map in Adding and Removing a Cluster Node.


    Note –

    Be sure you have removed the node from all resource groups, disk device groups, and quorum device configurations and placed it in maintenance state before you continue with this procedure.


  2. Become superuser on a node in the cluster other than the node to remove.

  3. Remove the node from the cluster.


    # scconf -r -h node=node-name
    
  4. Verify the node removal by using scstat(1M).


    # scstat -n
    
  5. Do you intend to uninstall Sun Cluster software from the removed node?


Example 7–12 Removing a Node From the Cluster Software Configuration

This example shows how to remove a node (phys-schost-2) from a cluster. All commands are run from another node of the cluster ( phys-schost-1).


[Remove the node from the cluster:]
phys-schost-1# scconf -r -h node=phys-schost-2
[Verify node removal:]
phys-schost-1# scstat -n
-- Cluster Nodes --
                    Node name           Status
                    ---------           ------
  Cluster node:     phys-schost-1       Online

See Also

To uninstall Sun Cluster software from the removed node, see How to Uninstall Sun Cluster Software From a Cluster Node

For hardware procedures, see the Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS.

For an overall list of tasks for removing a cluster node, see Table 7–3.

To add a node to an existing cluster, see How to Add a Node to the Authorized Node List.

ProcedureHow to Remove Connectivity Between an Array and a Single Node, in a Cluster With Greater Than Two-Node Connectivity

Use this procedure to detach a storage array from a single cluster node, in a cluster that has three- or four-node connectivity.

Steps
  1. Back up all database tables, data services, and volumes that are associated with the storage array that you are removing.

  2. Determine the resource groups and device groups that are running on the node to be disconnected.


    # scstat
    
  3. If necessary, move all resource groups and device groups off the node to be disconnected.


    Caution (SPARC only) – Caution (SPARC only) –

    If your cluster is running Oracle Parallel Server/Real Application Clusters software, shut down the Oracle Parallel Server/Real Application Clusters database instance that is running on the node before you move the groups off the node. For instructions see the Oracle Database Administration Guide.



    # scswitch -S -h from-node
    
  4. Put the device groups into maintenance state.

    For the procedure on acquiescing I/O activity to Veritas shared disk groups, see your VxVM documentation.

    For the procedure on putting a device group in maintenance state, see the Chapter 7, Administering the Cluster.

  5. Remove the node from the device groups.

    • If you use VxVM or raw disk, use the scconf(1M) command to remove the device groups.

    • If you use Solstice DiskSuite, use the metaset command to remove the device groups.

  6. If the cluster is running HAStorage or HAStoragePlus, remove the node from the resource group's nodelist.


    # scrgadm -a -g resource-group -h nodelist 
    

    See the Sun Cluster Data Services Planning and Administration Guide for Solaris OS for more information on changing a resource group's nodelist.


    Note –

    Resource type, resource group, and resource property names are case insensitive when executing scrgadm.


  7. If the storage array you are removing is the last storage array that is connected to the node, disconnect the fiber-optic cable between the node and the hub or switch that is connected to this storage array (otherwise, skip this step).

  8. Do you want to remove the host adapter from the node you are disconnecting?

    • If yes, shut down and power off the node.

    • If no, skip to Step 11.

  9. Remove the host adapter from the node.

    For the procedure on removing host adapters, see the documentation that shipped with your node.

  10. Without allowing the node to boot, power on the node.

  11. Boot the node into non-cluster mode.

    • SPARC:


      ok boot -x
      
    • x86:


                            <<< Current Boot Parameters >>>
      Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/
      sd@0,0:a
      Boot args:
      
      Type    b [file-name] [boot-flags] <ENTER>  to boot with options
      or      i <ENTER>                           to enter boot interpreter
      or      <ENTER>                             to boot with defaults
      
                        <<< timeout in 5 seconds >>>
      Select (b)oot or (i)nterpreter: b -x
      

    Caution (SPARC only) – Caution (SPARC only) –

    The node must be in non-cluster mode before you remove Oracle Parallel Server/Real Application Clusters software in the next step or the node panics and potentially causes a loss of data availability.


  12. SPARC: If Oracle Parallel Server/Real Application Clusters software has been installed, remove the Oracle Parallel Server/Real Application Clusters software package from the node that you are disconnecting.


    # pkgrm SUNWscucm 
    

    Caution (SPARC only) – Caution (SPARC only) –

    If you do not remove the Oracle Parallel Server/Real Application Clusters software from the node you disconnected, the node will panic when the node is reintroduced to the cluster and potentially cause a loss of data availability.


  13. Boot the node into cluster mode.

    • SPARC:


      ok boot
      
    • x86:


                            <<< Current Boot Parameters >>>
      Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/
      sd@0,0:a
      Boot args:
      
      Type    b [file-name] [boot-flags] <ENTER>  to boot with options
      or      i <ENTER>                           to enter boot interpreter
      or      <ENTER>                             to boot with defaults
      
                        <<< timeout in 5 seconds >>>
      Select (b)oot or (i)nterpreter: b
      
  14. On the node, update the device namespace by updating the /devices and /dev entries.


    # devfsadm -C 
    # scdidadm -C
    
  15. Bring the device groups back online.

    For procedures about bringing a VERITAS shared disk group online, see your VERITAS Volume Manager documentation.

    For the procedure on bringing a device group online, see the procedure on putting a device group into maintenance state.

ProcedureHow to Uninstall Sun Cluster Software From a Cluster Node

Perform this procedure to uninstall Sun Cluster software from a cluster node before you disconnect it from a fully established cluster configuration. You can use this procedure to uninstall software from the last remaining node of a cluster.


Note –

To uninstall Sun Cluster software from a node that has not yet joined the cluster or is still in install mode, do not perform this procedure. Instead, go to “How to Uninstall Sun Cluster Software to Correct Installation Problems” in the Sun Cluster Software Installation Guide for Solaris OS.


Steps
  1. Be sure you have correctly completed all prerequisite tasks listed in the task map for removing a cluster node.

    See Adding and Removing a Cluster Node.


    Note –

    Be sure you have removed the node from all resource groups, device groups, and quorum device configurations, placed it in maintenance state, and removed it from the cluster before you continue with this procedure.


  2. Become superuser on an active cluster member other than the node you will uninstall.

  3. From the active cluster member, add the node you intend to uninstall to the cluster's node authentication list.


    # scconf -a -T node=nodename
    
    -a

    Add

    -T

    Specifies authentication options

    node=nodename

    Specifies the name of the node to add to the authentication list

    Alternately, you can use the scsetup(1M) utility. See How to Add a Node to the Authorized Node List for procedures.

  4. Become superuser on the node to uninstall.

  5. Reboot the node into non-cluster mode.

    • SPARC:


      # shutdown -g0 -y -i0ok boot -x
      
    • x86:


      # shutdown -g0 -y -i0
      ...
                            <<< Current Boot Parameters >>>
      Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@7,1/
      sd@0,0:a
      Boot args:
      
      Type    b [file-name] [boot-flags] <ENTER>  to boot with options
      or      i <ENTER>                           to enter boot interpreter
      or      <ENTER>                             to boot with defaults
      
                        <<< timeout in 5 seconds >>>
      Select (b)oot or (i)nterpreter: b -x
      
  6. In the /etc/vfstab file, remove all globally mounted file system entries except the /global/.devices global mounts.

  7. Uninstall Sun Cluster software from the node.

    Run the command from a directory that is not associated with any Sun Cluster packages.


    # cd /
    # scinstall -r
    

    See the scinstall(1M) man page for more information. If scinstall returns error messages, see Unremoved Cluster File System Entries.

  8. Disconnect the transport cables and the transport junction, if any, from the other cluster devices.

    1. If the uninstalled node is connected to a storage device that uses a parallel SCSI interface, install a SCSI terminator to the open SCSI connector of the storage device after you disconnect the transport cables.

      If the uninstalled node is connected to a storage device that uses Fibre Channel interfaces, no termination is necessary.

    2. Follow the documentation that shipped with your host adapter and server for disconnection procedures.

ProcedureHow to Correct Error Messages

To correct the error messages in the previous sections, perform this procedure.

Steps
  1. Attempt to rejoin the node to the cluster.


    # boot
    
  2. Did the node successfully rejoin the cluster?

    • If no, proceed to Step 3.

    • If yes, perform the following steps to remove the node from disk device groups.

    1. If the node successfully rejoins the cluster, remove the node from the remaining disk device group(s).

      Follow procedures in How to Remove a Node From All Disk Device Groups.

    2. After you remove the node from all disk device groups, return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure.

  3. If the node could not rejoin the cluster, rename the node's /etc/cluster/ccr file to any other name you choose, for example, ccr.old.


    # mv /etc/cluster/ccr /etc/cluster/ccr.old
    
  4. Return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure.

Troubleshooting a Node Uninstallation

This section describes error messages you might receive when you run the scinstall -r command and the corrective actions to take.

Unremoved Cluster File System Entries

The following error messages indicate that the node you removed still has cluster file systems referenced in its vfstab file.


Verifying that no unexpected global mounts remain in /etc/vfstab ... failed
scinstall:  global-mount1 is still configured as a global mount.
scinstall:  global-mount1 is still configured as a global mount.
scinstall:  /global/dg1 is still configured as a global mount.
 
scinstall:  It is not safe to uninstall with these outstanding errors.
scinstall:  Refer to the documentation for complete uninstall instructions.
scinstall:  Uninstall failed.

To correct this error, return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure. Ensure that you successfully complete Step 6 in the procedure before you rerun the scinstall -r command.

Unremoved Listing in Disk Device Groups

The following error messages indicate that the node you removed is still listed with a disk device group.


Verifying that no device services still reference this node ... failed
scinstall:  This node is still configured to host device service "
service".
scinstall:  This node is still configured to host device service "
service2".
scinstall:  This node is still configured to host device service "
service3".
scinstall:  This node is still configured to host device service "
dg1".
 
scinstall:  It is not safe to uninstall with these outstanding errors.          
scinstall:  Refer to the documentation for complete uninstall instructions.
scinstall:  Uninstall failed.