Sun Cluster 3.1 System Administration Guide

Adding and Removing a Cluster Node

The following table lists the tasks to perform when adding a node to an existing cluster. To complete the procedure correctly, these tasks must be performed in the order shown.

Table 6–2 Task Map: Adding a Cluster Node to an Existing Cluster

Task 

For Instructions, Go To 

Install the host adapter on the node and verify that the existing cluster interconnects can support the new node 

Sun Cluster 3.1 Hardware Administration Manual

Add shared storage 

Sun Cluster 3.1 Hardware Administration Manual

Add the node to the authorized node list 

   - Use scsetup.

How to Add a Node to the Authorized Node List

Install and configure the software on the new cluster node 

   - Install the Solaris Operating Environment and Sun Cluster software 

   - Configure the node as part of the cluster 

“Installing and Configuring Sun Cluster Software” in Sun Cluster 3.1 Software Installation Guide

The following table lists the tasks to perform when removing a node from an existing cluster. To complete the procedure correctly, the tasks must be performed in the order shown.


Caution – Caution –

Do not use this procedure if your cluster is running an OPS configuration. At this time, removing a node in an OPS configuration might cause nodes to panic at reboot.


Table 6–3 Task Map: Removing a Cluster Node (5/02)

Task 

For Instructions, Go To 

Move all resource groups and disk device groups off of the node to be removed. 

   - Use scswitch(1M)

# scswitch -S -h from-node

Remove the node from all resource groups. 

   - Use scrgadm(1M)

Sun Cluster 3.1 Data Service Planning and Administration Guide

Remove node from all disk device groups 

   - Use scconf(1M), metaset(1M), and scsetup(1M)

How to Remove a Node From a Disk Device Group (Solstice DiskSuite/Solaris Volume Manager)

How to Remove a Node From a Disk Device Group (VERITAS Volume Manager)

How to Remove a Node From a Raw Disk Device Group

Remove all quorum devices. 

   - Use scsetup.

Caution: Do not remove the quorum device if you are removing a node from a two-node cluster.

 

How to Remove a Quorum Device

 

Note that although you must remove the quorum device before you remove the storage device in the next step, you can add the quorum device back immediately afterward. 

Remove the storage device from the node.  

  - Use devfsadm(1M), scdidadm(1M).

Caution: Do not remove the quorum device if you are removing a node from a two-node cluster.How to Remove Connectivity Between an Array and a Single Node, in a Cluster With Greater Than Two-Node Connectivity

Add the new quorum device (to only the nodes that are intended to remain in the cluster). 

  - Use scconf -a -q globaldev=d[n],node=node1,node=node2

scconf(1M)

Place the node being removed into maintenance state. 

   - Use scswitch(1M), shutdown(1M), and scconf(1M).

How to Put a Node Into Maintenance State

Remove all logical transport connections to the node being removed. 

   - Use scsetup.

How to Remove Cluster Transport Cables, Transport Adapters, and Transport Junctions

Remove the last quorum device. 

How to Remove the Last Quorum Device From a Cluster

Remove node from the cluster software configuration. 

   - Use scconf.

How to Remove a Node From the Cluster Software Configuration

How to Add a Node to the Authorized Node List

Before adding a machine to an existing cluster, be sure the node has all of the necessary hardware correctly installed and configured, including a good physical connection to the private cluster interconnect.

For hardware installation information, refer to the Sun Cluster 3.1 Hardware Administration Manual or the hardware documentation that shipped with your server.

This procedure permits a machine to install itself into a cluster by adding its node name to the list of authorized nodes for that cluster.

You must be superuser on a current cluster member to complete this procedure.

  1. Be sure you have correctly completed all prerequisite hardware installation and configuration tasks listed in the task map for Adding and Removing a Cluster Node.

  2. Execute the scsetup(1M) utility.


    # scsetup
    

    The Main Menu is displayed.

  3. To access the New Nodes Menu, type 6 at the Main Menu.

  4. To modify the authorized list, type 3 at the New Nodes Menu, Specify the name of a machine which may add itself.

    Follow the prompts to add the node's name to the cluster. You will be asked for the name of the node to be added.

  5. Verify that the task has been performed successfully.

    The scsetup utility prints a “Command completed successfully” message if it completes the task without error.

  6. To prevent any new machines from being added to the cluster, type 1 at the New Nodes Menu.

    Follow the scsetup prompts. This option tells the cluster to ignore all requests coming in over the public network from any new machine trying to add itself to the cluster.

  7. Quit the scsetup utility.

  8. Install and configure the software on the new cluster node.

    Use either scinstall or JumpStartTM to complete the installation and configuration of the new node, as described in theSun Cluster 3.1 Software Installation Guide .

Example—Adding a Cluster Node to the Authorized Node List

The following example shows how to add a node named phys-schost-3 to the authorized node list in an existing cluster.


[Become superuser and execute the scsetup utility.]
# scsetup
Select New nodes>Specify the name of a machine which may add itself.
Answer the questions when prompted.
Verify that the scconf command completed successfully.
 
scconf -a -T node=phys-schost-3
 
    Command completed successfully.
Select Prevent any new machines from being added to the cluster.
Quit the scsetup New Nodes Menu and Main Menu.
[Install the cluster software.]

Where to Go From Here

For an overall list of tasks for adding a cluster node, see Table 6–2, “Task Map: Adding a Cluster Node.”

To add a node to an existing resource group, see theSun Cluster 3.1 Data Service Planning and Administration Guide .

How to Remove a Node From the Cluster Software Configuration

Perform this procedure to remove a node from the cluster.

  1. Be sure you have correctly completed all prerequisite tasks listed in the “Removing a Cluster Node” task map in Adding and Removing a Cluster Node.


    Note –

    Be sure you have removed the node from all resource groups, disk device groups, and quorum device configurations and placed it in maintenance state before you continue with this procedure.


  2. Become superuser on a node in the cluster other than the node to remove.

  3. Remove the node from the cluster.


    # scconf -r -h node=node-name
    

  4. Verify the node removal by using scstat(1M).


    # scstat -n
    

  5. Do you intend to uninstall Sun Cluster software from the removed node?

Example—Removing a Node From the Cluster Software Configuration

This example shows how to remove a node (phys-schost-2) from a cluster. All commands are run from another node of the cluster (phys-schost-1).


[Remove the node from the cluster:]
phys-schost-1# scconf -r -h node=phys-schost-2
[Verify node removal:]
phys-schost-1# scstat -n
-- Cluster Nodes --
                    Node name           Status
                    ---------           ------
  Cluster node:     phys-schost-1       Online

Where to Go From Here

To uninstall Sun Cluster software from the removed node, see How to Uninstall Sun Cluster Software From a Cluster Node

For hardware procedures, see theSun Cluster 3.1 Hardware Administration Manual .

For an overall list of tasks for removing a cluster node, see Table 6–3.

To add a node to an existing cluster, see How to Add a Node to the Authorized Node List.

How to Remove Connectivity Between an Array and a Single Node, in a Cluster With Greater Than Two-Node Connectivity

Use this procedure to detach a storage array from a single cluster node, in a cluster that has three- or four-node connectivity.

  1. Back up all database tables, data services, and volumes that are associated with the storage array that you are removing.

  2. Determine the resource groups and device groups that are running on the node to be disconnected.


    # scstat
    
  3. If necessary, move all resource groups and device groups off the node to be disconnected.


    Caution – Caution –

    If your cluster is running OPS/RAC software, shut down the OPS/RAC database instance that is running on the node before you move the groups off the node. For instructions see the Oracle Database Administration Guide.



    # scswitch -S -h from-node
    
  4. Put the device groups into maintenance state.

    For the procedure on acquiescing I/O activity to Veritas shared disk groups, see your VERITAS Volume Manager documentation.

    For the procedure on putting a device group in maintenance state, see the “Administering the Cluster” in Sun Cluster 3.1 System Administration Guide.

  5. Remove the node from the device groups.

    • If you use VERITAS Volume Manager or raw disk, use the scconf(1M) command to remove the device groups.

    • If you use Solstice DiskSuite, use the metaset command to remove the device groups.

  6. If the cluster is running HAStorage or HAStoragePlus, remove the node from the resource group's nodelist.


    # scrgadm -a -g resource-group -h nodelist 
    

    See the Sun Cluster 3.1 Data Service Planning and Administration Guide for more information on changing a resource group's nodelist.


    Note –

    Resource type, resource group, and resource property names are case insensitive when executing scrgadm.


  7. If the storage array you are removing is the last storage array that is connected to the node, disconnect the fiber-optic cable between the node and the hub or switch that is connected to this storage array (otherwise, skip this step).

  8. Do you want to remove the host adapter from the node you are disconnecting?

    • If yes, shut down and power off the node.

    • If no, skip to Step 11.

  9. Remove the host adapter from the node.

    For the procedure on removing host adapters, see the documentation that shipped with your node.

  10. Without allowing the node to boot, power on the node.

  11. Boot the node into non-cluster mode.


    ok boot -x 
    

    Caution – Caution –

    The node must be in non-cluster mode before you remove OPS/RAC software in the next step or the node will panic and potentially cause a loss of data availability.


  12. If OPS/RAC software has been installed, remove the OPS/RAC software package from the node that you are disconnecting.


    # pkgrm SUNWscucm 
    

    Caution – Caution –

    If you do not remove the OPS/RAC software from the node you disconnected, the node will panic when the node is reintroduced to the cluster and potentially cause a loss of data availability.


  13. Boot the node into cluster mode.


    ok> boot
    

  14. On the node, update the device namespace by updating the /devices and /dev entries.


    # devfsadm -C 
    # scdidadm -C
    
  15. Bring the device groups back online.

    For procedures on bringing a VERITAS shared disk group online, see your VERITAS Volume Manager documentation.

    For the procedure on bringing a device group online, see the procedure on putting a device group into maintenance state.

How to Uninstall Sun Cluster Software From a Cluster Node

Perform this procedure to uninstall Sun Cluster software from a cluster node before you disconnect it from a fully established cluster configuration. You can use this procedure to uninstall software from the last remaining node of a cluster.


Note –

To uninstall Sun Cluster software from a node that has not yet joined the cluster or is still in install mode, do not perform this procedure. Instead, go to “How to Uninstall Sun Cluster Software to Correct Installation Problems” in theSun Cluster 3.1 Software Installation Guide .


  1. Be sure you have correctly completed all prerequisite tasks listed in the task map for removing a cluster node.

    See Adding and Removing a Cluster Node.


    Note –

    Be sure you have removed the node from all resource groups, device groups, and quorum device configurations, placed it in maintenance state, and removed it from the cluster before you continue with this procedure.


  2. Become superuser on an active cluster member other than the node you will uninstall.

  3. From the active cluster member, add the node you intend to uninstall to the cluster's node authentication list.


    # scconf -a -T node=nodename
    
    -a

    Add

    -T

    Specifies authentication options

    node=nodename

    Specifies the name of the node to add to the authentication list

    Alternately, you can use the scsetup(1M) utility. See How to Add a Node to the Authorized Node List for procedures.

  4. Become superuser on the node to uninstall.

  5. Reboot the node into non-cluster mode.


    # shutdown -g0 -y -i0
    ok boot -x
    

  6. In the /etc/vfstab file, remove all globally mounted file system entries except the /global/.devices global mounts.

  7. Uninstall Sun Cluster software from the node.

    Run the command from a directory that is not associated with any Sun Cluster packages.


    # cd /
    # scinstall -r
    

    See the scinstall(1M) man page for more information. If scinstall returns error messages, see Troubleshooting a Node Uninstallation.

  8. Disconnect the transport cables and the transport junction, if any, from the other cluster devices.

    1. If the uninstalled node is connected to a storage device that uses a parallel SCSI interface, install a SCSI terminator to the open SCSI connector of the storage device after you disconnect the transport cables.

      If the uninstalled node is connected to a storage device that uses Fibre Channel interfaces, no termination is necessary.

    2. Follow the documentation that shipped with your host adapter and server for disconnection procedures.

Correcting Error Messages

To correct the error messages in the following sections, perform this procedure.

  1. Attempt to rejoin the node to the cluster.


    # boot
    

  2. Did the node successfully rejoin the cluster?

    • If no, proceed to Step 3.

    • If yes, perform the following steps to remove the node from disk device groups.

    1. If the node successfully rejoins the cluster, remove the node from the remaining disk device group(s).

      Follow procedures in How to Remove a Node From All Disk Device Groups.

    2. After you remove the node from all disk device groups, return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure.

  3. If the node could not rejoin the cluster, rename the node's /etc/cluster/ccr file to any other name you choose, for example, ccr.old.


    # mv /etc/cluster/ccr /etc/cluster/ccr.old
    

  4. Return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure.

Troubleshooting a Node Uninstallation

This section describes error messages you might receive when you run the scinstall -r command and the corrective actions to take.

Unremoved Cluster File System Entries

The following error messages indicate that the node you removed still has cluster file systems referenced in its vfstab file.


Verifying that no unexpected global mounts remain in /etc/vfstab ... failed
scinstall:  global-mount1 is still configured as a global mount.
scinstall:  global-mount1 is still configured as a global mount.
scinstall:  /global/dg1 is still configured as a global mount.
 
scinstall:  It is not safe to uninstall with these outstanding errors.          
scinstall:  Refer to the documentation for complete uninstall instructions.
scinstall:  Uninstall failed.

To correct this error, return to How to Uninstall Sun Cluster Software From a Cluster Node and repeat the procedure. Ensure that you successfully complete Step 6 in the procedure before you rerun the scinstall -r command.

Unremoved Listing in Disk Device Groups

The following error messages indicate that the node you removed is still listed with a disk device group.


Verifying that no device services still reference this node ... failed
scinstall:  This node is still configured to host device service "service".
scinstall:  This node is still configured to host device service "service2".
scinstall:  This node is still configured to host device service "service3".
scinstall:  This node is still configured to host device service "dg1".
 
scinstall:  It is not safe to uninstall with these outstanding errors.          
scinstall:  Refer to the documentation for complete uninstall instructions.
scinstall:  Uninstall failed.