Sun Cluster 3.1 - 3.2 With Sun StorEdge 3310 or 3320 SCSI RAID Array Manual for Solaris OS

ProcedureHow to Replace a Host Adapter

Use this procedure to replace a failed host adapter in a running cluster. This procedure defines Node A as the node with the failed host adapter that you are replacing.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.

Before You Begin

This procedure relies on the following prerequisites and assumptions.

To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC (role-based access control) authorization.

  1. Determine the resource groups and device groups that are running on Node A.

    Record this information because you use this information in Step 12 and Step 13 of this procedure to return resource groups and device groups to Node A.

    • If you are using Sun Cluster 3.2, use the following commands:


      # clresourcegroup status -n nodename
      # cldevicegroup status -n nodename
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scstat
      
  2. Record the details of any metadevices that are affected by the failed host adapter.

    Record this information because you use it in Step 11 of this procedure to repair any affected metadevices.

  3. Move all resource groups and device groups off Node A.

    • If you are using Sun Cluster 3.2, use the following command:


      # clnode evacuate NodeA
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scswitch -S -h NodeA
      
  4. Shut down Node A.

    For the full procedure about how to shut down and power off a node, see your Sun Cluster system administration documentation.

  5. Power off Node A.

  6. Replace the failed host adapter.

    For the procedure about how to remove and add host adapters, see the documentation that shipped with your nodes.

  7. If you need to upgrade the node's host adapter firmware, boot Node A into noncluster mode by adding -x to your boot instruction. Proceed to Step 9.

    For more information about how to boot nodes, see your Sun Cluster system administration documentation.

  8. If you do not need to upgrade the node's host adapter firmware, proceed to Step 10.

  9. Upgrade the host adapter firmware on Node A.

    If you use the Solaris 8, Solaris 9, or Solaris 10 Operating System, Sun Connection Update Manager keeps you informed of the latest versions of patches and features. Using notifications and intelligent needs-based updating, Sun Connection helps improve operational efficiency and ensures that you have the latest software patches for your Sun software.

    You can download the Sun Connection Update Manager product for free by going to http://www.sun.com/download/products.xml?id=4457d96d.

    Additional information for using the Sun patch management tools is provided in Solaris Administration Guide: Basic Administration at http://docs.sun.com. Refer to the version of this manual for the Solaris OS release that you have installed.

    If you must apply a patch when a node is in noncluster mode, you can apply it in a rolling fashion, one node at a time, unless instructions for a patch require that you shut down the entire cluster. Follow the procedures in How to Apply a Rebooting Patch (Node) in Sun Cluster System Administration Guide for Solaris OS to prepare the node and to boot it in noncluster mode. For ease of installation, consider applying all patches at the same time. That is, apply all patches to the node that you place in noncluster mode.

    For a list of patches that affect Sun Cluster, see the Sun Cluster Wiki Patch Klatch.

    For required firmware, see the Sun System Handbook.

  10. Boot Node A into cluster mode.

    For more information about how to boot nodes, see Chapter 3, Shutting Down and Booting a Cluster, in Sun Cluster System Administration Guide for Solaris OS.

  11. Perform any volume management procedures that are necessary to fix any metadevices affected by this procedure, as you identified in Step 2.

    For more information, see your volume manager software documentation.

  12. (Optional) Restore the device groups to Node A.

    Perform the following step for each device group you want to return to the original node.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevicegroup switch -n NodeA devicegroup1[ devicegroup2 …]
      
      -n NodeA

      The node to which you are restoring device groups.

      devicegroup1[ devicegroup2 …]

      The device group or groups that you are restoring to the node.

    • If you are using Sun Cluster 3.1, use the following command:


       # scswitch -z -D devicegroup -h NodeA
      
  13. (Optional) Restore the resource groups to Node A.

    Perform the following step for each resource group you want to return to the original node.

    • If you are using Sun Cluster 3.2, use the following command:


      # clresourcegroup switch -n NodeA  resourcegroup1[ resourcegroup2 …]
      
      NodeA

      For failover resource groups, the node to which the groups are returned. For scalable resource groups, the node list to which the groups are returned.

      resourcegroup1[ resourcegroup2 …]

      The resource group or groups that you are returning to the node or nodes.

    • If you are using Sun Cluster 3.1, use the following command:


      # scswitch -z -g resourcegroup -h NodeA