Sun Cluster 3.1 - 3.2 With Sun StorEdge 3310 or 3320 SCSI RAID Array Manual for Solaris OS

Chapter 2 Maintaining Sun StorEdge 3310 or 3320 SCSI RAID Array

This chapter describes the procedures about how to maintain SunTM StorEdgeTM 3310 and 3320 RAID arrays in a Sun Cluster environment.

Read the entire procedure before you perform any steps within a procedure in this chapter. If you are not reading an online version of this document, ensure that you have the books listed in Preface available.

This chapter contains the following procedures.

Maintaining RAID Storage Arrays

This section contains the procedures about how to maintain a RAID storage array in a Sun Cluster environment. Maintenance tasks in Table 2–1 contain cluster-specific tasks. Tasks that are not cluster-specific are referenced in a list following the table.


Note –

When you upgrade firmware on a storage device or on an enclosure, redefine the stripe size of a LUN, or perform other LUN operations, a device ID might change unexpectedly. When you perform a check of the device ID configuration by running the cldevice check or scdidadm -c command, the following error message appears on your console if the device ID changed unexpectedly.


device id for nodename:/dev/rdsk/cXtYdZsN does not match physical 
device's id for ddecimalnumber, device may have been replaced.

To fix device IDs that report this error, run the cldevice repair or scdidadm -R command for each affected device.


Table 2–1 Tasks: Maintaining a RAID Storage Array

Task 

Information 

Remove a RAID storage array. 

How to Remove a RAID Storage Array

Replace a RAID storage array. 

To replace a RAID storage array, remove the RAID storage array. Add a new RAID storage array to the configuration. 

How to Remove a RAID Storage Array

How to Add a RAID Storage Array to an Existing Cluster

Add a JBOD as an expansion unit. 

Follow the same procedure that you use in a noncluster environment. 

If you plan to use the disk drives in the expansion unit to create a new LUN, see How to Create and Map a LUN

Sun StorEdge 3000 Family Installation, Operation, and Service Manual

How to Create and Map a LUN

Remove a JBOD expansion unit. 

Follow the same procedure that you use in a noncluster environment. 

If you plan to remove a LUN in conjunction with the disk drive, see How to Unmap and Delete a LUN.

Sun StorEdge 3000 Family Installation, Operation, and Service Manual

How to Unmap and Delete a LUN

Add a disk drive. 

Follow the same procedure that you use in a noncluster environment. 

If you plan to create a new LUN with the disk drive, see How to Create and Map a LUN.

Sun StorEdge Family FRU Installation Guide for the Sun StorEdge 3310 SCSI Array and the Sun StorEdge 3510 FC Array

How to Create and Map a LUN

Remove a disk drive. 

Follow the same procedure that you use in a noncluster environment. 

If you plan to remove a LUN in conjunction with the disk drive, see How to Unmap and Delete a LUN.

If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible. 

Sun StorEdge Family FRU Installation Guide for the Sun StorEdge 3310 SCSI Array and the Sun StorEdge 3510 FC Array

How to Unmap and Delete a LUN

Replace a failed controller or restore an offline controller. 

How to Replace a Controller

Replace a host-to-storage array SCSI cable. 

Switch the device group over to the other node before performing this procedure. 

Then follow the same procedure that you use in a noncluster environment. 

Sun StorEdge 3000 Family Installation, Operation, and Service Manual

Replace the I/O module. 

How to Replace an I/O Module

Replace the terminator module. 

How to Replace a Terminator Module

Sun StorEdge 3310 and 3320 SCSI RAID FRUs

The following is a list of administrative tasks that require no cluster-specific procedures. See theSun StorEdge 3000 Family FRU Installation Guide for instructions on replacing the following FRUs.

ProcedureHow to Remove a RAID Storage Array

Use this procedure to remove a RAID storage array from a running cluster.


Caution – Caution –

This procedure removes all data that is on the RAID storage array that you remove.


This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.

Before You Begin

This procedure assumes that your nodes are not configured with dynamic reconfiguration functionality. If your nodes are configured for dynamic reconfiguration, see your Sun Cluster Hardware Administration Manual for Solaris OS.

To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC (role-based access control) authorization.

  1. If one of the disks in the RAID storage array is configured as a quorum device, relocate that quorum device to another suitable RAID storage array.

    To determine whether any of the disks is configured as a quorum device, use one of the following commands.

    • If you are using Sun Cluster 3.2, use the following command:


      # clquorum show +  
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scstat -q
      

    For procedures about how to add and remove quorum devices, see your Sun Cluster system administration documentation.

  2. If necessary, back up the metadevice or volume.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

  3. Perform volume management administration to remove the RAID storage array from the configuration.

    If a volume manager does manage the LUN, run the appropriate Solstice DiskSuite/Solaris Volume Manager commands or Veritas Volume Manager commands to remove the LUN from any diskset or disk group. For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation. See the following paragraph for additional Veritas Volume Manager commands that are required.


    Note –

    LUNs that were managed by Veritas Volume Manager must be completely removed from Veritas Volume Manager control before you can delete the LUNs from the Sun Cluster environment. After you delete the LUN from any disk group, use the following commands on both nodes to remove the LUN from Veritas Volume Manager control.



    # vxdisk offline cNtXdY
    # vxdisk rm cNtXdY
    
  4. Identify the LUNs that you need to remove.


    # cfgadm -al
    
  5. On all nodes, remove references to the LUNs in the RAID storage array that you removed.


    # cfgadm -c unconfigure cN::dsk/cNtXdY
    
  6. Disconnect the SCSI cables from the RAID storage array.

  7. On both nodes, remove the paths to the LUN that you are deleting.


    # devfsadm -C
    
  8. On both nodes, remove all obsolete device IDs (DIDs).

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice clear 
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scdidadm -C
      
  9. If no other LUN is assigned to the target and LUN ID, remove the LUN entries from /kernel/drv/sd.conf file.

    Perform this step on both nodes to prevent extended boot time caused by unassigned LUN entries.


    Note –

    Do not remove default t0d0 entries.


  10. Power off the RAID storage array. Disconnect the RAID storage array from the AC power source.

    For the procedure about how to power off a storage array, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual, 3310 SCSI Array.

  11. Remove the RAID storage array.

    For the procedure about how to remove a storage array, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual, 3310 SCSI Array.

  12. If necessary, remove any unused host adapters from the nodes.

    For the procedure about how to remove a host adapter, see your Sun Cluster system administration documentation and the documentation that shipped with your host adapter and node.

  13. From any node, verify that the configuration is correct.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice list -v
      
    • If you are using Sun Cluster 3.1, use the following command:


       # scdidadm -l
      

ProcedureHow to Replace a Controller

If the RAID storage array is configured with dual controllers, see theSun StorEdge 3000 Family FRU Installation Guide for controller replacement procedures. If the RAID storage array is configured with a single controller, perform the procedure below to ensure high availability.

  1. Detach the submirrors on the RAID storage array that are connected to the controller. This controller is the controller that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

  2. Replace the controller.

    For the procedure about how to replace a controller, see the Sun StorEdge 3000 Family Installation, Operation, and Service Manual, 3310 SCSI Array.

  3. Reattach the submirrors to resynchronize the submirrors.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

ProcedureHow to Replace an I/O Module

Use this procedure to replace a RAID storage array I/O module.


Note –

If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible.


  1. Detach the submirrors on the RAID storage array that are connected to the I/O module that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

  2. Disconnect the SCSI cables from the hosts.

  3. Disconnect the SCSI cables from the I/O module.

  4. Replace the I/O module.

    For the procedure about how to replace the I/O module, see the Sun StorEdge 3000 Family FRU Installation Guide.

  5. Reconnect the SCSI cables to the I/O module.

  6. Reconnect the SCSI cables to the host.

  7. Reattach the submirrors to resynchronize submirrors.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

ProcedureHow to Replace a Terminator Module

Use this procedure to replace a RAID storage array terminator module.


Note –

If your configuration is running in RAID level 0, take appropriate action to prepare the volume manager for the impacted disk to be inaccessible.


  1. Detach the submirrors on the RAID storage array that are connected to the terminator module that you are replacing. Detach the submirrors to stop all I/O activity to the RAID storage array.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

  2. Replace the terminator module.

    For the procedure about how to replace the terminator module, see the Sun StorEdge 3000 Family FRU Installation Guide.

  3. Reattach the submirrors to resynchronize submirrors.

    For more information, see your Solstice DiskSuite/Solaris Volume Manager or Veritas Volume Manager documentation.

ProcedureHow to Replace a Host Adapter

Use this procedure to replace a failed host adapter in a running cluster. This procedure defines Node A as the node with the failed host adapter that you are replacing.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.

Before You Begin

This procedure relies on the following prerequisites and assumptions.

To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC (role-based access control) authorization.

  1. Determine the resource groups and device groups that are running on Node A.

    Record this information because you use this information in Step 12 and Step 13 of this procedure to return resource groups and device groups to Node A.

    • If you are using Sun Cluster 3.2, use the following commands:


      # clresourcegroup status -n nodename
      # cldevicegroup status -n nodename
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scstat
      
  2. Record the details of any metadevices that are affected by the failed host adapter.

    Record this information because you use it in Step 11 of this procedure to repair any affected metadevices.

  3. Move all resource groups and device groups off Node A.

    • If you are using Sun Cluster 3.2, use the following command:


      # clnode evacuate NodeA
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scswitch -S -h NodeA
      
  4. Shut down Node A.

    For the full procedure about how to shut down and power off a node, see your Sun Cluster system administration documentation.

  5. Power off Node A.

  6. Replace the failed host adapter.

    For the procedure about how to remove and add host adapters, see the documentation that shipped with your nodes.

  7. If you need to upgrade the node's host adapter firmware, boot Node A into noncluster mode by adding -x to your boot instruction. Proceed to Step 9.

    For more information about how to boot nodes, see your Sun Cluster system administration documentation.

  8. If you do not need to upgrade the node's host adapter firmware, proceed to Step 10.

  9. Upgrade the host adapter firmware on Node A.

    If you use the Solaris 8, Solaris 9, or Solaris 10 Operating System, Sun Connection Update Manager keeps you informed of the latest versions of patches and features. Using notifications and intelligent needs-based updating, Sun Connection helps improve operational efficiency and ensures that you have the latest software patches for your Sun software.

    You can download the Sun Connection Update Manager product for free by going to http://www.sun.com/download/products.xml?id=4457d96d.

    Additional information for using the Sun patch management tools is provided in Solaris Administration Guide: Basic Administration at http://docs.sun.com. Refer to the version of this manual for the Solaris OS release that you have installed.

    If you must apply a patch when a node is in noncluster mode, you can apply it in a rolling fashion, one node at a time, unless instructions for a patch require that you shut down the entire cluster. Follow the procedures in How to Apply a Rebooting Patch (Node) in Sun Cluster System Administration Guide for Solaris OS to prepare the node and to boot it in noncluster mode. For ease of installation, consider applying all patches at the same time. That is, apply all patches to the node that you place in noncluster mode.

    For a list of patches that affect Sun Cluster, see the Sun Cluster Wiki Patch Klatch.

    For required firmware, see the Sun System Handbook.

  10. Boot Node A into cluster mode.

    For more information about how to boot nodes, see Chapter 3, Shutting Down and Booting a Cluster, in Sun Cluster System Administration Guide for Solaris OS.

  11. Perform any volume management procedures that are necessary to fix any metadevices affected by this procedure, as you identified in Step 2.

    For more information, see your volume manager software documentation.

  12. (Optional) Restore the device groups to Node A.

    Perform the following step for each device group you want to return to the original node.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevicegroup switch -n NodeA devicegroup1[ devicegroup2 …]
      
      -n NodeA

      The node to which you are restoring device groups.

      devicegroup1[ devicegroup2 …]

      The device group or groups that you are restoring to the node.

    • If you are using Sun Cluster 3.1, use the following command:


       # scswitch -z -D devicegroup -h NodeA
      
  13. (Optional) Restore the resource groups to Node A.

    Perform the following step for each resource group you want to return to the original node.

    • If you are using Sun Cluster 3.2, use the following command:


      # clresourcegroup switch -n NodeA  resourcegroup1[ resourcegroup2 …]
      
      NodeA

      For failover resource groups, the node to which the groups are returned. For scalable resource groups, the node list to which the groups are returned.

      resourcegroup1[ resourcegroup2 …]

      The resource group or groups that you are returning to the node or nodes.

    • If you are using Sun Cluster 3.1, use the following command:


      # scswitch -z -g resourcegroup -h NodeA