Sun Cluster 3.1 - 3.2 With StorEdge A1000 Array, Netra st A1000 Array, or StorEdge A3500 System Manual

ProcedureHow to Replace a Failed Controller or Restore an Offline Controller

This procedure assumes that your cluster is operational. For conceptual information about SCSI reservations and failure fencing, see your Sun Cluster concepts documentation. For a list of Sun Cluster documentation, see Related Documentation.

Before You Begin

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.

To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.

  1. (StorEdge A1000 Only) If none of the LUNs in the storage array is a quorum device, proceed to Step 3.


    Note –

    Your storage array or storage system might not support LUNs as quorum devices. To determine if this restriction applies to your storage array or storage system, see Restrictions and Requirements.


    To determine whether any LUNs in the storage array are quorum devices, use one of the following commands.

    • If you are using Sun Cluster 3.2, use the following command:


      # clquorum show 
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scstat -q
      
  2. (StorEdge A1000 Only) If any of the LUNs in the storage array is a quorum device, relocate that quorum device to another suitable storage array.

    For procedures about how to add and remove quorum devices, see your Sun Cluster system administration documentation. For a list of Sun Cluster documentation, see Related Documentation.

  3. (StorEdge A3500 Only) On both nodes, to prevent LUNs from automatic assignment to the controller that is being brought online, set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false.


    Caution – Caution –

    You must set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false so that no LUNs are assigned to the controller being brought online. After you verify in Step 8 that the controller has the correct SCSI reservation state, you can balance LUNs between both controllers.


    For the procedure about how to modify the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  4. Restart the RAID Manager daemon.


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start
    
  5. If your controller module is offline, but does not have a failed controller, proceed to Step 7.

  6. If you have a failed controller, replace the failed controller with a new controller. Do not bring the controller online.

    For the procedure about how to replace controllers, see the Sun StorEdge A3500/A3500FC Controller Module Guide and the Sun StorEdge RAID Manager Installation and Support Guide for additional considerations.

  7. On one node, use the RAID Manager GUI's Recovery application to restore the controller online.


    Note –

    You must use the RAID Manager GUI's Recovery application to bring the controller online. Do not use the Redundant Disk Array Controller Utility (rdacutil) because this utility ignores the value of the System_LunReDistribution parameter in the /etc/raid/rmparams file.


    For information about the Recovery application, see the Sun StorEdge RAID Manager User’s Guide. If you have problems with bringing the controller online, see the Sun StorEdge RAID Manager Installation and Support Guide.

  8. On one node that is connected to the storage array or storage system, verify that the controller has the correct SCSI reservation state.

    Use one of the following commands on LUN 0 of the controller you want to bring online.

    In the following commands, devicename is the full UNIX path name of the device, for example /dev/dsk/c1tXdY

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice repair devicename
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scdidadm -R devicename
      
  9. (StorEdge A3500 Only) Set the controller to active/active mode. Assign LUNs to the controller.

    For more information about controller modes, see the Sun StorEdge RAID Manager Installation and Support Guide and the Sun StorEdge RAID Manager User’s Guide.

  10. (StorEdge A3500 Only) Reset the System_LunReDistribution parameter in the /etc/raid/rmparams file to true.

    For the procedure about how to change the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  11. (StorEdge A3500 Only) Restart the RAID Manager daemon.


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start