Sun Cluster 3.0-3.1 With StorEdge A1000 Array, Netra st A1000 Array, or StorEdge A3500 System Manual

ProcedureHow to Replace a Failed Controller or Restore an Offline Controller

This procedure assumes that your cluster is operational. For conceptual information about SCSI reservations and failure fencing, see your Sun Cluster concepts documentation. For a list of Sun Cluster documentation, see Related Documentation.

Steps
  1. (StorEdge A1000 Only) Is one of the LUNs in the storage array a quorum device?


    Note –

    Your storage array or storage system might not support LUNs as quorum devices. To determine if this restriction applies to your storage array or storage system, see Restrictions and Requirements.



    # scstat -q
    
    • If no, proceed to Step 2.

    • If yes, relocate that quorum device to another suitable storage array.

      For procedures about how to add and remove quorum devices, see your Sun Cluster system administration documentation. For a list of Sun Cluster documentation, see Related Documentation.

  2. (StorEdge A3500 Only) On both nodes, to prevent LUNs from automatic assignment to the controller that is being brought online, set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false.


    Caution – Caution –

    You must set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false so that no LUNs are assigned to the controller being brought online. After you verify in Step 6 that the controller has the correct SCSI reservation state, you can balance LUNs between both controllers.


    For the procedure about how to modify the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  3. Restart the RAID Manager daemon.


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start
    
  4. Do you have a failed controller?

    • If your controller module is offline, but does not have a failed controller, proceed to Step 5.

    • If you have a failed controller, replace the failed controller with a new controller. Do not bring the controller online.

      For the procedure about how to replace controllers, see the Sun StorEdge A3500/A3500FC Controller Module Guide and the Sun StorEdge RAID Manager Installation and Support Guide for additional considerations.

  5. On one node, use the RAID Manager GUI's Recovery application to restore the controller online.


    Note –

    You must use the RAID Manager GUI's Recovery application to bring the controller online. Do not use the Redundant Disk Array Controller Utility (rdacutil) because this utility ignores the value of the System_LunReDistribution parameter in the /etc/raid/rmparams file.


    For information about the Recovery application, see the Sun StorEdge RAID Manager User’s Guide. If you have problems with bringing the controller online, see the Sun StorEdge RAID Manager Installation and Support Guide.

  6. On one node that is connected to the storage array or storage system, verify that the controller has the correct SCSI reservation state.

    Run the scdidadm(1M) repair option (-R) on LUN 0 of the controller you want to bring online.


    # scdidadm -R /dev/dsk/cNtXdY
    
  7. (StorEdge A3500 Only) Set the controller to active/active mode. Assign LUNs to the controller.

    For more information about controller modes, see the Sun StorEdge RAID Manager Installation and Support Guide and the Sun StorEdge RAID Manager User’s Guide.

  8. (StorEdge A3500 Only) Reset the System_LunReDistribution parameter in the /etc/raid/rmparams file to true.

    For the procedure about how to change the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  9. (StorEdge A3500 Only) Restart the RAID Manager daemon.


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start