Sun Cluster 3.0 12/01 Hardware Guide

How to Replace a Failed Controller or Restore an Offline Controller

Use this procedure to replace a StorEdge A3500/A3500FC controller, or to restore an offline controller.

For conceptual information on SCSI reservations and failure fencing, see the Sun Cluster 3.0 12/01 Concepts.


Note -

Replacement and cabling procedures are different from the following procedure if you are using your StorEdge A3500FC arrays to create a SAN by using a Sun StorEdge Network FC Switch-8 or Switch-16 and Sun SAN Version 3.0 release software. (StorEdge A3500 arrays are not supported by the Sun SAN 3.0 release at this time.) See "StorEdge A3500FC Array SAN Considerations" for more information.


  1. On both nodes, to prevent LUNs from automatic assignment to the controller that is being brought online, set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false.


    Caution - Caution -

    You must set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false so that no LUNs are assigned to the controller being brought online. After you verify in Step 5 that the controller has the correct SCSI reservation state, you can balance LUNs between both controllers.


    For the procedure on modifying the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  2. Restart the RAID Manager daemon:


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start
    

  3. Do you have a failed controller?

    • If your controller module is offline, but does not have a failed controller, go to Step 4.

    • If you have a failed controller, replace the failed controller with a new controller, but do not bring the controller online.

      For the procedure on replacing StorEdge A3500/A3500FC controllers, see the Sun StorEdge A3500/A3500FC Controller Module Guide and the Sun StorEdge RAID Manager Installation and Support Guide for additional considerations.

  4. On one node, use the RAID Manager GUI's Recovery application to restore the controller online.


    Note -

    You must use the RAID Manager GUI's Recovery application to bring the controller online. Do not use the Redundant Disk Array Controller Utility (rdacutil) because it ignores the value of the System_LunReDistribution parameter in the /etc/raid/rmparams file.


    For information on the Recovery application, see the Sun StorEdge RAID Manager User's Guide. If you have problems with bringing the controller online, see the Sun StorEdge RAID Manager Installation and Support Guide.

  5. On one node that is connected to the StorEdge A3500/A3500FC system, verify that the controller has the correct SCSI reservation state.

    Run the scdidadm(1M) repair option (-R) on LUN 0 of the controller you want to bring online:


    # scdidadm -R /dev/dsk/cNtXdY
    

  6. Set the controller to active/active mode and assign LUNs to it.

    For more information on controller modes, see the Sun StorEdge RAID Manager Installation and Support Guide and the Sun StorEdge RAID Manager User's Guide.

  7. Reset the System_LunReDistribution parameter in the /etc/raid/rmparams file to true.

    For the procedure on changing the rmparams file, see the Sun StorEdge RAID Manager Installation and Support Guide.

  8. Restart the RAID Manager daemon:


    # /etc/init.d/amdemon stop
    # /etc/init.d/amdemon start