Sun Cluster 3.0 Hardware Guide

Maintaining a StorEdge A3500 Disk Array

This section provides the procedures for maintaining an StorEdge A3500 disk array. The following table lists these procedures.

Table 7-3 Task Map: Maintaining a StorEdge A3500 Disk Array

Task 

For Instructions, Go To... 

Add a disk drive 

Follow the same procedure used in a non-cluster environment. 

Sun StorEdge D1000 Storage Guide for the hardware procedure Sun StorEdge RAID Manager User's Guide for the software procedure

Remove a disk drive 

Follow the same procedure used in a non-cluster environment. Before you physically remove a disk drive from an Redundant Disk Array Controller (rdac) disk group or disk group, you need to remove the logical unit number (LUN).

Sun StorEdge D1000 Storage Guide for the hardware procedure

 

"How to Remove a LUN"

Replace a disk drive 

Follow the same procedure used in a non-cluster environment if you have only one failed disk drive. If you have more than one failed disk drive, you might need to back up and restore your data. If disk drive replacement affects any LUN's availability, remove the LUN(s) from volume management control using Solstice DiskSuite or VERITAS Volume Manager. After you replace the disk drive and you can access all LUNs, you can return the LUN(s) to volume management control. 

Sun StorEdge D1000 Storage Guide for the hardware procedure. Sun StorEdge RAID Manager User's Guide for the software procedure.

Add an StorEdge A3500 to an existing cluster 

Install host adapters and SCSI cables, then power on the disk array. Use the hardware install procedure to perform an initial installation and configuration of an StorEdge A3500 disk array or to install an StorEdge A3500 disk array to an existing cluster. 

"How to Add a StorEdge A3500"

Remove an StorEdge A3500 from an existing cluster 

Use the software install to install an StorEdge A3500 disk array to an existing cluster. 

"How to Remove a StorEdge A3500"

Replace an StorEdge A3500 controller and restore an offline StorEdge A3500 controller 

This procedure requires that you set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false. You must also verify that the new controller has the correct SCSI reservation state before assigning LUNs.

"How to Replace a Failed StorEdge A3500 Controller or Restore an Offline StorEdge A3500 Controller"

Replace a host adapter 

This procedure requires that you halt the node that contains the failed SCSI host adapter and prepare the SCSI connections for continued operation. 

"How to Replace a Host Adapter"

Replace a failed SCSI cable from the controller to the drive tray 

Follow the same procedure used in a non-cluster environment. 

Sun StorEdge A3500/A3500FC Controller Module Guide

 

Sun StorEdge RAID Manager User's Guide

Replace a StorEdge A3500-to-host SCSI cable 

Follow the same procedure used in a non-cluster environment. 

Sun StorEdge A3500/A3500FC Controller Module Guide

 

Sun StorEdge RAID Manager User's Guide

Replace a controller fan canister 

Follow the same procedure used in a non-cluster environment. 

Sun StorEdge A3500/A3500FC Controller Module Guide

Replace the power supply fan canister  

Follow the same procedure used in a non-cluster environment. 

Sun StorEdge A3500/A3500FC Controller Module Guide

Replace DC power or battery harness 

Shut down the cluster, then follow the non-cluster procedure for replacing the power supply housing. 

Sun Cluster 3.0 System Administration Guide for procedures on shutting down a cluster

 

Sun StorEdge A3500/A3500FC Controller Module Guide for replacement procedures

Replace the battery unit 

Shut down the cluster, then follow the non-cluster procedure for replacing the power supply housing. 

Sun Cluster 3.0 System Administration Guide for procedures on shutting down a cluster

 

Sun StorEdge A3500/A3500FC Controller Module Guide for replacement procedures

Replace the controller card cage 

Shut down the cluster, then follow the non-cluster procedure for replacing the power supply housing. 

Sun Cluster 3.0 System Administration Guide for procedures on shutting down a cluster

 

Sun StorEdge A3500/A3500FC Controller Module Guide for replacement procedures

Replace the entire controller assembly 

Shut down the cluster, then follow the non-cluster procedure for replacing the power supply housing. 

Sun Cluster 3.0 System Administration Guide for procedures on shutting down a cluster

 

Sun StorEdge A3500/A3500FC Controller Module Guide for replacement procedures

Replace the power supply housing 

Shut down the cluster, then follow the non-cluster procedure for replacing the power supply housing. 

Sun Cluster 3.0 System Administration Guide for procedures on shutting down a cluster

 

Sun StorEdge A3500/A3500FC Controller Module Guide

How to Add a StorEdge A3500

Use this procedure to add an StorEdge A3500 disk array to an existing cluster.

  1. Ensure that each device in the SCSI chain has a unique SCSI address.

    The default SCSI address for host adapters is 7. Reserve SCSI address 7 for one host adapter in the SCSI chain. This procedure refers to the host adapter you choose for SCSI address 7 as the host adapter on the second node. To avoid conflicts, in Step 7 you will change the scsi-initiator-id of the remaining host adapter in the SCSI chain to an available SCSI address. This procedure refers to the host adapter with an available SCSI address as the host adapter on the first node. SCSI address 6 is usually available.

    For more information, see the OpenBoot 3.x Command Reference Manual and the labels inside the storage device.

  2. Shut down and power off the first node.


    # scswitch -S -h nodename
    # shutdown -y -g 0
    

    For more information on shutdown, see Sun Cluster 3.0 System Administration Guide.

  3. Install the host adapters in the node.

    For the procedure on installing host adapters, see the documentation that shipped with your host adapters and nodes.

  4. Connect the differential SCSI cable between the node and the disk array as shown in Figure 7-2.

    Make sure that the entire SCSI bus length to each enclosure is less than 25 m. This measurement includes the cables to both nodes, as well as the bus length internal to each enclosure, node, and host adapter. Refer to the documentation that shipped with the disk array for other restrictions regarding SCSI operation.

    Figure 7-2 Example of a StorEdge A3500 disk array

    Graphic

  5. Power on the first node and the disk arrays.

  6. Find the paths to the SCSI host adapters.


    {0} ok show-disks
    

    Identify and record the two controllers that will be connected to the disk arrays, and record these paths. Use this information to change the SCSI addresses of these controllers in the nvramrc script. Do not include the /sd directories in the device paths.

  7. Edit the nvramrc script to change the scsi-initiator-id for the host adapters of the first node.

    For a list of Editor nvramrc editor and keystroke commands, see Appendix B, NVRAMRC Editor and NVEDIT Keystroke Commands.

    The following example sets the scsi-initiator-id to 6. The OpenBoot PROM Monitor prints the line numbers (0:, 1:, and so on).


    Caution - Caution -

    Insert exactly one space after the double quote and before scsi-initiator-id.



    {0} ok nvedit 
    0: probe-all
    1: cd /sbus@1f,0/QLGC,isp@3,10000 
    2: 6 encode-int " scsi-initiator-id" property 
    3: device-end 
    4: cd /sbus@1f,0/ 
    5: 6 encode-int " scsi-initiator-id" property 
    6: device-end 
    7: install-console 
    8: banner [Control C] 
    {0} ok
  8. Store the changes.

    The changes you make through the nvedit command are done on a temporary copy of the nvramrc script. You can continue to edit this copy without risk. After you have completed your edits, save the changes. If you are not sure about the changes, discard them.

    • To store the changes, type:


      {0} ok nvstore
      {0} ok 

    • To discard the changes, type:


      {0} ok nvquit
      {0} ok 
  9. Verify the contents of the nvramrc script you created in Step 7.

    If the contents of the nvramrc script are incorrect, use the nvedit command to make corrections.


    {0} ok printenv nvramrc
    nvramrc =             probe-all
                          cd /sbus@1f,0/QLGC,isp@3,10000
                          6 encode-int " scsi-initiator-id" property
                          device-end 
                          cd /sbus@1f,0/
                          6 encode-int " scsi-initiator-id" property
                          device-end  
                          install-console
                          banner
    {0} ok
  10. Instruct the OpenBoot PROM Monitor to use the nvramrc script:


    {0} ok setenv use-nvramrc? true
    use-nvramrc? = true
    {0} ok 

  11. Boot the first node, and wait for it to join the cluster.


    {0} ok boot -r
    

    For more information, see Sun Cluster 3.0 System Administration Guide.

  12. On all nodes, verify that the DIDs have been assigned to the StorEdge A3500 LUNs.


    # scdidadm -l
    

  13. Shut down the second node.


    # scswitch -S -h nodename
    # shutdown -y -g 0
    

  14. Power off the second node.

  15. Install the host adapters in the second node.

    For the procedure on installing host adapters, see the documentation that shipped with your nodes.

  16. Connect the disk array to the host adapters using differential SCSI cables as shown in Figure 7-3.

    Figure 7-3 Example of a StorEdge A3500 disk array

    Graphic

  17. Without allowing the node to boot, power on the second node. If necessary, abort the system to continue with OpenBoot PROM Monitor tasks.

  18. Verify that the second node sees the new host adapters and disk drives.


    {0} ok show-disks
    

  19. Verify that the scsi-initiator-id for the host adapters on the second node is set to 7.

    Use the show-disks command to find the paths to the host adapters connected to these enclosures. Select each host adapter's device tree node, and display the node's properties to confirm that the scsi-initiator-id for each host adapter is set to 7.


    {0} ok cd /sbus@1f,0/QLGC,isp@3,10000
    {0} ok .properties
    scsi-initiator-id        00000007 
    ...
  20. Boot the second node, and wait for it to join the cluster.


    {0} ok boot -r
    

  21. On all nodes, verify that the DIDs have been assigned to the StorEdge A3500 LUNs.


    # scdidadm -l
    

  22. Install the RAID Manager.

    For the procedure on installing RAID Manager, see Sun StorEdge RAID Manager Installation and Support Guide.

  23. Install StorEdge A3500 disk array patches.

    For the location of patches and installation instructions, see Sun Cluster 3.0 Release Notes.

  24. One at a time, reboot each node into cluster mode.


    # boot
    
  25. Upgrade the StorEdge A3500 disk array controller firmware.

    For the StorEdge A3500 disk array controller firmware version number and boot level, see Sun Cluster 3.0 Release Notes. For the procedure on upgrading the StorEdge A3500 controller firmware, see Sun StorEdge RAID Manager User's Guide.

Where to Go From Here

To create a LUN from unassigned disk drives, see "How to Create a LUN".

How to Remove a StorEdge A3500

Use this procedure to remove a StorEdge A3500 from an existing cluster.


Caution - Caution -

This procedure removes all data on the disk array you remove.


  1. Migrate all Oracle Parallel Server (OPS) tables, data services, and volumes off of the StorEdge A3500 disk array volumes.

  2. Stop all I/O activity to the StorEdge A3500.

  3. If a volume manager does not manage any of the logical unit numbers (LUN) on the StorEdge A3500, proceed to Step 4. Otherwise, run the appropriate Solstice DiskSuite or VERITAS Volume Manager commands to remove the LUN(s) from any diskset or disk group.

    For more information, see your Solstice DiskSuite or VERITAS Volume Manager documentation.

  4. If you are removing the last StorEdge A3500 in your cluster, remove StorEdge A3500 packages.

    For the procedure on removing StorEdge A3500 packages, see the documentation that shipped with your disk array.

  5. One at a time, reboot each node into cluster mode.


    # boot
    
  6. Disconnect the SCSI cables from the disk array.

  7. On all cluster nodes, remove references to the disk array


    # devfsadm -C
    # scdidadm -C
    
  8. If needed, remove any host adapters from the nodes.

    For the procedure on removing host adapters, see the documentation that shipped with your nodes.

How to Replace a Failed StorEdge A3500 Controller or Restore an Offline StorEdge A3500 Controller

Use this procedure to remove a failed StorEdge A3500 disk array controller or restore an offline StorEdge A3500 disk array controller.

For conceptual information on SCSI reservations and failure fencing, see Sun Cluster 3.0 Concepts.

  1. On all nodes, to prevent LUNs from being automatically assigned to the new controller, set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false.


    Caution - Caution -

    You must set the System_LunReDistribution parameter in the /etc/raid/rmparams file to false so that no LUNs are assigned to the controller being brought online. After you verify in Step 4 that the controller has the correct SCSI reservation state, you can balance LUNs between both controllers.


    For the procedure on modifying the rmparams file, see Sun StorEdge RAID Manager Installation and Support Guide.

  2. If you have a failed controller, replace the failed controller with a new controller, but do not bring the controller online. If you do not have a failed controller, proceed to Step 3.

    For the procedure on replacing StorEdge A3500 controllers, see Sun StorEdge RAID Manager Installation and Support Guide.

  3. On one node, use the RAID Manager 6.x graphical user interface's Recovery application to bring the controller online.


    Caution - Caution -

    You must use the RAID Manager 6.x graphical user interface's Recovery application to bring the controller online. Do not use the Redundant Disk Array Controller Utility (rdacutil) because it ignores the value of the System_LunReDistribution parameter in the /etc/raid/rmparams file.


    For information on the Recovery application, see Sun StorEdge RAID Manager User's Guide. For the procedure on replacing StorEdge A3500 controllers, see Sun StorEdge RAID Manager User's Guide. If you have problems bringing the controller online, see Sun StorEdge RAID Manager Installation and Support Guide.

  4. On one node connected to the disk array, verify that the controller has the correct SCSI reservation state.

    Run the scdidadm repair procedure (-R) on LUN 0 of the controller you want to restore.


    # scdidadm -R /dev/dsk/cNtXdY
    
  5. Set the new controller to active/active mode and assign LUNs to the new controller.

    For more information on controller modes, see Sun StorEdge RAID Manager Installation and Support Guide and Sun StorEdge RAID Manager User's Guide.

  6. Reset the System_LunReDistribution parameter in the /etc/raid/rmparams file to true.

    For the procedure on changing the rmparams file, see Sun StorEdge RAID Manager Installation and Support Guide.