Sun Cluster 3.1 - 3.2 With SCSI JBOD Storage Device Manual for Solaris OS

ProcedureSPARC: How to Replace a Disk Drive With Oracle Real Application Clusters

You need to replace a disk drive if the disk drive fails or when you want to upgrade to a higher-quality or to a larger disk.

For conceptual information about quorum, quorum devices, global devices, and device IDs, see the Sun Cluster concepts documentation.

While performing this procedure, ensure that you use the correct controller number. Controller numbers can be different on each node.


Note –

If the following warning message is displayed, ignore the message. Continue with the next step.


vxvm:vxconfigd: WARNING: no transactions on slave
vxvm:vxassist: ERROR:  Operation must be executed on master

Before You Begin

This procedure relies on the following prerequisites and assumptions.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix A, Sun Cluster Object-Oriented Commands, in Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.

To perform this procedure, become superuser or assume a role that provides solaris.cluster.read and solaris.cluster.modify RBAC authorization.

  1. On each node, check the status of the failed disk drive.


    # vxdisk list
    
  2. On each node, identify the failed disk's instance number.

    You use this instance number in Step 12.


    # ls -l /dev/dsk/cWtXdYsZ
    # cat /etc/path_to_inst | grep "device_path"
    

    Note –

    Ensure that you do not use the ls command output as the device path. The following example demonstrates how to find the device path and how to identify the sd instance number by using the device path.


    # ls -l /dev/dsk/c4t0d0s2
    lrwxrwxrwx 1 root root 40 Jul 31 12:02 /dev/dsk/c4t0d0s2 ->
    ../../devices/pci@4,2000/scsi@1/sd@0,0:cb41d
    # cat /etc/path_to_inst | grep "/pci@4,2000/scsi@1/sd@0,0"
    "/node@2/pci@4,2000/scsi@1/sd@0,0" 60 "sd"

    If you are using Solaris 10, the node@2 portion of this output is not present. Solaris 10 does not add this prefix for cluster nodes.


  3. On one node, identify the failed disk's disk ID number.

    You will use the disk ID number in Step 15 and Step 16.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice show -v cNtXdY
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scdidadm -o diskid -l cNtXdY
      
  4. If the disk drive you are removing is configured as quorum device, add a new quorum device that will not be affected by this procedure. Then remove the old quorum device.

    To determine whether a quorum device will be affected by this procedure, use one of the following commands.

    • If you are using Sun Cluster 3.2, use the following command:


      # clquorum show +
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scstat -q
      

    For procedures about how to add and remove quorum devices, see the Sun Cluster system administration documentation.

  5. On each node, remove the failed disk from Veritas Volume Manager control.


    # vxdisk offline cXtYdZ
    # vxdisk rm cXtYdZ
    
  6. On each node, verify that you removed the disk entry.


    # vxdisk list
    
  7. Remove the failed disk from the storage array.

    For the procedure about how to remove a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.

  8. On each node, unconfigure the failed disk.


    # cfgadm -c unconfigure cX::dsk/cXtYdZ
    
  9. On each node, remove the paths to the disk drive that you are removing.


    # devfsadm -C
    
  10. On each node, verify that you removed the disk.


    # cfgadm -al | grep cXtYdZ
    # ls /dev/dsk/cXtYdZ
    
  11. Add the new disk to the storage array.

    For the procedure about how to add a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.

  12. On each node, configure the new disk.

    Use the instance number that you identified in Step 2.


    # cfgadm -c configure cX::sd_instance_Number
    # devfsadm
    
  13. Verify that you added the disk.


    # ls /dev/dsk/cXtYdZ
    
  14. On one node, update the device ID numbers.

    Use the device ID number that you identified in Step 3.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice repair
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scdidadm -R deviceID
      

      Note –

      After running scdidadm —R on the first node, each subsequent node that you run the command on might display a warning. Ignore this warning.


  15. Verify that you updated the disk's device ID number.

    Use the device ID number that you identified in Step 3.

    • If you are using Sun Cluster 3.2, use the following command:


      # cldevice show | grep DID_number
      
    • If you are using Sun Cluster 3.1, use the following command:


      # scdidadm -L | grep DID_number
      
  16. Verify that the disk ID number is different than the disk ID number that you identified in Step 3.

  17. On each node, add the new disk to the Veritas Volume Manager database.


    # vxdctl enable
    
  18. Verify that you added the new disk.


    # vxdisk list |grep cXtYdZ
    
  19. Determine the master node.


    # vxdctl -c mode
    
  20. Perform disk recovery tasks on the master node.

    Depending on your configuration and volume layout, select the appropriate Veritas Volume Manager menu item to recover the failed disk.


    # vxdiskadm
    # vxtask list