How to Replace a Disk Drive With Oracle Real Application Clusters (Sun Cluster 3.0-3.1 With SCSI JBOD Storage Device Manual for Solaris OS)

Sun Cluster 3.0-3.1 With SCSI JBOD Storage Device Manual for Solaris OS

How to Replace a Disk Drive With Oracle Real Application Clusters

You need to replace a disk drive if the disk drive fails or when you want to upgrade to a higher-quality or to a larger disk.

For conceptual information about quorum, quorum devices, global devices, and device IDs, see the Sun Cluster concepts documentation.

While performing this procedure, ensure that you use the correct controller number. Controller numbers can be different on each node.

Note –

If the following warning message is displayed, ignore the message. Continue with the next step.

vxvm:vxconfigd: WARNING: no transactions on slave
vxvm:vxassist: ERROR:  Operation must be executed on master

Before You Begin

This procedure relies on the following prerequisites and assumptions.

Your cluster is operational.
Your system is running Oracle Real Application Clusters.

If your system is not running Oracle Real Application Clusters, see How to Replace a Disk Drive Without Oracle Real Application Clusters.
Your nodes are not configured with dynamic reconfiguration functionality.

If your nodes are configured for dynamic reconfiguration, see the Sun Cluster system administration documentation, and skip steps that instruct you to shut down the node.

Steps

On each node, check the status of the failed disk drive.
# vxdisk list

On each node, identify the failed disk's instance number.

You use this instance number in Step 13.

# ls -l /dev/dsk/cWtXdYsZ
# cat /etc/path_to_inst | grep "device_path"

Note –

Ensure that you do not use the ls command output as the device path. The following example demonstrates how to find the device path and how to identify the sd instance number by using the device path.

# ls -l /dev/dsk/c4t0d0s2
lrwxrwxrwx 1 root root 40 Jul 31 12:02 /dev/dsk/c4t0d0s2 ->
../../devices/pci@4,2000/scsi@1/sd@0,0:cb41d
# cat /etc/path_to_inst | grep "/pci@4,2000/scsi@1/sd@0,0"
"/node@2/pci@4,2000/scsi@1/sd@0,0" 60 "sd"

If you are using Solaris 10, the node@2 portion of this output is not present. Solaris 10 does not add this prefix for cluster nodes.

On one node, identify the failed disk's device ID number.

You will use the device ID number in Step 15 and Step 16.
# scdidadm -L | grep cXtYdZ

On one node, identify the failed disk's disk ID number.

You will use the disk ID number in Step 17.
# scdidadm -o diskid -l cXtYdZ

Determine if the disk drive that you want to replace is a quorum device.
# scstat -q
- If the disk drive is not a quorum device, skip this step.
- If the disk drive that you want to replace is a quorum device, add a new quorum device on a different storage device and remove the old quorum device.
  
  For procedures about how to add and remove quorum devices, see the Sun Cluster system administration documentation.

On each node, remove the failed disk from VERITAS Volume Manager control.
# vxdisk offline cXtYdZ # vxdisk rm cXtYdZ

On each node, verify that you removed the disk entry.
# vxdisk list

Remove the failed disk from the storage array.

For the procedure about how to remove a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.

On each node, unconfigure the failed disk.
# cfgadm -c unconfigure cX::dsk/cXtYdZ

On each node, remove the paths to the disk drive that you are removing.
# devfsadm -C

On each node, verify that you removed the disk.

# cfgadm -al | grep cXtYdZ
# ls /dev/dsk/cXtYdZ

Add the new disk to the storage array.

For the procedure about how to add a disk drive, see your storage documentation. For a list of storage documentation, see Related Documentation.

On each node, configure the new disk.

Use the instance number that you identified in Step 2.
# cfgadm -c configure cX::sd_instance_Number # devfsadm

Verify that you added the disk.
# ls /dev/dsk/cXtYdZ

On one node, update the device ID numbers.

Use the device ID number that you identified in Step 3.
# scdidadm -R DID_number

Verify that you updated the disk's device ID number.

Use the device ID number that you identified in Step 3.
# scdidadm -L | grep DID_number

Verify that the disk ID number is different than the disk ID number that you identified in Step 4.

On each node, add the new disk to the VERITAS Volume Manager database.
# vxdctl enable

Verify that you added the new disk.
# vxdisk list |grep cXtYdZ

Determine the master node.
# vxdctl -c mode

Perform disk recovery tasks on the master node.

Depending on your configuration and volume layout, select the appropriate VERITAS Volume Manager menu item to recover the failed disk.
# vxdiskadm # vxtask list