Sun Cluster 2.2 Cluster Volume Manager Guide

3.10 Replacing a Bad Disk in a SPARCstorage Array Tray

It is possible to replace a SPARCstorage Array disk without halting system operations.

Note -

The following procedure is used to replace a failed disk in a SPARCstorage Array 100 Series. When removable storage media (RSM) disks are used, follow the procedure in the Sun Cluster 2.2 System Administration Guide and the applicable hardware service manual.

Identify all the volumes and corresponding plexes on the disks in the tray which contains the faulty disk.
1. From the physical device address cNtNdN, obtain the controller number and the target number.
  
  For example, if the device address is c3t2d0, the controller number is 3 and the target is 2.
2. Identify devices from a vxdisk list output:
  
  If the target is 0 or 1, identify all devices with physical addresses beginning with cNt0 and cNt1. If the target is 2 or 3, identify all devices with physical addresses beginning with cNt2 and cNt3. If the target is 4 or 5, identify all devices with physical addresses beginning with cNt4 and cNt5.
  
  For example:
  # vxdisk -g diskgroup -q list | egrep c3t2\|c3t3 | nawk '{print $3}'
  Record the volume media name for the faulty disk from the output of the command. The volume media name is the variable device_media_name used in Step 8.
3. Identify all plexes on the above devices by using the appropriate version (csh, ksh, or Bourne shell) of the following command:
  PLLIST=`vxprint -ptq -g diskgroup -e '(aslist.sd_dm_name in ("c3t2d0","c3t3d0","c3t3d1")) && (pl_kstate=ENABLED)' | nawk '{print $2}'`
  For csh, the syntax is set PLLIST .... For ksh, the syntax is export PLLIST= .... The Bourne shell requires the command export PLLIST after the variable is set.

After you have set the variable, detach all plexes in the tray:
# vxplex det ${PLLIST}
An alternate command for detaching each plex in a tray is:
# vxplex -g diskgroup -v volume det plex
Note -
The volumes will still be active because the other mirror is still available.

Spin down the disks in the tray:
# ssaadm stop -t tray controller

Replace the faulty disk.

Spin up the drives:
# ssaadm start -t tray controller

Initialize the replacement disk:
# vxdisksetup -i devicename

Scan the current disk configuration again:

Enter the following commands on both nodes in the cluster
# vxdctl enable # vxdisk -a online

Add the new disk to the disk group:

# vxdg -g diskgroup -k adddisk device_media_name=device_name

Resynchronize the volumes:
# vxrecover -b -o iosize=192K