Sun Cluster 2.2 Cluster Volume Manager Guide

2.4 Recovery

The basic recovery actions required for CVM are very similar to the recovery action required for SSVM under identical circumstances. However, when running CVM in clustered mode, additional steps are required and certain restrictions apply. The most significant difference is that recovery of shared disk groups must be initiated from the master node.

There are multiple ways to monitor the status of volume manager objects. The most frequently used mechanisms include vxprint command, VxVa (GUI) and vxnotify command. Most error conditions generate console messages or send messages to the messages file. CVM automatically recovers from some of events (for example, failure of one of the cluster nodes), through the Sun Cluster framework, other events need system administrator intervention. Recovery is done by using one or more of the following utilities: vxva, vxdiskadm, vxreattach, vxrecover, and/or vxvol. Some of these utilities internally call other utilities like vxdg, vxdisk, vxplex, and so on. To understand what is needed to recover from a particular situation you must have a solid understanding of volume manager utilities. For volume recovery and/or disk replacement procedures, refer to the Sun StorEdge Volume Manager 2.6 User's Guide and the Sun StorEdge Volume Manager 2.6 System Administrator's Guide.

The following section describes the process of recovering from some of the most commonly encountered situations.

2.4.1 Disk/Controller/Storage Device Failure

Failure of a disk, controller, or other storage device may make one or more devices inaccessible from one or more nodes. If a device was being accessed at the time of failure, that device is detached from the disk group. The data layout of a mirrored device should be such that no single failure can make both and/or all mirrors unavailable.

The first step of recovery is to make the failed device(s) accessible again, which includes:

Replacing failed hardware components (if any)
Executing storage device specific recovery/startup actions (for example using the Recovery Guru on Sun StorEdge A3000 or the luxadm on Sun StorEdge A5000)
Updating the Solaris device tree (drvconfig/boot -r)

For the exact sequence of steps to perform, refer to the storage-specific administration manual.

The volume manager needs to recognize when a device is accessible, Usually, this is achieved by running vxdctl enable, after which CVM can perform the recovery action involving the device. Devices can be reattached using vxreattach, vxdiskadm (option 5), or the vxva GUI. All of these utilities attach the disk using vxdg -k adddisk. Once the disk has been attached, the volumes must be recovered using vxrecover. The exact operations required to recover depend on the kstate (kernel state) and/or the state of dm/volume/plex. For an explanation of the state values, refer to the Sun StorEdge Volume Manager 2.6 System Administrator's Guide. A brief discussion on recovering from various states follows.

If you notice devices in the NODEVICE state, you must reattach them using vxreattach/vxdiskadm/vxva. vxreattach is convenient to use, as it tries to figure out disk media and device access names. However, if a disk was replaced, you must attach it using vxdg/vxdiskadm/vxva. When using vxva/vxdiskadm you must specify which disk to use for the disk media. Disks that are in the REMOVED state must be attached by using the vxva/vxdisk

Note -
If the replacement disk is not initialized, you must first initialize it.

A volume enters kstate ENABLED when it is started, and becomes DISABLED when it is stopped (or as a result of critical errors that render it unusable). If one or more volumes are not in kstate ENABLED, they can be started by using vxvol/vxrecover/vxva. A volume may not start if no plexes are in CLEAN or ACTIVE state, in which case you can use vxmend to change the state of the selected plex to CLEAN/ACTIVE before the volume can be started.

A volume may enter the NEEDSYNC state if one or more nodes leave the cluster abruptly. In this case vxrecover is started by the cluster framework to perform the necessary synchronization. When a volume is being synchronized, it will be in the SYNC state, and it will move to the ACTIVE state once complete. If a process doing recovery is killed, the volume may not transition from SYNC to the ACTIVE state. In this case, it must be recovered using vxvol -f resync.

Plexes that are associated with a volume but are detached, are DISABLED (kstate). You can recover these plexes using vxrecover, which in turn calls vxplex att. The following procedures should enable you to recover from most common failures.

Rectify the fault condition (hardware and/or software) and make sure the devices are accessible again.

Run vxdctl enable on all nodes of clusters.

Run vxreattach on the master node.

Run vxreattach on the other nodes that have non-shared disk groups.

Verify (by running vxprint) that the devices have been reattached. (Under certain circumstances, vxreattach may not reattach a disk removed and/or replaced disks. These disks must be manually reattached using vxdg/vxdiskadm/vxva.

Run vxrecover -sb on the master node.

Run vxrecover -g <dg> -sb on another node with a non-shared disk group.

2.4.2 Recovery Examples

The following examples show some typical recovery situations. You start the recovery process by checking the operating mode of the cluster nodes; recovery must be performed on the master node (for a non-shared disk group, recovery must be performed on the node where the disk group was imported).

Root@capri:/# vxdctl -c mode
mode: enabled: cluster active - SLAVE
Root@palermo:/# vxdctl -c mode
mode: enabled: cluster active - MASTER

To check the available disk groups on both nodes, you can use vxdg list:

Root@capri:/# vxdg list
NAME   STATE          ID
rootdg enabled        885258939.1025.capri
test   enabled,shared 885331519.1233.palermo
Root@palermo:/# vxdg list
NAME   STATE          ID
rootdg enabled        885258917.1025.palermo
test   enabled,shared 885331519.1233.palermo

In this case there is one non-shared (rootdg) and one shared (test) disk group. The disk group ID of rootdg differs between the two hosts, even though the name is the same. Notice the state of the volume manager objects. For each object KSTATE and STATE, you can consider the object state and decide whether or not recovery is warranted.

vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -

dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -

v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -

v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

If device c4t0d6s2 or all devices under controller c4 become unavailable, the device is detached from the disk group. The following example shows how the vxprint output looks after fault injection.

vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -

dm disk1        -            -        -        -        NODEVICE -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -

v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        DISABLED 132867   -        NODEVICE -       -
sd c4t0d6s2-01  test1-01     DISABLED 132867   0        NODEVICE -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -

v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        DISABLED 132867   -        NODEVICE -       -
sd c4t0d6s2-02  test2-01     DISABLED 132867   0        NODEVICE -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

Notice that the state of the DM entry, disk1, and the subdisk and plex using this disk is NODEVICE. In this case, the device should be reattached when it becomes accessible again. The vxdisk list output shows the state of the disk. If the device state has changed, you should run vxdctl enable before you run vxdisk list.

vxdctl enable vxdisk list | grep c[45]t0d6s2
c4t0d6s2     sliced    -            -            error  shared
c5t0d6s2     sliced    disk2        test         online shared
-            -         disk1        test        failed was:c4t0d6s2

Notice that c5t0d6s2 is online, but c4t0d6s2 is in an error state. If the device was accessible to some nodes but not others, the vxdisk list output might differ between nodes (nodes that can still access the device will show it online). Now we can rectify the fault condition (in this case palermo lost connectivity to one of the SSAs; connection was restored later). At this point, running vxreattach is enough to reattach the devices.

Next, you can run vxdctl enable and verify that the devices are now accessible (device state is online). The following examples show the use of the vxdisk and vxdg commands.

vxdctl enable
vxdisk -a online
vxdisk list | grep c[45]t0d6s2
c4t0d6s2     sliced    -            -            error  shared
c5t0d6s2     sliced    disk2        test         online shared
-            -         disk1        test         failed was:c4t0d6s2

The preceding listing shows that c4t0d6s2, which is no longer associated with any disk group, was associated with disk group test as DM disk1. You can reattach it with the command vxdg -g test -k adddisk disk1=c4t0d6s2, after you verify that disk1 is still disassociated, and c4t0d6s2 is the right disk (that is, it has not been swapped).

vxprint -d -g test -F "%name %nodarec %diskid"
disk1 on 882484157.1163.palermo
disk2 off 884294145.1517.palermo

The preceding listing shows the DM name to disk ID association. Since the nodarec attribute of disk1 is on, it is still disassociated. Disk 882484157.1163.palermo used to be associated with it. If you did not physically replace or move the disk, this disk ID should correspond to c4t0d6s2. If it was replaced by a new initialized disk, you may not find a matching disk ID. To verify the disk ID, you can run the command vxdisk -s list.

vxdisk -s list c4t0d6s2 c5t0d6s2
Disk:   c4t0d6s2
type:   sliced
flags:  online ready private autoconfig shared autoimport
diskid: 882484157.1163.palermo
dgname: test
dgid:   885331519.1233.palermo
clusterid: italia
Disk:   c5t0d6s2
type:   sliced
flags:  online ready private autoconfig shared autoimport imported
diskid: 884294145.1517.palermo
dgname: test
dgid:   885331519.1233.palermo
clusterid: italia

The preceding listing shows that disk c4t0d6s2 ID is 882484157.1163.palermo. Verifying the association this way is rather tedious. Fortunately, vxreattach (with the -c option) can show you the disk group and DM with which a disk should be reattached:

vxreattach -c  c4t0d6s2
test disk1

You can now associate the disk using the command vxdg -g test -k adddisk disk1=c4t0d6s2. Under most circumstances, running vxreattach takes care of all the preceding steps (running vxdctl enable and reattaching devices with respective disk groups). However, if a disk was removed administratively using the vxdiskadm command, or was physically replaced, it must be replaced using the vxdg command (option 5) or the VxVM GUI.

vxreattach -bv
! vxdg -g 885331519.1233.palermo -k adddisk disk1=c4t0d6s2

You can verify state change of the DM and plex using the vxprint command.

vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -

dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -

v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        DISABLED 132867   -        IOFAIL   -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -

v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        DISABLED 132867   -        RECOVER  -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

You can recover the volume and plex now by specifying the -rb option of vxreattach to start vxrecover.

vxrecover -g test -vb
job 026404 dg test volume test1: reattach plex test1-01
ps -ef | grep plex
    root 26404 26403  1 13:58:04 ?        0:01 /usr/lib/vxvm/type/fsgen/vxplex -U fsgen -g 885331519.1233.palermo -- att test1
    root 26406   916  0 13:58:10 console  0:00 grep plex

Running in the background, vxrecover started vxplex to attach the plex to the volume (note the STALE state and ATT in tutil0).

vxprint -g test               
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -
dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -
v  test1        fsgen        ENABLED  131072   -        ACTIVE   ATT1    -
pl test1-01     test1        ENABLED  132867   -        STALE    ATT     -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -
v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        DISABLED 132867   -        RECOVER  -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -
Root@palermo:/ # job 026404 done status=0
job 026408 dg test volume test2: reattach plex test2-01
job 026408 done status=0

After the volumes have been recovered, check the state of the devices again (KSTATE should be ENABLED and STATE should be ACTIVE):

vxprint -g test               
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -
dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -
v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -
v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

Now the recovery is complete. If the fault had occurred on the slave node rather than the master node, the behavior might vary slightly. Following fault injection the vxprint output is similar to the following listing:

vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -

dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -

v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        DETACHED 132867   -        IOFAIL   -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -

v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

Since the devices are not detached, running vxrecover on the master after the slave can access the disk again will be enough. However, if the disk is removed administratively, it must be added using vxdg/vxdiskadm/vxva (vxreattach does not work) and then recovered by using vxrecover.

vxdg -k rmdisk disk1
vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -
dm disk1        -            -        -        -        REMOVED  -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -
v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        DISABLED 132867   -        REMOVED  -       -
sd c4t0d6s2-01  test1-01     DISABLED 132867   0        REMOVED  -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -
v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        DISABLED 132867   -        REMOVED  -       -
sd c4t0d6s2-02  test2-01     DISABLED 132867   0        REMOVED  -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

Note that vxreattach reports any disk that it can reattach. However, you can reattach the disk manually as follows:

vxdg -g test -k adddisk disk1=c4t0d6s2
vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -

dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -

v  test1        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test1-01     test1        DISABLED 132867   -        RECOVER  -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -
v  test2        fsgen        ENABLED  131072   -        ACTIVE   -       -
pl test2-01     test2        DISABLED 132867   -        RECOVER  -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -
vxrecover -v -g test
job 026416 dg test volume test1: reattach plex test1-01
waiting...
job 026416 done status=0
job 026417 dg test volume test2: reattach plex test2-01
waiting...
job 026417 done status=0

The following example shows how you can reattach and recover:

# ps -ef | grep vx
    root 21935     1  1 20:10:31 ?        5:36 vxconfigd
    root 29295     1  0 14:29:11 ?        0:00 /usr/sbin/vxrecover -c -v -s
    root 29349     1  0 14:29:13 ?        0:00 /usr/sbin/vxrecover -c -v -s
    root 29399 29295  0 14:29:14 ?        0:00 /usr/lib/vxvm/type/fsgen/vxvol -U fsgen -g 885331519.1233.palermo -- resync tes
    root 29507 29399  0 14:29:16 ?        0:00 /usr/lib/vxvm/type/fsgen/vxvol -U fsgen -g 885331519.1233.palermo -- resync tes
    root 29508 29349  0 14:29:17 ?        0:00 /usr/lib/vxvm/type/fsgen/vxvol -U fsgen -g 885331519.1233.palermo -- resync tes
    root 29509 29508  0 14:29:17 ?        0:00 /usr/lib/vxvm/type/fsgen/vxvol -U fsgen -g 885331519.1233.palermo -- resync tes
    root 29511   916  0 14:29:21 console  0:00 grep vx
vxprint -g test
TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg test         test         -        -        -        -        -       -
dm disk1        c4t0d6s2     -        8379057  -        -        -       -
dm disk2        c5t0d6s2     -        8379057  -        -        -       -
v  test1        fsgen        ENABLED  131072   -        SYNC     -       -
pl test1-01     test1        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-01  test1-01     ENABLED  132867   0        -        -       -
pl test1-02     test1        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-01  test1-02     ENABLED  132867   0        -        -       -

v  test2        fsgen        ENABLED  131072   -        SYNC     -       -
pl test2-01     test2        ENABLED  132867   -        ACTIVE   -       -
sd c4t0d6s2-02  test2-01     ENABLED  132867   0        -        -       -
pl test2-02     test2        ENABLED  132867   -        ACTIVE   -       -
sd c5t0d6s2-02  test2-02     ENABLED  132867   0        -        -       -

Notice that the state of the volumes is now SYNC. Their state will be ACTIVE after vxplex completes.