C H A P T E R  4

Exporting, Importing, and Joining Shadows in a Sun Cluster OE


Overview

The Point-in-Time Copy software provides the functionality to allow for an independent shadow volume on a multi-ported storage device to be exported to a secondary node within a Sun Cluster Operating Environment (OE), while still under Point-in-Time Copy software control. This capability of Exporting, Importing, and Joining of a Point-in-Time Copy shadow set enables shadow volume processing by associated applications to be off-loaded to a secondary node within a Sun Cluster OE, without affecting the primary node's master volume or its associated applications. Since the Point-in-Time Copy software retains control of the shadow volume while it is imported on a secondary node within a Sun Cluster OE, point-in-time consistency is retained to facilitate fast resynchronization via update processing at a later time.

Prior versions of the Point-in-Time Copy software did not support the use of E/I/J functionality in a Sun Cluster OE. With the current version of the Availability Suite product, Point-in-Time Copy supports the Export, Import, Join processing of Point-in-Time Copy shadow volumes for hosts running Sun Cluster 3.1 (10/3) OE.

This section describes the proper use, configuration and control for the Exporting, Importing, and Joining of shadow volumes feature of the Sun StorageTek Availability Suite Point-In-Time Copy software in a Sun Cluster 3.1 (10/3) OE.

The Master and Bitmap volumes of a Point-in-Time Copy set can reside on the primary Sun Cluster node, while the shadow volume and optional bitmap(2) volume are exported to the secondary Sun Cluster node. Once on the secondary node, the shadow volume and bitmap(2) volume can be imported and used for off-host read/write data processing, while not impacting the performance of the primary node, the master volume, or the Point-in-time Copy set. Once secondary node processing is complete, the shadow volume and bitmap(2) volume can be moved back to the primary node and joined back with the master volume, restoring the Point-in-Time Copy set and Point-in-Time copy state, as though the shadow volume had never be exported.


Requirements

For the Export, Import, and Join functionality to work correctly, the shadow volume must be on a different global device or volume manager controlled device group than its associated master and bitmap volumes. This allows the shadow volume's device group to be switched between various nodes in a Sun Cluster and to be used as an exportable shadow volume.


Export, Import, and Join Functionality

The Export, Import, and Join functionality of the Point-in-Time Copy software allows for a previously configured shadow volume contained on a dual-ported or Sun StorageTek SAN Foundation (SAN) accessible storage device to be exported from a configured Point-in-Time Copy volume set. Within the Sun Cluster OE, this shadow volume can be contained on a Sun Cluster global device or either of the two supported Sun Cluster volume managers, Solaris Volume Manager (SVM) or VERITAS Volume Manager (VxVM).



Note - Sun Cluster DID devices are not supported as a master, shadow, or bitmap volume, due to disk data fencing functionality when Sun Cluster failure events are active. Sun Cluster Global devices that are symmetrical in name to the DID devices are supported.



When a Point-in-Time Copy volume set is initially configured such that the master and bitmap volumes are in one disk group and the shadow volume is in another disk group, the Point-in-Time Copy Export functionality allows for an independent shadow volume (once the Point-in-Time Copy becomes fully independent) to be exported from the Point-in-Time Copy volume set.

Once exported from the Point-in-Time Copy volume set, this shadow volume can be accessed in a read-only manner on any Sun Cluster node without impacting the master volume. If the shadow volume needs to be accessed in a read-write manner, then the Point-in-Time Copy Import feature can be utilized to provide both read and write access to the shadow volume while retaining point-in-time consistency.

Once secondary shadow volume processing is no longer required on the secondary Sun Cluster node, the shadow volume is disabled, if the volume was being used in the imported state. The shadow volume is then switched back to the Sun Cluster node now containing the original Point-in-Time Copy volume set's master and bitmap volumes. Utilizing the Point-in-Time Copy Join command, the shadow volume and the secondary bitmap volume can be used to reassociate the shadow volume back with its original master and bitmap volumes, thus restoring the Point-in-Time Copy volume set. Upon completion of these operations, the Point-in-Time Copy volume set exists as though the shadow was never exported in the first place.

The entire Export, Import, and Join functionality enables retention of Point-in-Time Copy information across the entire process of moving the shadow volume from one Sun Cluster node to another Sun Cluster node and back again. Through the incorporation of a secondary bitmap volume on the secondary Sun Cluster node, in combination with the Import feature, writes occurring on the secondary node are tracked. The secondary bitmap volume tracking information is then reflected back into the original Point-in-Time Copy set, when a Join operation is used to associate the shadow volume back into its original Point-in-Time Copy set. While the shadow volume was exported from the Point-in-time Copy set, write operations occurring to the master volume are still being tracked in the bitmap volume on the primary Sun Cluster node. The Join operation merges all write operations occurring to both the master and shadow volumes, retaining a consistent Point-in-Time Copy set.

In summary, changes to the Availability Suite's Point-in-time Copy software, along with new Sun Cluster configuration guidelines for configuring an exportable Point-in-Time Copy volume set, provide both high availability (HA) to the Point-in-Time Copy volume set and the ability to have the Point-in-Time Copy volume set retain these HA characteristics while Export, Import, and Join processing is being utilized.


Point-in-Time Copy Set in a Sun Cluster OE

The master volume of an independent Point-in-Time Copy set can be located on Sun Cluster controlled devices as either a raw global device (for example, /dev/global/rdsk/d4s0), SVM (for example, /dev/md/AVsuite/rdsk/d5), or VxVM (for example, /dev/vx/rdsk/AVsuite/m1) controlled volume. The shadow volume of this set can be on the same type or a different Sun Cluster controlled device type, as long as it is in its own device group. Furthermore, when the master and shadow volumes are on different Sun Cluster device groups, the Export, Import, and Join functionality and the Sun Cluster device group and resource group functionality allow the shadow volume of a Point-in-Time Copy set to be relocated on different nodes of a Sun Cluster OE.

While exported from the Point-in-Time Copy set, the shadow volume is disabled from the highly available resource group in which the Point-in-Time Copy set is configured. Once the shadow volume is no longer needed as an exported shadow volume, it can be joined with the Point-in-Time Copy set and enabled under the set's highly available resource group.

If an exportable shadow volume is currently in the imported state on a Sun Cluster node, a new feature of the Point-in-Time Copy software is an automatic implicit Join operation. If, during a Sun Cluster voluntary or involuntary failover event, the node where the imported shadow volume is currently enabled is picked, the software will detect this fact and automatically rejoin the imported shadow volume back into the Point-in-Time Copy set. The behavior retains the High Availability of the Sun Cluster configured resource group, while retaining control of the shadow volume's data.


Point-in-Time Copy Sets

The restriction that all constituent volumes of a Point-in-Time Copy set must be in the same device group is not enforced for the shadow volume.

Since an exportable shadow volume must also be an independent shadow volume, the exportable shadow volume must be the same size (or larger) than the master volume it is associated with. If the exportable shadow volume is to be used in a read/write mode on another node in the Sun Cluster, it is advisable that the master and bitmap volumes be configured in one Sun Cluster device group and that the shadow volume and secondary bitmap volume be configured in a different Sun Cluster device group. Also ensure that the secondary bitmap volume is the same size (or larger) than the original bitmap volume.

When configuring Availability Suite volumes on Sun Cluster global devices (/dev/global/rdsk/dnsn), the device group associated with each global device is the dsk/dn part. Therefore when configuring a Point-in-Time Copy Volume, the master and associated bitmap volume must be on one global device and the shadow volume and secondary bitmap should be on another.

Due to the "global nature" of Sun Cluster global devices, it is advisable that the master and shadow volumes be on different global devices, so that off-host processing of the shadow volume avoids initiating I/Os over the Sun Cluster private interconnect. Once the shadow volume is exported from the Point-in-Time Copy set, it can be switched over to the Sun Cluster node where off-host processing will occur. In doing so, those I/Os to the shadow volume will not be impacting the Sun Cluster system as it pertains to the private interconnect.

Prior to configuring a Point-in-Time Copy set with an exportable shadow volume in a Sun Cluster OE, one must make sure that the device groups of both the master and bitmap volumes and the shadow volumes are both highly available. Failure to do so, will prevent the single Point-in-Time Copy set from being highly available.


Configuring a Point-in-Time Copy Set in a Sun Cluster OE

The steps to create a highly available Point-in-Time Copy volume set are listed in the following section.

There is no convention required for the RGM resource groups or resource types, although a planned and well-thought-out naming scheme, spanning the volume manager if one is used and the Sun Cluster resource groups and types, will be beneficial later if troubleshooting is required.

The setup creates a Point-in-Time Volume set on Sun Cluster nodes, node-A and node-B, with the exportable shadow volume available on node-C.


procedure icon  To Configure a Point-in-Time Copy Set in a Sun Cluster OE



Note - This example uses two SVM device groups: "oracle" and "backup", where "oracle" is the master volume's device group and "backup" is the exportable shadow volume's device group.



This example is based on configuring the following Point-in-Time Copy set:


# iiadm -ne ind /dev/md/oracle/rdsk/d1 /dev/md/backup/rdsk/d1 \
/dev/md/oracle/rdsk/d2

Do not invoke the above, or similar, iiadm command at this time in the following sequence of steps. The device attributes of the constituent volumes of an enabled Point-in-Time Copy set are such that without Sun Cluster Resource Group Manager (RGM) control, the association of a master and bitmap volume, allowing an exportable shadow volume, makes all of the associated device groups no longer highly available. This fact could impact the high availability of the Sun Cluster as it pertains to these associated volumes.

5. Create a resource group that will contain the HAStoragePlus resource type associated with the Point-in-Time Copy set.

This resource group should specify two or more nodes within the Sun Cluster, or be left blank if all nodes in the Sun Cluster are capable of supporting the Point-in-Time Copy set as a highly available resource.


# scrgadm -a -g Availability_Suite_RG -h node-A,node-B[,node-C,...]

Or, for all nodes in the Sun Cluster, as long as the Availability Suite software has been installed and configured on each:


# scrgadm -a -g Availability_Suite_RG



Note - Additional resource types for other Sun Cluster HA data services or applications may be added to this same resource group at your discretion.



6. Ensure that the SUNW.HAStoragePlus resource type is registered. If not register it.


# scrgadm -p | grep "Res Type name:" | grep HAStoragePlus
# scrgadm -a -t SUNW.HAStoragePlus

7. Add an HAStoragePlus resource type to the previously created resource group.

The HAStoragePlus resource type will be used to specify two Sun Cluster device groups, one representing the master-bitmap volume pair, and one for the exportable shadow volume. The ordering of these device group is important; the exportable shadow volume must be the last one specified.



Note - The HAStoragePlus resource type allows for its GlobalDevicePaths parameter to be either the full device path specification of a Sun Cluster device or the name of a Sun Cluster device group. The first format is used in this example, so that no doubt exists as to which devices are being associated. In doing so, the example will be setting a GlobalDevicePath with both a master and bitmap volume, each of which is in the same device group. As such, one of the device path specifications is redundant, and will be ignored.




# scrgadm -a -g Availability_Suite_RG -j \
Availability_Suite_RES -t SUNW.HAStoragePlus -x \
GlobalDevicePaths=/dev/md/oracle/rdsk/d1,\
/dev/md/oracle/rdsk/d2,/dev/md/backup/rdsk/d1 -x \
AffinityOn=False

The Sun Cluster resource type SUNW.HAStoragePlus, supports a configuration option AffinityOn, with the default value being True. This setting, along with the fact that the GlobalDevicePaths qualifier contains two device groups, with one of the device groups being the exportable shadow volume, implies that if the exportable shadow volume is currently in use on a secondary Sun Cluster node, this resource group will have a strong affinity to move to the node, regardless of the node list specified in step 2 above.

For example, in a three (or more) node Sun Cluster configuration, where two nodes have the system resources to support an HA enterprise class application like ORACLE® and the third Sun Cluster node is a low-end backup system, if the exportable shadow volume is in use on this third system, the HA application will move to the third Sun Cluster node even if there are not system resources capable of supporting its execution. This is the justification for setting AffinityOn=False in the example above.

8. Bring the resource group online. Then, verify that the resource group is located on the Sun Cluster node where the Point-in-Time Copy enable command will be invoked.


# scswitch -Z -g Availability_Suite_RG
# scswitch -z -g Availability_Suite_RG -h node-A

9. Enable the Point-in-Time Copy set using the new option -n to enable exportable shadows.


# iiadm -ne ind /dev/md/oracle/rdsk/d1 /dev/md/backup/rdsk/d1 \
/dev/md/oracle/rdsk/d2

10. Validate that the Point-in-Time Copy set is available on this node.


# iiadm -i /dev/md/backup/rdsk/d1
# scstat -g
# scstat -D

11. Switch the resource group from this node to each of the other configured nodes, and validate the set.


# scswitch -z -g Availability_Suite_RG -h node-B
# telnet node-B
<login to root account>
# iiadm -i /dev/md/backup/rdsk/d1
# scstat -g
# scstat -D
# ^D {logout}

12. This Point-in-Time Copy volume set is now highly available and usable as a resource group to which other highly available applications (HA-NFS, HA-ORACLE, and so forth) can now be added.

For example:


# scrgadm -a -g Availability_Suite_RG -j nfs_res -t SUNW.nfs
# scswitch -e -j nfs

To use the Point-in-Time shadow volume on another node within the Sun Cluster, it must be exported from its associated set and disabled as a device path within its HAStoragePlus resource type.

13. Confirm that the II set is fully independent.

Prior to being exported from a Point-in-Time Copy set, the II set must be fully independent. This is confirmed by wait (iiadm -w), returning.


# iiadm -w /dev/md/backup/rdsk/d1

14. Export the II shadow volume from its associated Point-in-Time Copy set.


# iiadm -E /dev/md/backup/rdsk/d1
# iiadm -i /dev/md/backup/rdsk/d1

15. The Point-in-Time exportable shadow volume can be switched to another node in the Sun Cluster.


# scswitch -z -D backup -h node-C

Or the Point-in-Time Copy set can be switched to another node in the Sun Cluster.


# scswitch -z -g Availability_Suite_RG -h node-C

16. Validate the correct behavior.


# telnet node-C
<login to root account>
# iiadm -i /dev/md/backup/rdsk/d1
# scstat -g
# scstat -D

The Point-in-Time Shadow volume is now accessible independently from the Point-in-Time Copy set, off-host, yet the original Point-in-Time set is still active on the other Sun Cluster node.

If the shadow volume will be accessed in a read/write mode, a secondary bitmap should be used to import the shadow locally on this node, so that subsequent fast-resynchronization operations (iiadm -u) can be done, versus the full-synchronization (iiadm -c).



Note - The exportable shadow MUST be enabled with the -C local tag, so that the system can differentiate between the highly available Point-in-Time Copy set and the locally accessible exportable shadow, each of which have the exact same name.




# iiadm -C local -I /dev/md/backup/rdsk/d1 /dev/md/backup/rdsk/d2
# iiadm -i /dev/md/backup/rdsk/d1



Note - From this node, you will see the imported shadow volume and the shadow volume's Point-in-Time Volume set as suspended on this node and active on node-A (or B).



17. While this imported shadow volume is active on this node, a test of steps #5 and #6 should be carried out to validate that the original Point-in-Time Copy set is still highly available.

Remember that the original Point-in-Time Copy set is not configured to be highly available on node-C, since an attempt to do so will fail as long as the shadow volume is imported on this node.

18. When you are done using the imported shadow volume on this node (if it was decided to import it), disable the locally accessible imported shadow volume, switch it back to the node where the Point-in-Time Volume set is active, and enable the resource in the resource group.


# iiadm -C local -d /dev/md/backup/rdsk/d1

19. Take the resource offline and back online, forcing the exportable shadow volume back to the Sun Cluster node where the rest of the Point-in-Time Copy set is enabled.


# scswitch -n -j Availability_Suite_RES
# scswitch -e -j Availability_Suite_RES



Note - From this node, we will still see the shadow volume's Point-in-Time volume set as suspended on this node, active on node-A (or B). The imported shadow volume is no longer listed.




# iiadm -i /dev/md/backup/rdsk/d1
# ^D {logout, back to node-A }

20. Join the shadow volume (with possible changes) back with the original Point-in-Time Copy set.


# iiadm -J /dev/md/backup/rdsk/d1 /dev/md/backup/rdsk/d2

The Point-in-Time Copy set is back in its original state, as though the shadow volume has never been exported.


Point-in-Time Copy Set Considerations in a Sun Cluster OE

Redundancy

To provide high availability to the data contained on a Point-in-Time Copy set when using global devices, it is assumed that the master, shadow, and bitmap volumes are on redundant storage since there is no means today to provide for host-based data service redundancy on raw global devices. If controller-based redundancy is not available, then a Sun Cluster supported volume manager must be used. For performance reasons, it is recommended under all scenarios that bitmap volumes are NOT placed on RAID-5 volumes (either host-based or controller-based) since the nature of bitmap I/O processing within a Point-in-Time Copy volume set can be I/O intensive.

Implicit Join Operation

The implicit join operation is a new feature of the Availability Suite software when using the Export, Import, and Join functionality in a Sun Cluster OE. By design, a Point-in-Time Copy set with its shadow volume exported cannot coexist on the same node as the Point-in-Time Copy set containing the imported shadow volume.

From an operational point-of-view, having the master volume and shadow volume on two different Sun Cluster nodes is the sole reason for using Export, Import, and Join functionality in a Sun Cluster OE. If you wanted both master and shadow volumes on the same node, you would have avoided using Export, Import, and Join in the first place.

If a Sun Cluster voluntary or involuntary failover event moves the master and associated bitmap volume to the Sun Cluster node containing the imported shadow volume, the design issue mentioned earlier would prevent the failover from completing successfully. To address this concern, the Availability Suite software detects this failover condition and performs an implicit join operation, merging the two Point-in-Time Copy sets back into one. This operation should have no impact on the master or shadow volume other then the fact that both volumes are now in the same Point-in-Time Copy set on the same node in the Sun Cluster.

Incomplete Export, Import, and Join Sequence

The operational procedures for Export, Import, and Join are based on the assumption that all three steps will be performed. Due to circumstances in system availability outside of Availability Suite, after exporting the shadow volume from the set it may not be possible to perform the import step immediately, yet it is desirable to join the exported shadow volume back into the Point-in-Time Copy set. In other words, it may be desirable at times to perform an Export, and Join sequence with no import step.

To perform a join operation there is still the requirement for a secondary bitmap volume, but since the secondary bitmap volume was NOT used during a recent import operation, it contains stale data or uninitialized data. Prior to performing a join operation, it is required to copy (using the Solaris dd utility) the current bitmap volume over the contents of the secondary bitmap volume so that the secondary bitmap volume's data is in a known state. The failure to perform this manual initialization step may result in the join operation failing, or, when state data is used, it may cause an inconsistency between what is actually on the shadow volume and the current state as recorded in the bitmap.