As described in "Defining the Recovery Point Objective (RPO)," one of the keys to a successful DR solution is the ability to establish system checkpoints that ensure a consistent set of data can be used as a DR baseline.
For VSM environments, a valid DR baseline is where:
All business critical data is secured at the designated DR location.
A secure copy of the metadata (CDS, MSP Catalog, TMC) has been captured.
The metadata copy is guaranteed to be valid when a disaster is declared (real or test).
VTCS provides the capability to create a DR baseline through the following functions:
The DRMONitr
utility monitors and ensures critical DR data reaches its designated recovery location. It allows job stream processing to stall awaiting the data to reach its destination. Once all data is accounted for, the utility ends. The DRMONitr
utility can be run as a job step. On completion of the job step it is guaranteed that all monitored data has been accounted for and secured at the designated DR location.
The DRCHKPT
utility is used to ensure that data accessed through the CDS metadata remains valid for a set period. This guarantees that a CDS backup remains valid for a set period and, therefore, you can restore a VSM system back to a DR baseline. The DRCHKPT
utility sets a date/time stamp in the active CDS which establishes the recovery point from which MVC content can be recovered from. Beginning at this recovery point in time, data content is guaranteed for some time in the future. Without the DRCHKPT
utility, a CDS backup cannot be used to restore back to a DR baseline as elements in the CDS (VTV position on an MVC) may no longer be valid.
For more information, see ELS Command, Control Statement, and Utility Reference.
Also note the following:
For VMVCs, MVCDRAIN
with the EJECT
parameter physically deletes the VTVs.
Caution:
If you use theDRCHKPT
utility and/or the CONFIG GLOBAL PROTECT
parameter to protect CDS backup content for VMVCs, specifying MVCDR EJECT
invalidates the CDS backup's VMVC content.For both VMVCs and MVCs, MVCDRAIN
without the EJECT
parameter does not delete the VTVs, but updates the CDS record to show no VTVs on the VMVC/MVC.
For more information, see ELS Command, Control Statement, and Utility Reference.
The following examples are discussed:
In this example:
The DRMONitr
and DRCHKPT
utilities ensure the DR data has reached its recovery location and that the associated metadata (CDS backup) to retrieve the VTV data if necessary.
A local site is shown with a VTSS plus an ACS (ACS 00), and a remote site with just an ACS (ACS 01) as shown in Figure 7-1.
The example is a simple DR strategy where on a daily basis, copies of critical data are secured on the remote site along with the metadata. The remote VTV copies are the designated DR copies.
After the production jobs have completed a job is scheduled that:
Monitors the remote copies for completion (DRMONitr
).
Checkpoints the CDS (DRCHKPT
).
Takes a backup of the metadata (CDS,TMC,MSP catalog) and secures on the remote site. Note that the metadata backups are key to the DR, it is assumed these are taken to a ”well known” location or that their location is noted at a secure location.
This provides a daily synchronised DR checkpoint. In the case of a DR being declared the tape environment is restored back to the checkpoint and jobs are re-run from this known state.
Figure 7-1 VSM System Recovery Points Example (Local and Remote)
To run this example using the configuration shown in Figure 7-1:
Create the following policy statements.
MGMT NAME(DR) MIGPOL(LOCAL,REMOTE) IMMDELAY(0) STOR NAME(LOCAL) ACS(00) STOR NAME(REMOTE) ACS(01)
Note:
For an effective DR environment, you may also want to consider usingMIGRSEL
and MIGRVTV
statements, which you can use to ensure DR copies are secured as early as possible.To ensure the critical data is secured in the remote location the following example DRMONitr
jobstep is run.
//MONITOR EXEC PGM=SLUADMIN,PARM='MIXED' //STEPLIB DD DSN=hlq.SEALINK,DISP=SHR //SYSIN DD UNIT=SYSDA,SPACE=(TRK,1) //* //SYSPRINT DD SYSOUT=* //SLSPRINT DD SYSOUT=* //SLSIN DD * DRMON MGMT(DR) STOR(REMOTE) MAXAGE(24) TIMEOUT(30)
In this example, the DRMONitr
utility will wait until all VTV copies, of management class DR less than 24 hours old, are delivered to the remote ACS. The utility is set to abort if the run time (or wait time) exceeds 30 minutes.
Once all VTV copies have been delivered to the remote ACS, signaled by RC zero, the DRCHKPT
runs to set the recovery point as shown in the following example.
//CHKPT EXEC PGM=SLUADMIN,PARM='MIXED' //STEPLIB DD DSN=hlq.SEALINK,DISP=SHR //SYSPRINT DD SYSOUT=* //SLSPRINT DD SYSOUT=* //SLSIN DD * DRCHKPT SET
In this example the DRCHKPT
utility sets a time stamp, or recovery point, in the active CDS. Beginning at this recovery point in time, MVC copy content is guaranteed for a period in the future. (for example, until another CHKPT
utility is run).
Once the recovery point is set in the active CDS, a CDS backup should be taken immediately as shown in the following example.
//CHKPT EXEC PGM=SLUADMIN,PARM='MIXED' //STEPLIB DD DSN=hlq.SEALINK,DISP=SHR //SYSIN DD UNIT=SYSDA,SPACE=(TRK,1) //* //SLSCNTL DD DSN=hlq.DBSEPRM,DISP=SHR //SLSBKUP DD DSN=hlq.DBSEPRM.BKUP,DISP=SHR //SYSPRINT DD SYSOUT=* //SLSPRINT DD SYSOUT=* //SLSIN DD * BACKUP OPTION(COPY)
After the backup is taken, the MVC content, or metadata, is guaranteed to be valid for some point in time in the future (until a subsequent recovery or check point is set).
This completes this procedure. If a DR declaration (the local production site is no longer available) then either:
The MVCs and all other critical data (metadata copies for example) are transported to another facility where a mirror of the production Local site is available.
or
A replica of the production Local site is constructed at the remote location.
The metadata is restored (CDS, TMC, MSP Catalog). On restarting the tape environment everything can proceed (roll forward) from the DR sync point.
In this example, the CDS is backed up every 24 hours. MVC content, or CDS metadata, within the CDS backup must remain valid until the subsequent CDS backup is taken.
This example shows MVC protection set as 28 hours. For more information on the CONFIG RECLAIM PROTECT
parameter, see ELS Command, Control Statement, and Utility Reference.
Set CONFIG GLOBAL PROTECT
= 28.
Day 1, back up the CDS.
Any MVCs drained/reclaimed after this backup cannot be overwritten for 28 hours.
Day 1 CDS backup is now the recovery point until the next CDS backup.
Day 2, back up the CDS.
Any MVCs drained/reclaimed after this backup cannot be overwritten for 28 hours.
Day 2 CDS backup is now the recovery point until the next CDS backup.