8 Creating System Recovery Points in VSM Environments

As described in "Defining the Recovery Point Objective (RPO)," one of the keys to a successful DR solution is the ability to establish system checkpoints that ensure a consistent set of data can be used as a DR baseline.

For VSM environments, a valid DR baseline is where:

  • All business critical data is secured at the designated DR location.

  • A secure copy of the metadata (CDS, MVS Catalog, TMC) has been captured.

  • The metadata copy is guaranteed to be valid when a disaster is declared (real or test).

VTCS provides the capability to create a DR baseline through the following functions:

  • The DRMONitr utility monitors and ensures critical DR data reaches its designated recovery location. It allows job stream processing to stall awaiting the data to reach its destination. Once all data is accounted for, the utility ends. The DRMONitr utility can be run as a job step. On completion of the job step it is guaranteed that all monitored data has been accounted for and secured at the designated DR location.

  • The DRCHKPT utility is used to ensure that data accessed through the CDS metadata remains valid for a set period. This guarantees that a CDS backup remains valid for a set period and, therefore, you can restore a VSM system back to a DR baseline. The DRCHKPT utility sets a date/time stamp in the active CDS which establishes the recovery point from which MVC content can be recovered from. Beginning at this recovery point in time, data content is guaranteed for some time in the future. Without the DRCHKPT utility, a CDS backup cannot be used to restore back to a DR baseline as elements in the CDS (VTV position on an MVC) may no longer be valid.

For more information, see ELS Command, Control Statement, and Utility Reference.

Also note the following:

  • For VMVCs, MVCDRAIN with the EJECT parameter physically deletes the VTVs.

    Caution:

    If you use the DRCHKPT utility and/or the CONFIG GLOBAL PROTECT parameter to protect CDS backup content for VMVCs, specifying MVCDR EJECT invalidates the CDS backup's VMVC content.
  • For both VMVCs and MVCs, MVCDRAIN without the EJECT parameter does not delete the VTVs, but updates the CDS record to show no VTVs on the VMVC/MVC.

For more information, see ELS Command, Control Statement, and Utility Reference.

Checkpointing Examples

The following examples are discussed:

Example 1: Local MVC Copies and Remote MVC Copies

In this example:

  • The DRMONitr and DRCHKPT utilities ensure the DR data has reached its recovery location and that the associated metadata (CDS backup) to retrieve the VTV data if necessary.

  • A local site is shown with a VTSS plus an ACS (ACS 00), and a remote site with just an ACS (ACS 01) as shown in Figure 8-1.

The example is a simple DR strategy where on a daily basis, copies of critical data are secured on the remote site along with the metadata. The remote VTV copies are the designated DR copies.

After the production jobs have completed a job is scheduled that:

  • Monitors the remote copies for completion (DRMONitr).

  • Checkpoints the CDS (DRCHKPT).

  • Takes a backup of the metadata (CDS,TMC,MVS catalog) and secures on the remote site. Note that the metadata backups are key to the DR, it is assumed these are taken to a ”well known” location or that their location is noted at a secure location.

This provides a daily synchronised DR checkpoint. In the case of a DR being declared the tape environment is restored back to the checkpoint and jobs are re-run from this known state.

Figure 8-1 VSM System Recovery Points Example (Local and Remote)

Description of Figure 8-1 follows
Description of ''Figure 8-1 VSM System Recovery Points Example (Local and Remote)''

To run this example using the configuration shown in Figure 8-1:

  1. Create the following policy statements.

    MGMT NAME(DR)  MIGPOL(LOCAL,REMOTE) IMMDELAY(0)
    STOR NAME(LOCAL) ACS(00)
    STOR NAME(REMOTE) ACS(01)
    

    Note:

    For an effective DR environment, you may also want to consider using MIGRSEL and MIGRVTV statements, which you can use to ensure DR copies are secured as early as possible.
  2. To ensure the critical data is secured in the remote location the following example DRMONitr jobstep is run.

    //MONITOR EXEC PGM=SLUADMIN,PARM='MIXED'
    //STEPLIB  DD DSN=hlq.SEALINK,DISP=SHR
    //SYSIN          DD UNIT=SYSDA,SPACE=(TRK,1)
    //* 
    //SYSPRINT  DD  SYSOUT=*
    //SLSPRINT   DD SYSOUT=*
    //SLSIN         DD *
    DRMON MGMT(DR) STOR(REMOTE) MAXAGE(24) TIMEOUT(30)
    

    In this example, the DRMONitr utility will wait until all VTV copies, of management class DR less than 24 hours old, are delivered to the remote ACS. The utility is set to abort if the run time (or wait time) exceeds 30 minutes.

  3. Once all VTV copies have been delivered to the remote ACS, signaled by RC zero, the DRCHKPT runs to set the recovery point as shown in the following example.

    //CHKPT EXEC PGM=SLUADMIN,PARM='MIXED' 
    //STEPLIB  DD DSN=hlq.SEALINK,DISP=SHR
    //SYSPRINT  DD  SYSOUT=*
    //SLSPRINT   DD SYSOUT=* 
    //SLSIN         DD *
    DRCHKPT SET 
    

    In this example the DRCHKPT utility sets a time stamp, or recovery point, in the active CDS. Beginning at this recovery point in time, MVC copy content is guaranteed for a period in the future. (for example, until another CHKPT utility is run).

  4. Once the recovery point is set in the active CDS, a CDS backup should be taken immediately as shown in the following example.

    //CHKPT EXEC PGM=SLUADMIN,PARM='MIXED' 
    //STEPLIB  DD DSN=hlq.SEALINK,DISP=SHR
    //SYSIN          DD UNIT=SYSDA,SPACE=(TRK,1)
    //*
    //SLSCNTL DD DSN=hlq.DBSEPRM,DISP=SHR
    //SLSBKUP DD DSN=hlq.DBSEPRM.BKUP,DISP=SHR 
    //SYSPRINT  DD  SYSOUT=* 
    //SLSPRINT   DD SYSOUT=*
    //SLSIN         DD *
    BACKUP OPTION(COPY) 
    

After the backup is taken, the MVC content, or metadata, is guaranteed to be valid for some point in time in the future (until a subsequent recovery or check point is set).

This completes this procedure. If a DR declaration (the local production site is no longer available) then either:

  • The MVCs and all other critical data (metadata copies for example) are transported to another facility where a mirror of the production Local site is available.

    or

  • A replica of the production Local site is constructed at the remote location.

The metadata is restored (CDS, TMC, MVS Catalog). On restarting the tape environment everything can proceed (roll forward) from the DR sync point.

Example 2: Using CONFIG RECLAIM PROTECT

In this example, the CDS is backed up every 24 hours. MVC content, or CDS metadata, within the CDS backup must remain valid until the subsequent CDS backup is taken.

This example shows MVC protection set as 28 hours. For more information on the CONFIG RECLAIM PROTECT parameter, see ELS Command, Control Statement, and Utility Reference.

  1. Set CONFIG GLOBAL PROTECT = 28.

  2. Day 1, back up the CDS.

    • Any MVCs drained/reclaimed after this backup cannot be overwritten for 28 hours.

    • Day 1 CDS backup is now the recovery point until the next CDS backup.

  3. Day 2, back up the CDS.

    • Any MVCs drained/reclaimed after this backup cannot be overwritten for 28 hours.

    • Day 2 CDS backup is now the recovery point until the next CDS backup.