Example Scenarios for Recovering Data

An OKM system's ability to recover from a disaster depends on the structure of the cluster.

OKM can span multiple geographically-separated sites to reduce the risk of a disaster destroying the entire cluster. Although unlikely that an entire cluster must be recreated, you can recover most of the key data by re-creating the OKM environment from a recent database backup.

When designing an encryption/archive strategy, you should replicate and vault critical data at a recovery site. If a site is lost, this backup data may be transferred to another operational site. Data units and keys associated with tape volumes will be known to the KMAs at the sister site, and encrypted data required to continue business operations will be available. The damaged portion of the cluster can be restored easily at the same or a different location once site operations resume.

Many companies employ the services of a third-party disaster recovery (DR) site to allow them to restart their business operations as quickly as possible. Periodic unannounced DR tests demonstrate the company's degree of preparedness to recover from a disaster, natural or human-induced.

Replicate from Another Site

Two geographically separate sites (two KMAs at each site) allows recovery of a single KMA to occur with no impact to the rest of the cluster as long as one KMA always remains operational.

The figure below shows a disaster recover example where the time to recover business continuity to an entire site could take months. If Site 1 were destroyed, the customer must replace all the destroyed equipment to continue tape operations at Site 1. Completely restoring Site 1 would require you to install and create the new KMAs (requires a Security Officer and Quorum), join the existing cluster, and enroll the tape drives. Site 1 then self-replicates from the surviving KMAs at Site 2.

Figure A-1 Replication from Another Site—No WAN Service Network

Description of Figure A-1 follows
Description of "Figure A-1 Replication from Another Site—No WAN Service Network"

The figure below shows an disaster recovery example where the amount of time to recover business continuity is a matter of minutes. If the KMAs at Site 1 were destroyed, and the infrastructure at Site 2 is still intact, a WAN used as the Service Network that connects the tape drives between the two sites allows the intact KMAs from Site 2 to continue tape operations between both sites. Once the KMAs are replaced at Site 1, they self-replicate from the surviving KMAs at the intact Site 2.

Figure A-2 Replication from Another Site—WAN Service Network

Description of Figure A-2 follows
Description of "Figure A-2 Replication from Another Site—WAN Service Network"

Dedicated Disaster Recovery Site

A dedicated disaster recovery site connected to the cluster using a WAN allows recovery to begin immediately in case of a disaster.

A recovery can begin once the customer enrolls the tape drives in the KMAs and joins the OKM cluster. This can be done by connecting the OKM GUI to the KMAs at the DR site. In a true disaster recovery scenario, these may be the only remaining KMAs from the customer's cluster. Drive enrollment can occur within minutes and tape production can begin after configuring the drives.

In the example below, the customer has a big environment with multiple sites. Each site uses a pair of KMAs and the infrastructure to support automated tape encryption and a single cluster where all KMAs share keys. Along with the multiple sites, this customer also maintains and uses equipment at a Disaster Recovery (DR) site that is part of the customer's OKM Cluster.

This customer uses a simple backup scheme that consists of daily incremental backups, weekly differential backups, and monthly full backups. The monthly backups are duplicated at the DR site and sent to an off-site storage facility for 90 days. After the 90-day retention period, the tapes are recycled. Because the customer owns the equipment at the DR site, this site is just an extension of the customer that strictly handles the back-up and archive processes.

Figure A-3 Pre-positioned Equipment at a Dedicated Disaster Recovery Site

Description of Figure A-3 follows
Description of "Figure A-3 Pre-positioned Equipment at a Dedicated Disaster Recovery Site"

Shared Resources for Disaster Recovery

Use shared resources for backup and archive to provide a cost-efficient element for disaster recovery.

Companies that specialize in records management, data destruction, and data recovery, purchase equipment that several customers can use for backup and archive. The customer can restore backups their OKM into KMAs provided by the shared resource site. This avoids the need for a wide area network (WAN) link and the on-site dedicated KMAs, however it requires additional time to restore the database. Restore operations can take about 20 minutes per 100,000 keys.

At the DR site,

  • The customer selects the appropriate equipment from the DR site inventory. The DR site configures the equipment and infrastructure accordingly.
  • IMPORTANT: The customer must provide the DR site with the three OKM back-up files: the Core Security backup file (requires a quorum), .xml backup file, and .dat backup file.
  • The customer configures an initial KMA using QuickStart, restores the KMA from the OKM backup files, activates/enables/ switches the drives to encryption-capable, and enrolls the tape drives into the DR site KMA cluster.
  • Once the restore completes, the DR site needs to switch-off encryption from the agents, remove the tape drives from the cluster or reset the drives passphrase, reset the KMAs to factory default, and disconnect the infrastructure/network.

Key Transfer Partners for Disaster Recovery

Key Transfer is also called Key Sharing. Transfers allow keys and associated data units to be securely exchanged between partners or independent clusters and is required if you want to exchange encrypted media.

Note:

A DR site may also be configured as a Key Transfer Partner.

This process requires each party in the transfer to establish a public/private key pair. Once the initial configuration is complete the sending party uses Export Keys to generate a file transfer and the receiving party then uses Import Keys to receive the keys and associated data.

As a practice, it is not recommended to use Key Transfer Partners for Disaster Recovery. However, when DR sites create keys during the backup process, doing a key transfer can incrementally add the DR sites keys to the already existing data base.

The Key Transfer process requires each user to configure a Transfer Partner for each OKM Cluster: one partner exports keys from their cluster and the other partner imports keys into their cluster. When configuring Key Transfer Partners, administrators must perform tasks in a specific order that requires the security officer, compliance officer, and operator roles.

To configure Key Transfer Partners, see "Transfer Keys Between Clusters".

Figure A-5 Transfer Key Partners

Description of Figure A-5 follows
Description of "Figure A-5 Transfer Key Partners"