Solstice Backup 5.1 Disaster Recovery Guide

Chapter 1 Introduction

This chapter contains concepts, procedures, and information that help you prepare for recovering data after a disaster. It is important that you develop a plan for recovering from a disaster where valuable data, a disk, or an entire system has been destroyed.

Different Types of Disasters

Typically, the four types of disasters you might experience are as follows:

Critical data other than the operating system (OS) or the Backup software is damaged or destroyed. This disaster applies to both Backup clients and servers.

In the example shown in Figure 1-1, a Backup client has two disks. The disk containing the operating system and Backup software is still operational, but the second disk, containing critical client data, was destroyed by a disk crash. To recover from this disaster, use the Backup recover program to recover the lost applications and data.

Figure 1-1 Critical Data is Lost on a Secondary Disk
In Figure 1-2, the operating system has been damaged or destroyed. This situation can occur on Backup clients and servers.

In the example, a Backup server has several physical disks. A power outage corrupted the filesystem on Disk 0, which destroyed the operating system. To recover from the disaster, you need to replace the disk, reinstall the operating system, and if necessary, the Backup software. Then use Backup to recover the lost server configuration and any data that was lost when the operating system was destroyed.

Figure 1-2 Disk Containing Operating Software is Damaged

Caution -
In a situation where the operating system was destroyed, you must always reinstall the operating system, reinstall Backup, and then use Backup to recover the remainder of your data. You cannot recover data backed up by Backup without reinstalling the operating system and Backup software first.
In Figure 1-3, the directory on the server that contains the Backup software and the online indexes and configuration files has been damaged or destroyed. The operating system is assumed to be running on a different disk than the Backup software. This situation only applies to Backup servers.

In the example, a single disk on a Backup server contains the Backup software and the index and configuration files. To recover from a disaster of this type, recover the contents of the bootstrap save set.

Figure 1-3 Disk Containing Backup Indexes is Damaged
In the example in Figure 1-4, the Backup server has been destroyed. To recover from this disaster, you need to recover all the data to a new system by the same name. This applies only to Backup servers.

Figure 1-4 Backup Server is Destroyed

Preparing for Disaster

Not only do you need to back up important data on a daily basis, you need to develop and test a plan for recovering your data if a disk crashes or you lose critical data. The more time and effort you invest in creating and testing your plan, the better prepared you are if disaster strikes.

Disaster Recovery Requirements

If you have included your Backup server and clients in a scheduled backup, you are well on your way to being prepared for a disaster. With each server backup, Backup creates a special save set called the bootstrap essential for recovering from a disaster.

Along with the bootstrap information, you should also keep accurate records of your network, system configurations, and maintain a safe location for all your original software.

The severity of the destroyed or lost data determines the number of procedures you need to perform. To accomplish the most comprehensive disaster recovery, you need the following items:

Original operating system media and patches
Original Backup media
Device drivers and media device names
Filesystem configuration
IP addresses and hostnames
Bootstrap information
Enabler and authorization codes (under certain circumstances)

To recover from a disaster where you need to reinstall the operating system, you have two choices. You can perform a complete installation where you reinstall all the operating system files and recreate any special configurations. Or, you can perform a partial reinstall of the operating system and wait to recover the system configuration files with Backup after your system is functional. If you have an autochanger, you can either configure and use the autochanger during the recovery, or use the drive in the autochanger as a stand-alone device.

Important Information

Use the procedures in this section to collect bootstrap and disk configuration information necessary to perform a disaster recovery.

Bootstrap Information

During each scheduled backup of the backup server, Backup creates a special save set named bootstrap, essential to perform a successful disaster recovery. The bootstrap contains the Backup server file index, media database, and configuration files.

Caution -

Backup does not save the bootstrap information during a manual backup; Backup only saves it during a scheduled save.

Backup prints or saves to a file the most recent bootstrap information that includes dates, locations, and save set ID numbers. See Example 1-1 for an example of the bootstrap information generated each time Backup performs a schedules backup. Make sure you store the bootstrap printout or electronic file in a safe place.

The bootstrap displays a listing of the bootstrap save sets backed up for the past month. For example:

Example 1-1 Bootstrap Information

August 20 03:30 1996 Backup bootstrap information Page 1
date    time    level   ssid            file    record  volume
8/19/96 2:29:08 9       1148868949      56      0       mars.005
8/20/96 2:52:25 9       1148868985      77      0       mars.001

You can also perform scheduled backups of the Backup server indexes by using the savegrp command. Using this command also sends the bootstrap information to a printer or electronic file. For example:

# savegrp -O -c server-name

To use the savegrp -O command, you must be root on the Backup server.

For information about printing or saving bootstrap data to a file, refer to the Solstice Backup 5.1 Administratioin Guide.

Bootstrap Save Set ID

The most efficient way to recover the bootstrap is to make sure you save the bootstrap information prior to a disaster. However, if you do not have the information, you must scan the most recent backup volume to find the save set ID (save set ID or ssid) of the most recent bootstrap. Use the scanner -B command because it always finds a valid bootstrap.

How to Find the Bootstrap

After you locate the bootstrap with the most recent date, run the mmrecov command, and supply the save set ID and file number displayed by the scanner command

Use the following steps to find the most recent save set ID:

Place the most recent media used for scheduled backups in the server device.

At the system prompt, change to the directory where you originally installed Backup, typically, /usr/sbin/nsr.

Use the scanner -B command to locate the most recent bootstrap on the media, for example:
# /usr/sbin/nsr/scanner -B /dev/rmt/0hbn
The scanner -B command displays the latest bootstrap save set information found on the backup volume, as illustrated below:
scanner: scanning 8mm tape jupiter.001 /dev/rmt/0hbn scanner: Bootstrap 1148869870 of 8/21/96 7:45:15 located on volume jupiter.001, file 88

Disk Information

An additional precautionary step to help you recover from loss of critical data: before a disaster strikes is to find out how each disk on your network is partitioned and formatted and print and save this information. If a disk is damaged or destroyed during a disaster, use the disk information to recreate the disk exactly as it was prior to the disk crash. Do the same for each system Backup backs up, unless the systems are consistent in disk and filesystem layout.

Caution -

When you recreate your disk configuration, you need to have partitions large enough to hold all the recovered data. Make the partitions at least as large as they were before to the crash.

Use the df command to find out how the Backup server disks are partitioned and mounted. Use the appropriate operating system command to print disk partitioning information. Do the same for any Backup clients that have local hard disks.

For example, the df -k information looks similar to this:

Filesystem            kbytes    used   avail capacity  Mounted on
/dev/dsk/c0t3d0s0     865678  624020  155098    80%    /
/dev/dsk/c0t1d0s6     265807  198729   40498    83%    /usr
/dev/dsk/c0t1d0s4      96103   57468   29025    66%    /var
swap                  107756       8  107748     0%    /tmp

The following dkinfo command examples give you information about how each disk is partitioned for a SunOS system:

% dkinfo sd0a
	SCSI CCS controller at addr f8800000, unit # 24
	1151 cylinders 9 heads 80 sectors/track
	33120 sectors (46 cyls)
	starting cylinder 0
% dkinfo sd0b
	1151 cylinders 9 heads 80 sectors/track
	197280 sectors (274 cyls)
	starting cylinder 46

The prtvtoc command example in Example 1-2 provides you information about how each disk is partitioned for a Solaris system. The device name is the "raw" device corresponding to the device name used for the output from the df command.

Example 1-2 `prtvtoc` Command Output

* /dev/dsk/c0t1d0s3 partition map
*
* Dimensions:
*     512 bytes/sector
*      80 sectors/track
*      17 tracks/cylinder
*    1360 sectors/cylinder
*    3500 cylinders
*    1965 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       2      5    00          0   2672400   2672399
       3      6    00          0   1434800   1434799   /export
       4      7    00    1434800    205360   1640159   /var
       5      6    00    1640160    463760   2103919   /opt
       6      4    00    2103920    568480   2672399   /usr

If a disk is damaged, you can restore it and recover the filesystems to their original state, using the hardcopy information from these disk information commands.

Disaster Recovery Choices

You have several options for recovering the operating system and whether you want to use an autochanger or a stand-alone drive. This section outlines the differences so you can decide which methods best suit your situation.

Restoring the Operating System

You can use one of two methods for recovering the operating system during a disaster recovery: complete or partial. When you perform a complete reinstallation, you install all operating system files and recreate any unique configurations that existed before you lost data or experienced a disk crash. To perform a partial reinstallation, install the minimum number of files and make the minimum number of configurations necessary for creating a fully operational networked system. Then, later, recover the remaining operating system and configuration files using Backup.

Figure 1-5 illustrates the steps for recovering from a disk crash where you lost the operating system, Backup software, and server indexes and configuration files. It also outlines the two choices you have for reinstalling the operating system.

Figure 1-5 Recovering From a Disk Crash With Operating System Loss

Performing a disaster recovery for Backup servers and clients is very similar, except on client systems you do not need to recover the server indexes and configuration files.

Complete Installation

In some cases, it might be faster to perform a complete reinstallation of the operating system, especially if you install the operating system from a CD and have very few special configurations to recreate. Depending on the speed of your backup device and network, it could potentially take longer to recover the remainder of your files and configurations using Backup during the disaster recovery procedure.

If you use a device with a default configuration that is not directly supported by the operating system, you also need to modify the device configuration files during installation:

You might need to modify the /kernel/drv/st.conf file to support a DLT tape drive.
For SunOS systems, modify the /usr/sys/scsi/targets/st_conf.legato.h file.

When you recover the remainder of your data, you can decide whether you want to replace the operating system files you just reinstalled with the operating system files backed up by Backup. If you want to guarantee that you have the same configurations prior to the disaster, replace the files and configurations you created during the installation.

Partial Installation

On the other hand, a partial installation might get your Backup server up and running more quickly, so you can concentrate on continuing the disaster recovery. Later, you can recover the remainder of your operating system files using Backup. You will especially save time if you have a large number of clients and devices on the network that need to be configured; it will take you time to find IP addresses, hostnames, and recreate configurations.

Furthermore, if you wait to recover the remainder of the operating system files with Backup, you will be assured that the server, clients, and devices will be reconfigured exactly as they were prior to the disaster.

If you choose to do a partial install, you need to perform the following tasks:

If necessary, select a domain for the system.
Install the basic operating system files and device driver software.
Make sure the system communicates properly over the network.

After you reinstall the operating system, whether you did a complete or partial installation, run the tar command to verify that the tape drive is functioning properly.

Recovery with Autochangers (Jukeboxes)

This section explains how to use your autochanger during a disaster recovery where you have lost, at a minimum, the Backup server indexes and configuration files. The configuration files reside in the /nsr/res.

The configuration files include the nsrjb.res file, which contains autochanger configuration information.

This section assumes that you have lost the Backup server indexes and configuration files on the original server, or you are moving Backup and need to recover the existing index and configuration files to the new server.

For more information, see "System and Backup Software Recovery" and "How to Recover Backup Indexes and Configuration Files ".

The programs that recover the indexes and configuration files do not recognize autochangers. Consequently, you need to use the autochanger as if it were a stand-alone drive for that portion of the recovery. Use the autochanger's control panel to mount and unmount the necessary backup volumes.

After recovering the indexes and configuration files, all the original autochanger configuration files are back in place. You can now use the autochanger to recover the remainder of your data.

Caution -

If you did not lose the server indexes and they are over 30 days old, you must reenable the server and autochanger to use the autochanger during a disaster recovery.

The rest of this section describes the issues that might influence your choices for using the autochanger or just the drive located inside the autochanger and how to recover the server's indexes and configuration files.

Autochanger Addition and Configuration

If you choose to recover with an autochanger, review these issue about recovering data prior to restoring the server indexes and configuration files:

If the autochanger has more than one drive, use the first drive for recovery.
You cannot use the full functionality of the autochanger while restoring the server indexes and configuration files. mmrecov does not support autochangers; this command only support stand-alone devices.
The robotic device does not locate, load, and mount volumes automatically. You must use the Backup Mount and Unmount buttons and the autochanger control panel to mount and unmount volumes. If you use the autochanger control panel, Backup does not have a record of where the volumes have been moved, so inventory the autochanger contents after you complete the recovery.
When you recover the server indexes and configuration files, you recover the autochanger configuration files as they existed during the last backup, including the inventory of the autochanger. If you moved backup volumes inside the autochanger during the disaster recovery, the location of the volumes probably no longer matches the recovered inventory contents. After the recover operation, inventory the autochanger.

How to Recover with an Autochanger

Use the following procedure to perform a disaster recovery with an autochanger:

If necessary, reinstall the operating system and Backup software.

During installation, use the same pathname for the indexes that you previously used and backed up.

Run the jbconfig command to add and configure the autochanger.

Issue the nsrjb -vHE command.

This command resets the autochanger for operation, ejects backup volumes, reintializes the element status, and checks each slot for a volume. If the -E option is not supported for your autochanger, use the sjiielm program (for example, /etc/LGTOuscsi/sjiielm) to initialize element status.

If a volume is loaded in the drive, it is removed and placed into a slot. This operation might take a few minutes to complete.

If you receive an error, typically the robotic device is having trouble finding a slot for a volume it has removed from the drive. Try moving some backup volumes around to make room for the volume, or, if possible, remove the volume from the robotic arm and manually place it in a slot.

Locate your bootstrap data, either an electronic file or hardcopy.

With this information, determine which volumes are necessary for recovering the server indexes and configuration files.

Enter the nsrjb -I command to inventory the contents of the autochanger, to help you determine whether the volumes required for recovering the bootstrap are inside the autochanger.

Chances are the volume currently loaded in the drive contains the most current bootstrap.

If you want to speed up this process, issue the command with the -S flag and list only the slots where you think the required backup volumes reside. This saves you from having to inventory the entire autochanger contents. You must list the slots in order (for example, "nsrjb -I -S 1-3"). If you want to inventory slots out of order, (for example 1, 3, and 6,) you must issue the nsrjb -I -S command separately for each slot. All the volumes currently loaded in the autochanger are marked with an asterisk because you have not yet recovered the media index.

Load the appropriate volume by entering the following command:
# nsrjb -l -S slot -f device-name
where slot is the slot where the first volume is located and device-name is the pathname of the first drive. You can also use the Backup Mount button.

Enter the mmrecov command.

If the bootstrap spans across more than one volume, Backup prompts you to load another backup volume.

Enter the nsrjb -u command to unmount the volume after the indexes have been recovered.
# nsrjb -u -S slot -f device-name
You can also use the Backup Unmount button.

Shut down Backup.

Rename /nsr/res to /nsr/res.orig.

Rename the /nsr/res.R directory to /nsr/res.

When you recover and rename the /nsr/res files, you replace the configuration files you created when you reinstalled and configured the autochanger. This step ensures that you have all your configurations that existed on the last backup, prior to the disaster.

Restart Backup.

After the server indexes and configuration files are recovered, you have a fully functioning autochanger. Inventory the contents of your autochanger, especially if you manually moved volumes as part of the disaster recovery.

Recovery with a Stand-Alone Drive

If you choose to recover with a drive in the autochanger, review these issues about recovering data prior to restoring the indexes and configuration files:

If the autochanger has more than one drive, use the first drive for recovery.
You must manually mount the backup volumes required for recovering the server indexes and configurations files.
If you remove backup volumes from the autochanger cartridge used for recovering the Backup indexes and configuration files, put them back in the same slots when you finish.

How to Perform Disaster Recovery With a Stand-Alone Drive

Use the following instructions to perform a disaster recovery using just a drive inside the autochanger for a Backup server:

If necessary, reinstall the operating system and Backup software.

If you need to reinstall the Backup software, use the same pathname for the indexes that you previously used and backed up.

Locate your bootstrap data, either an electronic file or hardcopy.

With this information, determine which volumes are necessary for recovering the server indexes and configuration files.

Manually mount the appropriate volume into the drive.

Enter the mmrecov command.

Shut down Backup.

Rename the original /nsr/res directory to /nsr/res.orig.

Rename the recovered /nsr/res.R directory to /nsr/res.

Restart Backup.

Issue the nsrjb -vHE command.

This command resets the autochanger for operation, ejects backup volumes, reintializes the element status, and checks each slot for a volume. If a volume is loaded in the drive, it is removed and placed into a slot. This operation might take a few minutes to complete.

Inventory the autochanger contents by using the nsrjb -I command or use the Inventory command in the administrator program.

After you recover the server indexes and configuration files, you should have a fully functioning autochanger.

Chapter 1 Introduction

Different Types of Disasters

Figure 1-1 Critical Data is Lost on a Secondary Disk

Figure 1-2 Disk Containing Operating Software is Damaged

Figure 1-3 Disk Containing Backup Indexes is Damaged

Figure 1-4 Backup Server is Destroyed

Preparing for Disaster

Disaster Recovery Requirements

Important Information

Bootstrap Information

Example 1-1 Bootstrap Information

Bootstrap Save Set ID

How to Find the Bootstrap

Disk Information

Example 1-2 prtvtoc Command Output

Disaster Recovery Choices

Restoring the Operating System

Figure 1-5 Recovering From a Disk Crash With Operating System Loss

Complete Installation

Partial Installation

Recovery with Autochangers (Jukeboxes)

Autochanger Addition and Configuration

How to Recover with an Autochanger

Recovery with a Stand-Alone Drive

How to Perform Disaster Recovery With a Stand-Alone Drive

Example 1-2 `prtvtoc` Command Output