Solstice Backup 5.1 Disaster Recovery Guide

Chapter 2 Disaster Recovery

Use the information in this chapter to determine which disaster recovery procedures you should follow for Backup for servers and clients. You should have already read Chapter 1, Introduction which explains how to prepare for recovering from a disaster and defines basic terms, procedures, and concepts used throughout this guide.

This chapter includes procedures for recovering from the following kinds of disasters:

Loss of a drive or partition that contains critical data other than the operating system or Backup software
Loss of the operating system
Loss of a drive or partition that contains Backup software, which typically includes the Backup indexes and configuration files
Loss of an entire Backup server to the extent that you need to recover to a new system

It is difficult to provide step-by-step instructions for performing a disaster recovery for a specific situation, because every situation is unique. The examples in this chapter are designed to give you general principles to recover from a disaster and to help you understand the procedures.

Requirements

While performing any of the disaster recovery procedures included in this chapter, keep in mind the requirements listed in this section. Fulfill the requirements pertinent to the disaster recovery procedure you are following.

Requirements for Replacing the Hardware

If hardware becomes damaged or destroyed, use the following selections to install and configure your new system hardware correctly:

Ensure that the replacement disk is as large or larger than the original disk.
When replacing the hardware, try to use the same controller, driver, and SCSI ID used prior to the disaster.
Re-create the same size or larger disk partitions on the new disk/system.
Format the disk partitions using the same formats used by the original disk.

Requirements for Reinstalling the Operating System

If the operating system is damaged or destroyed, adhere to the following list when you reinstall the operating system:

Reinstall the same version of the operating system.
Use the same computer name, TCP/IP host name, and DNS domain name.
Reinstall any operating system patches that existed before the disaster.
Reinstall the device and SCSI drivers.
Make sure all network protocols are working properly.
After reinstalling the operating system, reboot your system, and log on as root. Make sure no error messages occur when you start up the system and that all the devices are recognized by the operating system.

Requirements for Reinstalling Backup

Fulfill the following requirements to ensure successful reinstallation of Backup. Refer to the Solstice Backup 5.1 Installation and Release Notes for installation instructions.

Reinstall the same version of the Backup software.
Reinstall Backup where it originally resided.
Reinstall any patches that were installed prior to the disaster.
For Backup servers, you will have to run additional procedures to retrieve the Backup server's indexes and configuration files. See "How to Recover Backup Indexes and Configuration Files " for information.
For Backup clients or storage nodes see "How to Recover Backup Clients and Storage Nodes".

Critical Data Recovery

The following example assumes the disk containing the operating system and Backup software is still operational, but another disk containing critical data has been lost. The example applies to both Backup servers and clients.

If the disk is damaged beyond repair, replace it with a new disk the same size or larger than the original disk. You need a disk large enough to hold all the data you plan to recover.

How to Recover Critical Data

To recover the critical data, follow these steps:

Install the replacement disk.

Make sure the operating system and kernel recognize the new disk.

Use the saved disk partition information to re-create the disk partitions with the same structure as the original disk.

See "Disk Information".

If you did not save the disk information, it should still be available because it is located on the primary disk, which in this case, is operational. To find the original disk partition information, examine /etc/vfstab for a Solaris system and /etc/fstab for a SunOS system. However, you must guess how big each partition should be.

Use the output from the disk information command to make a filesystem for each raw partition you plan to recover, then mount the block partition. (Backup does not initialize or create filesystems; it recovers data into existing filesystems.)

Use the appropriate command to format the replacement disk.

For SunOS and Solaris systems, use newfs or mkfs.

Caution -
Make sure the disk is no longer needed, because you will completely destroy the disk contents when you use newfs, or mkfs.

Run newfs on a SunOS system. For example:
# newfs /dev/rsd1g ... # mount /dev/sd1g /export # newfs /dev/rsd1h # mount /dev/sd1h /home
Run newfs on a Solaris system. For example:
# newfs /dev/rdsk/c0t1d0s5 # mount /dev/dsk/c0t1d0s5 # newfs /dev/rdsk/c0t1d0s7 # mount /dev/dsk/c0t1d0s7

After creating and mounting all the filesystems on the replacement disk, use Save Set Recover feature in the nwadmin program or the normal recovery procedure in the nwrecover program to recover the files.

For an explanation about the best recovery method for your lost data, see the Solstice Backup 5.1 Administration Guide.

System and Backup Software Recovery

In this example, we assume a disk with the operating system and the Backup binaries has been damaged or completely destroyed, so you need to replace the damaged disk and reinstall both the operating system and the Backup software. If the disk was not completely destroyed and the operating system or Backup is still operational, use only those steps in this section that apply to your situation. These steps apply to Backup servers and clients, unless otherwise specified.

Caution -

When you recover the operating system, you must do so in single-user mode from the system console, not from the X window system.

How to Prepare for Operating System Recovery

To prepare for recovering the operating system for either a server or client, follow these instructions:

Replace the damaged disk if necessary.

Make sure the replacement disk is as large or larger than the original disk.

Use the saved disk partition information to re-create the disk partitions with the same structure as the original.

See "Disk Information".

Use the output from the "Disk Information" section to make a filesystem for each raw partition that you plan to recover, then mount the block partition.

Backup does not initialize or create filesystems; it recovers data into existing filesystems.

Use the appropriate command to format the replacement disk.

For SunOS and Solaris systems, use newfs or mkfs.

Reinstall the operating system in the same location where it originally resided, using the original software and accompanying documentation.

Use the same system name, TCP/IP hostname, and DNS Domain name used prior to losing the operating system.

You can choose to fully configure the operating system now, or you can install the minimum number of files and make the minimum number of configurations required to create an operational networked system. See "Restoring the Operating System" for more information.

Install and configure the SCSI controller and tape device drivers, if necessary.

Reinstall the Backup software, using the original software and accompanying documentation.

On a Backup client, you only need access to the Backup binaries. You can run Backup from the /usr/sbin/nsr directory or NFS-mount the binaries from another system running Backup. Refer to the Solstice Backup 5.1 Installation and Release Notes for installation instructions. Reinstall any Backup patches you had installed prior to the disaster.

You might have several different releases of Backup software; reinstall the same release that was running prior to the disaster or a later release. The release must be equal to or later than the release used for the backups.

Backup servers only - When you reinstall the Backup server software, Backup automatically rediscovers indexes and configuration files if they are not corrupted.

Backup clients only - The client system is now ready to recover its data from the Backup server.

You can also use the following method to access the Backup binaries for recovery. If you have another system running Backup that is like the system being recovered on the network, you can NFS-mount the Backup binaries on the damaged system.

For example:
# mount venus:/usr/etc /mnt # /mnt/recover -s server -q recover> add / recover> force recover> recover

Install and configure the SCSI controller and tape device drivers.

Reboot the system, and log on as root.

How to Recover the Operating System

First create and mount all the filesystems on the replacement disk. Then to recover the operating system, use the Save Set recover feature in the nwadmin program or the normal recovery procedure in the nwrecover program to recover the necessary data.

Recovering the Backup Software

The following example for recovering the Backup binaries, assumes a disk containing the Backup software has been damaged or completely destroyed. This example also assumes that the operating system is installed and operating properly.

The set of instructions you need to follow in this section depend upon which system you lost (server, client, or storage node) and the extent of the damage. Refer to the following list of disaster recovery scenarios to determine which set of instructions apply to your situation.

If you are recovering Backup clients and storage nodes, you need to follow the instructions in these sections:
- "How to Prepare for Backup Software Recovery"
- "How to Recover Backup Clients and Storage Nodes"
If you are recovering a Backup server that lost its indexes and configuration files, you need to follow the instructions in these sections:
If you are recovering a Backup server from clone volumes, you need to follow the instructions in these sections:
If you are recovering Backup to a a new server, you need to follow the instructions in this section:
- "Recovery to a New Server"

How to Prepare for Backup Software Recovery

Before you can restore Backup configuration files and/or indexes, you must reinstall the Backup software from the original media on the damaged system.

To reinstall the Backup software, follow these instructions:

Replace the damaged disk if necessary. Make sure the replacement disk is as large or larger than the original disk.

Use the saved disk partition information to re-create the disk partitions with the same structure as the original disk.

See "Disk Information".

Use the output from the disk information command to make a filesystem for each raw partition that you plan to recover, then mount the block partition. (Backup does not initialize or create filesystems; it recovers data into existing filesystems.)

Use the appropriate command to format the replacement disk.

For SunOS and Solaris systems, use newfs or mkfs.

Reinstall the Backup software, using the original software and accompanying documentation.

On a Backup client, you only need access to the Backup binaries. You can run Backup from the /usr/sbin/nsr directory or NFS-mount the binaries from another system running Backup. Refer to the appropriate Solstice Backup 5.1 Installation and Release Notes for detailed instructions. Reinstall any Backup patches you had installed prior to the disaster.

Backup servers only - You do not need to reload the license enablers if the /nsr/res directory (configuration files) still exists. If the /nsr/res directory was destroyed, the license enablers are recovered when you recover the configuration files.

If you had a link to another disk that contains the Backup indexes and configuration files (/nsr) or any other Backup directories located on another disk, re-create it now.

Backup servers only - If you back up to an autochanger and want to use it during the remainder of the disaster recovery, add and configure the autochanger with the jb_config command after installing Backup. See "Recovery with Autochangers (Jukeboxes)" for more information.

You can also use the following method to access the Backup binaries for recovery. If you have another system running Backup that is like the system being recovered on the network, you can NFS-mount the Backup binaries to the damaged system.

For example:
# mount venus:/usr/etc /mnt # /mnt/recover -s server -q recover> add / recover> force recover> recover
If this system is a server, continue with the disaster recovery by restoring the Backup indexes and configuration files. See "How to Recover Backup Indexes and Configuration Files " for instructions.

If this system is a Backup client or storage node, see "How to Recover Backup Clients and Storage Nodes" for instructions.

How to Recover Backup Clients and Storage Nodes

To recover clients and storage nodes, you simply need to reinstall the Backup client and storage node software, and then recover their configuration files, using the nwrecover program.

The Backup clients and storage nodes, similar to Backup servers, each have a /nsr directory that contains special configurations created during the initial installation. During the disaster recovery procedure, you will recover the /nsr directory, which restores the clients and storage nodes to their status prior to the disaster.

To recover a Backup client or storage node, follow these steps:

Log on as root.

Start the nwrecover program.

Click the Recover speedbar button to open the Recover window.

Backup displays the system's directory structure in the Recover window.

Select and mark the Backup directory for recovery.

Click the Start speedbar button to begin the recovery.

Restart nsrexecd.

The Backup client or storage node should be restored to the status it had prior to the disk crash.

How to Recover Backup Indexes and Configuration Files

These steps only apply to Backup servers; because only servers store and maintain the indexes and configuration files. Use the mmrecov command to recover the Backup indexes and configuration files that reside in the /nsr directory.

If the operating system and the Backup software were also destroyed, you must reinstall them prior to recovering the /nsr directory contents. See "System and Backup Software Recovery".

When you use the mmrecov command to recover the /nsr directory, you recover the contents of three important directories:

/nsr/mm (media manager) directory - contains the Backup media index that tracks all the Backup backup volumes and their save sets.
/nsr/index/server-name directory - contains the server indexes, which has a list of all the server files that were backed up prior to the disaster.

The server index includes information about the client indexes, for example, where they are located and how to recover them. Later, after you complete the recovery of the server index, you can use the nwrecover program to recover the client indexes.
/nsr/res directory - contains special Backup configuration files.

The nsr.res file includes the list of clients that belong to the server, customized client configurations or selections, and device and registration information. The nsrjb.res file includes the location of the backup volumes in the jukebox and label template information. Unlike the indexes, the contents of this directory cannot be reliably overwritten while Backup is running. Therefore, mmrecov recovers the /nsr/res directory as /nsr/res.R, which you rename later.

Using the mmrecov Command

Use the mmrecov command to recover the Backup server indexes and configuration files in the /nsr directory. Information in this section only applies to Backup servers.

The mmrecov command prompts you for the bootstrap save set identification number (ssid or save set ID). If you followed the recommended procedures to prepare for loss of critical data, you have a copy of the bootstrap file (either hardcopy or an electronic file) with the name of the needed backup media you need and the bootstrap ssid. (Never run the mmrecov command from root (/); you can use any other directory.)

In the following example, ssid 17851237 is the most recent bootstrap backup:

Jun 17  22:21 1997 mars's Backup bootstrap information
date    time    level   ssid    file    record                  volume
6/14/92 23:46:13        full    17826163        48      0       mars.1
6/15/92 22:45:15        9       17836325        87      0       mars.2
6/16/92 22:50:34        9       17846505        134     0       mars.2
6/17/92 22:20:25        9       17851237        52      0       mars.3

If you do not have this information, you can still recover the indexes by finding the ssid using the scanner -B command. See "Bootstrap Save Set ID".

With the operating system and Backup software in place, recover the indexes and configuration files from the backup media:

Find the bootstrap information.

This information is needed for the following steps.

Mount the backup media that contains the most recent backup named bootstrap in a storage device.

Use the mmrecov command to extract the contents of the bootstrap backup. (Never run the mmrecov command from root (/); you can use any other directory.)

For example:

# mmrecov
mmrecov: Using mars.sun.com as server
NOTICE: mmrecov is used to recover the Backup server's on-line file
and media indexes from media (backup tapes or disks) when either
of the server's on-line file or media index has been lost or
damaged.
Note that this command will OVERWRITE the server's existing on-
line file and media indexes.  mmrecov is not used to recover Backup
clients' on-line indexes; normal recover procedures may be used
for this purpose.  See the mmrecov(8) and nsr_crash(8) man pages
for more details.
Enter the latest bootstrap save set id []: 17851237
Enter starting file number (if known) [0]: 52
Enter starting record number (if known) [0]: 0
Please insert the volume on which save set id 17851237 started into
/disk1/file.tape.  When you have done this, press <RETURN>:
Scanning /disk1/file.tape for save set 17851237; this may take a
while...
scanner: scanning file disk file.tape on /disk1/file.tape scanner:
ssid 17851237: scan complete
scanner: ssid 17851237: 28 KB, 11 file(s)
/nsr/res/nsr.res
/nsr/res/nsr.res: file exists, overwriting /nsr/res/nsrjb.res
/nsr/res/nsrjb.res: file exists, overwriting /nsr/res/nsrla.res
/nsr/res/nsrla.res: file exists, overwriting /nsr/res/
/nsr/mm/
/nsr/index/mars.sun.com/
/nsr/index/
/nsr/
/
#
nsrmmdbasm -r /nsr/mm/mmvolume/
nsrindexasm -r /nsr/index/mars.sun.com/db/
/disk1/file.tape: mount operation in progress
mars.sun.com: 7 records recovered, 0 discarded.
/disk1/file.tape: mounted file disk file.tape
The bootstrap entry in the on-line index for mars.sun.com has been
recovered. The complete index is now being reconstructed from the
various partial indexes which were saved during the normal save
for this server.

If your resource files were lost, they are now recovered in the
'res.R' directory.  Copy or move them to the 'res' directory, after
the index has been reconstructed and you have shut down the
daemons.  Then restart the daemons.
Otherwise, just restart the daemons after the index has been
reconstructed.
nsrindexasm: Pursuing index pieces of /nsr/index/mars.sun.com/db
from mars.sun.com.
Recovering files into their original locations.
nsrindexasm -r ./mars.sun.com/db/
merging with existing mars.sun.com index
mars.sun.com: 753 records recovered, 0 discarded.
Received 1 matching file(s) from NSR server `mars.sun.com'
Recover completion time: Wed Jan 28 08:37:38 1998
The index for `mars.sun.com' is now fully recovered.

You can use Backup commands such as nsrwatch or nwadmin to watch the progress of the server during the recovery of the indexes and configuration files. Open a new window (shell tool) to monitor the recovery so that the mmrecov output is not displayed on top of the nsrwatch output. For example:

mars# nsrwatch
Server: mars.sun.com           Wed Jan 28 08:53:54 1998
Up since: Wed Jan 28 08:35:15 1998 Version: Backup
5.1.Build.63 Eval
Saves: 0 session(s)  Recovers: 1 session(s), 131 KB total
Device             type     volume
Disk1/file.tape    file     file.tape    reading, done
Sessions:
Messages:
Wed 08:35:11 server notice: started
Wed 08:35:22 index notice: completed checking 1 client(s)
Wed 08:36:44 /disk1/file.tape mount operation in progress
Wed 08:36:48 /disk1/file.tape mounted file disk file.tape
Wed 08:37:36 /disk1/file.tape mounted file disk file.tape
Wed 08:37:36 mars.sun.com:/nsr/index/mars.sun.com (1/28/98)
starting read from file.tape of 131 KB
Wed 08:37:37 mars.sun.com:/nsr/index/mars.sun.com (1/28/98) done
reading 1
31 KB
Pending:

Recovery from Clone Volumes

For recovery from clone volumes, use the mmrecov command, as described in "Using the mmrecov Command".

Select the bootstrap save set ID that includes the information associated with the cloned save set. The most recent bootstrap is the last save set listed in the bootstrap output.

In the following example, the ssid of the most recent bootstrap is 17851237. The clone of the bootstrap save set resides on mars_c.3. The value for the file location is 6, and the value for the record location is 0.

Jun 17  22:21 1996  mars's Backup bootstrap information Page 1
date    time       level  ssid       file  record   volume 
6/14/96 23:46:13   full   17826163    48       0   mars.1 
6/14/96 23:46:13   full   17826163    12       0   mars_c.1
6/15/96 22:45:15      9   17836325    87       0   mars.2 
6/15/96 22:45:15      9   17836325    24       0   mars_c.2
6/17/96 22:20:25      9   17851237    52       0   mars.3
6/17/96 22:20:25      9   17851237     6       0   mars_c.3

After mmrecov recovers the bootstrap save set, it continues recovering the remainder of the server's client index to complete the recovery. The cloned bootstrap contains information about the original and cloned volumes.

Caution -

To most easily recover data from clone volumes, make sure that all the required clone volumes are mounted in attached devices at the time you run mmrecov. If some of the clone volumes are not online, mmrecov attempts to recover the server's client index from the original volume, not the clone volume.

Based on the preceding example of bootstrap output, the mars_c.1 and mars_c.3 volumes both need to be online. If the mars_c.3 volume is the only one online, mmrecov also requests mars.1.

How to Rename the Configuration Files Directory

The information in this section only applies to Backup servers.

Unlike the /nsr/index directory, the /nsr/res directory that contains the configuration files cannot be reliably overwritten while Backup is running. Therefore, mmrecov recovers the /nsr/res directory as /nsr/res.R. To complete the recovery of the configuration files, shut down Backup, rename the recovered /nsr/res.R directory to /nsr/res, and then restart Backup.

When the mmrecov program is complete, it displays this message:

The index for `server_name' is now fully recovered.

To complete the recovery of the Backup configuration files, follow these steps:

Shut down the Backup server using the nsr_shutdown command:
# nsr_shutdown

Save the original /res directory as /res.orig, and rename the recovered file (res.R) to res.
# mv res res.orig # mv res.R res

Restart Backup.

When it restarts, the server uses the recovered configuration data in the recovered /res directory.
# nsrd # nsrexecd

After you verify the Backup configurations are correct, you can remove the res.orig directory.
# rm -r /nsr/res.orig

How to Complete the Recovery of the Backup Server Data

The information in this section only applies to Backup servers.

After you recover the server's indexes and configuration files, you can recover the remainder of the server data that includes the client indexes by using the nwrecover program.

To recover the remainder of the Backup data, follow these steps:

Log on as root.

Open the nwrecover program.

Click the Recover speedbar button to open the Recover window.

Backup displays the system's directory structure.

Select and mark the Backup directory for recovery.

Deselect the following directories and files before you recover the remainder of the server data:
- /nsr/index/server-name file - recovered when you ran the mmrecov command.
- /nsr/res and the /nsr/mm directories - recovered when you ran the mmrecov command.
  
  If you recover the /nsr/res directory and you used the autochanger to perform the disaster recovery, you will lose any special configurations you created when you added and configured the autochanger for recovery.

Click the Start speedbar button to begin the recovery.

Restart nsrd and nsrexecd.

After you recover the server data, inventory the autochanger so Backup knows which slots contain which volumes.

The Backup server should be restored to the status it had prior to the disk crash.

Recovery to a New Server

This section describes the case where your original Backup server is beyond repair, so you want to move Backup to a new server. This procedure assumes that you are not updating the operating system or the Backup software.

Caution -

Do not make major changes to the operating system or Backup software at the same time you move to a new server.

If you want to make changes to the operating system or the Backup software, configure the new server exactly like the original, using the same version of the operating system and Backup software. After configuring the new server, make sure the system is operational, perform a couple of successful backups, and then update or upgrade the operating system or the Backup software, one at a time.

To move Backup to a new server, use the same steps for recovering the operating system and Backup software, including the indexes and configuration files. Follow the instructions in these sections:

However, you should be aware of the following requirements for configuring and registering the software:

Use the same hostname for the new Backup server. You must use the same hostname because the server indexes were created under the original Backup server name.
Make sure the original server name is listed as an alias for the server in the Client window of the nwadmin program.
If the new server has a different host ID, you need to reregister the Backup software.

After you move the Backup server to another system, you must recover the resource database (nsr.res file) to ensure that you carry over the same resource and attribute settings to the new Backup server.

If the new server has a different host ID, you have 15 days to reregister the software with Sun. Refer to the Solstice Backup 5.1 Installation and Release Notes.

Sun will send you a Sun Backup Host Transfer Affidavit, which you must complete and return. Once Sun receives the signed affidavit, Sun sends you a new authorization code to enter into the Auth Code field of the Registration window.

After you successfully move your server, check the following:

Verify that the server and all the clients are included in a scheduled backup.
Schedule a full backup or use the savegrp -O command to back up the server and all the clients as soon as possible. (Manual backups do not back up the server or client indexes.)
Use the nwrecover program's window Recover window to make sure all the client indexes are browsable and, therefore, recoverable.