C H A P T E R  6

Salvaging Damaged Volumes

This chapter describes how to restore data from tapes or magneto-optical disks that are not usable in a SAM-QFS environment. This procedures in this chapter describe what to do when a volume is partially corrupted, accidentally relabeled, has a destroyed label, or is entirely destroyed. The procedures in this chapter describe how to recover data both when archive copies are available and when there are no other copies available.

Before attempting the procedures in this chapter, determine whether or not the volume can be read by using software other than Sun StorEdge SAM-FS tools. Try reading the volume in multiple drives, or try using the tar(1) command.

This chapter covers the following topics:


Recovering Data From a Tape Volume

The procedures for recovering data from a tape volume differ depending on the nature of the damage and whether or not additional archive copies of the volume's files are present on another tape. This section describes how to recover data in the following scenarios:

Damaged Tape Volume - Other Copies Available

The Sun StorEdge SAM-FS storage and archive manager allows you to make up to four archive copies of each online file. By default, only one copy is made, but Sun Microsystems recommends that you make at least two copies, preferably to physically different archive media.

When an alternative archive copy is available, the recovery procedure includes a step for rearchiving all archive copies currently stored on the damaged volume before dispensing with the damaged volume. The new archive copies are made from the available alternative archive copy.


procedure icon  To Recycle a Damaged Tape - Other Copies Available

Use this procedure if alternative archive copies exist on volumes that are stored on-site and are available for staging.

1. Export the damaged volume from the tape library, and flag it as unavailable in the historian catalog.

Enter the export(1M) and chmed(1M) commands as shown in the following screen example, specifying the media type (mt) and VSN (vsn) of the damaged volume.


# export mt.vsn
# chmed +U mt.vsn

2. Flag the unavailable volume for recycling.

Use the chmed(1M) command and specify the media type (mt) and the VSN (vsn) of the damaged volume.


# chmed +c mt.vsn

3. Set the -ignore option for the library in the recycler.cmd file.

CODE EXAMPLE 6-1 shows the -ignore option set on the lt20 library:


CODE EXAMPLE 6-1 Example recycler.cmd File with the -ignore Option
# vi /etc/opt/SUNWsamfs/recycler.cmd
logfile = /var/adm/recycler.log
lt20 -hwm 75 -mingain 60 -ignore
:wq

For more information about the ignore option, see the recycler-cmd(4) man page.

4. Run the sam-recycler(1M) command with the -x option from the command line.

For example:


# sam-recycler -x

When the recycler runs, it does not select any volumes for recycling other than the volume you have marked as unavailable. The recycler identifies all active archive copies on this volume and flags those archive copies for rearchiving. The next time the archiver runs, the archive copies marked for rearchiving will be written to new volumes.

After the archive copies have been written to new volumes, the damaged volume you are recycling is considered to be drained of active archive copies.

5. Dispense with the volume.

After the damaged volume is drained of active archive copies, you can dispense with the volume. How you dispense with it depends on the nature of the damage. Use the following guidelines:

If the tape is either partially corrupt or completely destroyed, it is possible (but not recommended) to reuse the tape VSN after the volume has been exported from the historian catalog.

Damaged Tape Volume - No Other Copies Available

If a tape volume is partially corrupt, it is possible to recover data from the parts of the tape volume that are not corrupt. This process is not an exact science, and it requires some trial and error to recover as much data as possible.

Errors logged in the device log can help you determine the area of a tape that is damaged. The archive_audit(1M) command can be used to generate the position and offset information for all archived files for a specific file system. You can use this position and offset information to help determine which archive copies are written to an area of a tape that is damaged.


procedure icon  To Recover Files From a Damaged Tape - No Other Copies Available

1. Use the archive_audit(1M) command to generate a list of all files with archive copies on the partially corrupt tape volume.

Use the command syntax shown in the following screen example, specifying the file system's mount point, the VSN (vsn) of the volume, and an output file name.


# archive_audit /mount_point | grep vsn > filename

2. Edit the output file from the archive_audit(1M) command in the previous step, deleting the lines for the files in the damaged area, and saving the list of deleted files for inspection in Step 3.

3. Use the list of files with archive copies that cannot be accessed (the ones that are written in the area of the tape determined to be damaged) to determine if any of the files are still on the disk.

Files that are not on disk cannot be recovered. These unrecoverable files can be removed from the file system.

4. Edit and run the stageback.sh script on the archive_audit output file you edited in Step 2.

The stageback.sh script can stage each file from archive_audit output, set it to no-release, and mark the file for rearchiving.

For information about the stageback.sh script, see Disaster Recovery Commands and Tools.

a. Open the /opt/SUNWsamfs/examples/stageback.sh file for editing.


# cd /opt/SUNWsamfs/examples
# vi stageback.sh

b. Find the section that begins with # echo rearch $file.

CODE EXAMPLE 6-2 shows this.


CODE EXAMPLE 6-2 Example stageback.sh File
# echo rearch $file
#
# Edit the following line for the correct media type and VSN
#
# eval /opt/SUNWsamfs/bin/rearch -m media -v VSN $file

c. In the section shown in CODE EXAMPLE 6-2, replace the word media with the media type (mt) and the word VSN with the VSN of the damaged volume, which are the same as the VSNs in Step 1.

d. Remove the pound sign from the beginning of the lines in the section shown in Step b.

The file should now look like CODE EXAMPLE 6-3.


CODE EXAMPLE 6-3 Example stageback.sh File - Edited
echo rearch $file
 
# Edit the following line for the correct media type and VSN
 
eval /opt/SUNWsamfs/bin/rearch -m media -v VSN $file

e. Save and quit the file.

f. Run the stageback.sh script.

Relabeled Tape Volume - No Other Copies Available

The Sun StorEdge SAM-FS software cannot read beyond the EOD. If a tape is accidentally relabeled, the only possibility for recovering any data is to contact the tape manufacturer to determine if they offer a method for reading beyond EOD.

If the tape manufacturer can provide a mechanism for reading beyond EOD, you can recover the data by combining that process with the procedure for recovering files from a tape volume with a label not readable by the Sun StorEdge SAM-FS software. This procedure is described under Unreadable Tape Label - No Other Copies Available.

Unreadable Tape Label - No Other Copies Available

Whenever the Sun StorEdge SAM-FS software receives a request to mount a tape volume into a drive, one of the first actions taken is to verify the tape label written on the tape. If the tape label cannot be read, the Sun StorEdge SAM-FS software cannot use the tape for staging or archiving activities.

The tarback.sh(1M) script is used to recover data from a tape that has a label that cannot be read. The shell script automates the process of recovering data written to a tape, using the star(1M) command to read each archive file written on a specific tape volume. The file data is read back onto disk (into a Sun StorEdge QFS or UFS file system) as data. File data recovered in this manner can then be moved to the appropriate location in the Sun StorEdge QFS file system. It must then be archived as new data.


procedure icon  To Recover Files From a Tape Whose Label Is Unreadable

1. If you are using this process to recover file data from several tapes, disable any currently occurring recycling.

When recycling is going on, data on the tape volumes may be inaccessible.

2. Use the cp(1M) command to copy the tarback.sh file to a working location.

For example, the following command copies the script from the default location /opt/SUNWsamfs/examples/tarback.sh to /var/tarback.sh.


# cp /opt/SUNWsamfs/examples/tarback.sh /var/tarback.sh

3. Enter the samcmd(1M) command with the unavail option to make the tape drive unavailable.

To prevent the tape drive from being used for staging and archiving activities, use the syntax shown in the following screen example. Specify the Equipment Ordinal of the drive, as specified in the mcf(4) file, for eq.


# samcmd unavail eq

4. Edit the working copy of the tarback.sh(1M) script to specify the variables shown in the following table.


TABLE 6-1 Variables to Specify in the tarback.sh (1M) Script

Variable

Definition

EQ="eq"

The Equipment Ordinal of the tape drive as defined in the mcf file.

TAPEDRIVE="path"

The raw path to the device that is described by EQ=.

BLOCKSIZE="size"

The block size in 512-byte units. Specify 256 for a block size of 128 kilobytes.

MEDIATYPE="mt"

The two-character media type for this tape as defined in the mcf(4) man page.

VSN_LIST="vsn1 vsn2 ..."

The list of VSNs to be read. There is no limit on the number of VSNs that can be specified. Use a space character to separate the VSNs.

 

This list can be continued onto another line by using a backslash (\) character. For example:

VSN_LIST="vsn1 vsn2 \

vsn3"


5. Execute the tarback.sh(1M) script.


Recovering Data From a Magneto-optical Volume

The procedures for recovering data from a magneto-optical volume differ, depending on the nature of the damage and whether or not additional archive copies of the volume's files are present on another tape. This section describes how to recover data in the following scenarios:

See Damaged Magneto-optical Volume - Copies Available.

See Damaged Magneto-optical Volume - No Other Copies Available.

See Relabeled Magneto-optical Volume - No Other Copies Available.

See Unreadable Label - No Other Copies Available.

Damaged Magneto-optical Volume - Copies Available

Regardless of the nature of the damage to the magneto-optical volume, if an alternative archive copy is available, you should use the good magneto-optical volume as your primary set of archive copies.

The recovery procedure includes a step for rearchiving all archive copies currently stored on the damaged volume before dispensing with the damaged volume. The new archive copies are made from the available alternative archive copy.


procedure icon  To Rearchive Files and Recycle a Damaged Magneto-optical Volume--Copies Available

Use this procedure if readable alternative archive copies exist on volumes that are available on-site for staging.

1. Enter the samexport(1M) command to export the damaged volume from the magneto-optical library.

Use the syntax shown in the following screen example, specifying the media type (mt) and VSN (vsn) of the damaged volume.


# samexport mt.vsn

2. Enter the chmed(1M) command with the -U option to flag the damaged volume as unavailable in the historian catalog.

Use the syntax shown in the following screen example, specifying the media type (mt) and VSN (vsn) of the damaged volume.


# chmed +U mt.vsn

3. Enter the chmed(1M) command with the -c option to flag the unavailable volume for recycling.

Use the syntax shown in the following screen example, specifying the media type (mt) and the VSN (vsn) of the damaged volume.


# chmed +c mt.vsn

4. Edit the recycler.cmd(4) file to set the -ignore option for the library.

The following screen example shows the -ignore option set on the lt20 library.


CODE EXAMPLE 6-4 Example recycler.cmd File with the -ignore Option
# vi /etc/opt/SUNWsamfs/recycler.cmd
logfile = /var/adm/recycler.log
lt20 -hwm 75 -mingain 60 -ignore
:wq

5. Enter the sam-recycler(1M) command with the -x option.


# sam-recycler -x

When the recycler runs, it does not select any volumes for recycling other than the volume you have marked as unavailable. The recycler identifies all active archive copies on this volume and flags those archive copies for rearchiving. The next time the archiver runs, the archive copies marked for rearchiving are written to new volumes.

After the archive copies have been written to new volumes, the damaged volume you are recycling is considered to be drained of active archive copies.

6. Dispense with the volume.

After the damaged volume is drained for active archive copies, you can dispense with the volume. How you dispense with it depends on the nature of the damage. See the following guidelines:

If the magneto-optical volume is either partially corrupt or completely destroyed, it is possible (but not recommended) to reuse the magneto-optical label after the volume has been exported from the historian catalog.

If the magneto-optical volume is completely destroyed and no alternative archive copies exist, there is no chance for recovering any data from this magneto-optical platter.

Damaged Magneto-optical Volume - No Other Copies Available

If a magneto-optical volume is only partially corrupt, it is possible to recover data written to the parts of the magneto-optical volume that are not damaged. This process requires some trial and error to recover as much data as possible.

It is possible to determine the area of a magnetic optical volume that is damaged from errors logged in the device logs. By using file names for files that cannot be retrieved, you can determine the location of the damage using the position and offset data.

The archive_audit(1M) command audits all archive copies for a specific file system. The output of the archive_audit command includes the position and offset information for each archive copy. You can use this position and offset information to help determine which archive copies are written to an area of a damaged magneto-optical disk.


procedure icon  To Recover From a Damaged Magneto-optical Volume - No Other Copies Available

Copies of files that were archived outside the damaged area on a magneto-optical volume may be accessible. You can use the following procedure to recover files in accessible areas of a partially corrupted magneto-optical volume.

1. Use the archive_audit(1M) command to generate a list of all files with archive copies on the partially corrupt tape volume:

Use the syntax shown in the following screen example, specifying the file system's mount point, the VSN of the damaged volume, and an output file name.


# archive_audit /mount_point | grep vsn > filename

2. Edit the archive_audit output file and create three separate files with the following contents:

3. Look for the files with archive copies within the damaged area of the magneto-optical disk to determine if any of the files are still in disk cache.

Files that are not in disk cache cannot be recovered.

4. Remove unrecoverable files from Step 2 from the file system.

5. Edit and run the stageback.sh script using the files created in Step 2 that list files outside the damaged area.

The stageback.sh script stages each file from archive_audit output, sets it to no-release, and marks the file for rearchiving.

For information about the stageback.sh script, see Chapter 1.

a. Open the /opt/SUNWsamfs/examples/stageback.sh file for editing.


# cd /opt/SUNWsamfs/examples
# vi stageback.sh

b. Find the section that begins with # echo rearch $file.


CODE EXAMPLE 6-5 Example stageback.sh File
# echo rearch $file
#
# Edit the following line for the correct media type and VSN
#
# eval /opt/SUNWsamfs/bin/rearch -m media -v VSN $file

c. In the section shown in CODE EXAMPLE 6-5, replace the word media with the media type and the word VSN with the same VSN specified in Step 1.

d. Remove the pound sign from the beginning of the lines in the section shown in Step b.


CODE EXAMPLE 6-6 Example stageback.sh File - Edited
echo rearch $file
 
# Edit the following line for the correct media type and VSN
 
eval /opt/SUNWsamfs/bin/rearch -m media -v VSN $file

e. Save and quit the file.

f. Run the stageback.sh script.

Relabeled Magneto-optical Volume - No Other Copies Available

Unlike tape media, magneto-optical media does not have an EOD marker. When a magneto-optical volume is accidentally relabeled, the Sun StorEdge SAM-FS software cannot access data written previously because of the label date. The Sun StorEdge SAM-FS systems assume that if the label date on the magneto-optical volume is newer than the archive copy date of files, that data is no longer accessible.

Contact Sun Microsystems customer support if a magneto-optical volume is accidentally relabeled. It is sometimes possible to recover some of this data with a special (but unsupported) samst driver that ignores the magneto-optical label date. This driver is not a standard part of the Sun StorEdge SAM-FS product, and it is not released as part of the product. It can only be made available by Sun's customer support.

Unreadable Label - No Other Copies Available

For magneto-optical media, there is no standard Solaris approach for locating and skipping to the various tar(1M) files. Contact Sun Microsystems customer support if you need to access files on a magneto-optical volume with an unreadable label.