Skip Headers
StorageTek Storage Archive Manager and StorageTek QFS Software File System Recovery Guide
Release 5.4
E42065-02
  Go To Documentation Library
Library
Go To Table Of Contents
Contents

Previous
Previous
 
Next
Next
 

1 Introduction

This document outlines the steps that you should take in order to recover files and file systems that have been lost or corrupted due to hardware failure, misconfiguration, human error, or the physical destruction of facilities and equipment. Properly configured SAM-QFS file systems are extremely robust. But the steps that you need to take during recovery—and your probability of success—depend on your degree of preparedness. So this introduction begins with an overview of the recovery process. Then it moves to a review of the data- and file system-protection measures that Oracle recommends. Finally it outlines the steps open to you based on the preparations that you have made and the resources that you currently you have available.

Failure and Recovery Scenarios

The scope of a file-system failure and the nature of the required recovery actions depends on the nature of the underlying problem. For example:

  • If the server host fails, the SAM-QFS software and file-system configurations may be lost, leaving file system data and metadata intact but inaccessible until the configuration information is restored.

    Once the underlying hardware problem has been addressed and the operating system has been restored, the SAM-QFS configurations can be restored by reinstalling the software and installing backup copies of the configuration files and any custom scripts. In this situation, follow the procedures outlined in Chapter 3, "Restoring the SAM-QFS Configuration".

  • If an administrator inadvertently deletes or misconfigures one or more configuration files, library catalogs, scripts, or crontab entries, access to one or more file systems may be lost along with some or all software functionality.

    The SAM-QFS configurations can be restored by reinstalling backup copies of the affected configuration files and/or scripts. Follow the procedures outlined in Chapter 3, "Restoring the SAM-QFS Configuration".

  • If a disk or RAID group that provides the disk cache for the data in a standalone (non-archiving) file-system fails, all files in the disk cache are lost.

    Once the hardware problem has been addressed, lost files can be restored from QFS backup copies. See "Recovering Files Using a Recovery Point File".

  • If a disk or RAID group that provides the disk cache for the data in an archiving file-system fails, all files in the disk cache are lost.

    Once the hardware problem has been addressed, lost files can be restored from archived copies or SAM-QFS backup files. See "Recovering Files Using a Recovery Point File" and "Recovering Files Using Archiver Log Entries".

  • If the disks that store file system metadata fail, the file system is lost and the data is no longer readily accessible.

    Once the hardware problem has been addressed, metadata can be restored from backup files. If the metadata for an archiving file system was not backed up, it can be reconstructed from backup copies of the archiver log file. See Chapter 5, "Recovering Lost and Damaged Files".

  • If an administrator inadvertently formats the disk partitions that host a SAM-QFS file system or issues the sammkfs command against existing SAM-QFS partitions, all files and metadata are lost.

    Metadata can be restored from backup files or reconstructed from the archiver log of an archiving file system. Data can be restored from archival media or from a backup file. See Chapter 5, "Recovering Lost and Damaged Files".

Recommended Preparations

In the StorageTek Storage Archive Manager and StorageTek QFS Installation and Configuration Guide, Oracle recommends that you include the following configuration, file-system, and data backup steps in your initial configuration:

  • Store critical data in SAM-QFS archiving file systems and archive at least two copies of the file data, at least one of which should be on removable media, such as magnetic tape. If possible, configure disk archives on independent file systems that do not share physical devices with the disk cache of the archiving file-system.

  • Store file-system metadata on highly redundant, mirrored storage.

  • Regularly backup SAM-QFS file systems with recovery point files. A recovery point file stores file-system metadata and, optionally, data, so that files or entire file systems can be restored.

    If you have the Storage Archive Manager software installed, you create recovery point files by running the samfsdump command. If you have only the QFS file system software, you use the qfsdump command. Using either command on its own backs up the metadata. Using either command with the -U option backs up data as well as metadata. This option is mainly useful for protecting file systems that are not archived to removable media. You can run the dump commands from the command line or from the SAM-QFS Manager graphical user interface.

  • Configure the host to automatically save SAM-QFS metadata recovery point files, either by using entries in the crontab file or by using the scheduling feature of the SAM-QFS Manager.

  • Configure the host to automatically save SAM-QFS archiver log information. Create entries in the system crontab file.

    For each file that is archived with the Storage Archive Manager software, the archiver log file records the file's name and location (path) within the file system, the name of archive (tar) files that hold copies, the removable media volumes that hold the archive files, and the positions of the archive files on the media.

  • Select a secure storage location for all SAM-QFS recovery information, including backed up configuration files, recovery points, and archiver logs. Select an independent file system that will mount on the SAM-QFS file system host but does not share any physical devices with the archiving file system.

    Do not store disaster recovery resources in the file system that they are meant to protect. Do not use logical devices, such as partitions or LUNs, that reside on physical devices that also host the archiving file-system.