Skip Headers
StorageTek Storage Archive Manager and StorageTek QFS Software File System Recovery Guide
Release 5.4
E42065-02
  Go To Documentation Library
Library
Go To Table Of Contents
Contents

Previous
Previous
 
Next
Next
 

2 Stabilizing the Situation

Whenever you are faced with recovering from a significant file-system failure or potential data loss, your first step should be to stabilize the affected systems, minimize chances for further losses, and preserve diagnostic information, where possible. When SAM-QFS hosts remain viable and normal operations continue in an abnormal situation, additional damage is inadvertently done. This chapter outlines some basic steps that help to avoid this:

Stopping Archiving and Recycling Processes

Whenever you are restoring an archiving file system and have to recover significant numbers of files, your first step should be to stop the archiving and recycling processes, either for the file system or for the entire system. You want to stabilize and isolate the archive until you have assessed the situation and, ideally, restored everything to normal. Otherwise, ongoing archiving and recycling operations can, in some situations, make matters worse by propagating corrupted files and/or by recycling remaining archive copies that contain valid data.

So, whenever possible, take the precautions listed below:

Once recovery operations are complete, you can reverse the changes below and restore normal file system behavior.

Stop Archiving

  1. Log in to the file-system metadata server as root.

    root@solaris:~# 
    
  2. Open the /etc/opt/SUNWsamfs/archiver.cmd file in a text editor, and scroll down to the first fs (file-system) directive.

    In the example, we use the vi editor:

    root@solaris:~# vi /etc/opt/SUNWsamfs/archiver.cmd
    # Configuration file for SAM-QFS archiving file systems
    #-----------------------------------------------------------------------
    # General Directives
    archivemeta = off
    examine = noscan
    #-----------------------------------------------------------------------
    # Archive Set Assignments 
    fs = samqfs1
    logfile = /var/adm/samqfs1.archive.log
    all .
        1 -norelease 15m
        2 -norelease 15m
    fs = samqfs2
    logfile = /var/adm/samqfs2.archive.log
    all .
    ...
    
  3. To stop archiving on all file systems, add a wait directive to file immediately before the first fs directive. Save the file, and close the editor.

    In the example, we insert the wait directive just before the directive for the samqfs1 file system, where it will apply to all file systems configured for archiving:

    root@solaris:~# vi /etc/opt/SUNWsamfs/archiver.cmd
    ...
    #-----------------------------------------------------------------------
    # Archive Set Assignments
    wait
    fs = samqfs1
    logfile = /var/adm/samqfs1.archive.log
    all .
        1 -norelease 15m
        2 -norelease 15m
        3 -norelease 15m
    fs = samqfs2
    ...
    :wq
    root@solaris:~# 
    
  4. If you need to stop archiving on only one file system, add a wait directive to file immediately after the corresponding fs directive. Save the file, and close the editor.

    In the example, we stop archiving activity on the samqfs1 file system:

    root@solaris:~# vi /etc/opt/SUNWsamfs/archiver.cmd
    ...
    #-----------------------------------------------------------------------
    # Archive Set Assignments
    fs = samqfs1
    wait
    logfile = /var/adm/samqfs1.archive.log
    all .
        1 -norelease 15m
        2 -norelease 15m
        3 -norelease 15m
    fs = samqfs2
    ...
    :wq
    root@solaris:~# 
    
  5. Next, Stop Recycling.

Stop Recycling

  1. Log in to the file-system metadata server as root.

    root@solaris:~# 
    
  2. Open the /etc/opt/SUNWsamfs/recycler.cmd file in a text editor.

    In the example, we use the vi editor:

    root@solaris:~# vi /etc/opt/SUNWsamfs/recycler.cmd
    # Configuration file for SAM-QFS archiving file systems
    #-----------------------------------------------------------------------
    logfile = /var/adm/recycler.log
    no_recycle tp VOL[0-9][2-9][0-9]
    library1 -hwm 95 -mingain 60
    
  3. Add the -ignore parameter to the recycling directive for each media library. Then save the file, and close the editor.

    In the example, we have only one library in the SAM-QFS configuration, library1:

    root@solaris:~# vi /etc/opt/SUNWsamfs/recycler.cmd
    # Configuration file for SAM-QFS archiving file systems
    #-----------------------------------------------------------------------
    logfile = /var/adm/recycler.log
    no_recycle tp VOL[0-9][2-9][0-9]
    library1 -hwm 95 -mingain 60 -ignore
    :wq
    root@solaris:~# 
    
  4. If you are recovering from loss or damage to one more archiving file systems, Back Up Unarchived Files before proceeding.

  5. If you are recovering from a server problem or from loss or damage to file systems, Save the SAM-QFS Configuration before proceeding.

  6. If you need to restore directories and files, decide whether you need to Save the SAM-QFS Configuration or go directly to Chapter 5, "Recovering Lost and Damaged Files".

Preserving Unarchived Data

Unarchived files may remain in the disk cache of a damaged archiving file system. No copies of these files exist in the archive. So, if we can, we need to back them up to a recovery point file before proceeding. Proceed as follows:

Back Up Unarchived Files

  1. If possible, log in to the file-system metadata server as root.

    root@solaris:~# 
    
  2. Select a safe location for the recovery point will be stored.

    In the example, we create a subdirectory, unarchived/, under the directory that we created for recovery points when we configured file systems (see the StorageTek Storage Archive Manager and StorageTek QFS Installation and Configuration Guide).

    root@solaris:~# mkdir /zfs1/samqfs_recovery/unarchived/
    root@solaris:~# 
    
  3. Change to the file system's root directory.

    In the example, we change to the mount-point directory /samqfs1:

    root@solaris:~# cd /samqfs1
    root@solaris:~# 
    
  4. Backup any unarchived files that remain in the disk cache. Use the command samfsdump -u -f recovery-point, where recovery-point is the path and file name of the output file.

    The -u option causes the samfsdump command to backup any data files that have not been archived. In the example, we save the recovery point file 20140404 to the remote directory /zfs1/samqfs_recovery/unarchived/:

    root@solaris:~# samfsdump -u -f /zfs1/samqfs_recovery/unarchived/20140404
    root@solaris:~# 
    
  5. If you are recovering from a server problem or from loss or damage to file systems, Save the SAM-QFS Configuration before proceeding.

  6. If you need to restore directories and files, decide whether you need to Save the SAM-QFS Configuration or go directly to Chapter 5, "Recovering Lost and Damaged Files".

Preserving Configuration and State Information

Even when you have safely stored backup copies of all configuration files and scripts needed for restoring the SAM-QFS software and file-system, it pays to preserve the current state of a failed system, if you can. Surviving configuration files and scripts may contain changes that were implemented since the complete configuration was last backed up. This can mean the difference between restoring the system to its almost its exact pre-failure state and merely getting close. Log and trace files contain information that helps restore files and clarifies the causes of failures. For this reason, you should preserve whatever remains, before you do anything else.

Save the SAM-QFS Configuration

  1. If possible, log in to the file-system metadata server as root.

    root@solaris:~# 
    
  2. If you can run the samexplorer command create a SAMreport and save it in the directory that holds your backup configuration information. Use the command samexplorer path/hostname.YYYYMMDD.hhmmz.tar.gz, where path is the path to the chosen directory, hostname is the name of the SAM-QFS file system host, and YYYYMMDD.hhmmz is a date and time stamp.

    The default file name is /tmp/SAMreport.hostname.YYYYMMDD.hhmmz.tar.gz. In the example, we already have a directory for saving SAMreports, /zfs1/sam_config/. So we create the report in this directory:

    root@solaris:~# samexplorer /zfs1/sam_config/explorer/server1.20140430.1659MST.tar.gz
         Report name:     /zfs1/sam_config/explorer/samhost1.20140430.1659MST.tar.gz
         Lines per file:  1000
         Output format:   tar.gz (default) Use -u for unarchived/uncompressed.
     
         Please wait.............................................
         Please wait.............................................
         Please wait......................................
     
         The following files should now be ftp'ed to your support provider
         as ftp type binary.
     
         /zfs1/sam_config/explorer/samhost1.20140430.1659MST.tar.gz
    
  3. Copy as many of the SAM-QFS configuration files as you can to another file system. These include the following:

    /etc/opt/SUNWsamfs/
         mcf
         archiver.cmd
         defaults.conf 
         diskvols.conf 
         hosts.family-set-name
         hosts.family-set-name.local
         preview.cmd
         recycler.cmd
         releaser.cmd
         rft.cmd
         samfs.cmd
         stager.cmd
         inquiry.conf
         samremote                  # SAM-Remote server configuration file
         family-set-name            # SAM-Remote client configuration file
         network-attached-library   # Parameters file
         scripts/*                  # Back up all locally modified files
    /var/opt/SUNWsamfs/
    
  4. Back up all surviving library catalog data, including that maintained by the historian, if possible. For each catalog, use the command /opt/SUNWsamfs/sbin/dump_cat -V catalog-file, where catalog-file is the path and name of the catalog file. Redirect the output to dump-file, in a new location.

    We will use the output of the dump_cat file to rebuild the catalogs on a replacement system, using the command build_cat. In the example, we dump the catalog data for library1 to the file library1cat.dump in a directory on the independent NFS-mounted file system zfs1:

    root@solaris:~# dump_cat -V /var/opt/SUNWsamfs/catalog/library1cat > \ /zfs1/sam_config/20140513/catalogs/library1cat.dump
    
  5. Copy system configuration files that were modified during SAM-QFS installation and configuration. These may include:

    /etc/
         syslog.conf
         system
         vfstab
    /kernel/drv/
         sgen.conf
         samst.conf
         samrd.conf
         sd.conf
         ssd.conf
         st.conf
    /usr/kernel/drv/dst.conf
    
  6. Copy any custom shell scripts and crontab entries that you created as part of the SAM-QFS configuration to the selected subdirectory.

    For example, if you created a crontab entry to manage creation of recovery points, you would save a copy now.

  7. Record the revision level of the currently installed software, including Oracle SAM-QFS, Solaris, and Solaris Cluster (if applicable), and save a copy of the information in a readme file in the chosen subdirectory.

  8. In the chosen subdirectory, save copies of downloaded Oracle SAM-QFS, Solaris, and Solaris Cluster packages so that you can restore the software quickly, should it become necessary.

  9. If you are recovering from the loss of a SAM-QFS server host, go to Chapter 3, "Restoring the SAM-QFS Configuration".

  10. If you need to restore one or more SAM-QFS file systems, go to Chapter 4, "Recovering File Systems".

  11. If you need to restore directories and files, go to Chapter 5, "Recovering Lost and Damaged Files".