C H A P T E R 4 |
Backing Up Data |
This chapter provides the backup and dump processes and information you need in order to keep your data safe and prepare for any disaster. For more information on planning for disaster recovery, see Planning for Disaster Recovery.
This chapter contains the following sections:
TABLE 4-1 shows common causes of data loss, with notes and suggestions about how to avoid or respond to each type of loss.
Sun StorEdge QFS file systems are protected from access by unauthorized users because of the UNIX superuser mechanism. You can also restrict administrative actions to an optional administrative group. |
||
File systems can be made unavailable by any of the following: |
Rebuild the file system only after verifying that a configuration problem is not the cause of the apparent failure. See the following: |
|
Hardware RAID used for disk storage system management has the following advantages over software RAID: |
Use hardware RAID disk storage systems wherever possible. Unmount the file system and use samfsck(1M) to check and fix hardware-based file system consistency problems. See To Troubleshoot an Inaccessible File System for an example. Also see Recovering From Catastrophic Failure. |
Some apparent data losses are actually caused by cabling problems or configuration changes. Be sure to eliminate the fundamental causes for a failure before commencing a data recovery process. Back up anything you change before you change it, if possible.
Caution - Do not reformat a disk, relabel a tape, or make other irreversible changes until you are convinced that the data on the disk or tape is completely unrecoverable. |
To Troubleshoot an Inaccessible File System |
1. Check cables and terminators.
2. If you cannot read a tape or magneto-optical cartridge, try cleaning the heads in the drive, or try reading the cartridge in a different drive.
3. Check the current state of your hardware configuration against the documented hardware configuration.
Go to Step 4 only when you are certain that a configuration error is not to blame.
4. Unmount the file system, and run samfsck(1M).
5. If you find the file system is still inaccessible, follow the procedures in the other chapters in this manual to restore the file system.
The following sections provide information about some of the commands and tools you can use to back up your data.
TABLE 4-2 summarizes the commands used most frequently in disaster recovery efforts.
For more information about these commands, see their man(1) pages. Other scripts and helpful sample files are located /opt/SUNWsamfs/examples or are available from Sun Microsystems.
TABLE 4-3 describes some disaster recovery utilities in the /opt/SUNWsamfs/examples directory and explains their purpose. You must modify all of the listed shell scripts, except for recover.sh(1M), to suit your configuration before using them. See the comments in the files.
Executable shell script that stages all files and directories that were online at the time a samfsdump(1M) command was run. This script requires a log file generated by samfsrestore(1M) to be used as input. Modify the script as instructed in the comments in the script. See also the restore.sh(1M) man page. NOTE: If this script is used in a SAM-QFS shared environment, it must be run on the metadata server, not on one of the clients. |
|
Executable shell script that recovers files from tape, using input from the archiver log file. If used with SAM-Remote clients or server, the recovery must be performed on the server to which the tape library is attached. For more information about this script, see the recover.sh(1M) man page and the comments in the script itself. Also see Using Archiver Logs. |
|
Executable shell script that stages files that have been archived on accessible areas of a partially damaged tape. Modify the script as instructed in the script's comments. For information about how the script is used, see Damaged Tape Volume With No Other Copies Available. |
|
Executable shell script that recovers files from tapes by reading each tar(1) file. Modify the script as instructed in the script's comments. For more information about this script, see the tarback.shman page. See also Unreadable Tape Label With No Other Copies Available. |
The samexplorer(1M) script (called info.sh in software versions before 4U1) creates a file containing all the configuration information needed for complete reconstruction of a SAM-QFS installation should you ever need to rebuild the system. You can use the crontab(1) command with the -e option to create a cron(1M) job to run the samexplorer script at desired intervals. The script writes the reconfiguration information to /tmp/SAMreport.
Although the /opt/SUNWsamfs/sbin/samexplorer script is not a backup utility, it should be run whenever changes are made to the system's configuration.
Make sure that the SAMreport file is moved from the /tmp directory after creation to a fixed disk that is separate from the configuration files and outside the SAM-QFS environment. For more information about managing the SAMreport file, see the samexplorer(1M) man page.
TABLE 4-4 lists the files that should be backed up and the recommended frequency of backups to a location outside the file system environment.
Except where specified otherwise, use whatever backup procedures you choose.
Site-modified versions of file system backup and restoration shell scripts. |
See the default scripts listed in Files Requiring Backup. |
|
Site-created shell scripts and cron(1) jobs created for backup and restoration. |
||
See the samexplorer script and SAMreport output file described in The samexplorer Script. |
||
Sun StorEdge QFS metadata and data (see Metadata Used in Disaster Recovery for definitions). |
Regularly, at intervals determined by individual site requirements |
Files altered after qfsdump(1M) is run cannot be recovered by qfsrestore(1M), so take dumps frequently. For more information, see Metadata Used in Disaster Recovery. |
SAM-QFS metadata (see Metadata Used in Disaster Recovery for definitions). |
Regularly, at intervals determined by individual site requirements |
Use the samfsdump(1M) command to back up metadata. Files altered after samfsdump is run cannot be recovered by samfsrestore(1M), so take dumps frequently or at least save the inodes information frequently. For more information, see Backing Up the Metadata in SAM-QFS File Systems. |
Regularly, at intervals determined by individual site requirements |
Back up all library catalog files, including the historian file. Library catalogs for each automated library, for each pseudolibrary on Sun SAM-Remote clients, and for the historian (for cartridges that reside outside the automated libraries) are in /var/opt/SUNWsamfs/catalog. |
|
Archiver log files from a SAM-QFS file system where the archiver is being used. |
Regularly, at intervals determined by individual site requirements |
Specify a path name and name for an archiver log file in the archiver.cmd file, and back up the archiver log file. See the archiver.cmd(4) man page for instructions on specifying an archiver log file for each file system. Also see Using Archiver Logs. |
Configuration files and other similar files modified at your site. Note that these reside outside the SAM-QFS file system. |
The following files may be created at your site in the /etc/opt/SUNWsamfs directory: |
|
If using network attached libraries, be sure to back up the configuration files. The exact names of the files are listed in the Equipment Identifier field of the /etc/opt/SUNWsamfs/mcf file on each line that defines a network attached robot. See the mcf(4) man page for more details. |
||
If using Sun SAM-Remote software, be sure to back up the configuration files. The exact names of the files are listed in the Equipment Identifier field of the /etc/opt/SUNWsamfs/mcf file on each line that defines a Sun SAM-Remote client or server. See the mcf(4) man page for more details. |
||
The following files are created by the software installation process. If you have made local modifications, preserve (or back up) these files: /etc/opt/SUNWsamfs/inquiry.conf[1] /opt/SUNWsamfs/sbin/ar_notify.sh* /opt/SUNWsamfs/sbin/dev_down.sh* /opt/SUNWsamfs/sbin/recycler.sh* /kernel/drv/samst.conf* |
||
The following files are modified as part of the software installation process: /kernel/drv/sd.conf* /kernel/drv/ssd.conf* /kernel/drv/st.conf* /usr/kernel/drv/dst.conf* Back up the above files so that you can restore them if any of them are lost or if the Solaris Operating System (OS) is reinstalled. |
||
The Sun StorEdge QFS and Sun StorEdge SAM software can be reinstalled easily from the release package and patches. Make sure you have a record of the revision level of the currently running software. If the software is on a CD-ROM, store the CD-ROM in a safe place. If you download the software from the Sun Download Center, back up the downloaded packages and patches. This saves time if you have to reinstall the software because you avoid having to download a new copy if you lose data. |
||
The Solaris OS can be reinstalled easily from the CD-ROM, but make sure you have a record of all installed patches. This information is captured in the SAMreport file generated by the samexplorer(1M) script; this script is described under The samexplorer Script. This information is also available from the Sun Explorer tool. |
For SAM-QFS file systems, you should have the following in place in anticipation of needing them for disaster recovery:
The effectiveness of any of the SAM-QFS recovery methods relies primarily on frequent archiving.
See Guidelines for Performing Metadata Dumps
See Metadata Used in Disaster Recovery.
If recent metadata is not available, archiver logs can help you re-create the filesystem directly from archive media.
See Using Archiver Logs.
In addition, consider the following questions when preparing your site's disaster recovery plan:
See the Sun StorEdge QFS Installation and Upgrade Guide for how to back up Sun StorEdge QFS metadata.
The samfsdump(1M) command with the -u option dumps file data for files that do not have a current archive copy. The dump files are substantially larger with than without the -u option, and the command takes longer to complete. However, restoration of the output from samfsdump with -u restores the file system back to its state when the dump was taken.
The samfsdump(1M) command without the -u option generates a metadata dump file. A metadata dump file is relatively small, so you should be able to store many more metadata dump files than data dump files. Restoration of the output of samfsdump without the -u option is quicker than with the -u option, because the data is not restored until accessed by a user.
Retain enough data and metadata to ensure that you can restore the file systems according to your site's needs. The appropriate number of dumps to save depends, in part, on how actively the system administrator monitors the dump output. If an administrator is monitoring the system daily to make sure the samfsdump(1M) or qfsdump(1M) dumps are succeeding and that there are enough tapes available, as well as investigating dump errors, then keeping a minimum number of dump files to cover vacations, long weekends, and other absences might be enough.
If your site is using the sam-recycler(1M) command to reclaim space on archive media, it is critical that you make metadata copies after sam-recycler has completed its work. If a metadata dump is created before sam-recycler exits, the information in the metadump about archive copies becomes out of date as soon as sam-recycler runs. Also, some archive copies may be made inaccessible because the sam-recycler command may cause archive media to be relabeled.
Check root's crontab(1) entry to find out if and when the sam-recycler command is being run, and then, if necessary, schedule the creation of metadump files around the sam-recycler execution times. For more about recycling, see the Sun StorEdge SAM-FS Storage and Archive Management Guide.
Off-site data storage is an essential part of a disaster recovery plan. In the event of a disaster, the only safe data repository might be an off-site vault. Beyond the recommended two copies of all files and metadata that you should be keeping in house as a safeguard against media failure, consider making a third copy on removable media and storing it off site.
Sun SAM-Remote offers you the additional alternative of making archive copies in remote locations on a LAN or WAN. Multiple Sun SAM-Remote servers can be configured as clients to one another in a reciprocal disaster recovery strategy.
If you need to restore all files that were online, you must run the samfsrestore command with the -g option.
The log file generated by the samfsrestore command's -g option contains a list of all files that were on the disk when the samfsdump(1M) command was run. This log file can be used in conjunction with the restore.sh shell script to restore the files on disk to their predisaster state. The restore.sh script takes the log file as input and generates stage requests for files listed in the log. By default, the restore.sh script restores all files listed in the log file.
If your site has thousands of files that need to be staged, consider splitting the log file into manageable chunks and running the restore.sh script against each of those chunks separately to ensure that the staging process does not overwhelm the system. You can also use this approach to ensure that the most critical files are restored first. For more information, see the comments in /opt/SUNWsamfs/examples/restore.sh.
Note - If the restore.sh script is used in a SAM-QFS shared environment, it must be run on the metadata server, not on one of the clients. |
The features of SAM-QFS file systems described in TABLE 4-5 streamline and speed up data restoration and minimize the risk of losing data in the case of an unplanned system outage.
Metadata consists of information about files, directories, access control lists, symbolic links, removable media, segmented files, and the indexes of segmented files. Metadata must be restored before lost data can be retrieved.
With up-to-date metadata, the data can be restored as follows:
In Sun StorEdge QFS file systems, the .inodes file contains all the metadata except for the directory namespace (which consists of the path names to the directories where the files are stored). The.inodes file is located in the root (/) directory of the file system. For a file system to be restored, the .inodes file is needed, along with the additional metadata.
FIGURE 4-1 illustrates some characteristics of the .inodes file. The arrows indicate that the .inodes file points to file contents on disk and to the directory namespace, and that the namespace also points back to the .inodes file. In SAM-QFS file systems where archiving is being done, the .inodes file also points to archived copies.
The .inodes file is not archived. For more about protecting the .inodes file in these types of file systems, see Guidelines for Performing Metadata Dumps and Backing Up the Metadata in SAM-QFS File Systems.
Note - Sun StorEdge QFS has no archiving capability. For information on backing up Sun StorEdge QFS metadata, see the Sun StorEdge QFS Installation and Upgrade Guide. |
As indicated in FIGURE 4-1, the namespace (in the form of directories) does not point to the archive media. The directory path names for each archived file are copied into the headers of the tar(1) files on the archive media that contain the files, but the directory path names in the tar file headers may get out of sync with the actual locations of the files on the disk.
One reason the two path names can get out of sync is that the path names in the tar file header do not show the originating file system. For example, the full directory path name /samfs1/dir1/filea would appear in the tar file header one of the following forms, without the component that shows the name of the originating file system /samfs1:
Another cause of path name inconsistency is illustrated by a scenario in which a file is saved to disk, archived, then later moved, either by use of the mv(1) command or by restoration from a samfsdump(1M) output file using samfsrestore(1M) into an alternate path or file system.
This scenario results in the following:
To prevent this kind of situation, keep the data from each file system on its own unique set of tapes or other archive media, and do not mix data from multiple file systems.
The potential for inconsistency does not interfere with recovery in most cases, because the directory path names in the tar headers are not used when data is being recovered from an archive. The directory path names in the tar headers on the archive media are only used in an unlikely disaster recovery situation in which no metadata is available and the file system must be completely reconstructed with the tar command.
Follow these guidelines when performing metadata dumps:
At any given time, some files need to be archived because they are new, while others need to be rearchived because they are modified or because their archive media is being recycled. TABLE 4-6 defines the terms that apply to files archived onto archive media.
Dumping metadata during a time when files are not being created or modified avoids the dumping of metadata for files that are stale and minimizes the creation of damaged files.
When any stale files exist while metadata and file data are being dumped, the samfsdump command generates a warning message. The following warning message is displayed for any files that do not have an up-to-date archive copy:
Caution - If you see the above message and do not rerun the samfsdumpcommand after the specified file is archived, the file will not be retrievable. |
If samfsrestore(1M) is later used to attempt to restore the damaged file, the following message is displayed:
In SAM-QFS file systems, the archiver(1M) command can copy both file data and metadata, other than the .inodes file, to archive media. For example, if you create a SAM-QFS file system with a family-set name of samfs1, you can tell the archiver command to create an archive set also called samfs1. (See the archiver.cmd(4) man page for more information.) You can later retrieve damaged or destroyed file systems, files, and directories as long as the archive media onto which the archive copy was written has not been erased and as long as a recent metadata dump file is available.
The samfsdump(1M) command enables you to back up metadata separately from the file system data. The samfsdump command creates metadata dumps (including the .inodes file) either for a complete file system or of a portion of a file system. A cron(1M) job can be set up to automate the process.
If you dump metadata often enough using samfsdump, the metadata is always available to restore file data from the archives through samfsrestore(1M).
In summary, the samfsdump method to dump metadata has the following advantages:
During a file system restoration, files and directories are assigned new inode numbers based on directory location; only the required number of inodes are assigned. Inodes are assigned as the samfsrestore process restores the directory structure.
File data is defragmented because files that were written in a combination of small disk allocation units (DAUs) and large DAUs are staged back to the disk with appropriately sized DAUs.
If you have multiple SAM-QFS file systems, make sure that you routinely dump the metadata for every file system. Look in /etc/vfstab for all file systems of type samfs.
Be sure to save the dump for each file system in a separate file.
The following procedures describe how to find all the samfs type file systems and to dump metadata using samfsdump(1M):
Note - The examples in these procedures use the names /sam1 for a SAM-QFS file system mount point and /dump_sam1 for the dump file system. |
The samfsdump(1M) command -u option causes unarchived file data to be interspersed with the metadata. Note the following about the use of the -u option:
To Find Sun StorEdge QFS File Systems |
Look in the vfstab(4) file to find mount points for all samfs-type file systems.
CODE EXAMPLE 4-1 shows three file systems of type samfs with the file system names samfs1, samfs2, and samfs3. The mount points are /sam1, /sam2, and /sam3.
To Create a Sun StorEdge SAM-FS Metadata Dump File Manually Using File System Manager |
Taking a metadata snapshot through the File System Manager interface is the equivalent of using the samfsdump command from the command line. You can take a metadata snapshot from the File System Manager interface at any time.
1. From the Servers page, click the server on which the file system that you want to administer is located.
The File Systems Summary page is displayed.
2. Select the radio button next to the file system for which you want to schedule a metadata snapshot.
3. From the Operations menu, choose Take Metadata Snapshots.
The Take Metadata Snapshot window is displayed.
4. In the Fully Qualified Snapshot File field, type the path and the name of the snapshot file that you want to create.
See the File System Manager online help for more information on creating metadata snapshots.
Beginning with File System Manager version 2.1, compressed metadata snapshots created by File System Manager can be indexed without being uncompressed. In order to take advantage of this feature, you should select the gzip compression method for any scheduled metadata snapshots.
If you have existing compressed snapshots that are not in the gzip format, you can use the gznew command to convert them to gzip format.
In addition, indexing for metadata snapshots was also improved in version 2.1 of File System Manager. Additional information was added to the index, including information about damaged or online files. To take advantage of this improvement, you should delete any existing indexes and recreate them.
You can also use File System Manager to specify a retention policy for metadata snapshots. Snapshots can be deleted after a specified number of months or marked for permanent retention.
When restoring from a metadata snapshot, the status of the file at the time the snapshot was taken is provided and you can opt to restore files to the same state. You can also select a replacement strategy for determining which files to keep in case a file of the same name already exists. The following options are available:
To Create a Sun StorEdge SAM-FS Metadata Dump File Manually Using the Command Line |
2. Go to the mount point for the samfs type file system mount point or to the directory that you are dumping.
If necessary, see To Find Sun StorEdge QFS File Systems.
3. Enter the samfsdump(1M) command to create a metadata dump file.
CODE EXAMPLE 4-2 shows a SAM-QFS file system metadata dump file being created on February 14, 2004, in a dumps subdirectory in dump file system /dump_sam1/dumps. The output of the ls(1) command line shows that the date is assigned in yymmdd format as the dump file's name, 040214.
# samfsdump -f /dump_sam1/dumps/`date +\%y\%m\%d` # ls /dump_sam1/dumps 040214 |
To Create a Sun StorEdge SAM-FS Metadata Dump File Automatically From File System Manager |
Scheduling a metadata snapshot through the File System Manager interface is the equivalent of creating a crontab(1) entry that automates the Sun StorEdge SAM-FS software samfsdump(1M) process.
To schedule a metadata snapshot:
1. From the Servers page, click the server on which the archiving file system that you want to administer is located.
The File Systems Summary page is displayed.
2. Select the radio button next to the archiving file system for which you want to schedule a metadata snapshot.
3. From the Operations menu, choose Schedule Metadata Snapshots.
The Schedule Metadata Snapshots page is displayed.
4. Specify values on the Schedule Metadata Snapshots page.
For detailed instructions on using this page, see the File System Manager online help.
To Create a Sun StorEdge SAM-FS Metadata Dump File Automatically Using cron |
2. Enter the crontab(1M) command with the -e option to make an entry to dump the metadata for each file system.
The crontab entry in CODE EXAMPLE 4-3 runs at 10 minutes past 2 a.m. every day and does the following:
Note - Put the crontab entry on a single line. It is shown in multiple lines in the preceding example because it is too wide for the page's format. |
If the crontab entry in the previous code example had run on March 20, 2005, the full path name of the dump file would be /dump_sam1/dumps/050320.
Archiver logging should be enabled in the archiver.cmd(4) file. Because archiver logs list all the files that have been archived and their locations on cartridges, archiver logs can be used to recover lost files that were archived after the last set of metadata dumps and backup copies were created.
Be aware of the following considerations:
Note - Using archiver logs is much more time consuming that using metadata to retrieve data, so this approach should not be relied upon. Do not use it unless there is no alternative. |
Set up and manage the archive logs by performing the described procedures in the following sections:
To Set Up Archiver Logging |
Enable archive logging in the archiver.cmd file, which is in the /etc/opt/SUNWsamfs directory.
The archiver log files are typically written to /var/adm/logfilename. The directory to which you direct the logs to be written should reside on a disk outside the SAM-QFS environment. For more information, see the archiver.cmd(4) man page.
To Save Archiver Logs |
Ensure that archiver log files are cycled regularly by creating a cron(1M) job that moves the current archiver log files to another location.
The screen example below shows how to create a dated copy of an archiver log named /var/adm/archlog every day at 3:15 a.m. The dated copy is stored in /var/archlogs.
Note - If you have multiple archiver logs, create a crontab entry for each one. |
Consider writing scripts to create tar(1) files that contain copies of all the relevant disaster recovery files and metadata described in this chapter and to store the copies outside the file system. Depending on your site's policies, put the files into one or more of the locations described in the following list:
For information on removable media files, see the request(1) man page.
This approach ensures that the disaster recovery files and metadata are archived separately from the file system to which they apply. You might also consider archiving multiple backup copies for additional redundancy.
Observe the following precautions:
You can obtain lists of all directories containing removable media files by using the sls(1M) command. These listings can be emailed. For more information about obtaining file information, see the sls(1M) man page.
Copyright © 2006, Sun Microsystems, Inc. All Rights Reserved.