C H A P T E R 3 |
Archiving |
Archiving is the process of copying a file from a Sun StorEdge SAM-FS file system to a volume that resides on a removable media cartridge or on a disk partition of another file system. Using Sun StorEdge SAM-FS archiving capabilities, you can specify that files be archived immediately, specify that files never be archived, and perform other tasks.
Throughout this chapter, the term archive media is used to refer to the various cartridges or disk slices to which archive volumes are written. This chapter describes the archiver's theory of operations, provides general guidelines for developing archive policies for your site, and explains how to implement policies by creating an archiver.cmd file.
This chapter contains the following sections:
The archiver automatically writes Sun StorEdge SAM-FS files to archive media. Operator intervention is not required to archive the files. Files are archived to a volume on the archive media, and each volume is identified by a unique identifier called a volume serial name (VSN). Archive media can contain one or more volumes.
The archiver starts automatically when a Sun StorEdge SAM-FS file system is mounted. You can customize the archiver's operations for your site by inserting archiving directives into the following file:
/etc/opt/SUNWsamfs/archiver.cmd
The archiver.cmd file does not need to be present for archiving to occur. In the absence of this file, the archiver uses the following defaults:
The following sections describe the concept of an archive set and explain the operations performed during the archiving process.
The sam-archiverd daemon schedules the archiving activity. The sam-arfind process assigns files to be archived to archive sets. The sam-arcopy process copies the files to be archived to the selected volumes.
The sam-archiverd daemon is started by sam-fsd when Sun StorEdge SAM-FS activity begins. The sam-archiver daemon executes the archiver(1M) command to read the archiver.cmd file and builds the tables necessary to control archiving. It starts a sam-arfind process for each mounted file system; if a file system is unmounted, it stops the associated sam-arfind process. The sam-archiverd process then monitors sam-arfind and processes signals from an operator or other processes.
An archive set identifies a group of files to be archived. Archive sets can be defined across any group of file systems. Files in an archive set share common criteria that pertain to the size, ownership, group, or directory location. The archive set controls the destination of the archive copy, how long the copy is kept archived, and how long the software waits before archiving the data. All files in an archive set are copied to the volumes associated with that archive set. A file in the file system can be a member of only one archive set.
As files are created and modified, the archiver copies them to archive media. The archiving process also copies the data necessary for Sun StorEdge SAM-FS file system operations, including directories, symbolic links, the index of segmented files, and archive media information.
Archive files are compatible with the standard UNIX tar(1) format. This ensures data compatibility with the Sun Solaris Operating System (OS) and other UNIX systems. If a complete loss of your Sun StorEdge SAM-FS environment occurs, the tar(1) format allows file recovery using standard UNIX tools and commands.
Archive set names are determined by the administrator and are virtually unlimited, with the following exceptions:
The no_archive archive set is defined by default. Files selected to be in this archive set are never archived. Files in a temporary directory, such as /sam1/tmp, for example, might be included in the no_archive archive set.
The allsets archive set is used to define parameters that apply to all archive sets.
By default, the archiver makes one copy of each archive set, but you can request up to four. An archive set and a copy number become a synonym for a collection of volumes. The archive copies provide duplication of files on separate volumes.
The data in a file must be modified before the file is considered to be a candidate for archiving or rearchiving. A file is not archived if it is only accessed. For example, issuing a touch(1) or an mv(1) command on a file does not cause it to be archived or rearchived.
A file is selected for archiving based on its archive age, which is the period of time that has past since the file was last modified. The archive age can be defined for each archive copy.
Users can change the default time references on their files to values far in the past or future by using the touch(1) command. This can cause unexpected archiving results, however. To avoid such problems, the archiver adjusts the references so that they are always somewhere between the file creation time and the present time.
The archive priority is computed from file property characteristics and from file property multipliers associated with the archive set. Essentially, the computation is as follows:
archive-priority = file-property-value x property-multiplier
Most file-property-value numbers are 1 (for true) or 0 (for false). For instance, the value of the property copy 1 is 1 if archive copy 1 is being made. The values of copy 2, copy 3, and copy 4 are, then 0. Other properties, such as archive age and file size, can have values other than 0 or 1.
The property-multiplier value is determined from the -priority parameters for the archive set. Various aspects of a file, such as age or size, can be given values to determine the archive request's priority. For more information on the -priority parameter, see the archiver.cmd(4) man page.
The archive-priority and the property-multiplier values are floating-point numbers. The default value for all property multipliers is 0.0. The archive request is set to the highest file priority in the archive request.
The following sections describe the steps taken by the archiver from the initial file scan to the file copy process.
There is a separate sam-arfind process for each mounted file system. The sam-arfind process monitors each file system to determine the files that need archiving. The file system notifies its sam-arfind process whenever a file is changed in a manner that would affect its archival state. Examples of such changes are file modification, rearchiving, unarchiving, and renaming. When notified, the sam-arfind process examines the file to determine the archive action required.
The sam-arfind process determines the archive set to which the file belongs by using the file properties descriptions. The characteristics used for determining a file's archive set include the following:
If the archive age of the file for one or more copies has been met or exceeded, sam-arfind adds the file to one or more archive requests for the archive set. An archive request is a collection of files that belong to the same archive set. The archive request resides in the following directory:
/var/opt/SUNWsamfs/archiver/file_sys/ArchReq
The files in this directory are binary files, and you can display them by using the showqueue(1M) command.
Separate archive requests are used for files not yet archived and for files being rearchived. This allows scheduling to be controlled independently for these two types of files.
If the archive age of the file for one or more copies has not been met, the directory in which the file resides and the time at which the archive age is reached is added to a scan list. Directories are scanned as the scan list times are reached. Files that have reached their archive age are added to archive requests.
If a file is offline, the sam-arfind process selects the volumes to be used as the source for the archive copy. If the file copy is being rearchived, the sam-arfind process selects the volume containing the archive copy that is being rearchived.
If a file is segmented, only those segments that have changed are selected for archival. The index of a segmented file contains no user data, so it is treated as a member of the file system archive set and is archived separately.
There are two methods by which files are marked for archiving: continuous archiving and scanning. With continuous archiving, the archiver works with the file system to determine which files need to be archived. With scanning, the archiver periodically peruses the file systems and selects files for archiving. The following sections describe these methods.
Continuous archiving is the default archiving method (the archiver.cmd file parameter is examine=noscan). With continuous archiving, you can specify scheduling start conditions for an archive set by using the -startage, -startcount, and -startsize parameters. These conditions enable you to optimize archive timeliness versus archive work done. For example:
When any of the scheduling start conditions is reached, the sam-arfind process sends each archive request to the archiver daemon, sam-archiverd, to be scheduled for file copying to archive media.
For more information about archiving parameters see Global Archiving Directives.
Note - When examine is set to noscan, the following default settings are automatically implemented:
|
As an alternative to continuous archiving, you can specify examine=scan in the archiver.cmd file to direct sam-arfind to examine files for archival using scanning. Files needing archival are placed into archive requests. The sam-arfind process scans each file system periodically to determine which files need archiving. The first scan is a directory scan, in which sam-arfind descends recursively through the directory tree. The process examines each file is examined, and sets the file status flag to archdone is set if the file does not need archiving. During successive scans, sam-arfind scans the .inodes file. Only inodes with the archdone flag not set are examined.
For information about controlling the setting of the archdone flag, see The setarchdone Directive: Controlling the Setting of the archdone Flag.
When the file system scan is complete, the sam-arfind process sends each archive request to the archiver daemon, sam-archiverd, to be scheduled for file copying to archive media. The sam-arfind process then sleeps for the duration specified by the interval=time directive. At the end of the interval, the sam-arfind process resumes scanning.
When archive requests are received by the sam-archiverd daemon, they are composed. This section describes the composition process.
Because of the capacity of the archive media or of the controls specified in the archiver command file, the files in an archive request might not be archived all at one time. Composing is the process of selecting the files from the archive request to be archived at one time. When the archive copy operation is complete for an archive request, the archive request is recomposed if files remain to be archived.
The sam-archiverd daemon places the files in the archive request according to certain default and site-specific criteria. The default operation is to archive all the files in an archive request to the same archive volumes in the order in which they were found during the file system scan. The site-specific criteria enable you to control the order in which files are archived and how they can be distributed on volumes. These criteria, called archive set parameters, are evaluated in the following order: -reserve, -join, -sort, -rsort (reverse sort), and -drives. For more information on these parameters, see the archiver.cmd(4) man page.
If the archive request belongs to an archive set that has -reserve owner specified, the sam-archiverd daemon orders the files in the archive request according to the file's directory path, user name, or group name. The files belonging to the first owner are selected for archiving. The remaining files are archived later.
If the archive request belongs to an archive set that has -join method specified, the sam-archiverd daemon groups the files together according to the specified join method. If -sort or -rsort method is also specified, the sam-archiverd daemon sorts the files within each group according to the specified sort method. Each group of joined files is then treated like a single file for the remainder of the composing and scheduling processes.
If the archive request belongs to an archive set that has -sort or -rsort method specified, the sam-archiverd daemon sorts the files according to the specified sort method. Depending on the sort method, the sam-archiverd daemon tends to keep files together based on the sort method, age, size, or directory location. By default, archive requests are not sorted, so files are archived in the order in which they are encountered during the file system scan.
The sam-archiverd daemon determines whether the files are online or offline. If both online and offline files are in the archive request, the online files are selected for archiving first.
If the archive request was not required to be joined or sorted by a sort method, the offline files are ordered by the volume on which the archive copies reside. This ensures that all files in each archive set on the same volume are staged at the same time in the order in which they were stored on the media. When more than one archive copy of an offline file is being made, the offline file is not released until all required copies are made. All the files to be staged from the same volume as the first file are selected for archiving.
After being composed, the archive requests are entered in the sam-archiverd daemon's scheduling queue, as described in the next section.
The scheduler in the sam-archiverd daemon executes on demand when one of the following conditions exists:
The archive requests in the scheduling queue are ordered by priority. Each time the scheduler runs, all archive requests are examined to determine whether they can be assigned to a sam-arcopy process to have their files copied to archive media.
The following must be true in order for archive requests to be scheduled:
If the archive set has the -drives parameter specified, the sam-archiverd daemon divides the selected files in the archive request among multiple drives. If the number of drives available at this time is fewer than that specified by the -drives parameter, the smaller number is used.
If the total size of files in the archive request is less than the -drivemin value, only one drive is used. The -drivemin value is either the value specified by the -drivemin parameter or the archmax value. The archmax value is specified by the -archmax parameter or the value defined for the media. For more information on the -archmax parameter and the archmax= directive, see the archiver.cmd(4) man page.
If the total size of files in the archive request is more than the -drivemin value, then the number of drives used is determined by the total size of the files divided by the -drivemin value. If the number of drives used is fewer than the number of drives specified by the -drives parameter, that is the number that is used.
Drives can take varying amounts of time to archive files. You can use the -drivemax parameter to obtain better drive utilization. The -drivemax parameter requires you to specify the maximum number of bytes to be written to a drive before that drive is rescheduled for more data.
For archiving to occur, there must be at least one volume with enough space to hold at least some of the files in the archive request. If it has enough space, the volume that has most recently been used for the archive set is the one scheduled. This volume must not be in use by the archiver.
If a volume usable for the archive set is busy, another is selected, unless the -fillvsns parameter is specified. In this case, the archive request is not schedulable.
If an archive request is too big for one volume, the files that can fit on the volume are selected to be archived to the volume. If the archive request contains files that are too big to fit on one volume, and volume overflow for the archive request is not selected, the files cannot be archived. An appropriate message for this condition is sent to the log.
You can specify volume overflow for the archive set by using the -ovflmin parameter or for the media by using the ovflmin= directive. For more information about the -ovflmin parameter and the ovflmin= directive, see the archiver.cmd(4) man page. The ovflmin specification determines the file size threshold above which additional volumes or media are assigned for archiving. An ovflmin value specified for the archive set takes precedence over an ovflmin value specified for the media.
If the size of the files is less than the value of ovflmin, the files cannot be archived. An appropriate message for this condition is sent to the log. If the size of the files is more than the value of ovflmin, additional volumes are assigned as required. Volumes are selected in order of decreasing size in order to minimize the number of volumes required. If no usable volumes can be found for the archive request, the archive request waits.
Certain properties, such as whether the file is online or offline, are used in conjunction with the archive priority to determine the scheduling priority for a particular archive request. For more information on customizing the priority multiplier, see the -priority parameters described on the archiver.cmd(4) man page.
For each archive request, the sam-archiverd daemon computes the scheduling priority by adding the archive priority to multipliers associated with various system resource properties. These properties are associated with the number of seconds for which the archive request has been queued, whether the first volume to be used in the archiving process is loaded into a drive, and so on.
Using the adjusted priorities, the sam-archiverd daemon assigns each ready archive request to be copied, as described in the next section.
When an archive request is ready to be archived, the sam-archiverd daemon marks the archive file (tarball) boundaries so that each archive file's size is less than the specified -archmax value. If a single file is larger than this value, it becomes the only file in an archive file.
For each archive request and each drive to be used, the sam-archiverd daemon assigns the archive request to a sam-arcopy process to copy the files to the archive media. The archive information is entered into the inode.
If archive logging is enabled, an archive log entry is created.
For each file that was staged, the disk space is released until all files in the list have been archived.
A variety of errors and file status changes can prevent a file from being successfully copied. Errors can include read errors from the cache disk and write errors to the volumes. Status changes can include modification since selection, a file that is open for writing, or a file that has been removed.
When the sam-arcopy process exits, the sam-archiverd daemon examines the archive request. If any files have not been archived, the archive request is recomposed.
CODE EXAMPLE 3-1 shows sample output from the archiver(1M) -l command.
The sam-arfind and sam-arcopy processes use the syslog facility and archiver.sh to log warnings and informational messages in a log file that contains information about each archived or automatically unarchived file. The log file is a continuous record of archival action. You can use the log file to locate earlier copies of files for traditional backup purposes.
This file is not produced by default. Use the logfile= directive in the archiver.cmd file to specify that a log file be created and to specify the name of the log file. For more information on the log file, see the Using Archiver Directives in this chapter and see the archiver.cmd(4) man page.
CODE EXAMPLE 3-2 shows sample lines from an archiver log file with definitions for each field.
Reading left to right, the fields in the previous listing have the content shown in TABLE 3-1.
The archiver.cmd file controls the archiver's behavior. By default, the archiver runs whenever sam-fsd is started and a Sun StorEdge SAM-FS file system is mounted. In the absence of an archiver.cmd file, the archiver uses the following defaults:
Using directives located in the archiver command file (archiver.cmd), you can customize the actions of the archiver to meet the archiving requirements of your site.
To Create or Modify an archiver.cmd File and Propagate Your Changes |
As an alternative to this method, you can use the File System Manager software to create or modify the archiver.cmd file. For more information, see the File System Manager online help.
1. (Optional) Decide whether you want to edit the actual archiver.cmd file or a temporary archiver.cmd file.
Perform this step if you have an /etc/opt/SUNWsamfs/archiver.cmd file and your system is already archiving files. Consider copying your archiver.cmd file to temporary location where you can edit and test it before putting it into production.
2. Use vi(1) or another editor to edit the file.
Add the directives you need in order to control archiving at your site. For information on the directives you can include in this file, see Using Archiver Directives and About Disk Archiving.
4. Use the archiver(1M) -lv command to verify the correctness of the file.
Whenever you make changes to the archiver.cmd file, you should check for syntax errors using the archiver(1M) command. Specifying the archiver(1M) command as follows evaluates an archiver.cmd file against the current Sun StorEdge SAM-FS system:
This command produces a list of all options and writes a list of the archiver.cmd file, volumes, file system content, and errors to the standard output file (stdout). Errors prevent the archiver from running.
By default, the archiver(1M) command evaluates the /etc/opt/SUNWsamfs/archiver.cmd file for errors. If you are working with a temporary archiver.cmd file, use the -c option with the archiver(1M) command and supply this temporary file's name.
5. If you encounter errors, correct them in the file and run the archiver(1M) command again to verify your corrections.
You must correct all errors before you move onto the next step. The archiver does not archive any files if it finds errors in the archiver.cmd file.
6. If you are working with a temporary file, move it to /etc/opt/SUNWsamfs/archiver.cmd.
7. Use the samd(1M) config command to propagate the file changes and restart the system.
The archiver.cmd file consists of the following types of directives:
The directives consist of lines of text read from the archiver.cmd file. Each directive line contains one or more fields separated by spaces or tabs. Any text that appears after the pound sign character (#) is treated as a comment and is not examined. You can continue long directives to a second line by ending the first line with a backslash (\).
Certain directives in the archiver.cmd file require you to specify a unit of time or a unit in bytes. To specify such a unit, use one of the letters in TABLE 3-2.
CODE EXAMPLE 3-3 shows a sample archiver.cmd file. The comments at the right indicate the various types of directives.
The following sections explain the archiver.cmd directives. They are as follows:
Global directives control the overall archiver operation and enable you to optimize archiver operations for your site's configuration. You can add global directives directly to the archiver.cmd file, or you can specify them using the File System Manager software. For more information on using File System Manager to set global directives, see the File System Manager online help.
Global directives in the archiver.cmd file can be identified either by the equal sign (=) in the second field or by the absence of additional fields.
Global directives must be specified prior to any fs= directives in the archiver.cmd file. The fs= directives are those that pertain to specific file systems. The archiver issues a message if it detects a global directive after an fs= directive.
The archivemeta directive controls whether file system metadata is archived. If files are often moved around and there are frequent changes to the directory structures in a file system, it is a good idea to archive the file system metadata. In contrast, if the directory structures are very stable, you can disable metadata archiving and thereby reduce the actions performed by removable media drives as cartridges are loaded and unloaded. By default, metadata is archived.
This directive has the following format:
For state, specify either on or off. The default is on.
The metadata archiving process depends on whether you are using a Version 1 or a Version 2 superblock, as follows:
The archmax directive specifies the maximum size of an archive file. User files are combined to form the archive file. No more user files are added to the archive file after the target-size value is met. Large user files are written in a single archive file.
To change the defaults, use the following directive:
There are advantages and disadvantages to setting large or small sizes for archive files. For example, if you are archiving to tape and archmax is set to a large size, the tape drive stops and starts less often. However, when writing large archive files, there is the possibility that when an end-of-tape is reached prematurely, a large amount of tape can be wasted. As a rule, archmax should not be set to more than 5 percent of the media capacity.
The archmax directive can also be set for an individual archive set.
By default, a file being archived is copied to archive media using a memory buffer. You can use the bufsize directive to specify a nondefault buffer size and, optionally, to lock the buffer. These actions can improve performance, and you can experiment with different buffer-size values.
This directive has the following format:
You can specify a buffer size and a lock on an archive set basis by using the -bufsize and -lock archive set copy parameters. For more information, see Archive Set Copy Parameters.
By default, the archiver uses all of the drives in an automated library for archiving. To limit the number of drives used, use the drives directive.
This directive has the following format:
The family set name of the automated library as defined in the mcf file. |
|
Also see the -drivemax, -drivemin, and -drives archive set copy parameters described in Specifying the Number of Drives for an Archive Request: -drivemax, -drivemin, and -drives.
New files and files that have changed are candidates for archiving. The archiver finds such files through one of the following:
This directive has the following format:
For method, specify one of the keywords shown in TABLE 3-6.
The archiver runs periodically to examine the status of all mounted Sun StorEdge SAM-FS file systems. The timing is controlled by the archive interval, which is the time between scan operations on each file system. To change the time, use the interval directive.
The interval directive initiates full scans only when continuous archiving is not set and no startage, startsize, or startcount parameters have been specified. If continuous archiving is set (examine=noscan), the interval directive acts as the default startage value.
This directive has the following format:
For time, specify the amount of time you want between scan operations on a file system. By default, time is interpreted in seconds and has a value of 600, which is 10 minutes. You can specify a different unit of time, such as minutes or hours, as described in TABLE 3-2.
If the archiver receives the samu(1M) utility's :arrun command, it begins scanning all file systems immediately. If the examine=scan directive is also specified in the archiver.cmd file, a scan is performed after :arrun or :arscan is issued.
If the hwm_archive mount option is set for the file system, the archive interval can be shortened automatically. This mount option specifies that the archiver commence its scan when the file system is filling up and the high water mark is crossed. The high=percent mount option sets the high water mark for the file system.
For more information on specifying the archive interval, see the archiver.cmd(4) man page. For more information on setting mount options, see the mount_samfs(1M) man page.
The archiver can produce a log file that contains information about each file that is archived, rearchived, or automatically unarchived. The log file is a continuous record of archival action. To specify a log file, use the logfile directive.
This directive has the following format:
For pathname, specify the absolute path and name of the log file. By default, this file is not produced.
The logfile directive can also be set for an individual file system.
Assume that you want to back up the archiver log file every day by copying the previous day's log file to an alternate location. Be sure to perform the copy operation when the archiver log file is closed, not while it is open for a write operation.
1. Use the mv(1) command to move the archiver log file within a Unix file system.
This gives any sam-arfind(1M) or sam-arcopy(1M) operations time to finish writing to the archiver log file.
2. Use the mv(1) command to move the previous day's archiver log file to the Sun StorEdge SAM-FS file system.
The notify directive sets the name of the archiver's event notification script file. This directive has the following format:
For filename, specify the name of the file containing the archiver event notification script or the full path to this file.
The default file name is as follows:
/etc/opt/SUNWsamfs/scripts/archiver.sh
The archiver executes this script to process various events in a site-specific manner. The script is called with one of the following keywords for the first argument: emerg, alert, crit, err, warning, notice, info, and debug.
Additional arguments are described in the default script. For more information, see the archiver.sh(1M) man page.
With volume overflow, archived files are allowed to span multiple volumes. Volume overflow is enabled when you use the ovflmin directive in the archiver.cmd file. When a file size exceeds the value of ovflmin directive's minimum-file-size argument, the archiver writes a portion of this file to another available volume of the same type. The portion of the file written to each volume is called a section.
The archiver controls volume overflow through the ovflmin directive. The ovflmin directive specifies the file size threshold that triggers the overflow process. By default, volume overflow is disabled.
This directive has the following format:
The media type. For a list of valid media types, see the mcf(4) man page. |
|
The minimum file size that you want to trigger the volume overflow. |
Assume that many files exist with a length that is a significant fraction (such as 25 percent) of an mo media cartridge. These files partially fill the volumes and leave unused space on each volume. To get better packing of the volumes, set ovflmin for mo media to a size slightly smaller than the size of the smallest file. The following directive sets it to 150 megabytes:
Note that enabling volume overflow in this example also causes two volumes to be loaded for archiving and staging the file because each file will overflow onto another volume.
The ovflmin directive can also be set for an individual archive set.
The sls(1) command output lists the archive copy showing each section of the file on each VSN. CODE EXAMPLE 3-4 shows the archiver log file and CODE EXAMPLE 3-5 shows the sls -D command output for a large file named file50 that spans multiple volumes.
CODE EXAMPLE 3-4 shows that file50 spans three volumes with VSNs of DLT000, DLT001, and DLT005. The position on the volume and the size of each section is indicated in the seventh and tenth fields respectively 7eed4.1 and 477609472 for the first entry), and matches the sls -D output shown in CODE EXAMPLE 3-5. For a complete description of the archiver log entry, see the archiver(1M) man page.
CODE EXAMPLE 3-5 shows the sls -D command and output.
Volume overflow files do not generate checksums. For more information on using checksums, see the ssum(1) man page.
The scanlist_squash parameter turns scanlist consolidation on or off. The default setting is off. This parameter can be either global or file-system-specific.
When this option is turned on, the scan list entries for files in two or more sub-directories with the same parent directory that need to be scanned by sam-arfind at a much later time are consolidated. This can cause a severe performance penalty if archiving on a file system that has a large number of changes to many sub-directories. When the scanlist is consolidated, these directories are combined upwards to a common parent, which results in a deep recursive scan over many sub-directories.
The setarchdone parameter is a global directive that controls the setting of the archdone flag when the file is examined by sam-arfind.
This directive has the following format:
When all archive copies for a file have been made, the archdone flag is set for that file to indicate that no further archive action is required. During an inodes scan, the archiver detects whether the archdone flag is set, and if it is set the archiver does not look up the path name for the inode.
During directory scans, the archiver also sets the archdone flag for files that will never be archived. This can be a time-consuming operation and can impact performance when large directories are scanned. The setarchdone directive gives you control over this activity. The default setting for the directive is off if the examine directive is set to scandirs or noscan.
This directive controls the setting of the archdone flag only on files that will never be archived. It does not affect the setting of the archdone flag after archive copies are made.
The wait directive causes the archiver to wait for a start signal from samu(1M) or File System Manager. By default, the archiver begins archiving when started by sam-fsd(1M).
This directive has the following format:
The wait directive can also be set for an individual file system.
After the general directives in the archiver.cmd file, you can use the fs= directive to include directives specific to a particular file system. After an fs= directive is encountered, the archiver assumes that all subsequent directives specify actions to be taken only for individual file systems.
You can specify fs= directives by editing the archiver.cmd file, as described in the following sections, or by using the File System Manager software. See the File System Manager online help for more information.
By default, archiving controls apply to all file systems. However, you can confine some controls to an individual file system. For instance, you can use this directive to specify a different log file for each file system. To specify an individual file system, use the fs directive.
This directive has the following format:
For fsname, specify the file system name as defined in the mcf file.
The general directives and archive set association directives that occur after these directives apply only to the specified file system until another fs= directive is encountered.
Several directives can be specified both as global directives for all file systems and as directives that are specific to one file system. These directives are as follows:
By default, files are archived as part of the archive set named for the file system. However, you can specify archive sets to include files that share similar characteristics. If a file does not match one of the specified archive sets, it is archived as part of the default archive set named for the file system.
You can create archive sets by directly editing the archiver.cmd file, as described in the following sections, or by using the File System Manager software. In File System Manager, an archive policy defines an archive set. For more information, see the File System Manager online help.
The archive set membership directives assign files with similar characteristics to archive sets. The syntax of these directives is patterned after the find(1) command. Each archive set assignment directive has the following format:
archive-set-name path [search-criterion1 search-criterion2 ... ] [file-attribute1 file-attribute2 ... ] |
CODE EXAMPLE 3-6 shows typical archive set membership directives.
hmk_files net/home/hmk -user hmk datafiles xray_group/data -size 1M system . |
You can suppress the archiver by including files in an archive set named no_archive. CODE EXAMPLE 3-7 shows lines that prevent archiving of files in a tmp directory, at any level, and regardless of the directory in which the tmp directory resides within the file system.
fs = samfs1 no_archive tmp no_archive . -name .*/tmp/ |
The following sections describe the search_criterion arguments that you can specify.
You can use the -access age characteristic to specify that the age of a file be used to determine archive set membership. When you use this characteristic, files with access times older than age are rearchived to different media. For age, specify an integer followed by one of the suffixes shown in TABLE 3-9.
For example, you can use this directive to specify that files that have not been accessed in a long time be rearchived to less expensive media.
When determining age, the software validates the access and modification times for files to ensure that these times are greater than or equal to the file creation time, and less than or equal to the time at which the file is examined. For files that have been "migrated" into a directory, this validation might not result in the desired behavior. The -nftv (no file time validation) parameter can be used in these situations to prevent the validation of file access and modification times.
You can use the -after date-time characteristic to group newly modified or created files into the same archive set. When you use this characteristic, only files created or modified after the date indicated are included in the archive set.
The format of date-time is YYYY-MM-DD[Thh:mm:ss][Z] (ISO 8601 format). If the time portion is not specified, it is assumed to be 00:00:00. If the Z is present, the time is interpreted as Coordinated Universal Time (UTC); otherwise it is interpreted as local time.
The size of a file can be used to determine archive set membership through the -minsize size and -maxsize size characteristics. For size, specify an integer followed by one of the letters shown in TABLE 3-10.
Example. The lines in CODE EXAMPLE 3-8 specify that all files of at least 500 kilobytes, but less than 100 megabytes, belong to the archive set big_files. Files bigger than 100 megabytes belong to the archive set huge_files.
big_files . -minsize 500k -maxsize 100M huge_files . -minsize 100M |
The ownership and group affiliation can be used to determine archive set membership through the -user name and -group name characteristics. In CODE EXAMPLE 3-9, all files belonging to user sysadmin belong to archive set adm_set, and all files with the group name of marketing are in the archive set mktng_set.
adm_set . -user sysadmin mktng_set . -group marketing |
The names of files to be included in an archive set can be specified by regular expressions. The -name regex specification as a search-criterion directive specifies that any complete path matching the regular expression regex is to be a member of the archive set.
The regex argument follows the conventions outlined in the regexp(5) man page. Note that regular expressions do not follow the same conventions as UNIX wildcards.
All files beneath the selected directory (with their specified paths relative to the mount point of the file system) go through pattern matching. This allows you to create patterns in the -name regex field to match both file names and path names.
The following directive restricts files in the archive set images to those files ending with .gif:
The following directive selects files that start with the characters GEO:
You can use regular expressions with the no_archive archive set. The following specification prevents any file ending with .o from being archived:
Assume that your archiver.cmd file contains the lines shown in CODE EXAMPLE 3-10.
# File selections. fs = samfs1 1 1s 2 1s no_archive share/marketing -name fred\. |
With this archiver.cmd file, the archiver does not archive fred.* in the user directories or subdirectories. CODE EXAMPLE 3-11 shows the files not archived if you specify the directives shown in CODE EXAMPLE 3-10.
/sam1/share/marketing/fred.anything /sam1/share/marketing/first_user/fred.anything /sam1/share/marketing/first_user/first_user_sub/fred.anything |
CODE EXAMPLE 3-12 shows the files that are archived if you specify the directives shown in CODE EXAMPLE 3-10.
/sam1/fred.anything /sam1/share/fred.anything /sam1/testdir/fred.anything /sam1/testdir/share/fred.anything /sam1/testdir/share/marketing/fred.anything /sam1/testdir/share/marketing/second_user/fred.anything |
In contrast to CODE EXAMPLE 3-10, assume that your archiver.cmd file contains the lines shown in CODE EXAMPLE 3-13.
# File selections. fs = samfs1 1 1s 2 1s no_archive share/marketing -name ^share/marketing/[^/]*/fred\. |
The archiver.cmd file in CODE EXAMPLE 3-13 does not archive fred.* in the user home directories. This archives fred.* in the user subdirectories and in the directory share/marketing. In this case, a user home directory is anything from share/marketing/ until the next slash character (/). As a result, the following files are not archived:
CODE EXAMPLE 3-14 shows the files that are archived if you specify the directives shown in CODE EXAMPLE 3-13.
/sam1/share/fred.anything /sam1/share/marketing/fred.anything /sam1/share/marketing/first_user/first_user_sub/fred.anything /sam1/fred.anything /sam1/testdir/fred.anything /sam1/testdir/share/fred.anything /sam1/testdir/share/marketing/fred.anything /sam1/testdir/share/marketing/second_user/fred.anything /sam1/testdir/share/marketing/second_user/sec_user_sub/fred.any |
You can set the release and stage attributes associated with files within an archive set by using the -release and -stage options, respectively. Both of these settings override stage or release attributes that a user might have set previously.
The -release option has the following format:
The attributes for the -release directive follow the same conventions as the release(1) command and are shown in TABLE 3-11.
Release the file following the completion of the first archive copy. |
|
The -stage option has the following format:
The attributes for the -stage directive follow the same conventions as the stage(1) command and are shown in TABLE 3-12.
The following example shows how you can use file name specifications and file attributes to partially release Macintosh resource directories:
Sometimes the choice of path and other file characteristics for inclusion of a file in an archive set results in ambiguous archive set membership. These situations are resolved in the following manner:
1. The membership definition occurring first in the archive set is chosen.
2. Membership definitions local to a file system are chosen before any globally defined definitions.
3. A membership definition that exactly duplicates a previous definition is noted as an error.
Given these rules, more restrictive membership definitions should be placed earlier in the directive file.
When controlling archiving for a specific file system (using the fs=fsname directive), the archiver evaluates the file-system-specific directives before evaluating the global directives. Thus, files can be assigned to a local archive set (including the no_archive archive set) instead of being assigned to a global archive. This has implications for global archive set assignments such as no_archive.
In CODE EXAMPLE 3-15, it appears that the administrator did not intend to archive any of the .o files across both file systems. However, because the local archive set assignment allfiles is evaluated before the global archive set assignment no_archive, the .o files in the samfs1 and samfs2 file systems are archived.
no_archive . -name .*\.o$ fs = samfs1 allfiles . fs = samfs2 allfiles . |
CODE EXAMPLE 3-16 shows the directives to use to ensure that no .o files are archived in the two file systems.
fs = samfs1 no_archive . -name .*\.o$ allfiles . fs = samfs2 no_archive . -name .*\.o$ allfiles . |
If you do not specify archive copies, the archiver writes a single archive copy for files in the archive set. By default, this copy is made when the archive age of the file is four minutes. If you require more than one archive copy, all copies, including the first, must be specified through archive copy directives.
The archive copy directives begin with a copy-number value of 1, 2, 3, or 4. The digit is followed by one or more arguments that specify archive characteristics for that copy.
Archive copy directives must appear immediately after the archive set assignment directive to which they pertain. Each archive copy directive has the following format:
You can specify archive copy directives by editing the archiver.cmd file, as described here, or by using the File System Manager software. For more information, see the File System Manager online help.
The following sections describe the archive copy directive arguments.
To specify that the disk space for files is to be automatically released after an archive copy is made, use the -release directive after the copy number. This directive has the following format:
In CODE EXAMPLE 3-17, files within the group images are archived when their archive age reaches 10 minutes. After archive copy 1 is made, the disk cache space is released.
ex_set . -group images 1 -release 10m |
You might not want to release disk space until multiple archive copies are completed. The -norelease option prevents the automatic release of disk cache until all copies marked with -norelease are made.
This directive has the following format:
The -norelease directive makes the archive set eligible to be released after all copies have been archived, but the files will not be released until the releaser is invoked and selects them as release candidates.
CODE EXAMPLE 3-18 specifies an archive set named vault_tapes. Two copies are created, but the disk cache associated with this archive set is not released until both copies are made.
vault_tapes 1 -norelease 10m 2 -norelease 30d |
Using the -norelease directive on a single copy has no effect on automatic releasing because the file cannot be released until it has at least one archive copy.
If you want to make sure that the disk space is released immediately after all copies of an archive set have been archived, you can use the -release and -norelease options together. The combination of -release and -norelease causes the archiver to release the disk space immediately, when all copies having this combination are made, rather than waiting for the releaser to be invoked, as is the case with the -norelease option alone.
You can set the archive age for files by specifying the archive age in the archive copy directive. The archive age can be specified with a suffix character such as h for hours or m for minutes as shown in TABLE 3-2.
In CODE EXAMPLE 3-19, the files in directory data are archived when their archive age reaches one hour.
If you specify more than one archive copy of a file, it is possible to unarchive all but one of the copies automatically. You might want to do this when the files are archived to various media using various archive ages.
CODE EXAMPLE 3-20 shows the directive that specifies the unarchive age. The first copy of the files in the path home/users is archived six minutes after modification. When the files are 10 weeks old, second and third archive copies are made. The first copy is then unarchived.
ex_set home/users 1 6m 10w 2 10w 3 10w |
For more ways to control unarchiving, see Controlling Unarchiving.
If more than one copy of metadata is required, you can place copy definitions in the directive file immediately after an fs= directive.
CODE EXAMPLE 3-21 shows an archiver.cmd file that specifies multiple metadata copies.
fs = samfs7 1 4h 2 12h |
In this example, one copy of the metadata for the samfs7 file system is made after 4 hours and a second copy is made after 12 hours.
File system metadata includes path names in the file system. For this reason, if you have frequent changes to directories, the new path names cause the creation of new archive copies. This results in frequent loads of the volumes specified for metadata.
The archive set parameters section of the archiver.cmd file begins with the params directive and ends with the endparams directive. CODE EXAMPLE 3-22 shows the format for directives for an archive set.
params archive-set-name.copy-number[R] [ -param1 -param2 ...] . . . endparams |
You can specify archive set copy parameters by editing the archiver.cmd file, as shown here, or by using the File System Manager software. For more information, see the File System Manager online help.
The pseudo archive set allsets provides a way to set default archive set directives for all archive sets. All allsets directives must precede directives for actual archive set copies. Parameters set for individual archive set copies override parameters set by the allsets directive. For more information on the allsets archive set, see the archiver.cmd(4) man page.
The following subsections describe all archive set processing parameters, with the exception of disk archiving parameters. For information on disk archiving parameters, see About Disk Archiving.
The -archmax directive sets the maximum file size for an archive set. This directive has the following format:
This directive is very similar to the archmax global directive. For information on that directive, and the values to enter for target-size, see The archmax Directive: Controlling the Size of Archive Files.
By default, a file being archived is stored in memory in a buffer before being written to archive media. You can use the -bufsize directive to specify a nondefault buffer size. These actions can improve performance, and you can experiment with various buffer-size values.
This parameter has the following format:
For buffer-size, specify a number from 2 through 32. The default is 4. This value is multiplied by the dev_blksize value for the media type, and the resulting buffer size is used. The dev_blksize value is specified in the defaults.conf file. For more information on this file, see the defaults.conf(4) man page.
For example, this parameter can be specified in the archiver.cmd file in a line such as the following:
The equivalent of this directive on a global basis is bufsize=media buffer-size. For more information on that directive, see The bufsize Directive: Setting the Archiver Buffer Size.
By default, the archiver uses only one media drive to archive the files of one archive set. When an archive set has many files or large files, it can be advantageous to use more than one drive. In addition, if the drives in your automated library operate at different speeds, use of multiple drives can balance these variations and thereby increase archiving efficiency.
The drive directives have the following formats:
An archive request is evaluated against the parameters that are specified, as follows:
When you use the -drives parameter, multiple drives are used only if data that is more than the value of min_size is to be archived. The number of drives to be used in parallel is the lesser of the following two values:
You can use the -drivemin and -drives parameters if you want to divide an archive request among drives but don't want to have all the drives bust with small archive requests. This might apply to operations that use very large files.
To set these parameters, you need to consider file creation rates, the number of drives, the time it takes to load and unload drives, and drive transfer rates.
For example, suppose that you are splitting an archive set named bigfiles over five drives. Depending on its size, this archive set could be split as shown in TABLE 3-15.
CODE EXAMPLE 3-23 shows the lines to use in the archiver.cmd file to split the archive request over multiple drives.
params bigfiles.1 -drives 5 -drivemin 10G endparams |
In addition, you might specify the following line in the archiver.cmd file:
When the total size of the files in archive set huge_files.2 is equal to or greater than two times drivemin for the media, two drives are used to archive the files.
By default, the archiver selects from all volumes assigned to an archive set when it writes archive copies, using a volume with enough space for all the files. This action can result in volumes not being filled to capacity. If -fillvsns is specified, the archiver separates the archive request into smaller groups.
By default, a file being archived is stored in memory in a buffer before being written to archive media. If direct I/O is enabled, you can use the -lock parameter to lock this buffer. This action can improve performance.
This parameter has the following format:
The -lock parameter indicates that the archiver should use locked buffers when making archive copies. If -lock is specified, the archiver sets file locks on the archive buffer in memory for the duration of the sam-arcopy(1M) operation. This avoids paging of the buffer, and can thereby improve performance.
The -lock parameter should be specified only on large systems with large amounts of memory. Insufficient memory can cause an out-of-memory condition.
The -lock parameter is effective only if direct I/O is enabled for the file being archived. By default, -lock is not specified, and the file system sets locks on all direct I/O buffers, including those for archiving. For more information on enabling direct I/O, see the setfa(1) man page, the sam_setfa(3) library routine man page, or the -O forcedirectio option on the mount_samfs(1M) man page.
For example, this parameter can be specified in the archiver.cmd file in a line such as the following:
You can also specify the equivalent of this parameter on a global basis by specifying the lock argument to the bufsize=media buffer-size [lock] directive. For more information on this topic, see The bufsize Directive: Setting the Archiver Buffer Size.
A file is a candidate for being released after one archive copy is made. If the file releases and goes offline before all the archive copies are made, the archiver uses this parameter to determine the method to be used when making the other archive copies. In choosing the method to be used, consider the number of drives available to the Sun StorEdge SAM-FS system and the amount of disk cache available.
This parameter has the following format:
For method, specify one of the keywords shown in TABLE 3-16.
The recycling process enables you to reclaim space on archive volumes that is taken up by expired archive images. By default, no recycling occurs.
If you want to recycle, you can specify directives in both the archiver.cmd file and the recycler.cmd file. For more information on the recycling directives supported in the archiver.cmd file, see Recycling.
The archiver employs associative archiving if you specify the -join path parameter. Associative archiving is useful if you want an entire directory to be archived to one volume and you know that the archive file can physically reside on only one volume. Otherwise, if you want to keep directories together, use either the -sort path or -rsort path parameters to keep the files contiguous. The -rsort parameter specifies a reverse sort.
When the archiver writes an archive file to a volume, it efficiently packs the volume with user files. Subsequently, when accessing files from the same directory, you can experience delays as the stage process moves through a volume to read the next file. To alleviate delays, you can use the -join path parameter to archive files from the same directory paths contiguously within an archive set copy. The process of associative archiving overrides the space efficiency algorithm to archive files from the same directory together.
Associative archiving is useful when the file content does not change but you always want to access a group of files together. For example, you might use associative archiving at a hospital for accessing all of the medical images associated with a patient. For example:
It is also possible to sort files within an archive set copy by age, size, or path. The age and size arguments are mutually exclusive. CODE EXAMPLE 3-24 shows how to sort an archive set using the -sort parameter with the argument age or size.
cardiac.1 -sort path cardiac.2 -sort age catscans.3 -sort size |
The first line forces the archiver to sort an archive request by path name. The second example line forces the archiver to sort the archive set copy cardiac.2 by the age of the file, oldest to youngest. The third line forces the archive set copy catscans to be sorted by the size of the file, smallest to largest. If you wanted a reverse sort, you could specify -rsort in place of -sort.
Unarchiving is the process by which archive entries for files or directories are deleted. Files are unarchived based on the time since last access. All frequently accessed data can be stored on fast media, such as disk, and all older, infrequently accessed data can be stored on tape. By default, files are never unarchived.
For example, suppose that the archiver.cmd file shown in CODE EXAMPLE 3-25 controls a file that is accessed frequently. This file remains on disk all the time, even if it is older than 60 days. The copy 1 information is removed only if the file is not accessed for 60 days.
If the copy 1 information is removed (because the file was not accessed for 60 days) and someone stages the file from copy 2, it is read from tape. After the file is back online, the archiver makes a new copy 1 on disk and the 60-day access cycle starts all over again. The Sun StorEdge SAM-FS archiver regenerates a new copy 1 if the file is accessed again.
arset1 dir1 1 10m 60d 2 10m 3 10m vsns arset1.1 mo OPT00[0-9] arset1.2 lt DLTA0[0-9] arset1.3 lt DLTB0[0-9] |
Assume that a patient is in the hospital for four weeks. During this time, all of this patient's files are on fast media (copy 1=mo). After four weeks, the patient is released from the hospital. If no data has been accessed for this patient for up to 60 days after the patient is released, the copy 1 entry in the inode is unarchived, and only copy 2 and copy 3 entries are available. The volume can now be recycled in order to make room for more current patients without having to increase the disk library. If the patient comes back to the hospital after six months for follow-up care, the first access of the data is from tape (copy 2). Now the archiver automatically creates a new copy 1 on disk to ensure that the data is back on the fast media during the follow-up, which could take several days or weeks.
By default, the archiver writes a tape mark, an EOF label, and two more tape marks between archive files. When the next archive file is started, the driver backs up to the position after the first tape mark, causing a loss of performance. The -tapenonstop parameter directs the archiver to write only the initial tape mark. In addition, if the -tapenonstop parameter is specified, the archiver enters the archive information at the end of the copy operation.
For more information on the -tapenonstop parameter, see the archiver.cmd(4) man page.
By default, the archiver writes archive set copies to any volume specified by a regular expression as described in the volume associations section of the archiver.cmd file. However, you might sometimes want archive set volumes to contain files from only one archive set. You can reserve volumes to satisfy this data storage requirement.
The -reserve parameter reserves volumes for an archive set. When the -reserve parameter is set and a volume has been assigned to an archive set copy, the volume identifier is not assigned to any other archive set copy, even if a regular expression matches it.
Note - A site that uses reserved volumes is likely to incur more cartridge loads and unloads. |
When a volume is selected for use by an archive set, it is assigned a reserved name, which is a unique identifier that ties the archive set to the volume.
The format for the -reserve parameter is as follows:
The value of keyword depends on the form you are using, as follows:
For example, the archiver.cmd file fragment in CODE EXAMPLE 3-26, shows that the line that begins with the allsets archive set name reserves volumes by archive set for all archive sets.
In the archiver.cmd file, you can specify a -reserve parameter for one, two, or all three possible forms. The three forms can be combined and used together in an archive set parameter definition.
For example, with the archiver.cmd file fragment shown in CODE EXAMPLE 3-27, the line that begins with arset.1 creates a reserved name based upon an archive set, a group, and the file system.
params arset.1 -reserve set -reserve group -reserve fs endparams |
The information regarding reserved volumes is stored in the library catalog. The lines in the library catalog list the media type, the VSN, the reserve information, and the reservation date and time. The reserve information includes the archive set component, path name component, and file system component, separated by slashes (//).
Note - These slashes are not indicative of a path name; they are merely separators for displaying the three components of a reserved name. |
As CODE EXAMPLE 3-28 shows, the lines in the library catalog that describe reserved volumes begin with #R characters.
Note - Some lines in CODE EXAMPLE 3-28 have been truncated to fit on the page. |
One or more of the reserve information fields can be empty, depending on the options defined in the archiver.cmd file. The date and time indicate when the reservation was made. A reservation line is appended to the file for each volume that is reserved for an archive set during archiving.
The archiver records volume reservations in the library catalog files. A volume is automatically unreserved when it is relabeled because the archive data has been effectively erased.
You can also use the reserve(1M) and unreserve(1M) commands to reserve and unreserve volumes. For more information on these commands, see the reserve(1M) and unreserve(1M) man pages.
You can display the reserve information by using the samu(1M) utility's v display or by using the archiver(1M) or dump_cat(1M) command in one of the formats shown in CODE EXAMPLE 3-29.
archiver -lv dump_cat -V catalog-name |
Example 4: User and Data Files Archived to Optical Media shows a complete archive example using reserved volumes.
The Sun StorEdge SAM-FS file systems offer a configurable priority system for archiving files. Each file is assigned a priority computed from properties of the file and priority multipliers that can be set for each archive set in the archiver.cmd file. Properties include online/offline, age, number of copies made, and size.
By default, the files in an archive request are not sorted, and all property multipliers are zero. This results in files being archived in first-found, first-archived order. You can control the order in which files are archived by setting priorities and sort methods. The following are examples of priorities that you can set:
TABLE 3-20 lists the archive priorities.
For value, specify a floating-point number in the following range:
For more information on priorities, see the archiver(1M) and archiver.cmd(4) man pages.
As the archiver scans a file system, it identifies files to be archived. Files that are recognized as candidates for archiving are placed in a list known as an archive request. At the end of the file system scan, the system schedules the archive request for archiving. The -startage, -startcount, and -startsize archive set parameters control the archiving workload and ensure the timely archival of files. TABLE 3-21 shows the formats for these parameters.
The amount of time that can elapse between the first file in a scan being marked for inclusion in an archive request and the start of archiving. For time, specify a time in the format used in Setting the Archive Age. If this variable is not set, the interval directive is used. |
|
The number of files to be included in an archive request. When the number of files in the archive request reaches the value of count, archiving begins. By default, count is not set. |
|
The minimum total size, in bytes, of all files to be archived in an archive request. Archiving work is accumulated, and archiving begins when the total size of the files reaches the value of size. By default, size is not set. |
The examine=method directive and the interval=time directives interact with the -startage, -startcount, and -startsize directives. The -startage, -startcount, and -startsize directives optimally balance archive timeliness and archive work done. These values override the examine=method specification, if any. For more information on the examine directive, see The examine Directive: Controlling Archive Scans. For more information on the interval directive, see The interval Directive: Specifying an Archive Interval.
The -startage, -startcount, and -startsize directives can be specified in an archiver.cmd file for each archive copy. If more than one of these directives is specified, the first condition encountered starts the archive operation. If none of these directives is specified, the archive request is scheduled based on the examine=method directive, as follows:
The archiver.cmd(4) man page has examples that show how to use these directives.
The VSN associations section of the archiver.cmd file assigns volumes to archive sets. This section starts with a vsns directive and ends with an endvsns directive.
VSN associations can also be configured with the File System Manager software. See the File System Manager online help for more information.
Collections of volumes are assigned to archive sets by directives of the following form:
An association requires at least three fields: archive-set-name and copy-num, media-type, and at least one volume. The archive-set-name and copy_num values are connected by a period (.).
Note - If your Sun StorEdge SAM-FS environment is configured to recycle by archive set, do not assign a VSN to more than one archive set. |
The following examples use regular expressions to specify the same VSNs in different ways.
CODE EXAMPLE 3-30 shows two lines of VSN specifications.
vsns set.1 lt VSN001 VSN002 VSN003 VSN004 VSN005 set.1 lt VSN006 VSN007 VSN008 VSN009 VSN010 endvsns |
CODE EXAMPLE 3-31 shows a VSN specification that uses a backslash character (\) to continue a line onto a subsequent line.
vsns set.1 lt VSN001 VSN002 VSN003 VSN004 VSN005 \ VSN006 VSN007 VSN008 VSN009 VSN010 endvsns |
CODE EXAMPLE 3-32 specifies VSNs using a regular expression in a shorthand notation.
When the archiver needs volumes for the archive set, it examines each volume of the selected media type in all automated libraries and manually mounted drives to determine if the volume would satisfy any VSN expression. It selects the first volume that matches an expression that contains enough space for the archive copy operation. For example:
The VSN pools section of the archiver.cmd file starts with a vsnpools directive and ends either with an endvsnpools directive or with the end of the archiver.cmd file. This section names a collection of volumes.
VSN pools can also be configured with the File System Manager software. See the File System Manager online help for more information.
A VSN pool is a named collection of volumes. VSN pools are useful for defining volumes that can be available to an archive set. As such, VSN pools provide a useful buffer for assigning volumes and reserving volumes to archive sets. You can use VSN pools to define separate groups of volumes by departments within an organization, by users within a group, by data type, and according to other convenient groupings.
If a volume is reserved, it is no longer available to the pool in which it originated. Therefore, the number of volumes within a named pool changes as volumes are used. You can view the VSN pools by issuing the archiver(1M) command in the following format:
The syntax of a VSN pool definition is as follows:
The following example uses four VSN pools: users_pool, data_pool, proj_pool, and scratch_pool. A scratch pool is a set of volumes used when specific volumes in a VSN association are exhausted or when another VSN pool is exhausted. If one of the three specific pools is out of volumes, the archiver selects the scratch pool VSNs. CODE EXAMPLE 3-33 shows an archiver.cmd file that uses four VSN pools.
For more information on VSN associations, see VSN Association Directives.
Archiving is the process of copying a file from online disk to archive media. With disk archiving, the archive medium is online disks in a file system.
Disk archiving can be implemented so that the files are archived from one Sun StorEdge SAM-FS file system to another file system on the same host computer or to another file system on a different Sun Solaris host. When disk archiving is implemented using two host systems, the systems involved act as a client and a server, with the client system hosting the source files and the server system being the destination system that hosts the archive copies.
The file system to which the archive files are written can be any UNIX file system. However, if disk archive copies are written to a different host, the host must have at least one file system installed on it that is compatible with the Sun StorEdge SAM-FS software.
The archiver treats files archived to disk volumes the same as files archived to volumes in a library. You can still make one, two, three, or four archive copies. If you are making multiple archive copies, one of the archive copies could be written to disk volumes while the others are written to removable media volumes. In addition, if you typically archive to disk volumes in a Sun StorEdge SAM-FS file system, the archive file copies are themselves archived according to the archiver.cmd file rules in that file system.
The following list summarizes some of the similarities and differences between archiving to online disk and archiving to removable media:
Note - You do not need the diskvols.conf configuration file if you are archiving to removable media volumes only. |
A diskvols.conf file must be created on the system upon which the source files reside. Depending on where the archive copies are written, this file also contains the following information:
Although there are no restrictions on where disk archive volumes can reside, it is recommended that disk volumes reside on a disk other than the one on which the original files reside. It is also recommended that you make more than one archive copy and write to more than one type of archive media. For example, you might archive copy 1 to disk volumes, copy 2 to tape, and copy 3 to magneto-optical disk.
If you are archiving files to a file system on a server system, the archive files themselves can be archived to removable media cartridges in a library attached to the server.
When archiving to online disk, the archiver recognizes the archiver.cmd directives that define the archive set and configure recycling. It ignores directives that specifically pertain to removable media cartridges. Specifically, the system recognizes the following directives for disk archive sets:
For client-system, specify the host name of the client system that contains the source files.
For more information on directives for disk archiving, see the archiver.cmd(4) man page.
To Enable Disk Archiving |
You can enable disk archiving at any time. The procedure in this section assumes that you have archiving in place and you are adding disk archiving to your environment. If you are enabling disk archiving as part of an initial installation, see the Sun StorEdge SAM-FS Installation and Upgrade Guide for information.
1. Make certain that the host to which you want to write your disk archive copies has at least one Sun StorEdge QFS or Sun StorEdge SAM-FS file system installed on it.
2. Become superuser on the host system that contains the files to be archived.
3. Follow the procedures in the Sun StorEdge SAM-FS Installation and Upgrade Guide for enabling disk archiving both on the host that contains the files to be archived and on the host to which the archive copies will be written.
4. On the host that contains the files to be archived, use the samd(1M) config command to propagate the configuration file changes and restart the system.
5. If you are archiving to disk on a different host, follow these steps:
a. Become superuser on the host system to which the archive copies are written.
b. Use the samd(1M) config command to propagate the configuration file changes and restart the destination system.
The following are some examples of disk archiving configurations.
In this example, VSNs identified as disk01, disk02, and disk04 are written to pluto, the host system upon which the original source files reside. VSN disk03 is written to a VSN on server system mars.
CODE EXAMPLE 3-36 shows the diskvols.conf file that resides on client system pluto.
CODE EXAMPLE 3-37 shows the diskvols.conf file on server system mars.
# This is file /etc/opt/SUNWsamfs/diskvols.conf on mars # clients pluto endclients |
CODE EXAMPLE 3-38 shows a fragment of the archiver.cmd file on pluto.
vsns arset1.2 dk disk01 arset2.2 dk disk02 disk04 arset3.2 dk disk03 endvsns |
In this example, file /sam1/testdir0/filea is in the archive set for arset0.1, and the archiver copies the content of this file to the destination path /sam_arch1. CODE EXAMPLE 3-39 shows the diskvols.conf file.
# This is file /etc/opt/SUNWsamfs/diskvols.conf # # VSN Name [Host Name:]Path # disk01 /sam_arch1 disk02 /sam_arch12/proj_1 |
CODE EXAMPLE 3-40 shows the archiver.cmd file lines that pertain to disk archiving:
. vsns arset0.1 dk disk01 endvsns . |
CODE EXAMPLE 3-41 shows output from the sls(1) command for file filea, which was archived to disk. Note the following in the copy 1 line:
In this example, file /sam2/my_proj/fileb is on client host snickers in archive set arset0.1, and the archiver copies the content of this file to the destination path /sam_arch1 on server host mars.
CODE EXAMPLE 3-42 shows the diskvols.conf file on snickers.
# This is file /etc/opt/SUNWsamfs/diskvols.conf on snickers # # VSN Name [Host Name:]Path # disk01 mars:/sam_arch1 |
CODE EXAMPLE 3-43 shows the diskvols.conf file on mars.
# This is file /etc/opt/SUNWsamfs/diskvols.conf on mars # clients snickers endclients |
CODE EXAMPLE 3-44 shows the directives in the archiver.cmd file that relate to this example.
. vsns arset0.1 dk disk01 endvsns . |
The archiver automates storage management operations using the archiver.cmd file. Before writing this file, it is useful to review some general guidelines that can improve the performance of your Sun StorEdge SAM-FS file system and the archiver and that can help ensure that your data is stored in the safest way possible.
The archiver and stager processes both can request that media be loaded and unloaded. If the number of requests exceeds the number of drives available for media loads, the excess number of requests is sent to the preview queue.
Archive and stage requests in the preview queue are those that cannot be immediately satisfied. By default, preview requests are satisfied in first-in-first-out (FIFO) order.
You can assign different priorities to preview requests. You can override the FIFO default by entering directives in the preview command file, which is written to /etc/opt/SUNWsamfs/preview.cmd. For more information about this file and setting priorities for archiving and staging, see Prioritizing Preview Requests.
This section provides some examples of archiving processes in real-world environments.
This example illustrates the action of the archiver when no archiver.cmd file is used in a Sun StorEdge SAM-FS environment with one file system, an optical automated library with two drives, and six cartridges.
CODE EXAMPLE 3-45 shows the output produced by the archiver(1M) -lv command. It shows that the default media selected by the archiver is type mo. Only the mo media are available.
CODE EXAMPLE 3-46 shows output that indicates that the archiver uses two drives. It lists the 12 volumes, storage capacity, and available space.
Note - The archiver(1M) -lv command only shows VSNs with space available. |
CODE EXAMPLE 3-47 shows that the archive set samfs includes both metadata and data files. The archiver makes one copy of the files when their archive age reaches the default four minutes (240 seconds).
Archive file selections: Filesystem samfs Logfile: samfs Metadata copy:1 arch_age:240 samfs1 path:. copy:1 arch_age:240 |
CODE EXAMPLE 3-48 shows the files in the archive sets archived to the volumes in the indicated order.
This example shows how to separate data files into two archive sets separate from the metadata. The environment includes a manually mounted DLT tape drive in addition to an optical automated library. The big files are archived to tape, and the small files are archived to optical cartridges.
CODE EXAMPLE 3-49 shows the content of the archiver.cmd file.
CODE EXAMPLE 3-50 shows the media and drives to be used.
Note - The archiver(1M) -lv command only shows VSNs with space available. |
CODE EXAMPLE 3-51 shows the organization of the file system. Files bigger than 512000 bytes (500 kilobytes) are archived after four minutes; all other files are archived after 30 seconds.
CODE EXAMPLE 3-52 shows the division of the archive sets among the removable media.
In this example, user files and project data files are archived to various media. Files from the directory data are segregated by size to optical and tape media. Files assigned to the group ID pict are assigned to another set of volumes. Files in the directories tmp and users/bob are not archived. Archiving is performed at 15-minute intervals, and an archiving record is kept.
CODE EXAMPLE 3-53 shows the output of the archiver(1M) -lv -c command in this example.
In this example, user files and project data files are archived to optical media.
Four VSN pools are defined; three pools are used for user, data, and project, and one is a scratch pool. When proj_pool runs out of media, it relies on scratch_pool to reserve volumes. This example shows how to reserve volumes for each archive set based on the set component, owner component, and file system component. Archiving is performed at 10-minute intervals, and an archiving log is kept.
CODE EXAMPLE 3-54 shows the archiver.cmd file and archiver output.
Copyright © 2006, Sun Microsystems, Inc. All Rights Reserved.