C H A P T E R  3

Archiving

Archiving is the process of copying a file from a Sun StorEdge SAM-FS file system to a volume that resides on a removable media cartridge or on a disk partition of another file system. Using Sun StorEdge SAM-FS archiving capabilities, you can specify that files be archived immediately, specify that files never be archived, and perform other tasks.

Throughout this chapter, the term archive media is used to refer to the various cartridges or disk slices to which archive volumes are written. This chapter describes the archiver's theory of operations, provides general guidelines for developing archive policies for your site, and explains how to implement policies by creating an archiver.cmd file.

This chapter contains the following sections:


Archiving Process Overview

The archiver automatically writes Sun StorEdge SAM-FS files to archive media. Operator intervention is not required to archive the files. Files are archived to a volume on the archive media, and each volume is identified by a unique identifier called a volume serial name (VSN). Archive media can contain one or more volumes.

The archiver starts automatically when a Sun StorEdge SAM-FS file system is mounted. You can customize the archiver's operations for your site by inserting archiving directives into the following file:

/etc/opt/SUNWsamfs/archiver.cmd

The archiver.cmd file does not need to be present for archiving to occur. In the absence of this file, the archiver uses the following defaults:

The following sections describe the concept of an archive set and explain the operations performed during the archiving process.

Archiver Daemons

The sam-archiverd daemon schedules the archiving activity. The sam-arfind process assigns files to be archived to archive sets. The sam-arcopy process copies the files to be archived to the selected volumes.

The sam-archiverd daemon is started by sam-fsd when Sun StorEdge SAM-FS activity begins. The sam-archiver daemon executes the archiver(1M) command to read the archiver.cmd file and builds the tables necessary to control archiving. It starts a sam-arfind process for each mounted file system; if a file system is unmounted, it stops the associated sam-arfind process. The sam-archiverd process then monitors sam-arfind and processes signals from an operator or other processes.

Archive Sets

An archive set identifies a group of files to be archived. Archive sets can be defined across any group of file systems. Files in an archive set share common criteria that pertain to the size, ownership, group, or directory location. The archive set controls the destination of the archive copy, how long the copy is kept archived, and how long the software waits before archiving the data. All files in an archive set are copied to the volumes associated with that archive set. A file in the file system can be a member of only one archive set.

As files are created and modified, the archiver copies them to archive media. The archiving process also copies the data necessary for Sun StorEdge SAM-FS file system operations, including directories, symbolic links, the index of segmented files, and archive media information.

Archive files are compatible with the standard UNIX tar(1) format. This ensures data compatibility with the Sun Solaris Operating System (OS) and other UNIX systems. If a complete loss of your Sun StorEdge SAM-FS environment occurs, the tar(1) format allows file recovery using standard UNIX tools and commands.

Archive set names are determined by the administrator and are virtually unlimited, with the following exceptions:

The no_archive archive set is defined by default. Files selected to be in this archive set are never archived. Files in a temporary directory, such as /sam1/tmp, for example, might be included in the no_archive archive set.

The allsets archive set is used to define parameters that apply to all archive sets.

Archiving Operations

By default, the archiver makes one copy of each archive set, but you can request up to four. An archive set and a copy number become a synonym for a collection of volumes. The archive copies provide duplication of files on separate volumes.

The data in a file must be modified before the file is considered to be a candidate for archiving or rearchiving. A file is not archived if it is only accessed. For example, issuing a touch(1) or an mv(1) command on a file does not cause it to be archived or rearchived.



Note - Issuing an mv(1) command alters the file name but not the file data, and this can have ramifications in a disaster recovery situation if you are restoring from tar(1) files. For more information on disaster recovery, see the Sun StorEdge SAM-FS Troubleshooting Guide.



A file is selected for archiving based on its archive age, which is the period of time that has past since the file was last modified. The archive age can be defined for each archive copy.

Users can change the default time references on their files to values far in the past or future by using the touch(1) command. This can cause unexpected archiving results, however. To avoid such problems, the archiver adjusts the references so that they are always somewhere between the file creation time and the present time.

The archive priority is computed from file property characteristics and from file property multipliers associated with the archive set. Essentially, the computation is as follows:

archive-priority = file-property-value x property-multiplier

Most file-property-value numbers are 1 (for true) or 0 (for false). For instance, the value of the property copy 1 is 1 if archive copy 1 is being made. The values of copy 2, copy 3, and copy 4 are, then 0. Other properties, such as archive age and file size, can have values other than 0 or 1.

The property-multiplier value is determined from the -priority parameters for the archive set. Various aspects of a file, such as age or size, can be given values to determine the archive request's priority. For more information on the -priority parameter, see the archiver.cmd(4) man page.

The archive-priority and the property-multiplier values are floating-point numbers. The default value for all property multipliers is 0.0. The archive request is set to the highest file priority in the archive request.

The following sections describe the steps taken by the archiver from the initial file scan to the file copy process.

Step 1: Identifying Files to Archive

There is a separate sam-arfind process for each mounted file system. The sam-arfind process monitors each file system to determine the files that need archiving. The file system notifies its sam-arfind process whenever a file is changed in a manner that would affect its archival state. Examples of such changes are file modification, rearchiving, unarchiving, and renaming. When notified, the sam-arfind process examines the file to determine the archive action required.

The sam-arfind process determines the archive set to which the file belongs by using the file properties descriptions. The characteristics used for determining a file's archive set include the following:

If the archive age of the file for one or more copies has been met or exceeded, sam-arfind adds the file to one or more archive requests for the archive set. An archive request is a collection of files that belong to the same archive set. The archive request resides in the following directory:

/var/opt/SUNWsamfs/archiver/file_sys/ArchReq

The files in this directory are binary files, and you can display them by using the showqueue(1M) command.

Separate archive requests are used for files not yet archived and for files being rearchived. This allows scheduling to be controlled independently for these two types of files.

If the archive age of the file for one or more copies has not been met, the directory in which the file resides and the time at which the archive age is reached is added to a scan list. Directories are scanned as the scan list times are reached. Files that have reached their archive age are added to archive requests.

If a file is offline, the sam-arfind process selects the volumes to be used as the source for the archive copy. If the file copy is being rearchived, the sam-arfind process selects the volume containing the archive copy that is being rearchived.

If a file is segmented, only those segments that have changed are selected for archival. The index of a segmented file contains no user data, so it is treated as a member of the file system archive set and is archived separately.

There are two methods by which files are marked for archiving: continuous archiving and scanning. With continuous archiving, the archiver works with the file system to determine which files need to be archived. With scanning, the archiver periodically peruses the file systems and selects files for archiving. The following sections describe these methods.

Continuous Archiving

Continuous archiving is the default archiving method (the archiver.cmd file parameter is examine=noscan). With continuous archiving, you can specify scheduling start conditions for an archive set by using the -startage, -startcount, and -startsize parameters. These conditions enable you to optimize archive timeliness versus archive work done. For example:

When any of the scheduling start conditions is reached, the sam-arfind process sends each archive request to the archiver daemon, sam-archiverd, to be scheduled for file copying to archive media.

For more information about archiving parameters see Global Archiving Directives.



Note - When examine is set to noscan, the following default settings are automatically implemented:

- startage: 10 minutes
- startsize: 10 GB
- startcount: 10,000 files



Scanned Archiving

As an alternative to continuous archiving, you can specify examine=scan in the archiver.cmd file to direct sam-arfind to examine files for archival using scanning. Files needing archival are placed into archive requests. The sam-arfind process scans each file system periodically to determine which files need archiving. The first scan is a directory scan, in which sam-arfind descends recursively through the directory tree. The process examines each file is examined, and sets the file status flag to archdone is set if the file does not need archiving. During successive scans, sam-arfind scans the .inodes file. Only inodes with the archdone flag not set are examined.

For information about controlling the setting of the archdone flag, see The setarchdone Directive: Controlling the Setting of the archdone Flag.

When the file system scan is complete, the sam-arfind process sends each archive request to the archiver daemon, sam-archiverd, to be scheduled for file copying to archive media. The sam-arfind process then sleeps for the duration specified by the interval=time directive. At the end of the interval, the sam-arfind process resumes scanning.

Step 2: Composing Archive Requests

When archive requests are received by the sam-archiverd daemon, they are composed. This section describes the composition process.

Because of the capacity of the archive media or of the controls specified in the archiver command file, the files in an archive request might not be archived all at one time. Composing is the process of selecting the files from the archive request to be archived at one time. When the archive copy operation is complete for an archive request, the archive request is recomposed if files remain to be archived.

The sam-archiverd daemon places the files in the archive request according to certain default and site-specific criteria. The default operation is to archive all the files in an archive request to the same archive volumes in the order in which they were found during the file system scan. The site-specific criteria enable you to control the order in which files are archived and how they can be distributed on volumes. These criteria, called archive set parameters, are evaluated in the following order: -reserve, -join, -sort, -rsort (reverse sort), and -drives. For more information on these parameters, see the archiver.cmd(4) man page.

If the archive request belongs to an archive set that has -reserve owner specified, the sam-archiverd daemon orders the files in the archive request according to the file's directory path, user name, or group name. The files belonging to the first owner are selected for archiving. The remaining files are archived later.

If the archive request belongs to an archive set that has -join method specified, the sam-archiverd daemon groups the files together according to the specified join method. If -sort or -rsort method is also specified, the sam-archiverd daemon sorts the files within each group according to the specified sort method. Each group of joined files is then treated like a single file for the remainder of the composing and scheduling processes.

If the archive request belongs to an archive set that has -sort or -rsort method specified, the sam-archiverd daemon sorts the files according to the specified sort method. Depending on the sort method, the sam-archiverd daemon tends to keep files together based on the sort method, age, size, or directory location. By default, archive requests are not sorted, so files are archived in the order in which they are encountered during the file system scan.

The sam-archiverd daemon determines whether the files are online or offline. If both online and offline files are in the archive request, the online files are selected for archiving first.

If the archive request was not required to be joined or sorted by a sort method, the offline files are ordered by the volume on which the archive copies reside. This ensures that all files in each archive set on the same volume are staged at the same time in the order in which they were stored on the media. When more than one archive copy of an offline file is being made, the offline file is not released until all required copies are made. All the files to be staged from the same volume as the first file are selected for archiving.



Note - The -join, -sort, and -rsort parameters can have a negative effect on performance during archiving of offline files if the order of the files to be archived does not match the order of the volumes needed for the offline files. Use these parameters only for the first archive copy to be made. Other copies should maintain the order of the first copy if enough archive media space is available when the copies are started.



After being composed, the archive requests are entered in the sam-archiverd daemon's scheduling queue, as described in the next section.

Step 3: Scheduling Archive Requests

The scheduler in the sam-archiverd daemon executes on demand when one of the following conditions exists:

The archive requests in the scheduling queue are ordered by priority. Each time the scheduler runs, all archive requests are examined to determine whether they can be assigned to a sam-arcopy process to have their files copied to archive media.

The following must be true in order for archive requests to be scheduled:

Drives

If the archive set has the -drives parameter specified, the sam-archiverd daemon divides the selected files in the archive request among multiple drives. If the number of drives available at this time is fewer than that specified by the -drives parameter, the smaller number is used.

If the total size of files in the archive request is less than the -drivemin value, only one drive is used. The -drivemin value is either the value specified by the -drivemin parameter or the archmax value. The archmax value is specified by the -archmax parameter or the value defined for the media. For more information on the -archmax parameter and the archmax= directive, see the archiver.cmd(4) man page.

If the total size of files in the archive request is more than the -drivemin value, then the number of drives used is determined by the total size of the files divided by the -drivemin value. If the number of drives used is fewer than the number of drives specified by the -drives parameter, that is the number that is used.

Drives can take varying amounts of time to archive files. You can use the -drivemax parameter to obtain better drive utilization. The -drivemax parameter requires you to specify the maximum number of bytes to be written to a drive before that drive is rescheduled for more data.

Volumes

For archiving to occur, there must be at least one volume with enough space to hold at least some of the files in the archive request. If it has enough space, the volume that has most recently been used for the archive set is the one scheduled. This volume must not be in use by the archiver.

If a volume usable for the archive set is busy, another is selected, unless the -fillvsns parameter is specified. In this case, the archive request is not schedulable.

If an archive request is too big for one volume, the files that can fit on the volume are selected to be archived to the volume. If the archive request contains files that are too big to fit on one volume, and volume overflow for the archive request is not selected, the files cannot be archived. An appropriate message for this condition is sent to the log.

You can specify volume overflow for the archive set by using the -ovflmin parameter or for the media by using the ovflmin= directive. For more information about the -ovflmin parameter and the ovflmin= directive, see the archiver.cmd(4) man page. The ovflmin specification determines the file size threshold above which additional volumes or media are assigned for archiving. An ovflmin value specified for the archive set takes precedence over an ovflmin value specified for the media.

If the size of the files is less than the value of ovflmin, the files cannot be archived. An appropriate message for this condition is sent to the log. If the size of the files is more than the value of ovflmin, additional volumes are assigned as required. Volumes are selected in order of decreasing size in order to minimize the number of volumes required. If no usable volumes can be found for the archive request, the archive request waits.

Certain properties, such as whether the file is online or offline, are used in conjunction with the archive priority to determine the scheduling priority for a particular archive request. For more information on customizing the priority multiplier, see the -priority parameters described on the archiver.cmd(4) man page.

For each archive request, the sam-archiverd daemon computes the scheduling priority by adding the archive priority to multipliers associated with various system resource properties. These properties are associated with the number of seconds for which the archive request has been queued, whether the first volume to be used in the archiving process is loaded into a drive, and so on.

Using the adjusted priorities, the sam-archiverd daemon assigns each ready archive request to be copied, as described in the next section.

Step 4: Archiving the Files in an Archive Request

When an archive request is ready to be archived, the sam-archiverd daemon marks the archive file (tarball) boundaries so that each archive file's size is less than the specified -archmax value. If a single file is larger than this value, it becomes the only file in an archive file.

For each archive request and each drive to be used, the sam-archiverd daemon assigns the archive request to a sam-arcopy process to copy the files to the archive media. The archive information is entered into the inode.

If archive logging is enabled, an archive log entry is created.

For each file that was staged, the disk space is released until all files in the list have been archived.

A variety of errors and file status changes can prevent a file from being successfully copied. Errors can include read errors from the cache disk and write errors to the volumes. Status changes can include modification since selection, a file that is open for writing, or a file that has been removed.

When the sam-arcopy process exits, the sam-archiverd daemon examines the archive request. If any files have not been archived, the archive request is recomposed.

Sample Default Output

CODE EXAMPLE 3-1 shows sample output from the archiver(1M) -l command.


CODE EXAMPLE 3-1 Output From the archiver (1M) -l Command
# archiver
Archive media:
default:mo
media:mo archmax:5000000
media:lt archmax:50000000
Archive devices:
device:mo20 drives_available:1 archive_drives:1
device:lt30 drives_available:1 archive_drives:1
Archive file selections:
Filesystem samfs1:
samfs1  Metadata
    copy:1  arch_age:240
big  path:. minsize:512000
    copy:1  arch_age:240
all  path:
    copy:1  arch_age:30
Archive sets:
all
    copy:1 media:mo
big
    copy:1 media:lt
samfs1
    copy:1 media:mo

Archive Log Files and Event Logging

The sam-arfind and sam-arcopy processes use the syslog facility and archiver.sh to log warnings and informational messages in a log file that contains information about each archived or automatically unarchived file. The log file is a continuous record of archival action. You can use the log file to locate earlier copies of files for traditional backup purposes.

This file is not produced by default. Use the logfile= directive in the archiver.cmd file to specify that a log file be created and to specify the name of the log file. For more information on the log file, see the Using Archiver Directives in this chapter and see the archiver.cmd(4) man page.

CODE EXAMPLE 3-2 shows sample lines from an archiver log file with definitions for each field.


CODE EXAMPLE 3-2 Archiver Log File Lines
A 2001/03/23 18:42:06 mo 0004A arset0.1 9a089.1329 samfs1 118.51 162514 t0/fdn f 0 56
A 2001/03/23 18:42:10 mo 0004A arset0.1 9aac2.1 samfs1 189.53 1515016 t0/fae f 0 56
A 2001/03/23 18:42:10 mo 0004A arset0.1 9aac2.b92 samfs1 125.53 867101 t0/fai f 0 56
A 2001/03/23 19:13:09 lt SLOT22 arset0.2 798.1 samfs1 71531.14 1841087 t0/fhh f 0 51
A 2001/03/23 19:13:10 lt SLOT22 arset0.2 798.e0e samfs1 71532.12 543390 t0/fhg f 0 51
A 2003/10/23 13:30:24 dk DISK01/d8/d16/f216 arset4.1 810d8.1 qfs2 119571.301 1136048 t1/fileem f 0 0
A 2003/10/23 13:30:25 dk DISK01/d8/d16/f216 arset4.1 810d8.8ad qfs2 119573.295 1849474 t1/fileud f 0 0
A 2003/10/23 13:30:25 dk DISK01/d8/d16/f216 arset4.1 810d8.16cb qfs2 119576.301 644930 t1/fileen f 0 0
A 2003/10/23 13:30:25 dk DISK01/d8/d16/f216 arset4.1 810d8.1bb8 qfs2 119577.301 1322899 t1/fileeo f 0 0

Reading left to right, the fields in the previous listing have the content shown in TABLE 3-1.


TABLE 3-1 Archiver Log File Fields

Field

Example Value

Content

1

A

Archive activity, as follows:

  • A for archived.
  • R for rearchived.
  • U for unarchived.

2

2001/03/23

Date of the archive action, in yyyy/mm/dd format.

3

18:42:06

Time of the archive activity, in hh:mm:ss format.

4

mo

Archive media type. For information on media types, see the mcf(4) man page.

5

0004A

VSN. For removable media cartridges, this is the volume serial name. For disk archives, this is the disk volume name and archive tar(1) file path.

6

arset0.1

Archive set and copy number.

7

9a089.1329

Physical position of the start of the archive file on media (tar(1) file) and file offset within the archive file, in hexadecimal format.

8

samfs1

File system name.

9

118.51

Inode number and generation number. The generation number is used in addition to the inode number for uniqueness since inode numbers are reused.

10

162514

Length of the file if the file is written on only one volume. Length of the section if the file is written on multiple volumes.

11

t0/fdn

Path and name of the file relative to the file system's mount point.

12

f

Type of file, as follows:

  • d for directory.
  • f for regular file.
  • l for symbolic link.
  • R for removable media file.
  • I for segment index.
  • S for data segment.

13

0

Section of an overflowed file or segment. If the file is an overflowed file, the value is nonzero. For all other file types, the value is 0.

14

56

Equipment ordinal of the drive on which the file was archived.



About the archiver.cmd File

The archiver.cmd file controls the archiver's behavior. By default, the archiver runs whenever sam-fsd is started and a Sun StorEdge SAM-FS file system is mounted. In the absence of an archiver.cmd file, the archiver uses the following defaults:

Using directives located in the archiver command file (archiver.cmd), you can customize the actions of the archiver to meet the archiving requirements of your site.


procedure icon  To Create or Modify an archiver.cmd File and Propagate Your Changes

As an alternative to this method, you can use the File System Manager software to create or modify the archiver.cmd file. For more information, see the File System Manager online help.

1. (Optional) Decide whether you want to edit the actual archiver.cmd file or a temporary archiver.cmd file.

Perform this step if you have an /etc/opt/SUNWsamfs/archiver.cmd file and your system is already archiving files. Consider copying your archiver.cmd file to temporary location where you can edit and test it before putting it into production.

2. Use vi(1) or another editor to edit the file.

Add the directives you need in order to control archiving at your site. For information on the directives you can include in this file, see Using Archiver Directives and About Disk Archiving.

3. Save and close the file.

4. Use the archiver(1M) -lv command to verify the correctness of the file.

Whenever you make changes to the archiver.cmd file, you should check for syntax errors using the archiver(1M) command. Specifying the archiver(1M) command as follows evaluates an archiver.cmd file against the current Sun StorEdge SAM-FS system:


# archiver -lv

This command produces a list of all options and writes a list of the archiver.cmd file, volumes, file system content, and errors to the standard output file (stdout). Errors prevent the archiver from running.

By default, the archiver(1M) command evaluates the /etc/opt/SUNWsamfs/archiver.cmd file for errors. If you are working with a temporary archiver.cmd file, use the -c option with the archiver(1M) command and supply this temporary file's name.

5. If you encounter errors, correct them in the file and run the archiver(1M) command again to verify your corrections.

You must correct all errors before you move onto the next step. The archiver does not archive any files if it finds errors in the archiver.cmd file.

6. If you are working with a temporary file, move it to /etc/opt/SUNWsamfs/archiver.cmd.

7. Use the samd(1M) config command to propagate the file changes and restart the system.


# samd config

The archiver.cmd File

The archiver.cmd file consists of the following types of directives:

The directives consist of lines of text read from the archiver.cmd file. Each directive line contains one or more fields separated by spaces or tabs. Any text that appears after the pound sign character (#) is treated as a comment and is not examined. You can continue long directives to a second line by ending the first line with a backslash (\).

Certain directives in the archiver.cmd file require you to specify a unit of time or a unit in bytes. To specify such a unit, use one of the letters in TABLE 3-2.


TABLE 3-2 archiver.cmd File Directive Units

Unit Suffix

Description

Time Suffixes:

s

Seconds.

m

Minutes

h

Hours

d

Days

w

Weeks

y

Years

Size Suffixes:

b

Bytes.

k

Kilobytes

M

Megabytes

G

Gigabytes

T

Terabytes

P

Petabytes

E

Exabytes


Example archiver.cmd File

CODE EXAMPLE 3-3 shows a sample archiver.cmd file. The comments at the right indicate the various types of directives.


CODE EXAMPLE 3-3 Example archiver.cmd File
interval = 30m                    # General directives
logfile = /var/opt/SUNWsamfs/archiver/archiver.log
 
fs = samfs1                       # Archive Set Assignments
no_archive tmp
work work
    1 1h
    2 3h
images images -minsize 100m
    1 1d
    2 1w
samfs1_all  .
    1 1h
    2 1h
fs = samfs2                       # Archive Set Assignments
no_archive tmp
system . -group sysadmin
    1 30m
    2 1h
samfs2_all  .
    1 10m
    2 2h
params                            # Archive Set Directives
allsets -drives 2
images.1 -join path -sort size
endparams
vsns                              # VSN Associations
samfs1.1   mo      optic-2A
samfs1.2   lt      TAPE01
work.1     mo      optic-[3-9][A-Z]
work.2     lt      .*
images.1   lt      TAPE2[0-9]
images.2   lt      TAPE3[0-9]
samfs1_all.1       mo.*
samfs1_all.2       lt.*
samfs2.1   mo      optic-2A
samfs2.2   lt      TAPE01
system.1   mo      optic08a optic08b
system.2   lt      ^TAPE4[0-1]
samfs2_all.1       mo.*
samfs2_all.2       lt.*
endvsns


Using Archiver Directives

The following sections explain the archiver.cmd directives. They are as follows:

Global Archiving Directives

Global directives control the overall archiver operation and enable you to optimize archiver operations for your site's configuration. You can add global directives directly to the archiver.cmd file, or you can specify them using the File System Manager software. For more information on using File System Manager to set global directives, see the File System Manager online help.

Global directives in the archiver.cmd file can be identified either by the equal sign (=) in the second field or by the absence of additional fields.

Global directives must be specified prior to any fs= directives in the archiver.cmd file. The fs= directives are those that pertain to specific file systems. The archiver issues a message if it detects a global directive after an fs= directive.

The archivemeta Directive: Controlling Whether Metadata Is Archived

The archivemeta directive controls whether file system metadata is archived. If files are often moved around and there are frequent changes to the directory structures in a file system, it is a good idea to archive the file system metadata. In contrast, if the directory structures are very stable, you can disable metadata archiving and thereby reduce the actions performed by removable media drives as cartridges are loaded and unloaded. By default, metadata is archived.

This directive has the following format:


archivemeta=state

For state, specify either on or off. The default is on.

The metadata archiving process depends on whether you are using a Version 1 or a Version 2 superblock, as follows:

The archmax Directive: Controlling the Size of Archive Files

The archmax directive specifies the maximum size of an archive file. User files are combined to form the archive file. No more user files are added to the archive file after the target-size value is met. Large user files are written in a single archive file.

To change the defaults, use the following directive:


archmax=media target-size

TABLE 3-3 Arguments for the archmax Directive

Argument

Meaning

media

The media type. For the list of valid media types, see the mcf(4) man page.

target-size

The maximum size of the archive file. This value is media-dependent. By default, archive files written to optical disks are no larger than 5 megabytes. The default maximum archive file size for tapes is 512 megabytes.


 

There are advantages and disadvantages to setting large or small sizes for archive files. For example, if you are archiving to tape and archmax is set to a large size, the tape drive stops and starts less often. However, when writing large archive files, there is the possibility that when an end-of-tape is reached prematurely, a large amount of tape can be wasted. As a rule, archmax should not be set to more than 5 percent of the media capacity.

The archmax directive can also be set for an individual archive set.

The bufsize Directive: Setting the Archiver Buffer Size

By default, a file being archived is copied to archive media using a memory buffer. You can use the bufsize directive to specify a nondefault buffer size and, optionally, to lock the buffer. These actions can improve performance, and you can experiment with different buffer-size values.

This directive has the following format:


bufsize=media buffer-size [lock]

TABLE 3-4 Arguments for the bufsize Directive

Argument

Meaning

media

The media type. For the list of valid media types, see the mcf(4) man page.

buffer-size

A number from 2 through 32. The default is 4. This value is multiplied by the dev_blksize value for the media type, and the resulting buffer size is used. The dev_blksize value is specified in the defaults.conf file. For more information on this file, see the defaults.conf(4) man page.

lock

Indicates whether the archiver should use locked buffers when making archive copies. If lock is specified, the archiver sets file locks on the archive buffer in memory for the duration of the sam-arcopy(1M) operation. This avoids the overhead associated with locking and unlocking the buffer for each I/O request and can thereby result in a reduction in system CPU time.

The lock argument should be specified only on large systems with large amounts of memory. Insufficient memory can cause an out-of-memory condition.

The lock argument is effective only if direct I/O is enabled for the file being archived. By default, lock is not specified and the file system sets the locks on all direct I/O buffers, including those for archiving. For more information on enabling direct I/O, see the setfa(1) man page, the sam_setfa(3) library routine man page, or the -O forcedirectio option on the mount_samfs(1M) man page.


 

You can specify a buffer size and a lock on an archive set basis by using the -bufsize and -lock archive set copy parameters. For more information, see Archive Set Copy Parameters.

The drives Directive: Controlling the Number of Drives Used for Archiving

By default, the archiver uses all of the drives in an automated library for archiving. To limit the number of drives used, use the drives directive.

This directive has the following format:


drives=auto-lib count

TABLE 3-5 Arguments for the drives Directive

Argument

Meaning

auto-lib

The family set name of the automated library as defined in the mcf file.

count

The number of drives to be used for archiving activities.


 

Also see the -drivemax, -drivemin, and -drives archive set copy parameters described in Specifying the Number of Drives for an Archive Request: -drivemax, -drivemin, and -drives.

The examine Directive: Controlling Archive Scans

New files and files that have changed are candidates for archiving. The archiver finds such files through one of the following:

This directive has the following format:


examine=method

For method, specify one of the keywords shown in TABLE 3-6.


TABLE 3-6 Values for the examine Directive's method Argument

method Value

Meaning

noscan

Specifies continuous archiving. After the initial scan, directories are scanned only when the content changes and archiving is required. Directory and inode information is not scanned. This archiving method provides better performance than scan-based archiving, particularly for file systems with more than 1,000,000 files. Default.

scan

Specifies scan-based archiving. The initial file system scan is a directory scan. Subsequent scans are inode scans.

scandirs

Specifies scan-based archiving on directories only. If the archiver finds a directory with the no_archive attribute set, that directory is not scanned. Files that do not change can be placed in such a directory, and this can dramatically reduce the amount of time spent on archiving scans.

scaninodes

Specifies scan-based archiving on inodes only.


The interval Directive: Specifying an Archive Interval

The archiver runs periodically to examine the status of all mounted Sun StorEdge SAM-FS file systems. The timing is controlled by the archive interval, which is the time between scan operations on each file system. To change the time, use the interval directive.

The interval directive initiates full scans only when continuous archiving is not set and no startage, startsize, or startcount parameters have been specified. If continuous archiving is set (examine=noscan), the interval directive acts as the default startage value.

This directive has the following format:


interval=time

For time, specify the amount of time you want between scan operations on a file system. By default, time is interpreted in seconds and has a value of 600, which is 10 minutes. You can specify a different unit of time, such as minutes or hours, as described in TABLE 3-2.

If the archiver receives the samu(1M) utility's :arrun command, it begins scanning all file systems immediately. If the examine=scan directive is also specified in the archiver.cmd file, a scan is performed after :arrun or :arscan is issued.

If the hwm_archive mount option is set for the file system, the archive interval can be shortened automatically. This mount option specifies that the archiver commence its scan when the file system is filling up and the high water mark is crossed. The high=percent mount option sets the high water mark for the file system.

For more information on specifying the archive interval, see the archiver.cmd(4) man page. For more information on setting mount options, see the mount_samfs(1M) man page.

The logfile Directive: Specifying An Archiver Log File

The archiver can produce a log file that contains information about each file that is archived, rearchived, or automatically unarchived. The log file is a continuous record of archival action. To specify a log file, use the logfile directive.

This directive has the following format:


logfile=pathname

For pathname, specify the absolute path and name of the log file. By default, this file is not produced.

The logfile directive can also be set for an individual file system.


procedure icon  To Back Up an Archiver Log File

Assume that you want to back up the archiver log file every day by copying the previous day's log file to an alternate location. Be sure to perform the copy operation when the archiver log file is closed, not while it is open for a write operation.

1. Use the mv(1) command to move the archiver log file within a Unix file system.

This gives any sam-arfind(1M) or sam-arcopy(1M) operations time to finish writing to the archiver log file.

2. Use the mv(1) command to move the previous day's archiver log file to the Sun StorEdge SAM-FS file system.

The notify Directive: Renaming the Event Notification Script

The notify directive sets the name of the archiver's event notification script file. This directive has the following format:


notify=filename

For filename, specify the name of the file containing the archiver event notification script or the full path to this file.

The default file name is as follows:

/etc/opt/SUNWsamfs/scripts/archiver.sh

The archiver executes this script to process various events in a site-specific manner. The script is called with one of the following keywords for the first argument: emerg, alert, crit, err, warning, notice, info, and debug.

Additional arguments are described in the default script. For more information, see the archiver.sh(1M) man page.

The ovflmin Directive: Controlling Volume Overflow

With volume overflow, archived files are allowed to span multiple volumes. Volume overflow is enabled when you use the ovflmin directive in the archiver.cmd file. When a file size exceeds the value of ovflmin directive's minimum-file-size argument, the archiver writes a portion of this file to another available volume of the same type. The portion of the file written to each volume is called a section.



Note - Use volume overflow with caution only after thoroughly assessing its effect on your site. Disaster recovery and recycling are much more difficult with files that span volumes. For more information, see the Sun StorEdge SAM-FS Troubleshooting Guide and the request(1) man page.



The archiver controls volume overflow through the ovflmin directive. The ovflmin directive specifies the file size threshold that triggers the overflow process. By default, volume overflow is disabled.

This directive has the following format:


ovflmin = media minimum-file-size

TABLE 3-7 Arguments for the ovflmin Directive

Argument

Meaning

media

The media type. For a list of valid media types, see the mcf(4) man page.

minimum-file-size

The minimum file size that you want to trigger the volume overflow.


 

Assume that many files exist with a length that is a significant fraction (such as 25 percent) of an mo media cartridge. These files partially fill the volumes and leave unused space on each volume. To get better packing of the volumes, set ovflmin for mo media to a size slightly smaller than the size of the smallest file. The following directive sets it to 150 megabytes:


ovflmin=mo 150m

Note that enabling volume overflow in this example also causes two volumes to be loaded for archiving and staging the file because each file will overflow onto another volume.

The ovflmin directive can also be set for an individual archive set.

The sls(1) command output lists the archive copy showing each section of the file on each VSN. CODE EXAMPLE 3-4 shows the archiver log file and CODE EXAMPLE 3-5 shows the sls -D command output for a large file named file50 that spans multiple volumes.


CODE EXAMPLE 3-4 Archiver Log File Example
A  97/01/13  16:03:29  lt  DLT000  big.1  7eed4.1  samfs1  13.7  477609472  00 big/file50  0  0
A  97/01/13  16:03:29  lt  DLT001  big.1  7fb80.0  samfs1  13.7  516407296  01 big/file50  0  1
A  97/01/13  16:03:29  lt  DLT005  big.1  7eb05.0  samfs1  13.7  505983404  02 big/file50  0  2

CODE EXAMPLE 3-4 shows that file50 spans three volumes with VSNs of DLT000, DLT001, and DLT005. The position on the volume and the size of each section is indicated in the seventh and tenth fields respectively 7eed4.1 and 477609472 for the first entry), and matches the sls -D output shown in CODE EXAMPLE 3-5. For a complete description of the archiver log entry, see the archiver(1M) man page.

CODE EXAMPLE 3-5 shows the sls -D command and output.


CODE EXAMPLE 3-5 sls (1M) -D Command and Output
# sls -D file50
file50:
  mode: -rw-rw----  links:   1  owner: gmm    group: sam
  length: 1500000172  admin id:      7  inode: 1407.5
  offline;   archdone;  stage -n
  copy1: ---- Jan 13 15:55            lt
    section 0:  477609472     7eed4.1    DLT000
    section 1:  516407296     7fb80.0    DLT001
    section 2:  505983404     7eb05.0    DLT005
  access:      Jan 13 17:08  modification: Jan 10 18:03
  changed:     Jan 10 18:12  attributes:   Jan 13 16:34
  creation:    Jan 10 18:03  residence:    Jan 13 17:08

Volume overflow files do not generate checksums. For more information on using checksums, see the ssum(1) man page.

The scanlist_squash Directive: Controlling Scanlist Consolidation

The scanlist_squash parameter turns scanlist consolidation on or off. The default setting is off. This parameter can be either global or file-system-specific.

When this option is turned on, the scan list entries for files in two or more sub-directories with the same parent directory that need to be scanned by sam-arfind at a much later time are consolidated. This can cause a severe performance penalty if archiving on a file system that has a large number of changes to many sub-directories. When the scanlist is consolidated, these directories are combined upwards to a common parent, which results in a deep recursive scan over many sub-directories.

The setarchdone Directive: Controlling the Setting of the archdone Flag

The setarchdone parameter is a global directive that controls the setting of the archdone flag when the file is examined by sam-arfind.

This directive has the following format:


setarchdone=on|off

When all archive copies for a file have been made, the archdone flag is set for that file to indicate that no further archive action is required. During an inodes scan, the archiver detects whether the archdone flag is set, and if it is set the archiver does not look up the path name for the inode.

During directory scans, the archiver also sets the archdone flag for files that will never be archived. This can be a time-consuming operation and can impact performance when large directories are scanned. The setarchdone directive gives you control over this activity. The default setting for the directive is off if the examine directive is set to scandirs or noscan.

This directive controls the setting of the archdone flag only on files that will never be archived. It does not affect the setting of the archdone flag after archive copies are made.

The wait Directive: Delaying Archiver Startup

The wait directive causes the archiver to wait for a start signal from samu(1M) or File System Manager. By default, the archiver begins archiving when started by sam-fsd(1M).

This directive has the following format:


wait

The wait directive can also be set for an individual file system.

File System Directives

After the general directives in the archiver.cmd file, you can use the fs= directive to include directives specific to a particular file system. After an fs= directive is encountered, the archiver assumes that all subsequent directives specify actions to be taken only for individual file systems.

You can specify fs= directives by editing the archiver.cmd file, as described in the following sections, or by using the File System Manager software. See the File System Manager online help for more information.

The fs Directive: Specifying the File System

By default, archiving controls apply to all file systems. However, you can confine some controls to an individual file system. For instance, you can use this directive to specify a different log file for each file system. To specify an individual file system, use the fs directive.

This directive has the following format:


fs=fsname

For fsname, specify the file system name as defined in the mcf file.

The general directives and archive set association directives that occur after these directives apply only to the specified file system until another fs= directive is encountered.

Global and File System Directives

Several directives can be specified both as global directives for all file systems and as directives that are specific to one file system. These directives are as follows:

Archive Set Assignment Directive

By default, files are archived as part of the archive set named for the file system. However, you can specify archive sets to include files that share similar characteristics. If a file does not match one of the specified archive sets, it is archived as part of the default archive set named for the file system.

You can create archive sets by directly editing the archiver.cmd file, as described in the following sections, or by using the File System Manager software. In File System Manager, an archive policy defines an archive set. For more information, see the File System Manager online help.

Assigning Archive Sets

The archive set membership directives assign files with similar characteristics to archive sets. The syntax of these directives is patterned after the find(1) command. Each archive set assignment directive has the following format:


archive-set-name path [search-criterion1 search-criterion2 ... ] [file-attribute1 file-attribute2 ... ]

TABLE 3-8 Arguments for the Archive Set Assignment Directive

Argument

Meaning

archive-set-name

A site-defined name for the archive set. Must be the first field in the archive set assignment directive. An archive set name is usually indicative of the characteristics of the files belonging to the archive set. Archive set names are restricted to the letters in the alphabet, numbers, and the underscore character (_). No other special characters or spaces are allowed. The first character in the archive set name must be a letter.

To prevent archiving for various files, specify no_archive as the archive-set-name value.

path

A path relative to the mount point of the file system. This allows an archive set membership directive to apply to multiple Sun StorEdge SAM-FS file systems. If the path is to include all of the files in a file system, use a period (.) for the path field. A leading slash (/) is not allowed in the path. Files in the directory specified by path, and its subdirectories, are considered for inclusion in this archive set.

search-criterion1

search-criterion2

Zero, one, or more search-criterion arguments can be specified. Search criteria can be specified to restrict the archive set according to file size, file ownership, and other factors. For information on possible search-criterion arguments, see the following sections.

file-attribute1
file-attribute2

Zero, one, or more file-attribute values can be specified. These file attributes are set for files as the sam-arfind process scans a file system during archiving.


 

CODE EXAMPLE 3-6 shows typical archive set membership directives.


CODE EXAMPLE 3-6 Archive Set Membership Directives
hmk_files    net/home/hmk     -user hmk
datafiles    xray_group/data  -size 1M
system       .

You can suppress the archiver by including files in an archive set named no_archive. CODE EXAMPLE 3-7 shows lines that prevent archiving of files in a tmp directory, at any level, and regardless of the directory in which the tmp directory resides within the file system.


CODE EXAMPLE 3-7 Archiving Directives That Prevent Archiving
fs = samfs1
no_archive tmp
no_archive . -name .*/tmp/

The following sections describe the search_criterion arguments that you can specify.

File Age search_criterion: -access and -nftv

You can use the -access age characteristic to specify that the age of a file be used to determine archive set membership. When you use this characteristic, files with access times older than age are rearchived to different media. For age, specify an integer followed by one of the suffixes shown in TABLE 3-9.


TABLE 3-9 -access age Suffixes

Suffix

Meaning

s

Seconds

m

Minutes

h

Hours

d

Days

w

Weeks

]y

Years


For example, you can use this directive to specify that files that have not been accessed in a long time be rearchived to less expensive media.

When determining age, the software validates the access and modification times for files to ensure that these times are greater than or equal to the file creation time, and less than or equal to the time at which the file is examined. For files that have been "migrated" into a directory, this validation might not result in the desired behavior. The -nftv (no file time validation) parameter can be used in these situations to prevent the validation of file access and modification times.

File Age search-criterion: -after

You can use the -after date-time characteristic to group newly modified or created files into the same archive set. When you use this characteristic, only files created or modified after the date indicated are included in the archive set.

The format of date-time is YYYY-MM-DD[Thh:mm:ss][Z] (ISO 8601 format). If the time portion is not specified, it is assumed to be 00:00:00. If the Z is present, the time is interpreted as Coordinated Universal Time (UTC); otherwise it is interpreted as local time.

File Size search-criterion: -minsize and -maxsize

The size of a file can be used to determine archive set membership through the -minsize size and -maxsize size characteristics. For size, specify an integer followed by one of the letters shown in TABLE 3-10.


TABLE 3-10 -minsize and -maxsize size Suffixes

Letter

Meaning

b

Bytes

k

Kilobytes

M

Megabytes

G

Gigabytes

T

Terabytes

P

Petabytes

E

Exabytes


Example. The lines in CODE EXAMPLE 3-8 specify that all files of at least 500 kilobytes, but less than 100 megabytes, belong to the archive set big_files. Files bigger than 100 megabytes belong to the archive set huge_files.


CODE EXAMPLE 3-8 Using the -minsize and -maxsize Directive Examples
big_files  .  -minsize 500k  -maxsize 100M
huge_files .  -minsize 100M

Owner and Group search-criterion: -user and -group

The ownership and group affiliation can be used to determine archive set membership through the -user name and -group name characteristics. In CODE EXAMPLE 3-9, all files belonging to user sysadmin belong to archive set adm_set, and all files with the group name of marketing are in the archive set mktng_set.


CODE EXAMPLE 3-9 Using the -user and -group Directive Examples
adm_set    .   -user sysadmin
mktng_set  .   -group marketing

File Name search-criterion Using Pattern Matching: -name regex

The names of files to be included in an archive set can be specified by regular expressions. The -name regex specification as a search-criterion directive specifies that any complete path matching the regular expression regex is to be a member of the archive set.

The regex argument follows the conventions outlined in the regexp(5) man page. Note that regular expressions do not follow the same conventions as UNIX wildcards.

All files beneath the selected directory (with their specified paths relative to the mount point of the file system) go through pattern matching. This allows you to create patterns in the -name regex field to match both file names and path names.

Examples

The following directive restricts files in the archive set images to those files ending with .gif:


images  . -name \.gif$

The following directive selects files that start with the characters GEO:


satellite  . -name /GEO

You can use regular expressions with the no_archive archive set. The following specification prevents any file ending with .o from being archived:


no_archive  . -name \.o$

Assume that your archiver.cmd file contains the lines shown in CODE EXAMPLE 3-10.


CODE EXAMPLE 3-10 Regular Expression Example
# File selections.
fs = samfs1
     1 1s
     2 1s
no_archive share/marketing -name fred\.

With this archiver.cmd file, the archiver does not archive fred.* in the user directories or subdirectories. CODE EXAMPLE 3-11 shows the files not archived if you specify the directives shown in CODE EXAMPLE 3-10.


CODE EXAMPLE 3-11 Files Not Archived (Using Directives Shown in CODE EXAMPLE 3-10 )
/sam1/share/marketing/fred.anything
/sam1/share/marketing/first_user/fred.anything
/sam1/share/marketing/first_user/first_user_sub/fred.anything

CODE EXAMPLE 3-12 shows the files that are archived if you specify the directives shown in CODE EXAMPLE 3-10.


CODE EXAMPLE 3-12 Files Archived (Using Directives Shown in CODE EXAMPLE 3-10 )
/sam1/fred.anything
/sam1/share/fred.anything
/sam1/testdir/fred.anything
/sam1/testdir/share/fred.anything
/sam1/testdir/share/marketing/fred.anything
/sam1/testdir/share/marketing/second_user/fred.anything

In contrast to CODE EXAMPLE 3-10, assume that your archiver.cmd file contains the lines shown in CODE EXAMPLE 3-13.


CODE EXAMPLE 3-13 Example archiver.cmd File
# File selections.
fs = samfs1
     1 1s
     2 1s
no_archive share/marketing -name ^share/marketing/[^/]*/fred\.

The archiver.cmd file in CODE EXAMPLE 3-13 does not archive fred.* in the user home directories. This archives fred.* in the user subdirectories and in the directory share/marketing. In this case, a user home directory is anything from share/marketing/ until the next slash character (/). As a result, the following files are not archived:


/sam1/share/marketing/first_user/fred.anything

CODE EXAMPLE 3-14 shows the files that are archived if you specify the directives shown in CODE EXAMPLE 3-13.


CODE EXAMPLE 3-14 Files Archived (Using Directives Shown in CODE EXAMPLE 3-13 )
/sam1/share/fred.anything
/sam1/share/marketing/fred.anything
/sam1/share/marketing/first_user/first_user_sub/fred.anything
/sam1/fred.anything
/sam1/testdir/fred.anything
/sam1/testdir/share/fred.anything
/sam1/testdir/share/marketing/fred.anything
/sam1/testdir/share/marketing/second_user/fred.anything
/sam1/testdir/share/marketing/second_user/sec_user_sub/fred.any

Release and Stage file-attributes: -release and -stage

You can set the release and stage attributes associated with files within an archive set by using the -release and -stage options, respectively. Both of these settings override stage or release attributes that a user might have set previously.

The -release option has the following format:


-release attribute

The attributes for the -release directive follow the same conventions as the release(1) command and are shown in TABLE 3-11.


TABLE 3-11 The -release Directive Attributes

Attributes

Meaning

a

Release the file following the completion of the first archive copy.

d

Reset to default.

n

Never release the file.

p

Partially release the file's disk space.


The -stage option has the following format:


-stage attribute

The attributes for the -stage directive follow the same conventions as the stage(1) command and are shown in TABLE 3-12.


TABLE 3-12 The -stage Directive's Attributes

Attribute

Meaning

a

Stage the files in this archive set associatively.

d

Reset to default.

n

Never stage the files in this archive set.


The following example shows how you can use file name specifications and file attributes to partially release Macintosh resource directories:


MACS  . -name .*/\.rscs/  -release p

Archive Set Membership Conflicts

Sometimes the choice of path and other file characteristics for inclusion of a file in an archive set results in ambiguous archive set membership. These situations are resolved in the following manner:

1. The membership definition occurring first in the archive set is chosen.

2. Membership definitions local to a file system are chosen before any globally defined definitions.

3. A membership definition that exactly duplicates a previous definition is noted as an error.

Given these rules, more restrictive membership definitions should be placed earlier in the directive file.

When controlling archiving for a specific file system (using the fs=fsname directive), the archiver evaluates the file-system-specific directives before evaluating the global directives. Thus, files can be assigned to a local archive set (including the no_archive archive set) instead of being assigned to a global archive. This has implications for global archive set assignments such as no_archive.

In CODE EXAMPLE 3-15, it appears that the administrator did not intend to archive any of the .o files across both file systems. However, because the local archive set assignment allfiles is evaluated before the global archive set assignment no_archive, the .o files in the samfs1 and samfs2 file systems are archived.


CODE EXAMPLE 3-15 An archiver.cmd File With Possible Membership Conflicts
no_archive   . -name .*\.o$
fs = samfs1
    allfiles  .
fs = samfs2
    allfiles  .

CODE EXAMPLE 3-16 shows the directives to use to ensure that no .o files are archived in the two file systems.


CODE EXAMPLE 3-16 Corrected archiver.cmd File
fs = samfs1
    no_archive   . -name .*\.o$
    allfiles  .
fs = samfs2
    no_archive   . -name .*\.o$
    allfiles  .

Archive Copy Directives

If you do not specify archive copies, the archiver writes a single archive copy for files in the archive set. By default, this copy is made when the archive age of the file is four minutes. If you require more than one archive copy, all copies, including the first, must be specified through archive copy directives.

The archive copy directives begin with a copy-number value of 1, 2, 3, or 4. The digit is followed by one or more arguments that specify archive characteristics for that copy.

Archive copy directives must appear immediately after the archive set assignment directive to which they pertain. Each archive copy directive has the following format:


copy-number [ -release | -norelease ] [archive-age] [unarchive-age]

You can specify archive copy directives by editing the archiver.cmd file, as described here, or by using the File System Manager software. For more information, see the File System Manager online help.

The following sections describe the archive copy directive arguments.

Releasing Disk Space After Archiving: -release

To specify that the disk space for files is to be automatically released after an archive copy is made, use the -release directive after the copy number. This directive has the following format:


-release

In CODE EXAMPLE 3-17, files within the group images are archived when their archive age reaches 10 minutes. After archive copy 1 is made, the disk cache space is released.


CODE EXAMPLE 3-17 An archiver.cmd File Using the -release Directive
ex_set . -group images
    1 -release 10m

Delaying Disk Space Release: -norelease

You might not want to release disk space until multiple archive copies are completed. The -norelease option prevents the automatic release of disk cache until all copies marked with -norelease are made.

This directive has the following format:


-norelease

The -norelease directive makes the archive set eligible to be released after all copies have been archived, but the files will not be released until the releaser is invoked and selects them as release candidates.

CODE EXAMPLE 3-18 specifies an archive set named vault_tapes. Two copies are created, but the disk cache associated with this archive set is not released until both copies are made.


CODE EXAMPLE 3-18 An archiver.cmd File Using the -norelease Directive
vault_tapes
    1 -norelease 10m
    2 -norelease 30d

Using the -norelease directive on a single copy has no effect on automatic releasing because the file cannot be released until it has at least one archive copy.

Using -release and -norelease Together

If you want to make sure that the disk space is released immediately after all copies of an archive set have been archived, you can use the -release and -norelease options together. The combination of -release and -norelease causes the archiver to release the disk space immediately, when all copies having this combination are made, rather than waiting for the releaser to be invoked, as is the case with the -norelease option alone.

Setting the Archive Age

You can set the archive age for files by specifying the archive age in the archive copy directive. The archive age can be specified with a suffix character such as h for hours or m for minutes as shown in TABLE 3-2.

In CODE EXAMPLE 3-19, the files in directory data are archived when their archive age reaches one hour.


CODE EXAMPLE 3-19 An archiver.cmd File That Specifies the Archive Age
ex_set data
    1 1h

Unarchiving Automatically

If you specify more than one archive copy of a file, it is possible to unarchive all but one of the copies automatically. You might want to do this when the files are archived to various media using various archive ages.

CODE EXAMPLE 3-20 shows the directive that specifies the unarchive age. The first copy of the files in the path home/users is archived six minutes after modification. When the files are 10 weeks old, second and third archive copies are made. The first copy is then unarchived.


CODE EXAMPLE 3-20 An archiver.cmd File that Specifies the Unarchive Age
ex_set home/users
    1 6m 10w
    2 10w
    3 10w

For more ways to control unarchiving, see Controlling Unarchiving.

Specifying More Than One Copy for Metadata

If more than one copy of metadata is required, you can place copy definitions in the directive file immediately after an fs= directive.

CODE EXAMPLE 3-21 shows an archiver.cmd file that specifies multiple metadata copies.


CODE EXAMPLE 3-21 An archiver.cmd File that Specifies Multiple Metadata Copies
fs = samfs7
    1 4h
    2 12h

In this example, one copy of the metadata for the samfs7 file system is made after 4 hours and a second copy is made after 12 hours.

File system metadata includes path names in the file system. For this reason, if you have frequent changes to directories, the new path names cause the creation of new archive copies. This results in frequent loads of the volumes specified for metadata.

Archive Set Copy Parameters

The archive set parameters section of the archiver.cmd file begins with the params directive and ends with the endparams directive. CODE EXAMPLE 3-22 shows the format for directives for an archive set.


CODE EXAMPLE 3-22 Archive Set Copy Parameter Format
params
archive-set-name.copy-number[R] [ -param1 -param2 ...]
.
.
.
endparams

TABLE 3-13 Arguments for the Archive Set Copy Parameters

Argument

Meaning

archive-set-name

A site-defined name for the archive set. Usually indicative of the characteristics of the files belonging to the archive set. Can be allsets. Archive set names are restricted to the letters in the alphabet, numbers, and the underscore character (_). No other special characters or spaces are allowed. The first character in the archive set name must be a letter.

.

A period (.) character. Used to separate archive-set-name from copy-number.

copy-number

An integer that defines the archive copy number. Can be 1, 2, 3, or 4.

R

Specifies that the parameters being defined are for rearchived copies of this archive set. For example, you can use the R and specify VSNs in the -param1 argument to direct rearchived copies to specific volumes.

-param1

-param2

One or more parameters. The following subsections describe the parameters than can be specified between the params and endparams directives.


 

You can specify archive set copy parameters by editing the archiver.cmd file, as shown here, or by using the File System Manager software. For more information, see the File System Manager online help.

The pseudo archive set allsets provides a way to set default archive set directives for all archive sets. All allsets directives must precede directives for actual archive set copies. Parameters set for individual archive set copies override parameters set by the allsets directive. For more information on the allsets archive set, see the archiver.cmd(4) man page.

The following subsections describe all archive set processing parameters, with the exception of disk archiving parameters. For information on disk archiving parameters, see About Disk Archiving.

Controlling the Size of Archive Files: -archmax

The -archmax directive sets the maximum file size for an archive set. This directive has the following format:


-archmax target-size

This directive is very similar to the archmax global directive. For information on that directive, and the values to enter for target-size, see The archmax Directive: Controlling the Size of Archive Files.

Setting the Archiver Buffer Size: -bufsize

By default, a file being archived is stored in memory in a buffer before being written to archive media. You can use the -bufsize directive to specify a nondefault buffer size. These actions can improve performance, and you can experiment with various buffer-size values.

This parameter has the following format:


-bufsize=buffer_size

For buffer-size, specify a number from 2 through 32. The default is 4. This value is multiplied by the dev_blksize value for the media type, and the resulting buffer size is used. The dev_blksize value is specified in the defaults.conf file. For more information on this file, see the defaults.conf(4) man page.

For example, this parameter can be specified in the archiver.cmd file in a line such as the following:

myset.1 -bufsize=6

The equivalent of this directive on a global basis is bufsize=media buffer-size. For more information on that directive, see The bufsize Directive: Setting the Archiver Buffer Size.

Specifying the Number of Drives for an Archive Request:
-drivemax, -drivemin, and -drives

By default, the archiver uses only one media drive to archive the files of one archive set. When an archive set has many files or large files, it can be advantageous to use more than one drive. In addition, if the drives in your automated library operate at different speeds, use of multiple drives can balance these variations and thereby increase archiving efficiency.

The drive directives have the following formats:


-drivemax max-size
-drivemin min-size
-drives number

TABLE 3-14 Arguments for the -drivemax , -drivemin , and -drives Directives

Argument

Meaning

max-size

The maximum amount of data to be archived using one drive.

min-size

The minimum amount of data to be archived using one drive. The default is the -archmax target-size value (if specified) or the default value for the media type.

If you specify the -drivemin min-size directive, Sun StorEdge SAM-FS software uses multiple drives only if there is enough work to warrant it. As a guideline, set min-size to be large enough to cause the transfer time to be significantly longer than the cartridge change time (load, position, unload).

number

The number of drives to be used for archiving this archive set. The default is 1.


 

An archive request is evaluated against the parameters that are specified, as follows:

When you use the -drives parameter, multiple drives are used only if data that is more than the value of min_size is to be archived. The number of drives to be used in parallel is the lesser of the following two values:

You can use the -drivemin and -drives parameters if you want to divide an archive request among drives but don't want to have all the drives bust with small archive requests. This might apply to operations that use very large files.

To set these parameters, you need to consider file creation rates, the number of drives, the time it takes to load and unload drives, and drive transfer rates.

For example, suppose that you are splitting an archive set named bigfiles over five drives. Depending on its size, this archive set could be split as shown in TABLE 3-15.


TABLE 3-15 Archive Set Example Split

Archive Set Size

Number of Drives

< 20 gigabytes

1

> 20 gigabytes to < 30 gigabytes

2

> 30 gigabytes to < 40 gigabytes

3

> 40 gigabytes to < 50 gigabytes

4

> 50 gigabytes

5


CODE EXAMPLE 3-23 shows the lines to use in the archiver.cmd file to split the archive request over multiple drives.


CODE EXAMPLE 3-23 Directives Used to Split an Archive Request Over Multiple Drives
params
bigfiles.1 -drives 5 -drivemin 10G
endparams

In addition, you might specify the following line in the archiver.cmd file:


huge_files.2 -drives 2

When the total size of the files in archive set huge_files.2 is equal to or greater than two times drivemin for the media, two drives are used to archive the files.

Maximizing Space on a Volume: -fillvsns

By default, the archiver selects from all volumes assigned to an archive set when it writes archive copies, using a volume with enough space for all the files. This action can result in volumes not being filled to capacity. If -fillvsns is specified, the archiver separates the archive request into smaller groups.

Specifying Archive Buffer Locks: -lock

By default, a file being archived is stored in memory in a buffer before being written to archive media. If direct I/O is enabled, you can use the -lock parameter to lock this buffer. This action can improve performance.

This parameter has the following format:


-lock

The -lock parameter indicates that the archiver should use locked buffers when making archive copies. If -lock is specified, the archiver sets file locks on the archive buffer in memory for the duration of the sam-arcopy(1M) operation. This avoids paging of the buffer, and can thereby improve performance.

The -lock parameter should be specified only on large systems with large amounts of memory. Insufficient memory can cause an out-of-memory condition.

The -lock parameter is effective only if direct I/O is enabled for the file being archived. By default, -lock is not specified, and the file system sets locks on all direct I/O buffers, including those for archiving. For more information on enabling direct I/O, see the setfa(1) man page, the sam_setfa(3) library routine man page, or the -O forcedirectio option on the mount_samfs(1M) man page.

For example, this parameter can be specified in the archiver.cmd file in a line such as the following:


yourset.3 -lock

You can also specify the equivalent of this parameter on a global basis by specifying the lock argument to the bufsize=media buffer-size [lock] directive. For more information on this topic, see The bufsize Directive: Setting the Archiver Buffer Size.

Making Archive Copies of Offline Files: -offline_copy

A file is a candidate for being released after one archive copy is made. If the file releases and goes offline before all the archive copies are made, the archiver uses this parameter to determine the method to be used when making the other archive copies. In choosing the method to be used, consider the number of drives available to the Sun StorEdge SAM-FS system and the amount of disk cache available.

This parameter has the following format:


-offline_copy method

For method, specify one of the keywords shown in TABLE 3-16.


TABLE 3-16 Values for the -offline_copy Directive's method Argument

method Value

Meaning

none

Stages files as needed for each file before copying to the archive volume. Default.

direct

Copies files directly from the offline volume to the archive volume without using the cache. This method assumes that the source volume and the destination volume are different volumes and that two drives are available. If this method is specified, raise the value of the stage_n_window mount option to a value that is greater than its default of 256 kilobytes. For more information on mount options, see the mount_samfs(1M) man page.

stageahead

Stages one file while archiving another. The system stages the next archive file while writing a file to its destination.

stageall

Stages all files to disk cache before archiving. This method uses only one drive and assumes that room is available on disk cache for all files.


Specifying Recycling

The recycling process enables you to reclaim space on archive volumes that is taken up by expired archive images. By default, no recycling occurs.

If you want to recycle, you can specify directives in both the archiver.cmd file and the recycler.cmd file. For more information on the recycling directives supported in the archiver.cmd file, see Recycling.

Associative Archiving: -join path

The archiver employs associative archiving if you specify the -join path parameter. Associative archiving is useful if you want an entire directory to be archived to one volume and you know that the archive file can physically reside on only one volume. Otherwise, if you want to keep directories together, use either the -sort path or -rsort path parameters to keep the files contiguous. The -rsort parameter specifies a reverse sort.

When the archiver writes an archive file to a volume, it efficiently packs the volume with user files. Subsequently, when accessing files from the same directory, you can experience delays as the stage process moves through a volume to read the next file. To alleviate delays, you can use the -join path parameter to archive files from the same directory paths contiguously within an archive set copy. The process of associative archiving overrides the space efficiency algorithm to archive files from the same directory together.

Associative archiving is useful when the file content does not change but you always want to access a group of files together. For example, you might use associative archiving at a hospital for accessing all of the medical images associated with a patient. For example:


patient_images.1 -join path



Note - The -join path parameter writes data files from the same directory to the same archive file. If there are many directories with a few small files, the archiver creates many small archive files. These small, discrete archive files, each with its own tar(1) header, slow the write performance of the system.

Also, because the -join path parameter specifies that all files from the same directory be archived on a single volume, it is possible that a group of files might not fit on any available volume. In this case, the files are not archived until more volumes are assigned to the archive set. It is also possible that the group of files to be archived is so large that it can never fit on a single volume. In such a case, the files are never archived.

For most applications, using either -sort path or -join path parameter is preferred if the more restrictive operation of -join path is not a requirement.



It is also possible to sort files within an archive set copy by age, size, or path. The age and size arguments are mutually exclusive. CODE EXAMPLE 3-24 shows how to sort an archive set using the -sort parameter with the argument age or size.


CODE EXAMPLE 3-24 Directives for Sorting an Archive Set
cardiac.1 -sort path
cardiac.2 -sort age
catscans.3 -sort size

The first line forces the archiver to sort an archive request by path name. The second example line forces the archiver to sort the archive set copy cardiac.2 by the age of the file, oldest to youngest. The third line forces the archive set copy catscans to be sorted by the size of the file, smallest to largest. If you wanted a reverse sort, you could specify -rsort in place of -sort.

Controlling Unarchiving

Unarchiving is the process by which archive entries for files or directories are deleted. Files are unarchived based on the time since last access. All frequently accessed data can be stored on fast media, such as disk, and all older, infrequently accessed data can be stored on tape. By default, files are never unarchived.

For example, suppose that the archiver.cmd file shown in CODE EXAMPLE 3-25 controls a file that is accessed frequently. This file remains on disk all the time, even if it is older than 60 days. The copy 1 information is removed only if the file is not accessed for 60 days.

If the copy 1 information is removed (because the file was not accessed for 60 days) and someone stages the file from copy 2, it is read from tape. After the file is back online, the archiver makes a new copy 1 on disk and the 60-day access cycle starts all over again. The Sun StorEdge SAM-FS archiver regenerates a new copy 1 if the file is accessed again.


CODE EXAMPLE 3-25 Directives to Control Unarchiving
arset1 dir1
   1    10m    60d
   2    10m
   3    10m
vsns
arset1.1    mo       OPT00[0-9]
arset1.2    lt       DLTA0[0-9]
arset1.3    lt       DLTB0[0-9]

Assume that a patient is in the hospital for four weeks. During this time, all of this patient's files are on fast media (copy 1=mo). After four weeks, the patient is released from the hospital. If no data has been accessed for this patient for up to 60 days after the patient is released, the copy 1 entry in the inode is unarchived, and only copy 2 and copy 3 entries are available. The volume can now be recycled in order to make room for more current patients without having to increase the disk library. If the patient comes back to the hospital after six months for follow-up care, the first access of the data is from tape (copy 2). Now the archiver automatically creates a new copy 1 on disk to ensure that the data is back on the fast media during the follow-up, which could take several days or weeks.

Controlling How Archive Files Are Written: -tapenonstop

By default, the archiver writes a tape mark, an EOF label, and two more tape marks between archive files. When the next archive file is started, the driver backs up to the position after the first tape mark, causing a loss of performance. The -tapenonstop parameter directs the archiver to write only the initial tape mark. In addition, if the -tapenonstop parameter is specified, the archiver enters the archive information at the end of the copy operation.

For more information on the -tapenonstop parameter, see the archiver.cmd(4) man page.

Reserving Volumes: -reserve

By default, the archiver writes archive set copies to any volume specified by a regular expression as described in the volume associations section of the archiver.cmd file. However, you might sometimes want archive set volumes to contain files from only one archive set. You can reserve volumes to satisfy this data storage requirement.

The -reserve parameter reserves volumes for an archive set. When the -reserve parameter is set and a volume has been assigned to an archive set copy, the volume identifier is not assigned to any other archive set copy, even if a regular expression matches it.



Note - A site that uses reserved volumes is likely to incur more cartridge loads and unloads.



When a volume is selected for use by an archive set, it is assigned a reserved name, which is a unique identifier that ties the archive set to the volume.



Note - The -reserve parameter is intended to reserve a volume for exclusive use by one archive set. Many directories with a few small files cause many small archive files to be written to each reserved volume. These small discrete archive files, each with its own tar(1) header, slow the performance of the system.



The format for the -reserve parameter is as follows:


-reserve keyword

The value of keyword depends on the form you are using, as follows:

For example, the archiver.cmd file fragment in CODE EXAMPLE 3-26, shows that the line that begins with the allsets archive set name reserves volumes by archive set for all archive sets.


CODE EXAMPLE 3-26 Reserving Volumes by Archive Set
params
allsets -reserve set
endparams

In the archiver.cmd file, you can specify a -reserve parameter for one, two, or all three possible forms. The three forms can be combined and used together in an archive set parameter definition.

For example, with the archiver.cmd file fragment shown in CODE EXAMPLE 3-27, the line that begins with arset.1 creates a reserved name based upon an archive set, a group, and the file system.


CODE EXAMPLE 3-27 An archiver.cmd File With Reserved Volumes
params
arset.1 -reserve set -reserve group -reserve fs
endparams

The information regarding reserved volumes is stored in the library catalog. The lines in the library catalog list the media type, the VSN, the reserve information, and the reservation date and time. The reserve information includes the archive set component, path name component, and file system component, separated by slashes (//).



Note - These slashes are not indicative of a path name; they are merely separators for displaying the three components of a reserved name.



As CODE EXAMPLE 3-28 shows, the lines in the library catalog that describe reserved volumes begin with #R characters.


CODE EXAMPLE 3-28 Library Catalog Showing Reserved Volumes
     6  00071  00071 lt     0xe8fe   12 9971464 1352412 0x6a000000 131072 0x
#      -il-o-b-----  05/24/00 13:50:02  12/31/69 18:00:00  07/13/01 14:03:00
#R lt 00071 arset0.3// 2001/03/19 18:27:31
    10 ST0001 NO_BAR_CODE lt     0x2741    9 9968052 8537448 0x68000000 1310
#      -il-o-------  05/07/00 15:30:29  12/31/69 18:00:00  04/13/01 13:46:54
#R lt ST0001 hgm1.1// 2001/03/20 17:53:06
    16 SLOT22 NO_BAR_CODE lt     0x76ba    6 9972252 9972252 0x68000000 1310
#      -il-o-------  06/06/00 16:03:05  12/31/69 18:00:00  07/12/01 11:02:05
#R lt SLOT22 arset0.2// 2001/03/02 12:11:25



Note - Some lines in CODE EXAMPLE 3-28 have been truncated to fit on the page.



One or more of the reserve information fields can be empty, depending on the options defined in the archiver.cmd file. The date and time indicate when the reservation was made. A reservation line is appended to the file for each volume that is reserved for an archive set during archiving.

The archiver records volume reservations in the library catalog files. A volume is automatically unreserved when it is relabeled because the archive data has been effectively erased.

You can also use the reserve(1M) and unreserve(1M) commands to reserve and unreserve volumes. For more information on these commands, see the reserve(1M) and unreserve(1M) man pages.

You can display the reserve information by using the samu(1M) utility's v display or by using the archiver(1M) or dump_cat(1M) command in one of the formats shown in CODE EXAMPLE 3-29.


CODE EXAMPLE 3-29 Commands to Use to Display the Reserve Information
archiver -lv
dump_cat -V catalog-name

Example 4: User and Data Files Archived to Optical Media shows a complete archive example using reserved volumes.

Setting Archive Priorities: -priority

The Sun StorEdge SAM-FS file systems offer a configurable priority system for archiving files. Each file is assigned a priority computed from properties of the file and priority multipliers that can be set for each archive set in the archiver.cmd file. Properties include online/offline, age, number of copies made, and size.

By default, the files in an archive request are not sorted, and all property multipliers are zero. This results in files being archived in first-found, first-archived order. You can control the order in which files are archived by setting priorities and sort methods. The following are examples of priorities that you can set:

TABLE 3-20 lists the archive priorities.


TABLE 3-20 Archive Priorities

Archive Priority

Definition

-priority age value

Archive age property multiplier

-priority archive_immediate value

Archive immediate property multiplier

-priority archive_overflow value

Multiple archive volumes property multiplier

-priority archive_loaded value

Archive volume loaded property multiplier

-priority copies value

Copies made property multiplier

-priority copy1 value

Copy 1 property multiplier

-priority copy2 value

Copy 2 property multiplier

-priority copy3 value

Copy 3 property multiplier

-priority copy4 value

Copy 4 property multiplier

-priority offline value

File offline property multiplier

-priority queuewait value

Queue wait property multiplier

-priority rearchive value

Rearchive property multiplier

-priority reqrelease value

Reqrelease property multiplier

-priority size value

File size property multiplier

-priority stage_loaded value

Stage volume loaded property multiplier

-priority stage_overflow value

Multiple stage volumes property multiplier


For value, specify a floating-point number in the following range:


-3.400000000E+38 less than or equal value less than or equal 3.402823466E+38

For more information on priorities, see the archiver(1M) and archiver.cmd(4) man pages.

Scheduling Archiving: -startage, -startcount, and -startsize

As the archiver scans a file system, it identifies files to be archived. Files that are recognized as candidates for archiving are placed in a list known as an archive request. At the end of the file system scan, the system schedules the archive request for archiving. The -startage, -startcount, and -startsize archive set parameters control the archiving workload and ensure the timely archival of files. TABLE 3-21 shows the formats for these parameters.


TABLE 3-21 Formats for the -startage , -startcount , and -startsize Directives

Directive

Meaning

-startage time

The amount of time that can elapse between the first file in a scan being marked for inclusion in an archive request and the start of archiving. For time, specify a time in the format used in Setting the Archive Age. If this variable is not set, the interval directive is used.

-startcount count

The number of files to be included in an archive request. When the number of files in the archive request reaches the value of count, archiving begins. By default, count is not set.

-startsize size

The minimum total size, in bytes, of all files to be archived in an archive request. Archiving work is accumulated, and archiving begins when the total size of the files reaches the value of size. By default, size is not set.


The examine=method directive and the interval=time directives interact with the -startage, -startcount, and -startsize directives. The -startage, -startcount, and -startsize directives optimally balance archive timeliness and archive work done. These values override the examine=method specification, if any. For more information on the examine directive, see The examine Directive: Controlling Archive Scans. For more information on the interval directive, see The interval Directive: Specifying an Archive Interval.

The -startage, -startcount, and -startsize directives can be specified in an archiver.cmd file for each archive copy. If more than one of these directives is specified, the first condition encountered starts the archive operation. If none of these directives is specified, the archive request is scheduled based on the examine=method directive, as follows:

The archiver.cmd(4) man page has examples that show how to use these directives.

VSN Association Directives

The VSN associations section of the archiver.cmd file assigns volumes to archive sets. This section starts with a vsns directive and ends with an endvsns directive.

VSN associations can also be configured with the File System Manager software. See the File System Manager online help for more information.

Collections of volumes are assigned to archive sets by directives of the following form:


archive-set-name.copy-num media-type vsn-expr ... [ -pool vsn-pool-name ... ]

TABLE 3-22 Arguments for the VSN Association Directive

Argument

Meaning

archive-set-name

A site-defined name for the archive set. Must be the first field in the archive set assignment directive. An archive set name is usually indicative of the characteristics of the files belonging to the archive set. Archive set names are restricted to the letters in the alphabet, numbers, and the underscore character (_). No other special characters or spaces are allowed. The first character in the archive set name must be a letter.

copy-num

A digit followed by one or more arguments that specify archive characteristics for that copy. Archive copy directives begin with a digit. This digit (1, 2, 3, or 4) is the copy number.

media-type

The media type. For a list of valid media types, see the mcf(4) man page.

vsn-expr

A regular expression. See the regexp(5) man page.

-pool vsn-pool-name

A named collection of VSNs.


 

An association requires at least three fields: archive-set-name and copy-num, media-type, and at least one volume. The archive-set-name and copy_num values are connected by a period (.).



Note - If your Sun StorEdge SAM-FS environment is configured to recycle by archive set, do not assign a VSN to more than one archive set.



The following examples use regular expressions to specify the same VSNs in different ways.

CODE EXAMPLE 3-30 shows two lines of VSN specifications.


CODE EXAMPLE 3-30 VSN Specifications on Multiple Lines
vsns
set.1  lt  VSN001 VSN002 VSN003 VSN004 VSN005
set.1  lt  VSN006 VSN007 VSN008 VSN009 VSN010
endvsns

CODE EXAMPLE 3-31 shows a VSN specification that uses a backslash character (\) to continue a line onto a subsequent line.


CODE EXAMPLE 3-31 VSN Specifications With a Continued Line
vsns
set.1  lt  VSN001 VSN002 VSN003 VSN004 VSN005 \
 VSN006 VSN007 VSN008 VSN009 VSN010
endvsns

CODE EXAMPLE 3-32 specifies VSNs using a regular expression in a shorthand notation.


CODE EXAMPLE 3-32 VSN Specifications With Shorthand Notation
vsns
set.1 lt VSN0[1-9] VSN10
endvsns

When the archiver needs volumes for the archive set, it examines each volume of the selected media type in all automated libraries and manually mounted drives to determine if the volume would satisfy any VSN expression. It selects the first volume that matches an expression that contains enough space for the archive copy operation. For example:



Note - Make sure you assign volumes to the archive set for the metadata when setting up the archiver.cmd file. Each file system has an archive set with the same name as the file system. For more information on preserving metadata, see the samfsdump(1M) man page or see the Sun StorEdge SAM-FS Troubleshooting Guide.



VSN Pools Directives

The VSN pools section of the archiver.cmd file starts with a vsnpools directive and ends either with an endvsnpools directive or with the end of the archiver.cmd file. This section names a collection of volumes.

VSN pools can also be configured with the File System Manager software. See the File System Manager online help for more information.

A VSN pool is a named collection of volumes. VSN pools are useful for defining volumes that can be available to an archive set. As such, VSN pools provide a useful buffer for assigning volumes and reserving volumes to archive sets. You can use VSN pools to define separate groups of volumes by departments within an organization, by users within a group, by data type, and according to other convenient groupings.

If a volume is reserved, it is no longer available to the pool in which it originated. Therefore, the number of volumes within a named pool changes as volumes are used. You can view the VSN pools by issuing the archiver(1M) command in the following format:


# archiver -lv | more

The syntax of a VSN pool definition is as follows:


vsn-pool-name media-type vsn-expr

TABLE 3-23 Arguments for the VSN Pools Directive

Argument

Meaning

vsn-pool-name

The VSN pool.

media-type

The two-character media type. For a list of valid media types, see the mcf(4) man page.

vsn-expr

A regular expression. There can be one or more vsn-expr arguments. See the regcmp(3G) man page.


 

The following example uses four VSN pools: users_pool, data_pool, proj_pool, and scratch_pool. A scratch pool is a set of volumes used when specific volumes in a VSN association are exhausted or when another VSN pool is exhausted. If one of the three specific pools is out of volumes, the archiver selects the scratch pool VSNs. CODE EXAMPLE 3-33 shows an archiver.cmd file that uses four VSN pools.


CODE EXAMPLE 3-33 VSN Pools
vsnpools
users_pool   mo ^MO[0-9][0-9]
data_pool    mo ^DA.*
scratch_pool mo ^SC[5-9][0-9]
proj_pool    mo ^PR.*
endvsnpools
vsns
users.1     mo    -pool users_pool   -pool scratch_pool
data.1      mo    -pool data_pool    -pool scratch_pool
proj.1      mo    -pool proj_pool    -pool scratch_pool
endvsns

For more information on VSN associations, see VSN Association Directives.


About Disk Archiving

Archiving is the process of copying a file from online disk to archive media. With disk archiving, the archive medium is online disks in a file system.

Disk archiving can be implemented so that the files are archived from one Sun StorEdge SAM-FS file system to another file system on the same host computer or to another file system on a different Sun Solaris host. When disk archiving is implemented using two host systems, the systems involved act as a client and a server, with the client system hosting the source files and the server system being the destination system that hosts the archive copies.

The file system to which the archive files are written can be any UNIX file system. However, if disk archive copies are written to a different host, the host must have at least one file system installed on it that is compatible with the Sun StorEdge SAM-FS software.

The archiver treats files archived to disk volumes the same as files archived to volumes in a library. You can still make one, two, three, or four archive copies. If you are making multiple archive copies, one of the archive copies could be written to disk volumes while the others are written to removable media volumes. In addition, if you typically archive to disk volumes in a Sun StorEdge SAM-FS file system, the archive file copies are themselves archived according to the archiver.cmd file rules in that file system.

The following list summarizes some of the similarities and differences between archiving to online disk and archiving to removable media:



Note - You do not need the diskvols.conf configuration file if you are archiving to removable media volumes only.



A diskvols.conf file must be created on the system upon which the source files reside. Depending on where the archive copies are written, this file also contains the following information:

Configuration Guidelines

Although there are no restrictions on where disk archive volumes can reside, it is recommended that disk volumes reside on a disk other than the one on which the original files reside. It is also recommended that you make more than one archive copy and write to more than one type of archive media. For example, you might archive copy 1 to disk volumes, copy 2 to tape, and copy 3 to magneto-optical disk.

If you are archiving files to a file system on a server system, the archive files themselves can be archived to removable media cartridges in a library attached to the server.

Directives for Disk Archiving

When archiving to online disk, the archiver recognizes the archiver.cmd directives that define the archive set and configure recycling. It ignores directives that specifically pertain to removable media cartridges. Specifically, the system recognizes the following directives for disk archive sets:

For client-system, specify the host name of the client system that contains the source files.

For more information on directives for disk archiving, see the archiver.cmd(4) man page.


procedure icon  To Enable Disk Archiving

You can enable disk archiving at any time. The procedure in this section assumes that you have archiving in place and you are adding disk archiving to your environment. If you are enabling disk archiving as part of an initial installation, see the Sun StorEdge SAM-FS Installation and Upgrade Guide for information.



Note - In software versions previous to 4U4, disk archiving was enabled in the archiver.cmd file through a -disk_archive parameter in the params section. This parameter is no longer used, so archiver.cmd files created with earlier software versions must be edited in order for archiving to work correctly in versions 4U4 and later. See the archiver.cmd(4) man page for details.



1. Make certain that the host to which you want to write your disk archive copies has at least one Sun StorEdge QFS or Sun StorEdge SAM-FS file system installed on it.

2. Become superuser on the host system that contains the files to be archived.

3. Follow the procedures in the Sun StorEdge SAM-FS Installation and Upgrade Guide for enabling disk archiving both on the host that contains the files to be archived and on the host to which the archive copies will be written.

4. On the host that contains the files to be archived, use the samd(1M) config command to propagate the configuration file changes and restart the system.


# samd config

5. If you are archiving to disk on a different host, follow these steps:

a. Become superuser on the host system to which the archive copies are written.

b. Use the samd(1M) config command to propagate the configuration file changes and restart the destination system.


# samd config

Disk Archiving Examples

The following are some examples of disk archiving configurations.

Example 1

In this example, VSNs identified as disk01, disk02, and disk04 are written to pluto, the host system upon which the original source files reside. VSN disk03 is written to a VSN on server system mars.

CODE EXAMPLE 3-36 shows the diskvols.conf file that resides on client system pluto.


CODE EXAMPLE 3-36 The diskvols.conf File on pluto
# This is file /etc/opt/SUNWsamfs/diskvols.conf on pluto
# VSN Name     [Host Name:]Path
#
disk01                     /sam_arch1
disk02                     /sam_arch2/proj_1
disk03                mars:/sam_arch3/proj_3
disk04	/sam_arch4/proj_4

CODE EXAMPLE 3-37 shows the diskvols.conf file on server system mars.


CODE EXAMPLE 3-37 The diskvols.conf File on mars
# This is file /etc/opt/SUNWsamfs/diskvols.conf on mars
#
clients
pluto
endclients

CODE EXAMPLE 3-38 shows a fragment of the archiver.cmd file on pluto.


CODE EXAMPLE 3-38 The archiver.cmd File on pluto
vsns
arset1.2 dk disk01
arset2.2 dk disk02 disk04
arset3.2 dk disk03
endvsns

Example 2

In this example, file /sam1/testdir0/filea is in the archive set for arset0.1, and the archiver copies the content of this file to the destination path /sam_arch1. CODE EXAMPLE 3-39 shows the diskvols.conf file.


CODE EXAMPLE 3-39 A diskvols.conf File
# This is file /etc/opt/SUNWsamfs/diskvols.conf
#
# VSN Name   [Host Name:]Path
#
disk01                  /sam_arch1
disk02                  /sam_arch12/proj_1

CODE EXAMPLE 3-40 shows the archiver.cmd file lines that pertain to disk archiving:


CODE EXAMPLE 3-40 Directives in the archiver.cmd File That Pertain to Disk Archiving
.
vsns
arset0.1 dk disk01
endvsns
.

CODE EXAMPLE 3-41 shows output from the sls(1) command for file filea, which was archived to disk. Note the following in the copy 1 line:

Example 3

In this example, file /sam2/my_proj/fileb is on client host snickers in archive set arset0.1, and the archiver copies the content of this file to the destination path /sam_arch1 on server host mars.

CODE EXAMPLE 3-42 shows the diskvols.conf file on snickers.


CODE EXAMPLE 3-42 The diskvols.conf File on snickers
# This is file /etc/opt/SUNWsamfs/diskvols.conf on snickers
#
# VSN Name   [Host Name:]Path
#
disk01        mars:/sam_arch1

CODE EXAMPLE 3-43 shows the diskvols.conf file on mars.


CODE EXAMPLE 3-43 The diskvols.conf File on mars
# This is file /etc/opt/SUNWsamfs/diskvols.conf on mars
#
clients
snickers
endclients

CODE EXAMPLE 3-44 shows the directives in the archiver.cmd file that relate to this example.


CODE EXAMPLE 3-44 Directives in the archiver.cmd File That Pertain to Disk Archiving
.
vsns
arset0.1 dk disk01
endvsns
.


Planning Archiving Operations

The archiver automates storage management operations using the archiver.cmd file. Before writing this file, it is useful to review some general guidelines that can improve the performance of your Sun StorEdge SAM-FS file system and the archiver and that can help ensure that your data is stored in the safest way possible.

The Preview Queue

The archiver and stager processes both can request that media be loaded and unloaded. If the number of requests exceeds the number of drives available for media loads, the excess number of requests is sent to the preview queue.

Archive and stage requests in the preview queue are those that cannot be immediately satisfied. By default, preview requests are satisfied in first-in-first-out (FIFO) order.

You can assign different priorities to preview requests. You can override the FIFO default by entering directives in the preview command file, which is written to /etc/opt/SUNWsamfs/preview.cmd. For more information about this file and setting priorities for archiving and staging, see Prioritizing Preview Requests.


Archiver Examples

This section provides some examples of archiving processes in real-world environments.

Example 1: No archiver.cmd File

This example illustrates the action of the archiver when no archiver.cmd file is used in a Sun StorEdge SAM-FS environment with one file system, an optical automated library with two drives, and six cartridges.

CODE EXAMPLE 3-45 shows the output produced by the archiver(1M) -lv command. It shows that the default media selected by the archiver is type mo. Only the mo media are available.


CODE EXAMPLE 3-45 archiver (1M) -lv Output Showing Archive Media
# archiver -lv
Notify file: /etc/opt/SUNWsamfs/scripts/archiver.sh
Archive media:
media:lt archmax: 512.0M Volume overflow not selected
media:mo archmax:   4.8M Volume overflow not selected

CODE EXAMPLE 3-46 shows output that indicates that the archiver uses two drives. It lists the 12 volumes, storage capacity, and available space.



Note - The archiver(1M) -lv command only shows VSNs with space available.




CODE EXAMPLE 3-46 archiver (1M) -lv Output Showing Available VSNs
Archive libraries:
Device:hp30 drives_available:2 archive_drives:2
  Catalog:
  mo.optic00          capacity:  1.2G space: 939.7M  -il-o-------
  mo.optic01          capacity:  1.2G space: 934.2M  -il-o-------
  mo.optic02          capacity:  1.2G space: 781.7M  -il-o-------
  mo.optic03          capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic10          capacity:  1.2G space:  85.5M  -il-o-------
  mo.optic11          capacity:  1.2G space:  0      -il-o-------
  mo.optic12          capacity:  1.2G space: 618.9k  -il-o-------
  mo.optic13          capacity:  1.2G space: 981.3M  -il-o-------
  mo.optic20          capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic21          capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic22          capacity:  1.2G space: 244.9k  -il-o-------
  mo.optic23          capacity:  1.2G space:   1.1G  -il-o-------

CODE EXAMPLE 3-47 shows that the archive set samfs includes both metadata and data files. The archiver makes one copy of the files when their archive age reaches the default four minutes (240 seconds).


CODE EXAMPLE 3-47 archiver (1M) -lv Output Showing Archive File Selections
Archive file selections:
Filesystem samfs  Logfile:
samfs Metadata
    copy:1  arch_age:240
samfs1  path:.
    copy:1  arch_age:240

CODE EXAMPLE 3-48 shows the files in the archive sets archived to the volumes in the indicated order.


CODE EXAMPLE 3-48 archiver (1M) -lv Output Showing Archive Sets and Volumes
Archive sets:
allsets
samfs.1
 media: mo (by default)
 Volumes:
   optic00
   optic01
   optic02
   optic03
   optic10
   optic12
   optic13
   optic20
   optic21
   optic22
   optic23
 Total space available:   8.1G

Example 2: Data Files Archived Separately From Metadata

This example shows how to separate data files into two archive sets separate from the metadata. The environment includes a manually mounted DLT tape drive in addition to an optical automated library. The big files are archived to tape, and the small files are archived to optical cartridges.

CODE EXAMPLE 3-49 shows the content of the archiver.cmd file.


CODE EXAMPLE 3-49 archiver (1M) -lv Output Showing the archiver.cmd File
# archiver -lv -c example2.cmd
Reading archiver command file "example2.cmd"
1: # Example 2 archiver command file
2: # Simple selections based on size
3: 
4: logfile = /var/opt/SUNWsamfs/archiver/log
5: interval = 5m
6: 
7: # File selections.
8: big . -minsize 500k
9: all .
10:    1 30s
11: 
12: vsns
13: samfs.1 mo .*0[0-2]       # Metadata to optic00 - optic02
14: all.1 mo .*0[3-9] .*[1-2][0-9]  # All others for files
15: big.1 lt .*
16: endvsns

CODE EXAMPLE 3-50 shows the media and drives to be used.


CODE EXAMPLE 3-50 archiver (1M) -lv Output Showing Media and Drives
Notify file: /etc/opt/SUNWsamfs/scripts/archiver.sh
Archive media:
media:lt archmax: 512.0M Volume overflow not selected
media:mo archmax:   4.8M Volume overflow not selected
Archive libraries:
Device:hp30 drives_available:0 archive_drives:0
  Catalog:
  mo.optic00        capacity:  1.2G space: 939.7M  -il-o-------
  mo.optic01        capacity:  1.2G space: 934.2M  -il-o-------
  mo.optic02        capacity:  1.2G space: 781.7M  -il-o-------
  mo.optic03        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic04        capacity:  1.2G space: 983.2M  -il-o-------
  mo.optic10        capacity:  1.2G space:  85.5M  -il-o-------
  mo.optic11        capacity:  1.2G space:   0     -il-o-------
  mo.optic12        capacity:  1.2G space: 618.9k  -il-o-------
  mo.optic13        capacity:  1.2G space: 981.3M  -il-o-------
  mo.optic20        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic21        capacity:  1.2G space:   1.1G  -il-o-------
	 mo.optic22        capacity:  1.2G space: 244.9k  -il-o-------
  mo.optic23        capacity:  1.2G space:   1.1G  -il-o-------Device:lt40 drives_available:0 archive_drives:0
  Catalog:
  lt.TAPE01         capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE02         capacity:  9.5G space:  6.2G  -il-o-------
  lt.TAPE03         capacity:  9.5G space:  3.6G  -il-o-------
  lt.TAPE04         capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE05         capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE06         capacity:  9.5G space:  7.4G  -il-o-------



Note - The archiver(1M) -lv command only shows VSNs with space available.



CODE EXAMPLE 3-51 shows the organization of the file system. Files bigger than 512000 bytes (500 kilobytes) are archived after four minutes; all other files are archived after 30 seconds.


CODE EXAMPLE 3-51 archiver (1M) -lv Output Showing File System Organization
Archive file selections:
Filesystem samfs  Logfile: /var/opt/SUNWsamfs/archiver/log
samfs Metadata
    copy:1  arch_age:240
big  path:. minsize:502.0k
    copy:1  arch_age:240
all  path:.
    copy:1  arch_age:30

CODE EXAMPLE 3-52 shows the division of the archive sets among the removable media.


CODE EXAMPLE 3-52 archiver (1M) -lv Output Showing Archive Sets and Removable Media
Archive sets:
allsets
all.1
 media: mo
Volumes:
   optic03
   optic04
   optic10
   optic12
   optic13
   optic20
   optic21
   optic22
   optic23
 Total space available:   6.3G
big.1
 media: lt
Volumes:
   TAPE01
   TAPE02
   TAPE03
   TAPE04
   TAPE05
   TAPE06
 Total space available:  42.8G
samfs.1
 media: mo
Volumes:
   optic00
   optic01
   optic02
 Total space available:   2.6G

Example 3: User and Data Files Archived to Different Media

In this example, user files and project data files are archived to various media. Files from the directory data are segregated by size to optical and tape media. Files assigned to the group ID pict are assigned to another set of volumes. Files in the directories tmp and users/bob are not archived. Archiving is performed at 15-minute intervals, and an archiving record is kept.

CODE EXAMPLE 3-53 shows the output of the archiver(1M) -lv -c command in this example.


CODE EXAMPLE 3-53 archiver (1M) -lv -c Command Output
# archiver -lv -c example3.cmd
Reading archiver command file "example3.cmd"
1: # Example 3 archiver command file
2: # Segregation of users and data
3: 
4: interval = 30s
5: logfile = /var/opt/SUNWsamfs/archiver/log
6: 
7: no_archive tmp
8: 
9: fs = samfs
10: no_archive users/bob
11: prod_big data -minsize 50k
12:    1 1m 30d
13:    2 3m
14: prod data
15:    1 1m
16: proj_1 projs/proj_1
17:    1 1m
18:    2 1m
19: joe . -user joe
20:    1 1m
21:    2 1m
22: pict . -group pict
23:    1 1m
24:    2 1m
25: 
26: params
27: prod_big.1 -drives 2
28: prod_big.2 -drives 2
29: endparams
30: 
31: vsns
32: samfs.1 mo optic0[0-1]$
33: joe.1 mo optic01$
34: pict.1 mo optic02$
35: pict.2 mo optic03$
36: proj_1.1 mo optic1[0-1]$
37: proj_1.2 mo optic1[2-3]$
38: prod.1 mo optic2.$
39: joe.2 lt 0[1-2]$
40: prod_big.1 lt 0[3-4]$
41: prod_big.2 lt 0[5-6]$
42: endvsns
Notify file: /etc/opt/SUNWsamfs/scripts/archiver.sh
Archive media:
media:lt archmax: 512.0M Volume overflow not selected
media:mo archmax:   4.8M Volume overflow not selected
Archive libraries:
Device:hp30 drives_available:0 archive_drives:0
 Catalog:
  mo.optic00        capacity:  1.2G space: 939.7M  -il-o-------
  mo.optic01        capacity:  1.2G space: 934.2M  -il-o-------
  mo.optic02        capacity:  1.2G space: 781.7M  -il-o-------
  mo.optic03        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic04        capacity:  1.2G space: 983.2M  -il-o-------
  mo.optic10        capacity:  1.2G space:  85.5M  -il-o-------
  mo.optic11        capacity:  1.2G space:  0      -il-o-------
  mo.optic12        capacity:  1.2G space: 618.9k  -il-o-------
  mo.optic13        capacity:  1.2G space: 981.3M  -il-o-------
  mo.optic20        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic21        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic22        capacity:  1.2G space: 244.9k  -il-o-------
  mo.optic23        capacity:  1.2G space:   1.1G  -il-o-------
Device:lt40 drives_available:0 archive_drives:0
  Catalog:
  lt.TAPE01          capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE02          capacity:  9.5G space:  6.2G  -il-o-------
  lt.TAPE03          capacity:  9.5G space:  3.6G  -il-o-------
  lt.TAPE04          capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE05          capacity:  9.5G space:  8.5G  -il-o-------
  lt.TAPE06          capacity:  9.5G space:  7.4G  -il-o-------
Archive file selections:
Filesystem samfs  Logfile: /var/opt/SUNWsamfs/archiver/log
samfs Metadata
    copy:1  arch_age:240
no_archive Noarchive path:users/bob
prod_big  path:data minsize:50.2k
    copy:1  arch_age:60 unarch_age:2592000
    copy:2  arch_age:180
prod  path:data
    copy:1  arch_age:60
proj_1  path:projs/proj_1
    copy:1  arch_age:60
    copy:2  arch_age:60
joe  path:. uid:10006
    copy:1  arch_age:60
    copy:2  arch_age:60
pict  path:. gid:8005
    copy:1  arch_age:60
    copy:2  arch_age:60
no_archive Noarchive path:tmp
samfs  path:.
    copy:1  arch_age:240
Archive sets:
allsets
joe.1
 media: mo
 Volumes:
   optic01
 Total space available: 934.2M
joe.2
 media: lt
 Volumes:
   TAPE01
   TAPE02
 Total space available:  14.7G
pict.1
 media: mo
 Volumes:
   optic02
 Total space available: 781.7M
pict.2
 media: mo
 Volumes:
   optic03
 Total space available:   1.1G
prod.1
 media: mo
 Volumes:
   optic20
   optic21
   optic22
   optic23
 Total space available:   3.3G
prod_big.1
 media: lt drives:2
 Volumes:
   TAPE03
   TAPE04
 Total space available:  12.1G
prod_big.2
 media: lt drives:2
 Volumes:
   TAPE05
   TAPE06
 Total space available:  16.0G
proj_1.1
 media: mo
 Volumes:
   optic10
 Total space available:  85.5M
proj_1.2
 media: mo
 Volumes:
   optic12
   optic13
 Total space available: 981.9M
samfs.1
 media: mo
 Volumes:
   optic00
   optic01
 Total space available:   1.8G

Example 4: User and Data Files Archived to Optical Media

In this example, user files and project data files are archived to optical media.

Four VSN pools are defined; three pools are used for user, data, and project, and one is a scratch pool. When proj_pool runs out of media, it relies on scratch_pool to reserve volumes. This example shows how to reserve volumes for each archive set based on the set component, owner component, and file system component. Archiving is performed at 10-minute intervals, and an archiving log is kept.

CODE EXAMPLE 3-54 shows the archiver.cmd file and archiver output.


CODE EXAMPLE 3-54 archiver.cmd File and Archiver Output
Reading archiver command file "example4.cmd"
1: # Example 4 archiver command file
2: # Using 4 VSN pools
3: 
4: interval = 30s
5: logfile = /var/opt/SUNWsamfs/archiver/log
6: 
7: fs = samfs
8: users users
9:     1 10m
10: 
11: data data
12:    1 10m
13: 
14: proj projects
15:    1 10m
16: 
17: params
18: users.1 -reserve user
19: data.1 -reserve group
20: proj.1 -reserve dir -reserve fs
21: endparams
22: 
23: vsnpools
24: users_pool mo optic0[1-3]$
25: data_pool mo optic1[0-1]$
26: proj_pool mo optic1[2-3]$
27: scratch_pool mo optic2.$
28: endvsnpools
29: 
30: vsn
31: samfs.1 mo optic00
32: users.1 mo -pool users_pool -pool scratch_pool
33: data.1 mo -pool data_pool -pool scratch_pool
34: proj.1 mo -pool proj_pool -pool scratch_pool
35: endvsns
Notify file: /etc/opt/SUNWsamfs/scripts/archiver.sh
Archive media:
media:mo archmax:   4.8M Volume overflow not selected
Archive libraries:
Device:hp30 drives_available:0 archive_drives:0
  Catalog:
  mo.optic00        capacity:  1.2G space: 939.7M  -il-o-------
  mo.optic01        capacity:  1.2G space: 934.2M  -il-o-------
  mo.optic02        capacity:  1.2G space: 781.7M  -il-o-------
  mo.optic03        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic04        capacity:  1.2G space: 983.2M  -il-o-------
 	mo.optic10        capacity:  1.2G space:  85.5M  -il-o-------
  mo.optic11        capacity:  1.2G space:   0     -il-o-------
  mo.optic12        capacity:  1.2G space: 618.9k  -il-o-------
  mo.optic13        capacity:  1.2G space: 981.3M  -il-o-------
  mo.optic20        capacity:  1.2G space:   1.1G  -il-o-------
  mo.optic21        capacity:  1.2G space:   1.1G  -il-o-------
	mo.optic22        capacity:  1.2G space: 244.9k  -il-o-------
  mo.optic23        capacity:  1.2G space:   1.1G  -il-o-------
Archive file selections:
Filesystem samfs  Logfile: /var/opt/SUNWsamfs/archiver/log
samfs Metadata
    copy:1  arch_age:240
users  path:users
    copy:1  arch_age:600
data  path:data
    copy:1  arch_age:600
proj  path:projects
    copy:1  arch_age:600
samfs  path:.
    copy:1  arch_age:240
VSN pools:
data_pool media: mo Volumes:
   optic10
 Total space available:  85.5M
proj_pool media: mo Volumes:
   optic12
   optic13
 Total space available: 981.9M
scratch_pool media: mo Volumes:
   optic20
   optic21
   optic22
   optic23
 Total space available:   3.3G
users_pool media: mo Volumes:
   optic01
   optic02
   optic03
 Total space available:   2.7G
Archive sets:
allsets
data.1
  reserve:/group/
 media: mo
 Volumes:
   optic10
   optic20
   optic21
 optic22
   optic23
 Total space available:   3.4G
proj.1
  reserve:/dir/fs
 media: mo
 Volumes:
   optic12
   optic13
   optic20
   optic21
   optic22
   optic23
 Total space available:   4.2G
samfs.1
 media: mo
 Volumes:
   optic00
 Total space available: 939.7M
users.1
  reserve:/user/
 media: mo
 Volumes:
   optic01
   optic02
   optic03
   optic20
   optic21
   optic22
   optic23
 Total space available:   6.0G