sam-archiverd - Oracle HSM file archive daemon
∕opt∕SUNWsamfs∕sbin∕sam-archiverd
SUNWsamfs
The archiver daemon automatically archives Oracle HSM files
when an Oracle HSM file
system is mounted. It is started by sam-fsd
, and it
cannot be executed from
a command line. Directives for controlling the archiver are read from
the archiver commands file, which is
∕etc∕opt∕SUNWsamfs∕archiver.cmd
.
This file does not have to be present for the archiver daemon to execute.
If the
archiver.cmd
file is present, however, it must be free of errors.
Errors in the
archiver.cmd
file prevent the archiver daemon from executing.
If the
archiver.cmd
file is not present, all files on the file system are
archived to the available removable media according to archiver defaults.
sam-archiverd
executes in the directory
∕var∕opt∕SUNWsamfs∕archiver
.
This is the archiver's working directory. Each
sam-arfind
daemon executes in a subdirectory named for the file system being archived.
Each
sam-arcopy
daemon executes in a subdirectory named for the archive file
(rm0 - rmxx) being archived to.
archive sets are the mechanism that the archiver uses to direct files
in a
samfs
file system to media during archiving.
All files in the file system are members of one and only one archive set. Characteristics of a file are used to determine archive set membership. All files in an archive set are copied to the media associated with the archive set. The archive set name is simply a synonym for a collection of media volumes.
Files are written to the media in an archive file which is written in
tar
format. The combination of the archive set and the
tar
format results in an operation that is just like using the command
find (1)
to select files for the
tar
command.
In addition, the file system meta data, (directories, the index
of segmented files, and the
removable media information), are assigned to an archive set to be copied
to media. The archive set name is the name of the file system (see mcf
(4)). Symbolic links are considered data files for the
purposes of archiving.
Each archive set may have up to four archive copies defined. The copies provide duplication of files on different media. Copies are selected by the archive age of a file.
Files in an archive set are candidates for archival action after a period of time, the archive age, has elapsed. The archive age of a file is computed using a selectable time reference for each file. The default time reference is the file's modification time.
For processing files in archive sets with an unarchive age specified, the unarchive age default time reference is the file's access time. But, in this case, two other conditions are recognized: If the modification time is later than the access time, the modification time is used. And, if an archive copy was unarchived, the file will be rearchived only after the file is staged from another copy, i.e the file was offline at the time a read access was made to the file.
Since users may change these time references to values far in the past or future, the time reference will be adjusted by the archiver to keep it in the range: creation_time = time_ref = time_now.
Each file system is examined by an individual sam-arfind. The examination is
accomplished by one of three methods. The method is selected by the
examine =
method
directive (see
archiver.cmd
(4)).
The examination methods are:
Continuous archiving method
sam-arfind
scans directories as files and
directories are created and changed.
Traditional examination method
When it executes for the first time, sam-arfind
scans all directories recursively, examining every file and giving any that do not need archiving the file status archdone
. Thereafter, sam-arfind
scans the .inodes
file.
Directory tree examination method
sam-arfind
recursively descends through the
directory tree, skipping any directory that has the noarchive
attribute set.
This allows the system administrator to identify directories that
only contain previously archived files and sub directories.
This approach can dramatically reduce the work required when examining a file system.
.inodes
file examination method
sam-arfind
scans the .inodes
file and examines only those inodes for which the file status is not archdone
.
If a large percentage of the files in the file system already have status archdone
, this method is faster than the
scandirs
method.
In this step, the archiver determines the archive set to which the file belongs by comparing the file's properties to those required for set membership. If the file's properties make it a member of an archive set and if the archive age of the file has been met or exceeded, the archiver adds the file to an archive request (ArchReq) for that archive set.
An ArchReq defines a batch of files that can be archived together.
The ArchReqs are files stored in individual directories named for the corresponding filesystem:
∕var∕opt∕SUNWsamfs∕archiver∕
file_system_name∕ArchReq
.
ArchReqs are binary files that can be displayed using the showqueue
(1m) command. An ArchReq
is removed once the files it specifies have been archived.
For segmented files, the segment, not the entire file, is the archivable unit, so properties such as minimum file size and priorities apply to the segment.
The characteristics used for determining which archive set a file belongs in are:
directory path portion of the file's name
complete file name using a regular expression
user name of the file's owner
group name of the file's owner
minimum file size
maximum file size
If a file is offline, the archiver selects the volume used as the source for the archive copy. If a file copy is being rearchived, the archiver selects the volume that holds the copy.
The archiver gives each file a file archive priority. This archive priority is computed from properties of the file and property multipliers associated with the archive set. The computation is effectively:
ArchivePriority = sum(Pn * Mn)
where:
Pn is the value of a file property.
Mn is the property multiplier.
Most property values are 1 or 0 as the property is TRUE or FALSE. For
instance, the value of the property Copy 1
is 1 if archive copy 1 is being
made. The values of Copy 2
, Copy 3
and Copy 4
are therefore 0.
Others, such as archive age
and File size
may have values other than 0 or 1.
The archive priority and the property multipliers are floating point numbers. The default value for all property multipliers is 0.
The file properties used in the priority calculation are:
The number of seconds that have elapsed since the file's archive age time reference (time_now -
time_ref)
Copy 1
1
(true) if archive copy 1 is being made, 0
(false) otherwise
Copy 2
1
(true) if archive copy 2 is being made, 0
(false) otherwise
Copy 3
1
(true) if archive copy 3 is being made, 0
(false) otherwise
Copy 4
1
(true) if archive copy 4 is being made, 0
(false) otherwise
Copies made
The number of archive copies previously made
File size
The size of the file in bytes
Archive immediate
1
(true) if immediate archiving has been requested for the file, 0
(false) otherwise
Rearchive
1
(true) if an archive copy is being rearchived, 0
(false) otherwise
Required for release
1
(true) if the archive copy is required before the file can be released, 0
(false) otherwise
All the priorities that apply for a file are added together. The priority of the ArchReq is set to the highest file priority in the ArchReq.
When the filesystem scan is finished, the archiver sends each ArchReq to sam-archiverd
.
If the ArchReq requires automatic owner
archive sets, the sam-archiverd
daemon separates the files in the
ArchReq by owner.
Then sam-archiverd
sorts the files as specified by the -sort
method copy parameter. Sorting helps to keep the files together in the archive files. By default, files
archived in the order encountered during
the file system scan.
Next, sam-archiverd
divides the files in the ArchReq into online and offline groups. Online files
are archived together, and the offline files are together.
sam-archiverd
sets the priority of the ArchReq to the priority of the highest-priority member file.
Finally, sam-archiverd
enters the ArchReq into the scheduling queue in priority order.
When an ArchReq is ready to be scheduled for a sam-arcopy
operation, sam-archiverd
assigns volumes to the candidate as follows:
If there is enough space for the ArchReq, sam-archiverd
assigns the volume that was most recently used for the archive set to which the ArchReq applies.
If an ArchReq is too big for one volume, sam-archiverd
archives as many files as will fit on the volume and archives the remainder later.
An ArchReq with a single file that is too large to fit on one volume
and is larger than ovflmin
will have additional volumes assigned as required.
The additional volumes are selected in order of decreasing size. This is
to minimize the number of volumes required for the file.
For each candidate ArchReq, compute the a scheduling priority by adding the archive priority to the following properties and the associated multipliers:
Archive volume loaded
the first volume to be archived to is loaded in a drive
Files offline
the request contains offline files
Multiple archive volumes
the file being archived requires more than one volume
Multiple stage volumes
the file being archived is offline on more than one volume
Queue wait
seconds that the ArchReq has been queued
Stage volume loaded
the first volume that contains offline files is loaded in a drive
Enter each ArchReq into the archive queue in priority order.
Schedule only the number of sam-arcopy
operations allowed by the number of drives in the library or by the number of drives allowed by
the archive set. When all sam-arcopy
operations are busy, wait for an operation to complete.
Repeat the scheduling sequence until all ArchReqs are processed.
If the archive set specifies multiple drives, divide the request for multiple drives.
Step through each ArchReqs to mark the archive file boundaries so that each archive file will be less than archmax in size. If a file is larger than archmax, it will be the only file in an archive file.
By default, all archiving priorities are set to zero. You may change the
priorities by specifying property multipliers. This allows you to control
the order in which files are archived. Here are some examples (see
archiver.cmd
(4)):
You may cause the files within an archive file to be archived in priority
order by using
-sort priority
.
You may reduce the media loads and unloads with:
-priority archive_loaded 1
and
-priority stage_loaded 1
.
You may cause online files to be archived before offline files with:
-priority offline -500
.
You may cause the archive copies to be made in order by using:
-priority copy1 4000
,
-priority copy2 3000
,
-priority copy3 2000
,
-priority copy4 1000
.
The archiver can produce a log file containing information about files archived and unarchived. Here is an example:
A 2000∕06∕02 15:23:41 mo OPT001 samfs1.1 143.1 samfs1 6.6 16384 lost+found d 0 51 A 2000∕06∕02 15:23:41 mo OPT001 samfs1.1 143.22 samfs1 19.3 4096 seg d 0 51 A 2000∕06∕02 15:23:41 mo OPT001 samfs1.1 143.2b samfs1 22.3 922337 rmfile R 0 51 A 2000∕06∕02 15:23:41 mo OPT001 samfs1.1 143.34 samfs1 27.3 11 system l 0 51 A 2000∕06∕02 15:23:41 mo OPT001 samfs1.1 143.35 samfs1 18.5 24 seg∕aa I 0 51 A 2000∕06∕02 15:23:43 ib E00000 all.1 110a.1 samfs1 20.5 14971 myfile f 0 23 A 2000∕06∕02 15:23:44 ib E00000 all.1 110a.20 samfs1 26.3 10485760 seg∕aa∕1 S 0 23 A 2000∕06∕02 15:23:45 ib E00000 all.1 110a.5021 samfs1 25.3 10485760 seg∕aa∕2 S 0 23 A 2000∕06∕02 15:23:45 ib E00000 all.1 110a.a022 samfs1 24.3 184 seg∕aa∕3 S 0 23 A 2003∕10∕23 13:30:24 dk DISK01∕d8∕d16∕f216 arset4.1 810d8.1 qfs2 119571.301 1136048 t1∕fileem f 0 0 A 2003∕10∕23 13:30:25 dk DISK01∕d8∕d16∕f216 arset4.1 810d8.8ad qfs2 119573.295 1849474 t1∕fileud f 0 0 A 2003∕10∕23 13:30:25 dk DISK01∕d8∕d16∕f216 arset4.1 810d8.16cb qfs2 119576.301 644930 t1∕fileen f 0 0 A 2003∕10∕23 13:30:25 dk DISK01∕d8∕d16∕f216 arset4.1 810d8.1bb8 qfs2 119577.301 1322899 t1∕fileeo f 0 0
Each record in the log file contains the following fourteen fields:
Archive Action
The operation being logged. This field has one of the following values:
A
(archived)
R
(re-archived)
U
(unarchived)
Date
The date field value takes the form yyyy∕
mm∕
dd.
Time
The time field value takes the form hh:
mm:
ss.
Media
The media field holds one of the two-character media-type codes listed by the mcf
(4) man page.
Volume Serial Number (VSN)
The unique volume serial name of a removable media cartridge or the disk volume name and tar file path of a disk archive.
Archive Set and Copy Number
The field value takes the form archive_set_name.
copy_number.
Location
The location of the archived copy of the file. The location is expressed as the position on the media where the archive file starts and the offset, in 512-byte blocks, between this starting position and the location of the copy within the enclosing archive file. The position and the offset are separated by a dot (.
).
File System
The name of the file system to which the copy belongs.
Inode and Generation
The combination of the inode number and generation number uniquely identifies the inode that stores the attributes and block location(s) of the data file. Inode numbers are re-used, so they do not uniquely identify an inode on their own.
Length
The length of the data written to the volume. If the copy fits on a single volume, this is the length of the file. If the file has been segmented and written to multiple volumes, this is the length of the segment.
Name
The name of the data file.
Type
The type field holds one of the following values:
d
(directory)
f
(regular file)
l
(symbolic link)
R
(removable media file)
I
(segment index)
S
(data segment)
Segment Number
The sequence number of this section of the file, if the file has been segmented and written to multiple media volumes, or 0
(zero), if the entire file resides on this volume.
Equipment Number
In the mcf
(Master Configuration File), the number that identifies the device on which the archive copy was made.
archiver
(1m),
archiver.cmd
(4),
sam-arcopy
(1m),
sam-arfind
(1m)