sam-migrationd - Migrate Oracle HSM tape volumes
∕opt∕SUNWsamfs∕sbin∕sam-migrationd
SUNWsamfs
The Automated Media Migration daemon, sam-migrationd,
moves archive files from one tape volume to another, so
that worn media and older media formats can be replaced
without staging and rearchiving.
The sam-migrationd daemon is started by the Oracle
HSM initialization daemon, sam-fsd. You cannot launch
sam-fsd from the command line. To control the
migration process, you use the following commands:
∕opt∕SUNWsamfs∕sbin∕samcmd migconfig ∕opt∕SUNWsamfs∕sbin∕samcmd migstart ∕opt∕SUNWsamfs∕sbin∕samcmd migidle ∕opt∕SUNWsamfs∕sbin∕samcmd migstop
Automated Media Migration can move archive files using either the StorageTek
Direct Copy method or the Memory Assisted Copy method. In Memory-Assisted Copy
mode, the metadata server mounts source and destination tapes, copies tape blocks
from the source to in-memory buffers and then writes the data blocks from memory
to the destination media, one tape archive (tar) file at a time. In
StorageTek Direct Copy mode, the server mounts the source tape on a supported
drive and mounts the destination tape on a StorageTek T10000D drive, connected
via a Fibre Channel Storage Area Network (SAN). The StorageTek Direct Copy
feature then copies tape blocks directly from the source to the destination,
without using additional host resources (supported source tape drives for the
StorageTek Direct Copy are listed later in this man page). The source and
destination tapes must use the same block size.
The sam-migrationd daemon monitors the Oracle HSM preview table and the
activities of the archiver and stager. If a higher-priority archiver or stager
process requests a volume or drive that is being used for migration, the
sam-migrationd daemon suspends the copy process. The sam-migrationd
daemon runs at the lowest priority in Oracle HSM.
The sam-migrationd daemon splits its work into five phases:
File System Scan
Tape Copy
Tar Header Check
Inode Update
Logging
The sam-migrationd daemon picks its list of migration source volumes
from the migrationd.cmd (4) configuration file and marks these tape
volumes as read-only in the Oracle HSM catalog. The samu v display
flags these volumes R (read-only) and S (migration source volume).
All Oracle HSM file systems must be mounted prior to migration, so that
sam-migrationd can update the corresponding file system inode when it
moves an archival copy to a new media location. If any of the file systems are
not mounted, the sam-migrationd cancels the migration.
Then, for each migration source volume, the daemon scans inodes in all Oracle
HSM file systems for file copies that reside on the volume. The daemon makes a
record of these inodes in an inode database created for the volume in the default
directory ∕var∕opt∕SUNWsamfs∕sammig∕db∕xx.yyyyyy,
where xx is one of the two-character media-type identifiers listed in the
mcf (4) man page and yyyyyy is the Volume Serial Number (VSN) of the
volume. For instance, for a StorageTek T10000D (type ti) tape named
TEE150, the daemon would create the inode database in the directory
∕var∕opt∕SUNWsamfs∕sammig∕ti.TEE150. The inode database record is sorted
by the archive copy position and offset pair and used for the subsequent copy,
tar header check, inode update, and logging phases of the migration operation.
The dbdir directive of the migrationd.cmd (4) configuration file
configures the inode database.
Once the file systems have been scanned and the database created, the migration daemon moves on to phase 2, the copy process.
If the migration daemon detects the following conditions while scanning file systems, the sam-migrationd takes appropriate action or cancels migration. The source tape remains read-only mode in the Oracle HSM catalog after the file system scan.
If no files are found on the migration source tape, the daemon finishes migration for that volume.
If a multiple-volume archive copy is found, the daemon, logs a warning to the
file ∕var∕opt∕SUNWsamfs∕sammig∕logfile asking the administrator to
manually rearchive files. The daemon then continues to scan inodes and migrate
files.
If a removable-media file is found, the migration daemon logs an error message
to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile, and the daemon cancels
the operation. See the request (1) man page for more information on
removable-media files.
If a fatal file system, inode, or database error is detected, the migration
daemon logs an error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
and cancels the operation.
If the sam-migrationd daemon was stopped by the command samcmd
migstop,
sam-migrationd cancels the file system scan, and the inode database
created so far is removed. The sam-migrationd daemon resumes the file
system scan from scratch when restarted by the samcmd migstart.
The catalog status S is not changed.
If the file system scan cannot be idled by the command samcmd
migidle by the scheduled time configured in the file
∕etc∕opt∕SUNWsamfs∕migrationd.cmd, the migration daemon logs an
error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile.
The sam-migrationd daemon first marks the catalog entry for each
destination volume specified in the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
file with the D (destination) flag. Then the daemon spawns an instance
of the copy process sam-migcopy for each migration source tape.
The sam-migcopy process operates in either of two modes, Memory-Assisted
Copy or StorageTek Direct Copy. The administrator specifies the desired mode
using the xcopy directive in the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
file. The directive can take the following forms:
xcopy = on (use StorageTek Direct Copy mode if possible, and Memory
Assisted Copy mode otherwise)
xcopy = only (use StorageTek Direct Copy mode if possible, and return an
error message otherwise)
xcopy = off (use Memory Assisted Copy mode)
Memory Assisted Copy Mode
In Memory Assisted Copy mode, the copy process allocates buffers on the server
and copies tape archive (tar) files from the source volume to the
destination volume via the buffers. The sam-migcopy process copies all
tar files that contain at least one active archive copy, one tar
file at a time.
The administrator can specify a desired buffer size in the migrationd.cmd
file by using the directive bufsize = media buffer-size,
where media is one of the two-character media-type identifiers listed on
the mcf (4) man page and buffer_size is an integer in the range
2-8192 (for example, bufsize = ti 64). The Oracle
HSM software multiples the specified buffer_size by the default block size
for the media type media. See defaults.conf (4)) and migrationd.cmd (4)
for additional information.
The sam-migcopy process starts by reading the inode records stored in the
database that sam-migrationd created for the current volume during phase
1, scanning of the file system. Then it examines the tar files on the
tape volume.
At the start of each tar file, sam-migcopy constructs a list of the
inodes that have archive copies starting at that position. It compares the list
of inodes on the tape volume with the database of inodes scanned from the file
system. Inodes that appear in the list but not in the database point to inactive
copies and are dropped from the list. If the list is empty when the comparison
finishes, sam-migcopy does not copy the tar file to the new media.
Otherwise, it copies all active and inactive copies found between the start of
the tar file and the end of the last active copy in the file. It updates
the inode database with the status of the copy operation (done, if
successful, or a copy error otherwise).
The sam-migcopy process repeats the above steps until all tar files
have been examined, compared with the inode database, and, if necessary, copied
to new media.
StorageTek Direct Copy Mode
In StorageTek Direct Copy mode, the sam-migcopy process takes advantage of
the xcopy feature of StorageTek T10000D Fibre Channel drives. When the
destination drive is a T10000D, sam-migcopy can make direct,
drive-to-drive, tape-to-tape copies, without consuming significant server
resources.
The sam-migcopy process can create tape-to-tape copies either by repacking
files or by copying to the EOD (end-of-data) mark on the source
tape. The administrator specifies the desired behavior using the xcopy_eod
directive of the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd 4) file. The directive
can take the following forms:
xcopy_eod = off (repack)
xcopy_eod = on (copy to EOD)
Repacking
When xcopy_eod is set to off, the sam-migcopy process copies
all tar files that contain at least one active archive copy, one tar
file at a time. Since tar files that do not contain active copies are not
copied, the new media volumes are partially repacked.
The sam-migcopy process starts by reading the inode records stored in the
database that sam-migrationd created for the current volume in phase 1,
scanning of the file system. Then it examines the tar files on the tape
volume.
At the start of each tar file, sam-migcopy constructs a list of the
inodes that have archive copies starting at that position. It compares the list
of inodes on the tape volume with the database of inodes scanned from the file
system. Inodes that appear in the list but not in the database point to inactive
copies and are dropped from the list. If one or more active copies remains on the
list when the comparison finishes, sam-migcopy copies the entire tar
file to the new media. It defines the required operation in an extended copy SCSI
command, sends the command to the destination tape drive, waits for the extended
copy command to complete, and updates the inode database with the copy status
(done, if successful, or a copy error otherwise). If the inode list is empty
when the comparison finishes, sam-migcopy does not copy the tar file.
The sam-migcopy process repeats the above steps until all tar files
have been examined, compared with the inode database, and, if necessary, copied
to new media.
Copying to EOD
When xcopy_eod is set to off, the sam-migcopy process starts
copying the tape at the first tar file that contains one or more active
file copies and continues until it encounters the EOD (end-of-data)
on the tape.
The sam-migcopy process identifies its starting point by reading the
inode records stored in the database that sam-migrationd created for the
current volume during phase 1, scanning of the file system. Then it copies all
tar files between its start point and the EOD mark to the new media.
It defines the required operation in an extended copy SCSI command, sends the
command to the destination tape drive, waits for the copy to complete, and
updates the inode database. If no errors are returned, sam-migcopy enters
done for all copies made. If errors are returned, sam-migcopy checks
the number of blocks that were written successfully and marks the successfully
copied files done.
Error Handling
If sam-migcopy detects potential problems while copying tape blocks, it
takes appropriate action or cancels the operation, while flagging the source
tape read-only (R) in the Oracle HSM catalog:
If StorageTek Direct Copy is enabled but the source or destination drives do
not support the StorageTek Direct Copy feature, sam-migcopy automatically
adapts. If the migrationd.cmd (4) file contains the directive xcopy
= on, it use the Memory Assisted Copy feature instead. If the file
contains the directive xcopy = only, it cancels the migration
operation.
If the source tape returns a read or positioning error, sam-migcopy skips
the position where the error occured. It logs all inodes that have archive copies
on the affected position in the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile, so
that the affected files can be manually rearchived with the rearch (1m)
command.
In the example below, the log entries show that a read error has occured on a
StorageTek T10000 (ti) source volume with the volume serial number
TEE152 at position 0xff38. Inodes 1408.20 and 1363.16
show that the first copies of files ∕mig1∕tee152∕ssum1∕s0 and
∕mig1∕tee152∕ssum1∕s1 reside at the damaged location. So the copies need
to be manually recreated by rearchiving the files to the new media:
015-07-30 19:51:46 Error: 'ti.TEE152' Copy: Source read error
position 0xff38 error 5
2015-07-30 19:51:56 Error: 'ti.TEE152' Copy: ino:1408.20
∕mig1∕tee152∕ssum1∕s0 copy:1 on position:0xff38 has a read error,
copy skipped, need to rearchive.
2015-07-30 19:51:56 Error: 'ti.TEE152' Copy: ino:1363.16
∕mig1∕tee152∕ssum1∕s1 copy:1 on position:0xff38 has a read error,
copy skipped, need to rearchive.
If the source tape returns read or positioning errors twice in a row during a
Memory Assisted Copy operation, sam-migrationd logs the event and cancels
migration.
In the example below, a StorageTek T10000 (ti) source volume with the
volume serial number TEE152 has returned two errors, so migration has
stopped:
2015-07-31 00:13:53 Error: 'ti.TEE152' Copy: Read error limit
count 2 reached.
If the destination tape returns a write error or reports that the media is
full, sam-migcopy stops the copy operation and checks the tar
header against the tar files that have been copied to the destination
volume. If the check fails, sam-migcopy cancels the copy operation and
sam-migrationd cancels migration. If the check passes, sam-migcopy
exits, sam-migrationd assigns the next available destination volume to the
copy operation, and copying resumes.
In the example below, the log shows that migration of StorageTek T10000
(ti) volume TEE152 stopped because destination volume TEE157
was full. The header check passed, so copying continued to a new destination
volume, TEE158:
2015-07-30 19:49:14 Info: 'ti.TEE152' Copy: Server copy started
from position 0x5247.
2015-07-30 19:51:12 Error: 'ti.TEE152' Copy: Destination
'ti.TEE157' media full: No space left on device
2015-07-30 19:51:13 Info: 'ti.TEE152' Copy: Tar header check
started from position 0x5247.
2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Tar header check
succeeded, 21 inodes checked, 0 tar header error found.
2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Exited for retry,
pid: 18522, exit status: 3, signal: 0
2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Restarted,
pid: 18524 destination 'ti.TEE158' source position: 0xfec4
2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Mode - Memory-Assisted Copy
2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Server copy restarted
from position 0xfed8.
If a fatal file system, inode or database error is detected, sam-migcopy
logs the error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile, and
sam-migrationd cancels migration.
In the example below, the log shows that migration of StorageTek T10000
(ti) volume TEE152 stopped when a database error occurred:
2015-07-31 17:53:41 Error: 'ti.TEE152' Copy: Update db with new
vsn failed, source position 0x527d, cancel migration.
If the copy operation is stopped by the samcmd migstop command or
idled by either the samcmd migidle command or the time schedule
specified in the file ∕etc∕opt∕SUNWsamfs∕migrationd.cmd, sam-migcopy
suspends the copy operation and exits. The sam-migrationd daemon restarts
the sam-migcopy process when the samcmd migstart is next
issued or at the next scheduled migration time. The sam-migcopy process
then resumes the copy operation at the point where it was suspended.
Whenever it successfully copies tape archive (tar) files from source to
destination media, the sam-migcopy process verifies the integrity of the
data by checking the tar headers on the destination volume. It checks the
headers for the first, middle, and last written positions on the destination tape.
If a read error was detected during phase 2, the last good position listed in the
inode database prior to the error and the next good position listed after the
error are also checked.
Error Handling
If sam-migcopy detects potential problems while checking tar
headers, it takes appropriate action or cancels the operation, while flagging
the source tape read-only (R) in the Oracle HSM catalog:
If it detects read or positioning errors on the destination tape or fatal file
system, inode, or database errors, sam-migcopy logs an error message to
the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile and cancels the tar
header check. The sam-migrationd daemon then cancels migration.
In the example, the log shows that migration of StorageTek T10000 (ti)
volume TEE152 stopped after a read error, two tar header errors, and
an inode error:
- read error
2015-07-31 18:58:00 Error: 'ti.TEE152' Copy: Tar header check,
destination 'ti.TEE157' read failed, position 0x8543b: EIO
- tar header error
2015-07-31 20:00:59 Error: 'ti.TEE152' Copy: Invalid tar header,
destination 'ti.TEE157', inode:1337.14 position 0x86ea1.
2015-07-31 20:01:10 Error: 'ti.TEE152' Copy: Tar header check
failed, 104 inodes checked, 1 tar header error found.
- inode error
2015-07-31 20:14:38 Error: 'ti.TEE152' Copy: Tar header check,
error detected to get inodes on source position 0x5681,
cancel migration.
If the tar header check is stopped by the samcmd migstop
command or idled by either the samcmd migidle command or the time
schedule specified in the file ∕etc∕opt∕SUNWsamfs∕migrationd.cmd, the
sam-migcopy process suspends the tar header check and exits. The
sam-migrationd daemon restarts the sam-migcopy process when the
samcmd migstart command is next issued or at the next scheduled
migration time. The sam-migcopy process then resumes the tar header
check at the point where it was suspended.
Once it has moved the active archive copies from the source volume to the
destination, the sam-migrationd daemon updates the file system. It updates
the inode for each affected file with the new media type, volume serial number
(VSN), position, modification time, and copy creation time of its archive copy
(the modification time is the time when the tar file that contains the
archive copy was copied to the destination tape). If a newer copy has been
archived in the interim, sam-migrationd skips the inode update and logs a
message. A second migration run is then required to empty the
source tape.
Error Handling
If sam-migrationd detects potential problems while updating inodes, it
takes appropriate action or cancels the migration operation, while flagging the
source tape read-only (R) in the Oracle HSM catalog:
If it detects fatal file system, inode, or database errors, sam-migrationd
logs error messages, cancels the inode update, and cancels the migration. Any
inodes that were successfully updated up to this point have valid archive copies
on the destination volume.
2015-07-31 23:38:34 Error: 'ti.TEE152' Update inode: Error
detected for inodes on next source position from 0x527d,
stop migration.
2015-07-31 23:38:34 Error: 'ti.TEE152' Update inode: failed,
stop migration.
If the inode update operation is stopped by the samcmd migstop
command, sam-migrationd suspends the operation. It resumes the inode
update operation at the point where it was suspended when the samcmd
migstart command is next issued.
When logging is enabled, the migration daemon creates a log file for each source
volume. Each log file lists the new location of each archived file copy that
migrated from that volume (see migrationd.cmd (4) for details).
Migration logging is enabled when the file
∕etc∕opt∕SUNWsamfs∕migrationd.cmd includes a directive of the form
logdir = path, where path specifies the directory path
where the log file should be created. Logging is not enabled by default.
Error Handling
If sam-migrationd detects potential problems while creating logs, it takes
appropriate action or cancels the migration operation, while flagging the source
tape read-only (R) in the Oracle HSM catalog:
If it detects database errors, sam-migrationd logs error messages to the
file ∕var∕opt∕SUNWsamfs∕sammig∕logfile, cancels the log operation, and
cancels migration.
2015-08-01 15:21:20 Error: 'ti.TEE152' Log: Error detected for
inodes on source position beyond 0xfec4, stop migration.
2015-08-01 15:21:20 Error: 'ti.TEE152' Log: failed, but
migration completed.
If the log operation is stopped by the samcmd migstop command,
sam-migrationd suspends the operation.
It resumes the log operation at the point where it was suspended when the
samcmd migstart command is next issued.
The sam-migrationd daemon can be in any of three states:
StopWhen the sam-migrationd daemon is in the Stop state, it does not
scan file systems, copy volumes, update inodes, or log events. The sam-fsd
daemon starts the sam-migrationd daemon in the stop state. The
sam-migrationd daemon remains in the Stop state until the
samcmd migidle or samcmd migstart command is next issued
or until the next scheduled migration time.
IdleWhen the sam-migrationd daemon is in the Idle state, it scans file
systems, updates inodes as necessary, and logs its activities. But it does not
copy volumes to new media. The sam-migrationd daemon enters the
Idle state at the end of the scheduled migration time and whenever the
samcmd migidle command is issued. The sam-migrationd daemon
remains in the Idle state until the start of the next scheduled migration
time or until a samcmd migstart or samcmd migstop
command is issued.
RunWhen the sam-migrationd daemon is in the Run state, it scans file
systems, copies volumes to new media, updates inodes, and logs its activities.
The sam-migrationd daemon enters the Run state at the start of the
scheduled migration time and whenever the samcmd migrun command is
issued. The sam-migrationd daemon remains in the Run state until the
end of the scheduled migration time or until a samcmd migidle or
samcmd migstop command is issued.
The samu command line interface displays the current migration state on
the x (migration status) and y (migration volume serial
number list) screens. For example:
Status: Stop: Waiting for :migstart
To migrate tape volumes, carry out the following steps
Make sure that migration jobs are not already running or incomplete.
Start the samu (1m) interface, and check the Migration
activity (x), Migration vsn list (y), and
Robot VSN catalog (v) displays. On the Migration
vsn list display, look for the following in the flags field:
D (destination)The volume is the destination for an active migration operation.
S (source)The volume is the source of an active or incomplete migration operation.
M (partially migrated source)The volume is the source of a partially completed migration operation.
m (migrated source)The volume is the source of a completed migration operation.
e (error source)The volume is the source of a failed migration.
The following example shows the catalog display entries for an active migration
operation. Archive copies are migrating from LTO (li) source volume
000040 to LTO destination volume 000044:
Robot VSN catalog by slot : eq 400 samu slot access time count use flags ty vsn 0 2015∕10∕14 03:58 4 0% -il---b--D-- li 000044 3 2015∕10∕14 03:58 25 1% -il---b-RS-- li 000040
When the operation is complete, the entry for LTO volume 000040 looks like
this:
3 2015∕10∕14 03:58 25 1% -il---b-Rm-- li 000040
Configure the migration. Open the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd file
in a text editor, and enter the needed directives and values. Then save the file
and close the editor.
Stop the sam-migrationd daemon.
# samcmd migstop
Activate the migrationd.cmd file.
# samcmd migconfig
In the Migration vsn list (y) display of the samu interface,
check the status of the source volumes.
In the example, the display shows that LTO (li) source (S) volume
000040 has the status sched_wait (scheduled, waiting), and LTO
destination (D) volume has the status avail (available for use):
Status: Stop: Waiting for :migstart Vsns:2 src:1 dest:1 maxcopy:0 ord m ty vsn start time end time status Inodes done∕tot bytes 0 S li 000040 none none sched_wait 0∕0 0 0 D li 000044 none none avail 0
Start migration.
# samcmd migstart
If you need to stop Oracle HSM (using the samd stop or hastop
commands), if you need to shut down the system, or if you need to stop migration
for any other reason, take the following steps:
If you can, stop the sam-migcopy process gracefully.
The samcmd migidle command gracefully terminates the sam-migcopy
process. The sam-migcopy process stops the copy operation when it finishes
copying the current tape archive (tar) file. If the current file is large,
it may take some time before copying stops.
# samcmd migidle
If you need to terminate the sam-migcopy process immediately, stop the
sam-migrationd daemon. Use the command samcmd migstop.
When the sam-migcopy process is informed that the migration daemon is
stopping, it waits for two seconds before it stops itself, so that it can try to
finish copying the current tar file. If it cannot copy the file in time,
it cancels the copy operation and stops.
# samcmd migstop
Idled or canceled migration jobs resume when Oracle HSM restarts. Each copy
operation restarts at the position following the last tar file copied.
Start the samu (1m) interface, and check the Migration
activity (x), Migration vsn list (y), and
Robot VSN catalog (v) displays. Monitor the file
∕var∕opt∕SUNWsamfs∕sammig∕logfile.
When migration starts, the migration software flags the library catalog entries
of the source volumes R (read-only).
To use xcopy, you must install both destination and source drives
on the same Fibre Channel fabric switch and zone.
Manually loaded tape drives are not supported by sam-migrationd.
The StorageTek Direct Copy mode supports the following source tape drives for use with StorageTek T10000D destination tape drives:
HP LTO2, LTO3, LTO4, LTO5, LTO6 IBM LTO2, LTO3, LTO4, LTO5, LTO6 IBM TS1120 IBM TS1130 IBM TS1140 IBM TS1150 StorageTek 9840C StorageTek 9840D StorageTek 9940A StorageTek 9940B StorageTek T10000A StorageTek T10000B StorageTek T10000C StorageTek T10000D