sam-migrationd - Migrate Oracle HSM tape volumes
∕opt∕SUNWsamfs∕sbin∕sam-migrationd
SUNWsamfs
The Automated Media Migration daemon, sam-migrationd
,
moves archive files from one tape volume to another, so
that worn media and older media formats can be replaced
without staging and rearchiving.
The sam-migrationd
daemon is started by the Oracle
HSM initialization daemon, sam-fsd
. You cannot launch
sam-fsd
from the command line. To control the
migration process, you use the following commands:
∕opt∕SUNWsamfs∕sbin∕samcmd migconfig ∕opt∕SUNWsamfs∕sbin∕samcmd migstart ∕opt∕SUNWsamfs∕sbin∕samcmd migidle ∕opt∕SUNWsamfs∕sbin∕samcmd migstop
Automated Media Migration can move archive files using either the StorageTek
Direct Copy method or the Memory Assisted Copy method. In Memory-Assisted Copy
mode, the metadata server mounts source and destination tapes, copies tape blocks
from the source to in-memory buffers and then writes the data blocks from memory
to the destination media, one tape archive (tar
) file at a time. In
StorageTek Direct Copy mode, the server mounts the source tape on a supported
drive and mounts the destination tape on a StorageTek T10000D drive, connected
via a Fibre Channel Storage Area Network (SAN). The StorageTek Direct Copy
feature then copies tape blocks directly from the source to the destination,
without using additional host resources (supported source tape drives for the
StorageTek Direct Copy are listed later in this man page). The source and
destination tapes must use the same block size.
The sam-migrationd
daemon monitors the Oracle HSM preview table and the
activities of the archiver and stager. If a higher-priority archiver or stager
process requests a volume or drive that is being used for migration, the
sam-migrationd
daemon suspends the copy process. The sam-migrationd
daemon runs at the lowest priority in Oracle HSM.
The sam-migrationd
daemon splits its work into five phases:
File System Scan
Tape Copy
Tar Header Check
Inode Update
Logging
The sam-migrationd
daemon picks its list of migration source volumes
from the migrationd.cmd
(4) configuration file and marks these tape
volumes as read-only in the Oracle HSM catalog. The samu
v
display
flags these volumes R
(read-only) and S
(migration source volume).
All Oracle HSM file systems must be mounted prior to migration, so that
sam-migrationd
can update the corresponding file system inode when it
moves an archival copy to a new media location. If any of the file systems are
not mounted, the sam-migrationd
cancels the migration.
Then, for each migration source volume, the daemon scans inodes in all Oracle
HSM file systems for file copies that reside on the volume. The daemon makes a
record of these inodes in an inode database created for the volume in the default
directory ∕var∕opt∕SUNWsamfs∕sammig∕db∕
xx.
yyyyyy,
where xx is one of the two-character media-type identifiers listed in the
mcf
(4) man page and yyyyyy is the Volume Serial Number (VSN) of the
volume. For instance, for a StorageTek T10000D (type ti
) tape named
TEE150
, the daemon would create the inode database in the directory
∕var∕opt∕SUNWsamfs∕sammig∕ti.TEE150
. The inode database record is sorted
by the archive copy position and offset pair and used for the subsequent copy,
tar header check, inode update, and logging phases of the migration operation.
The dbdir
directive of the migrationd.cmd
(4) configuration file
configures the inode database.
Once the file systems have been scanned and the database created, the migration daemon moves on to phase 2, the copy process.
If the migration daemon detects the following conditions while scanning file systems, the sam-migrationd takes appropriate action or cancels migration. The source tape remains read-only mode in the Oracle HSM catalog after the file system scan.
If no files are found on the migration source tape, the daemon finishes migration for that volume.
If a multiple-volume archive copy is found, the daemon, logs a warning to the
file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
asking the administrator to
manually rearchive files. The daemon then continues to scan inodes and migrate
files.
If a removable-media file is found, the migration daemon logs an error message
to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
, and the daemon cancels
the operation. See the request
(1) man page for more information on
removable-media files.
If a fatal file system, inode, or database error is detected, the migration
daemon logs an error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
and cancels the operation.
If the sam-migrationd
daemon was stopped by the command samcmd
migstop
,
sam-migrationd
cancels the file system scan, and the inode database
created so far is removed. The sam-migrationd
daemon resumes the file
system scan from scratch when restarted by the samcmd
migstart
.
The catalog status S
is not changed.
If the file system scan cannot be idled by the command samcmd
migidle
by the scheduled time configured in the file
∕etc∕opt∕SUNWsamfs∕migrationd.cmd
, the migration daemon logs an
error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
.
The sam-migrationd
daemon first marks the catalog entry for each
destination volume specified in the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
file with the D
(destination) flag. Then the daemon spawns an instance
of the copy process sam-migcopy
for each migration source tape.
The sam-migcopy
process operates in either of two modes, Memory-Assisted
Copy or StorageTek Direct Copy. The administrator specifies the desired mode
using the xcopy
directive in the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
file. The directive can take the following forms:
xcopy = on
(use StorageTek Direct Copy mode if possible, and Memory
Assisted Copy mode otherwise)
xcopy = only
(use StorageTek Direct Copy mode if possible, and return an
error message otherwise)
xcopy = off
(use Memory Assisted Copy mode)
Memory Assisted Copy Mode
In Memory Assisted Copy mode, the copy process allocates buffers on the server
and copies tape archive (tar
) files from the source volume to the
destination volume via the buffers. The sam-migcopy
process copies all
tar
files that contain at least one active archive copy, one tar
file at a time.
The administrator can specify a desired buffer size in the migrationd.cmd
file by using the directive bufsize
=
media buffer-size,
where media is one of the two-character media-type identifiers listed on
the mcf
(4) man page and buffer_size is an integer in the range
2-8192
(for example, bufsize
=
ti
64
). The Oracle
HSM software multiples the specified buffer_size by the default block size
for the media type media. See defaults.conf
(4)) and migrationd.cmd
(4)
for additional information.
The sam-migcopy
process starts by reading the inode records stored in the
database that sam-migrationd
created for the current volume during phase
1, scanning of the file system. Then it examines the tar
files on the
tape volume.
At the start of each tar
file, sam-migcopy
constructs a list of the
inodes that have archive copies starting at that position. It compares the list
of inodes on the tape volume with the database of inodes scanned from the file
system. Inodes that appear in the list but not in the database point to inactive
copies and are dropped from the list. If the list is empty when the comparison
finishes, sam-migcopy
does not copy the tar
file to the new media.
Otherwise, it copies all active and inactive copies found between the start of
the tar
file and the end of the last active copy in the file. It updates
the inode database with the status of the copy operation (done
, if
successful, or a copy error otherwise).
The sam-migcopy
process repeats the above steps until all tar
files
have been examined, compared with the inode database, and, if necessary, copied
to new media.
StorageTek Direct Copy Mode
In StorageTek Direct Copy mode, the sam-migcopy
process takes advantage of
the xcopy
feature of StorageTek T10000D Fibre Channel drives. When the
destination drive is a T10000D, sam-migcopy
can make direct,
drive-to-drive, tape-to-tape copies, without consuming significant server
resources.
The sam-migcopy
process can create tape-to-tape copies either by repacking
files or by copying to the EOD
(end-of-data) mark on the source
tape. The administrator specifies the desired behavior using the xcopy_eod
directive of the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
4) file. The directive
can take the following forms:
xcopy_eod
=
off
(repack)
xcopy_eod
=
on
(copy to EOD
)
Repacking
When xcopy_eod
is set to off
, the sam-migcopy
process copies
all tar
files that contain at least one active archive copy, one tar
file at a time. Since tar
files that do not contain active copies are not
copied, the new media volumes are partially repacked.
The sam-migcopy
process starts by reading the inode records stored in the
database that sam-migrationd
created for the current volume in phase 1,
scanning of the file system. Then it examines the tar
files on the tape
volume.
At the start of each tar
file, sam-migcopy
constructs a list of the
inodes that have archive copies starting at that position. It compares the list
of inodes on the tape volume with the database of inodes scanned from the file
system. Inodes that appear in the list but not in the database point to inactive
copies and are dropped from the list. If one or more active copies remains on the
list when the comparison finishes, sam-migcopy
copies the entire tar
file to the new media. It defines the required operation in an extended copy SCSI
command, sends the command to the destination tape drive, waits for the extended
copy command to complete, and updates the inode database with the copy status
(done
, if successful, or a copy error otherwise). If the inode list is empty
when the comparison finishes, sam-migcopy
does not copy the tar
file.
The sam-migcopy
process repeats the above steps until all tar
files
have been examined, compared with the inode database, and, if necessary, copied
to new media.
Copying to EOD
When xcopy_eod
is set to off
, the sam-migcopy
process starts
copying the tape at the first tar
file that contains one or more active
file copies and continues until it encounters the EOD
(end-of-data)
on the tape.
The sam-migcopy
process identifies its starting point by reading the
inode records stored in the database that sam-migrationd
created for the
current volume during phase 1, scanning of the file system. Then it copies all
tar
files between its start point and the EOD
mark to the new media.
It defines the required operation in an extended copy SCSI command, sends the
command to the destination tape drive, waits for the copy to complete, and
updates the inode database. If no errors are returned, sam-migcopy
enters
done
for all copies made. If errors are returned, sam-migcopy
checks
the number of blocks that were written successfully and marks the successfully
copied files done
.
Error Handling
If sam-migcopy
detects potential problems while copying tape blocks, it
takes appropriate action or cancels the operation, while flagging the source
tape read-only (R
) in the Oracle HSM catalog:
If StorageTek Direct Copy is enabled but the source or destination drives do
not support the StorageTek Direct Copy feature, sam-migcopy
automatically
adapts. If the migrationd.cmd
(4) file contains the directive xcopy
=
on
, it use the Memory Assisted Copy feature instead. If the file
contains the directive xcopy
=
only
, it cancels the migration
operation.
If the source tape returns a read or positioning error, sam-migcopy
skips
the position where the error occured. It logs all inodes that have archive copies
on the affected position in the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
, so
that the affected files can be manually rearchived with the rearch
(1m)
command.
In the example below, the log entries show that a read error has occured on a
StorageTek T10000 (ti
) source volume with the volume serial number
TEE152
at position 0xff38
. Inodes 1408.20
and 1363.16
show that the first copies of files ∕mig1∕tee152∕ssum1∕s0
and
∕mig1∕tee152∕ssum1∕s1
reside at the damaged location. So the copies need
to be manually recreated by rearchiving the files to the new media:
015-07-30 19:51:46 Error: 'ti.TEE152' Copy: Source read error position 0xff38 error 5 2015-07-30 19:51:56 Error: 'ti.TEE152' Copy: ino:1408.20 ∕mig1∕tee152∕ssum1∕s0 copy:1 on position:0xff38 has a read error, copy skipped, need to rearchive. 2015-07-30 19:51:56 Error: 'ti.TEE152' Copy: ino:1363.16 ∕mig1∕tee152∕ssum1∕s1 copy:1 on position:0xff38 has a read error, copy skipped, need to rearchive.
If the source tape returns read or positioning errors twice in a row during a
Memory Assisted Copy operation, sam-migrationd
logs the event and cancels
migration.
In the example below, a StorageTek T10000 (ti
) source volume with the
volume serial number TEE152
has returned two errors, so migration has
stopped:
2015-07-31 00:13:53 Error: 'ti.TEE152' Copy: Read error limit count 2 reached.
If the destination tape returns a write error or reports that the media is
full, sam-migcopy
stops the copy operation and checks the tar
header against the tar
files that have been copied to the destination
volume. If the check fails, sam-migcopy
cancels the copy operation and
sam-migrationd
cancels migration. If the check passes, sam-migcopy
exits, sam-migrationd
assigns the next available destination volume to the
copy operation, and copying resumes.
In the example below, the log shows that migration of StorageTek T10000
(ti
) volume TEE152
stopped because destination volume TEE157
was full. The header check passed, so copying continued to a new destination
volume, TEE158
:
2015-07-30 19:49:14 Info: 'ti.TEE152' Copy: Server copy started from position 0x5247. 2015-07-30 19:51:12 Error: 'ti.TEE152' Copy: Destination 'ti.TEE157' media full: No space left on device 2015-07-30 19:51:13 Info: 'ti.TEE152' Copy: Tar header check started from position 0x5247. 2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Tar header check succeeded, 21 inodes checked, 0 tar header error found. 2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Exited for retry, pid: 18522, exit status: 3, signal: 0 2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Restarted, pid: 18524 destination 'ti.TEE158' source position: 0xfec4 2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Mode - Memory-Assisted Copy 2015-07-30 19:51:45 Info: 'ti.TEE152' Copy: Server copy restarted from position 0xfed8.
If a fatal file system, inode or database error is detected, sam-migcopy
logs the error message to the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
, and
sam-migrationd
cancels migration.
In the example below, the log shows that migration of StorageTek T10000
(ti
) volume TEE152
stopped when a database error occurred:
2015-07-31 17:53:41 Error: 'ti.TEE152' Copy: Update db with new vsn failed, source position 0x527d, cancel migration.
If the copy operation is stopped by the samcmd
migstop
command or
idled by either the samcmd
migidle
command or the time schedule
specified in the file ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
, sam-migcopy
suspends the copy operation and exits. The sam-migrationd
daemon restarts
the sam-migcopy
process when the samcmd
migstart
is next
issued or at the next scheduled migration time. The sam-migcopy
process
then resumes the copy operation at the point where it was suspended.
Whenever it successfully copies tape archive (tar
) files from source to
destination media, the sam-migcopy
process verifies the integrity of the
data by checking the tar
headers on the destination volume. It checks the
headers for the first, middle, and last written positions on the destination tape.
If a read error was detected during phase 2, the last good position listed in the
inode database prior to the error and the next good position listed after the
error are also checked.
Error Handling
If sam-migcopy
detects potential problems while checking tar
headers, it takes appropriate action or cancels the operation, while flagging
the source tape read-only (R
) in the Oracle HSM catalog:
If it detects read or positioning errors on the destination tape or fatal file
system, inode, or database errors, sam-migcopy
logs an error message to
the file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
and cancels the tar
header check. The sam-migrationd
daemon then cancels migration.
In the example, the log shows that migration of StorageTek T10000 (ti
)
volume TEE152
stopped after a read error, two tar
header errors, and
an inode error:
- read error 2015-07-31 18:58:00 Error: 'ti.TEE152' Copy: Tar header check, destination 'ti.TEE157' read failed, position 0x8543b: EIO - tar header error 2015-07-31 20:00:59 Error: 'ti.TEE152' Copy: Invalid tar header, destination 'ti.TEE157', inode:1337.14 position 0x86ea1. 2015-07-31 20:01:10 Error: 'ti.TEE152' Copy: Tar header check failed, 104 inodes checked, 1 tar header error found. - inode error 2015-07-31 20:14:38 Error: 'ti.TEE152' Copy: Tar header check, error detected to get inodes on source position 0x5681, cancel migration.
If the tar
header check is stopped by the samcmd
migstop
command or idled by either the samcmd
migidle
command or the time
schedule specified in the file ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
, the
sam-migcopy
process suspends the tar
header check and exits. The
sam-migrationd
daemon restarts the sam-migcopy
process when the
samcmd
migstart
command is next issued or at the next scheduled
migration time. The sam-migcopy
process then resumes the tar
header
check at the point where it was suspended.
Once it has moved the active archive copies from the source volume to the
destination, the sam-migrationd
daemon updates the file system. It updates
the inode for each affected file with the new media type, volume serial number
(VSN), position, modification time, and copy creation time of its archive copy
(the modification time is the time when the tar
file that contains the
archive copy was copied to the destination tape). If a newer copy has been
archived in the interim, sam-migrationd
skips the inode update and logs a
message. A second migration run is then required to empty the
source tape.
Error Handling
If sam-migrationd
detects potential problems while updating inodes, it
takes appropriate action or cancels the migration operation, while flagging the
source tape read-only (R
) in the Oracle HSM catalog:
If it detects fatal file system, inode, or database errors, sam-migrationd
logs error messages, cancels the inode update, and cancels the migration. Any
inodes that were successfully updated up to this point have valid archive copies
on the destination volume.
2015-07-31 23:38:34 Error: 'ti.TEE152' Update inode: Error detected for inodes on next source position from 0x527d, stop migration. 2015-07-31 23:38:34 Error: 'ti.TEE152' Update inode: failed, stop migration.
If the inode update operation is stopped by the samcmd
migstop
command, sam-migrationd
suspends the operation. It resumes the inode
update operation at the point where it was suspended when the samcmd
migstart
command is next issued.
When logging is enabled, the migration daemon creates a log file for each source
volume. Each log file lists the new location of each archived file copy that
migrated from that volume (see migrationd.cmd
(4) for details).
Migration logging is enabled when the file
∕etc∕opt∕SUNWsamfs∕migrationd.cmd
includes a directive of the form
logdir
=
path, where path specifies the directory path
where the log file should be created. Logging is not enabled by default.
Error Handling
If sam-migrationd
detects potential problems while creating logs, it takes
appropriate action or cancels the migration operation, while flagging the source
tape read-only (R
) in the Oracle HSM catalog:
If it detects database errors, sam-migrationd
logs error messages to the
file ∕var∕opt∕SUNWsamfs∕sammig∕logfile
, cancels the log operation, and
cancels migration.
2015-08-01 15:21:20 Error: 'ti.TEE152' Log: Error detected for inodes on source position beyond 0xfec4, stop migration. 2015-08-01 15:21:20 Error: 'ti.TEE152' Log: failed, but migration completed.
If the log operation is stopped by the samcmd
migstop
command,
sam-migrationd
suspends the operation.
It resumes the log operation at the point where it was suspended when the
samcmd
migstart
command is next issued.
The sam-migrationd
daemon can be in any of three states:
Stop
When the sam-migrationd
daemon is in the Stop
state, it does not
scan file systems, copy volumes, update inodes, or log events. The sam-fsd
daemon starts the sam-migrationd
daemon in the stop
state. The
sam-migrationd
daemon remains in the Stop
state until the
samcmd
migidle
or samcmd
migstart
command is next issued
or until the next scheduled migration time.
Idle
When the sam-migrationd
daemon is in the Idle
state, it scans file
systems, updates inodes as necessary, and logs its activities. But it does not
copy volumes to new media. The sam-migrationd
daemon enters the
Idle
state at the end of the scheduled migration time and whenever the
samcmd
migidle
command is issued. The sam-migrationd
daemon
remains in the Idle
state until the start of the next scheduled migration
time or until a samcmd
migstart
or samcmd
migstop
command is issued.
Run
When the sam-migrationd
daemon is in the Run
state, it scans file
systems, copies volumes to new media, updates inodes, and logs its activities.
The sam-migrationd
daemon enters the Run
state at the start of the
scheduled migration time and whenever the samcmd
migrun
command is
issued. The sam-migrationd
daemon remains in the Run
state until the
end of the scheduled migration time or until a samcmd
migidle
or
samcmd
migstop
command is issued.
The samu
command line interface displays the current migration state on
the x
(migration status) and y
(migration volume serial
number list) screens. For example:
Status: Stop: Waiting for :migstart
To migrate tape volumes, carry out the following steps
Make sure that migration jobs are not already running or incomplete.
Start the samu
(1m) interface, and check the Migration
activity
(x
), Migration
vsn list
(y
), and
Robot
VSN
catalog
(v
) displays. On the Migration
vsn list
display, look for the following in the flags
field:
D
(destination)The volume is the destination for an active migration operation.
S
(source)The volume is the source of an active or incomplete migration operation.
M
(partially migrated source)The volume is the source of a partially completed migration operation.
m
(migrated source)The volume is the source of a completed migration operation.
e
(error source)The volume is the source of a failed migration.
The following example shows the catalog display entries for an active migration
operation. Archive copies are migrating from LTO (li
) source volume
000040
to LTO destination volume 000044
:
Robot VSN catalog by slot : eq 400 samu slot access time count use flags ty vsn 0 2015∕10∕14 03:58 4 0% -il---b--D-- li 000044 3 2015∕10∕14 03:58 25 1% -il---b-RS-- li 000040
When the operation is complete, the entry for LTO volume 000040
looks like
this:
3 2015∕10∕14 03:58 25 1% -il---b-Rm-- li 000040
Configure the migration. Open the ∕etc∕opt∕SUNWsamfs∕migrationd.cmd
file
in a text editor, and enter the needed directives and values. Then save the file
and close the editor.
Stop the sam-migrationd
daemon.
# samcmd migstop
Activate the migrationd.cmd
file.
# samcmd migconfig
In the Migration vsn list
(y
) display of the samu
interface,
check the status of the source volumes.
In the example, the display shows that LTO (li
) source (S
) volume
000040
has the status sched_wait
(scheduled, waiting), and LTO
destination (D
) volume has the status avail
(available for use):
Status: Stop: Waiting for :migstart Vsns:2 src:1 dest:1 maxcopy:0 ord m ty vsn start time end time status Inodes done∕tot bytes 0 S li 000040 none none sched_wait 0∕0 0 0 D li 000044 none none avail 0
Start migration.
# samcmd migstart
If you need to stop Oracle HSM (using the samd stop
or hastop
commands), if you need to shut down the system, or if you need to stop migration
for any other reason, take the following steps:
If you can, stop the sam-migcopy
process gracefully.
The samcmd migidle
command gracefully terminates the sam-migcopy
process. The sam-migcopy
process stops the copy operation when it finishes
copying the current tape archive (tar
) file. If the current file is large,
it may take some time before copying stops.
# samcmd migidle
If you need to terminate the sam-migcopy
process immediately, stop the
sam-migrationd
daemon. Use the command samcmd
migstop
.
When the sam-migcopy
process is informed that the migration daemon is
stopping, it waits for two seconds before it stops itself, so that it can try to
finish copying the current tar
file. If it cannot copy the file in time,
it cancels the copy operation and stops.
# samcmd migstop
Idled or canceled migration jobs resume when Oracle HSM restarts. Each copy
operation restarts at the position following the last tar
file copied.
Start the samu
(1m) interface, and check the Migration
activity
(x
), Migration
vsn list
(y
), and
Robot
VSN
catalog
(v
) displays. Monitor the file
∕var∕opt∕SUNWsamfs∕sammig∕logfile
.
When migration starts, the migration software flags the library catalog entries
of the source volumes R
(read-only).
To use xcopy
, you must install both destination and source drives
on the same Fibre Channel fabric switch and zone.
Manually loaded tape drives are not supported by sam-migrationd
.
The StorageTek Direct Copy mode supports the following source tape drives for use with StorageTek T10000D destination tape drives:
HP LTO2, LTO3, LTO4, LTO5, LTO6 IBM LTO2, LTO3, LTO4, LTO5, LTO6 IBM TS1120 IBM TS1130 IBM TS1140 IBM TS1150 StorageTek 9840C StorageTek 9840D StorageTek 9940A StorageTek 9940B StorageTek T10000A StorageTek T10000B StorageTek T10000C StorageTek T10000D