8 Migrating to New Storage Media

Types of Migration

Automated Media Migration

Automated Media Migration is ideal when you need to transfer complete volumes of archival data, from tape to tape or from tape to a cloud volume. It is also ideal for adding a cloud-based archive copy to the archive set.

The Automated Media Migration feature introduced in Oracle HSM 6.1 copies full volumes from media mounted on one library drive to media mounted on another, updating the file-system metadata in the process (manually loaded drives are not supported). This minimizes system overhead and administrator workload. Volumes are copied in the background, when drives are not required for archiving or staging. You can specify the number of drives used and the times of day when migration may occur. Or you can let Oracle HSM migrate volumes whenever drives are idle. If a drive or volume is needed for an archiving or staging job, the media migration process yields to the higher priority operation. If you have correctly configured, StorageTek T10000D tape drives, you can migrate to T10000D using the StorageTek Direct Copy (xcopy) method. Once a request is made, the drives handle the copying themselves, using no server resources. Otherwise, you can still minimize server load by using the StorageTek Memory Assisted Copy method. The file-system server then copies volumes from drive-to-drive via a configurable I/O buffer.

Staging and Rearchiving Process

The staging and rearchiving processes that handle ordinary archive management tasks are ideal when you need to transfer archival data selectively, directory by directory and file by file.

The older staging and rearchiving approach stages archive files from the old media to the disk cache and then rearchives them to the new media, one file at a time. This file-by-file approach can give you more control over how files are grouped and distributed. But it requires more administration. You allocate disk cache and drive resources yourself, so you must plan carefully if you need to minimize interference with normal file-system operations.

Preparing for Migration

Make Sure that File Systems Stay Backed Up

Before you start a media migration, make sure that the recovery mechanisms that normally protect Oracle HSM archive data remain effective during and following the changeover. Catastrophic hardware failures and user errors are neither more or less likely during a migration operation. So, as always, you need to be sure that you can recover files and/or complete file systems from your extant, samfsdump recovery-point files.

During the migration and for some time after, recovery will depend on recovery point files that reference the source tape volumes, rather than the new, destination volumes. If a major hardware failure disables the file system and these old tapes are not available, you will not be able to recover.

So, at a minimum, plan on preserving old tapes until you have created enough new recovery points to restore the current file system from new media. If you need to be able to restore files to particular points in time, you may need to retain the old media longer, if not indefinitely. Ideally, the old volumes should be maintained in a library, where they are readily accessible.

Provide the Required Media

Make sure that you have prepared the required destination media before you begin migration. If you are migrating volumes or volume copies to the cloud, configure and provision the cloud library. If you are migrating volumes to new tape volumes, make sure that the destination library contains enough media to hold the migrated files. Make sure that all volumes are correctly labeled, as described in "Label a New Tape or Relabel an Existing Tape". Migration will fail if volumes are unlabeled.

Select the Migration Approach that Best Meets Your Needs

  1. Migrate data by staging and re-archiving files if the following considerations apply:

    • You need to selectively migrate groups of archive files, rather than whole volumes, so that relationships between groups of archive files are maintained.

    • File system performance is not an issue.

      You have enough disk space and enough removable media drives to simultaneously handle normal file system activity and the additional migration-related staging and archiving requests (see "Estimate The Resources Available for Staging and Re-Archiving").

      Or users and host applications can accept reduced file system performance during the migration.

  2. Otherwise, select a volume migration method.

    Volume migration is especially advantageous when any of the following considerations apply:

    • You need to migrate data while maintaining normal file system operations.

    • You need to copy old volumes to new tape or cloud media.

    • You need to economize on disk space and/or removable media drives during migration.

    • You need to add a cloud-based copy to an existing archive set.

Select a Volume Migration Method

Oracle HSM can copy volumes in either of two ways, StorageTek Direct Copy or Memory Assisted Copy.

The StorageTek Direct Copy method maximizes performance and minimizes server overhead. When you specify this approach, the file-system server sends a SCSI copy request to the drive, and the T10000D drive copies the source to the destination tape, block-by-block, starting from the first valid archive file. The server plays no further role in the transfer. If the operation fails for any reason, the migration software switches automatically to the Memory Assisted Copy method.

The Memory Assisted Copy method copies valid archive files from the source drive to a configurable I/O buffer on the file-system server. If source and destination block sizes differ, the software automatically makes adjustments, as long as the destination block size is larger. The software then sends tape blocks from the buffer to the destination drive. The direct, tape-to-tape xcopy method can only copy tapes that share a common block size.

Select the method that best suits your requirements:

  1. Plan to use the StorageTek Direct Copy migration method if all of the following apply:

    • You will migrate volumes to Fibre Channel StorageTek T10000D destination drives running current firmware.

    • Source and target tapes share the same block size.

    • Source and destination drives connect via the same storage area network (SAN) switch.

    • You are not using multiple paths to the tape drive through the SAN switch/fabric. Tape multi-path is not supported with StorageTek Direct Copy.

    For more information on drive and firmware requirements, see the release notes, README.txt, in the download ZIP file or on your file-system server at /opt/SUNWsamfs/doc/README.txt

  2. Otherwise, plan to use the Memory Assisted Copy migration method.

    Use Memory Assisted Copy if any of the following considerations apply:

    • The destination drive is not a Fibre Channel Oracle StorageTek T10000D drive.

    • The source and destination drives are not Fibre Channel drives.

    • The source and destination drives do not run current firmware.

    • The source and destination drives do not connect via the same storage area network (SAN) switch.

    • The source and destination tapes do not share a common block size.

  3. If you will use StorageTek Direct Copy, select a copy mode now.

  4. If you will use Memory Assisted Copy, configure the migrationd.cmd file accordingly.

Select a StorageTek Direct Copy Mode

  1. If you are planning to migrate source volumes that contain relatively few expired files, plan on using the eod (end-of-data) option.

    In this mode, the T10000 drive copies all archive (tar) files found between the first valid file and the end-of-data (EOD) mark on the tape. If some of these files are stale, they are copied to the destination volume with the valid files.

  2. If you are planning to migrate source volumes that contain many expired files, use the repack option.

    In this mode, the T10000 drive copies only archive (tar) files that hold at least one unexpired archive copy.

  3. Now configure the migrationd.cmd file for StorageTek Direct Copy.

Migrating Complete Volumes

You select the StorageTek Direct Copy or Memory Assisted Copy method and configure migration by creating the migrationd.cmd file. Carry out the following tasks:

Create the migrationd.cmd Configuration File

  1. Log in to the Oracle HSM metadata server as root.

    root@solaris:~# 
    
  2. Open the /etc/opt/SUNWsamfs/migrationd.cmd file in a text editor.

    In the example, we open the new file in the vi editor:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    
  3. Define media pools for the source and destination volumes. Define each pool by entering a line of the form vsnpool = poolname library equipment_number media_type VSNlist, where:

    • poolname uniquely identifies the pool.

    • equipment_number is the ordinal number that the mcf file assigns to the library that holds the source volumes.

    • media_type is the two-letter code that identifies the kind of media that holds the source (see Appendix A, "Glossary of Equipment Types" for details).

    • VSNlist is a space-delimited list made up of VSNs that individually identify tape volumes and/or VSN-based regular expressions that collectively identify groups or ranges of tape or cloud media volumes.

      Cloud media pools are identified by regular expressions of the form ^prefix.*, where prefix is the value of the name parameter in the cloud library's parameters file. During migration, Oracle HSM creates cloud volumes and generates volume serial numbers as needed, using the value of the name parameter as the prefix that starts each VSN.

    In the example, we first need to migrate recent data from old LTO6 (li) tape volumes to replacement LTO7 (ti) tape cartridges. So we add a line for a source pool, pool1, that represents the LTO6 volumes that will migrate from library 20. These include volumes that have VSNs in the range VOL000 to VOL299 and two single volumes, VOL300 and VOL304. Then we add a line for a destination pool, pool2, which represents a range of LTO7 volumes in library 30:

    # pool1 contains the source volumes
    vsnpool = pool1 library 20 li ^VOL[0-2][0-9][0-9] VOL300 VOL304
    # pool2 contains the destination volumes
    vsnpool = pool2 library 30 li ^VOL50[0-9]
    

    Then we need to migrate older data from the remaining LTO6 tape volumes to cloud (cl) media for long-term storage. We add a line for a source pool, pool3, that represents the LTO6 volumes in library 20. Then we add a line for a destination pool, pool4, which represents the cloud (cl) media volumes in cloud (cr) library 800 (family set name cl800). To identify the cloud media, we supply a regular expression that specifies any VSNs starting with cl800, the value of the name parameter listed in the cloud library's parameters file, /etc/opt/SUNWsamfs/cl800:

    # pool3 contains the source volumes
    vsnpool = pool3 library 20 li ^VOL6[0-9][0-9]
    # pool4 contains the destination volumes
    vsnpool = pool4 library 800 cl ^cl800.*
    
  4. If you need to move data from source to destination media, enter a line of the form migrate = from sourcepool to destinationpool, where:

    • sourcepool is the media pool that contains the data that will be migrated.

    • destinationpool is the media pool that will receive the migrated data

    In the example, we migrate data from pool1 to pool2 and from pool3 to pool4.

    # Migrate data from tapes in pool1 to tapes in pool2.
    migrate = from pool1 to pool2
    # Migrate data from tapes in pool1 to tapes in pool2.
    migrate = from pool3 to pool4
    
  5. If you need to add an archive copy by copying data from source to destination media, enter a line of the form migrate = add_copy copy_number from sourcepool to destinationpool, where:

    • copy_number is either a specific archive copy number (in the range [1-4]) or 0, which specifies the next available copy number.

      For example, if the archiver.cmd file already specifies two copies for the archive set, a copy number of 0 is equivalent to copy number 3. If the archiver.cmd file already specifies the maximum number of supported copies, then the migration process returns an error.

    • sourcepool is the media pool that holds the local, on-site copies of files that will be vaulted off-site.

    • destinationpool is the media pool that will receive copies of the on-site files for export from the library and vaulting off-site.

    In the example, we add archive copy number 4 to the archiving configuration by copying previously archived volumes from pool3 to the pool of cloud media, pool4:

    migrate = add_copy 4 from pool3 to pool4
    
  6. If you need to move an archive copy from one copy to another, enter a line of the form migrate = move_copy copy_number from sourcepool to destinationpool. This may be useful in circumstances where you want to take an existing copy and move it to a different media type (off-site or cloud). It may be useful also if a mis-configured archiver has made copies on the wrong media pool.

    • copy_number is either a specific archive copy number (in the range [1-4]) or 0, which specifies the next available copy number.

      For example, if the archiver.cmd file already specifies two copies for the archive set, a copy number of 0 is equivalent to copy number 3. If the archiver.cmd file already specifies the maximum number of supported copies, then the migration process returns an error.

    • sourcepool is the media pool that holds the local, on-site copies of files that will be vaulted off-site.

    • destinationpool is the media pool that will receive copies of the on-site files for export from the library and vaulting off-site.

    In the example, we move copy number 4 from pool3 to pool4:

    migrate = move_copy 4 from pool3 to pool4
    
  7. If you plan to use the StorageTek Memory Assisted Copy method exclusively, enter a line of the form xcopy = off.

    When the xcopy directive is set to off, StorageTek Memory Assisted Copy is disabled.

    # Disable StorageTek Direct Copy and use Memory Assisted Copy only
    xcopy = off
    
  8. If you plan to use the StorageTek Direct Copy feature exclusively and do not intend to migrate data when drives that support this feature are unavailable, enter a line of the form xcopy = only.

    When the xcopy directive is set to only, the migration software uses the StorageTek Direct Copy method whenever the source and destination drive support the StorageTek Direct Copy feature and automatically cancels migration otherwise.

    # Enable StorageTek Direct Copy but cancel migration if 
    # the source or destination drive does not support it.
    xcopy = only
    
  9. If you plan to use StorageTek Direct Copy when possible and Memory Assisted Copy otherwise, enter a line of the form xcopy = on.

    When the xcopy directive is set to on, the migration software uses the StorageTek Direct Copy method whenever compatible drives are available. If either the source or destination drive does not support StorageTek Direct Copy, the migration software automatically switches to Memory Assisted Copy.

    # Enable StorageTek Direct Copy feature but 
    # If the source or destination is compatible with StorageTek Direct Copy, 
    # automatically switch to Memory Assisted Copy.
    xcopy = on
    
  10. If you need to migrate tape volumes that contain few expired files using StorageTek Direct Copy, enable end-of-data (eod) mode. Enter a line of the form xcopy_eod = on.

    xcopy = on
    xcopy_eod = on
    
  11. If you need to migrate tape volumes that contain that contain significant numbers of expired files using StorageTek Direct Copy, enable repack mode. Enter a line of the form xcopy_eod = off.

    xcopy = on
    xcopy_eod = off
    
  12. Specify the minimum amount of data that must be copied before xcopy can be interrupted by a higher priority, archiving or staging task. Enter a line of the form xcopy_minsize = amountUnits, where:

    • amount is an integer.

    • Units is k for kilobytes, M for megabytes, G for gigabytes, T for terabytes, P for petabytes, or E for exabytes.

    This value defines a compromise between efficient utilization of T10000 drives and availability of drives for more other tasks. Larger values write data to the drives more efficiently. Smaller values increase the availability of drives for archiving and staging. In the example, we set the minimum copy size to 30 gigabytes:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    xcopy_eod = on
    # xcopy can be interrupted after 30GB copied.
    xcopy_minsize = 30G
    
  13. Define the daily period during which migration jobs are allowed to run. Enter a line of the form runtime = window, where window is one of the following values:

    • always lets the migration daemon migrate data whenever drives and media are not required for archiving or staging. If the migration daemon is using drives or media when they are needed for archiving or staging, the migration daemon yields them.

    • start_time end_time, where start_time and end_time are, respectively, the times when the allowed period begins and ends, expressed as hours and minutes on a 24-hour clock (HHMM).

    You can override this directive at any time by issuing the commands samcmd, migstart, migidle, or migstop.

    The migration service relinquishes volumes and drives when either are required by the stager or archiver. So, unless you experience problems with archiving or staging (for example, during peak hours), you should accept the default value, always:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    xcopy_minsize = 30G
    # Run all of the time. Migration daemon will yield VSNs and drives when
    # resources are wanted by the SAM-QFS archiver and stager.
    run_time = always
    
  14. Enable logging by specifying a log directory. Enter a line of the form logdir = path, where path is the directory path and directory name.

    When the directory is defined, the migration daemon logs the destination of each archive file that migrates from each source volume. Each source volume has its own log file, named media_type.VSN where:

    • media_type is a two-letter code that identifies the kind of source media (see Appendix A, "Glossary of Equipment Types" for details).

    • VSN is the unique volume serial number that identifies the source volume.

    So for example, the log file for the source volume with the VSN VOL300 would be named li.VOL300.

    Like the archiver log, these migration logs can be invaluable during disaster recovery (see "Understanding Recovery Points and Archive Logs" and the Oracle Hierarchical Storage Manager and StorageTek QFS Software File System Recovery Guide for details). So always specify a log directory if you can. Select a location that will not be affected by the failure of Oracle HSM software or hardware, such as /var/adm/. In the example, we specify the directory /var/adm/hsm_migration_logs:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    run_time = always
    # Log directory for the migration logs. 
    logdir = /var/adm/hsm_migration_logs
    
  15. Specify a home directory for migration inode databases. Enter a line of the form dbdir = path, where path is an absolute directory path name.

    An inode database is created for each source volume and maintained for the duration of the migration. One 224-byte database record is created for each archive copy found on the source volume. So you must choose a location that has enough disk space to accommodate the largest number of copies that could fit on the source media. For example, each Oracle StorageTek T10000D volume might hold up to 8,200,104,892 archive copies. So you would need about 1.67 terabytes of database space for each T10000D volume that would migrate at any given time (see the migration.cmd (1m) man page for details).

    The default database location is /var/opt/SUNWsamfs/sammig/db. In the example, we specify the default directory:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    logdir = /var/adm/hsm_migration_logs
    # database home directory
    dbdir = /var/opt/SUNWsamfs/sammig/db
    
  16. Set the migration buffer size for the destination device. Enter a line of the form buffsize = media_type blocks, where:

    • media_type is the two-letter code that identifies the kind of media that holds the source (see Appendix A, "Glossary of Equipment Types" for details).

    • size is an integer in the range [2-8192], where the integer value specifies the number of the tape blocks that the buffer should be able to hold. The default is 64.

    In the example, we allocate enough space to hold the default number of Oracle StorageTek T10000 tape blocks:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    # database home directory
    dbdir = /var/opt/SUNWsamfs/sammig/db
    # allocate buffer space for 64 T10000D tape blocks
    bufsize = ti 64
    
  17. Specify the maximum number of drives that can be used for migration per library. Enter a line of the form max_drives = library-list, where:

    • library-list is a space-separated list of library entries, each of the form library equipment-number device-count.

    • equipment-number is the equipment ordinal number assigned to the library in the mcf file.

    • device-count is the number of drives that can be used in the specified library. By default, device-count is set to the number of drives in the library.

    The migration service relinquishes volumes and drives when either are required by the stager or archiver. So, unless you experience problems with archiving or staging, you should accept the default setting and allow migration to use any drive that is free. In the example, we have found that we do, in fact, need to limit drive usage. So we allocate eight drives to migration in library 20, six in library 30, and two in library 40:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    dbdir = /var/opt/SUNWsamfs/sammig/db
    # allocate buffer space for 64 T10000D tape blocks
    bufsize = ti 64
    # For migration, use 8 drives in library 20, 6 in 30, and 2 in 40
    max_drives = library 20 8 library 30 6 library 40 2
    
  18. Specify the maximum number of migration-related copy operations that be run at the same time. Enter a line of the form max_copy = processes, where processes is an integer.

    The default is the maximum value, which is equal to the number of configured drives in the all libraries listed in the mcf file divided by 2. In the example, we allow up to eight simultaneous copy processes:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    bufsize = ti 64
    # For migration, use 8 drives in library 20, 6 in 30, and 2 in 40
    max_drives = library 20 8 library 30 6 library 40 2
    # Up to 8 sam-migcopy process can be run simultaneously.
    max_copy = 8
    
  19. Specify the maximum number of migration-related tape-scanning operations that be run at the same time. Enter a line of the form max_scan = processes, where processes is an integer.

    To identify archive copies on the migration source VSNs, the sam-migrationd daemon scans all file systems configured in the mcf, reads all inodes from the disk cache, and compares the vsn field in each inode to the Volume Serial Numbers (VSNs) of migration source volumes. The process increases metadata activity in the file system and may thus adversely affect file-system performance.

    Select a value that best balances acceptable file-system performance against speed of migration, or accept the default, 4, for most uses. If you intend to quiesce the file system in order to achieve the fastest migration, set max_scan to 0, so that all source volumes are scanned at once. In the example, we know from experience that we can allow up to eight simultaneous scanning processes without affecting our normal file-system operations:

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    bufsize = ti 64
    # For migration, use 8 drives in library 20, 6 in 30, and 2 in 40
    max_drives = library 20 8 library 30 6 library 40 2
    # Run up to 8 sam-migcopy processes simultaneously.
    max_copy = 8
    # Scan up to 8 VSNs simultaneously.
    max_scan = 8
    
  20. Save the file, and close the editor.

    root@solaris:~# vi /etc/opt/SUNWsamfs/migrationd.cmd
    ...
    max_copy = 8
    # Scan up to 8 VSNs simultaneously.
    max_scan = 8
    :wq
    root@solaris:~# 
    

Check for Active Migration Jobs

The instructions in this section describe entering commands from the shell command prompt using the samcmd command. But please note that all commands can also be entered from the samu interface as well, in the form :command, where command is the name of the command.

  1. If you are not currently logged in to the Oracle HSM metadata server as root, log in now.

    root@solaris:~# 
    
  2. Make sure that a previous migration is not currently active or incomplete. First, check current migration status. Use the command samcmd x.

    If another migration copy job is under way, the command lists the source and destination volumes, by media type and VSN, the copy mode, the percentage complete, and the current status of the copy:

    root@solaris:~# samcmd x
    Migration status   samcmd  version HH:MM:SS month day year
    samcmd on hsm61sol
    Status: Stop: Waiting for :migstart
    source    dest    cmod perc status
    li VOL004 li VOL042 -   60% Copy idled
    

    Otherwise, if no other migration copy jobs are running, the command lists no jobs:

    root@solaris:~# samcmd x
    Migration status     samu      ver  time date
    Source Vsns - wait:  0 fsscan: 0 copy: 0 update ino: 0 log: 0 done:  0
    Status: Idle: Waiting for :migstart
    source  dest  cmod  perc  status
    
  3. Next, check the status of any current source (S) and/or destination (D) volumes. Use the command samcmd y.

    In the first example, the job end time for the only source and destination volumes listed is 10/16 12:14. The copy of the source volume is complete. So no jobs are currently running:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    Status:  Run  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn     start time  end time    status  Inodes done/tot    bytes
      0 S li VOLa01  10/16 12:12 10/16 12:14 complete    35023/35023   550.00M
      0 D li VOLa80  10/16 12:12 10/16 12:14 avail                     550.00M
    

    In the second example, the job end time for the source and destination volumes is none. The source volume is still being copied to the destination volume. So a migration job is still running:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    Status:  Run  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn     start time  end time  status  Inodes done/tot    bytes
      0 S li VOLa02  10/16 12:12 none      copy            0/35023   164.50M
      0 D li VOLa81  10/16 12:12 none      copy                      148.75M
    
  4. Finally, check the volume listings in the library catalog. Use the command samcmd v. Look for the following flags in the output:

    • R means that the volume is read-only. When migration starts, source volumes are marked read-only.

    • S (source) means that data is still being copied from this volume.

    • D (destination) means that data is still being copied to this volume.

    • m means that the source volume has finished migration.

    • e means that the source volume failed to migrate due to an error.

    In the example, volume VOLa01 migrated to VOLa80 successfully. Volume VOLa02 is still migrating to VOLa81. Volume VOLa03 failed to migrate.

    root@solaris:~# samcmd v
    Robot catalog   samcmd  version HH:MM:SS month day year
    Robot VSN catalog by slot       : eq 800
    slot          access time count use flags         ty vsn
    count 64
       0     2015/06/29 17:00    1  95%  -il---b--Rm-  li VOLa01 
       1     2015/07/02 17:43    2  89%  -il-o-b--RS-  li VOLa02
       2     2015/07/02 18:31    2  89%  -il-o-b--Re-  li VOLa03
       ... 
       51    2015/10/16 15:18    2    82%  -il-o-b-----  li VOLa80 
       52    2015/10/16 15:25    2    84%  -il-o-b---D-  li VOLa81 
    
  5. If jobs are running, wait for them to complete.

  6. Otherwise, once you are sure that no migrations are already underway, migrate volumes.

Migrate Volumes

The instructions in this section describe entering commands from the shell command prompt using the samcmd command. But please note that all commands can also be entered from the samu interface as well, in the form :command, where command is the name of the command.

  1. If you are not currently logged in to the Oracle HSM metadata server as root, log in now.

    root@solaris:~# 
    
  2. Make sure that the source file system is mounted.

  3. Activate the migrationd.cmd file. Use the command samcmd migconfig.

    If configuration is successful, the command displays the message Configuring migration and refers you to the log file for details:

    root@solaris:~# samcmd migconfig
    samcmd: migconfig: Configuring migration (see /var/opt/SUNWsamfs/sammig/logfile)
    root@solaris:~# 
    

    Otherwise, the command stops with an error. Either you have failed to stop the migration process before issuing the configuration command, or you have stopped migration while a tape volume is still waiting to migrate:

    root@solaris:~# samcmd migconfig
    samcmd: migconfig: Can't configure migration, migration status is not stop, or migration job is pending
    root@solaris:~# 
    
  4. Display the migration configuration. Use the command samcmd y.

    If configuration was successful, the specified volumes are listed, the source volume's status is sched_wait (scheduled, waiting), and the destination volume's status is avail (available). In the example, configuration has succeeded:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    samcmd on hsm61sol
    Status: Stop: Waiting for :migstart  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn    start time  end time    status  Inodes done/tot      bytes
      0 S li VOL001 none        none        sched_wait        0/0            0
      0 D li VOL501 none        none        avail                            0
    
  5. If configuration was successful, start migration. Use the command samcmd migstart.

    root@solaris:~# samcmd migstart
    samcmd: migstart: State changed to start
    root@solaris:~# 
    
  6. Check the status of migration. Use the commands samcmd x and samcmd y.

    In the example, migration is just starting. the Migration status screen shows that the job status is now Run, 1 copy is underway using s (server) copy mode (cmod), the copy is 0% complete, 0 inodes have been updated, and the source volume is still Loading:

    root@solaris:~# samcmd x
    Migration status    samcmd  version HH:MM:SS month day year
    Source Vsns - wait:  0 fsscan: 0 copy: 1 update ino: 0 log: 0 done:  0
    Status: Run
    source      dest              cmod perc status
    li VOL001 li VOL501 s        0%   Loading li.VOL001
    

    The Migration vsn list screen shows that 2 volumes are currently being processed, 1 source and 1 destination. The status of both volumes is now copy to show that the source is being copied to the destination. At this point, 0 bytes have been copied from the source to the destination, and none of the 35023 inodes have been updated:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    Status: Run  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn     start time  end time  status  Inodes done/tot    bytes
      0 S li VOL001  10/16 12:12 none      copy            0/35023        0
      0 D li VOL501  10/16 12:12 none      copy                           0
    
  7. If you need to monitor I/O performance when using the StorageTek Direct Copy migration method, do so from the Storage Area Network switch that connects the drives.

    StorageTek Direct Copy bypasses the server host and operating system. So familiar tools like iostat cannot monitor the I/O.

  8. Recheck the status of migration periodically, again using the commands samcmd x and samcmd y.

    In the example, the Migration status screen shows that the copy is now 23% complete, and 560 (0x00000230) tape blocks have been read from the source:

    root@solaris:~# samcmd x
    Migration status    samcmd  version HH:MM:SS month day year
    Source Vsns - wait:  0 fsscan: 0 copy: 1 update  ino: 0 log: 0 done:  0
    Status:  Run
    source    dest        cmod perc status
    li VOL001 li VOL501 s        24% 0x00000230 blocks read
    

    The Migration vsn list screen shows that 164.50 megabytes have been read from the source volume and 148.75 megabytes have been written to the destination volume:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    Status:  Run  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn     start time  end time  status  Inodes done/tot    bytes
      0 S li VOL001  10/16 12:12 none      copy            0/35023   164.50M
      0 D li VOL501  10/16 12:12 none      copy                      148.75M
    
  9. When the migration completes, check the ending status. Use the commands samcmd x and samcmd y and examine the migration log file:

    In the example, source and destination volumes are no longer listed on the Migration status screen, which now shows that 1 copy is now done. Note that migration status is still Run and will remain so until we enter the migidle or migstop commands:

    root@solaris:~# samcmd x
    Migration status    samcmd  version HH:MM:SS month day year
    Source Vsns - wait:  0 fsscan: 0 copy: 0 update ino: 0 log: 0 done:  1
    Status: Run
    source    dest    cmod perc status
    

    The Migration vsn list screen shows that 550.00 megabytes have been read from the source volume and 550.50 megabytes have been written to the destination volume. All 35023 inodes have been updated to reflect the new locations of the migrated archive copies:

    root@solaris:~# samcmd y
    Migration vsn list   samcmd  version HH:MM:SS month day year
    Status:  Run  Vsns:2 src:1 dest:1 maxcopy:2
    ord m ty vsn     start time  end   time  status  Inodes done/tot    bytes
      0 S li VOL001  10/16 12:12 10/16 12:14 complete    35023/35023   550.00M
      0 D li VOL012  10/16 12:12 10/16 12:14 avail                     550.00M
    

    The migration daemon's log file lists each stage of the migration and concludes with a summary. In the example, we use the Solaris tail command to view the most recent entries:

    root@solaris:~# tail /var/opt/SUNWsamfs/sammig/logfile
    date time Info: Schedule: Create VsnList file.
    date time Info: Schedule: VsnList file created, source: 1, destination: 1.
    date time Info: Schedule: Migration status changed to Start.
    date time Info: 'li.VOL001' Filesystem scan: Started
    date time Info: 'li.VOL001' Filesystem scan: Completed, total copy bytes: 517.2M, inodes: 35023, multi vsn copy: 0, removable-media file: 0, obsolete copy: 0
    date time Info: 'li.VOL001' Copy: Started, pid: 2459 destination 'li.VOL012'
    date time Info: 'li.VOL001' Copy: Mode - server copy
    date time Info: 'li.VOL001' Copy: Server copy started from position 0x4.
    date time Info: 'li.VOL001' Copy: Tar header check started from position 0x4.
    date time Info: 'li.VOL001' Copy: Tar header check succeeded, 5 inodes checked, 0 tar header error found.
    date time Info: 'li.VOL001' Copy: Completed, pid: 2459, exit status: 12, signal: 0
    date time Info: 'li.VOL001' Update inode: Started, source position: 0
    date time Info: 'li.VOL001' Update inode: Completed.
    date time Info: 'li.VOL001' Log: Started, source position: 0
    date time Info: 'li.VOL001' Log: Completed.
    date time Summ: 'li.VOL001'
    date time Summ: 'li.VOL001' =============== Summary ===============
    date time Summ: 'li.VOL001' Status:    Complete
    date time Summ: 'li.VOL001' Copy mode: Server copy
    date time Summ: 'li.VOL001' Start at:  date time
    date time Summ: 'li.VOL001' End at:    date time
    date time Summ: 'li.VOL001' Bytes:     550.00M
    date time Summ: 'li.VOL001' Archive copies:                   35023
    date time Summ: 'li.VOL001' Read error copies:                    0
    date time Summ: 'li.VOL001' Multi vsn copies:                     0
    date time Summ: 'li.VOL001' Removable-Media file:                 0
    date time Summ: 'li.VOL001' ---Dest---   ---Bytes---   ---Copies---
    date time Summ: 'li.VOL001' li VOL501        550.00M          35023
    root@solaris:~# 
    
  10. Finally, make sure that you copy the volume-migration logs to a secure location.

    These logs record the destination volume and starting position for each archive file copy that was migrated off of each source volume. This information is critical when you need to recover files or file systems. So Oracle strongly recommends that you keep backup copies of these files with your recovery point and archiver log files, as described in Chapter 7, "Backing Up the Configuration and File Systems" and in the corresponding chapter of the Oracle Hierarchical Storage Manager and StorageTek QFS Installation and Configuration Guide.

    The migration daemon creates migration log files in the directory that you specified in the migrationd.cmd file. For each volume migrated, it creates a file named media_type.VSN where:

    • media_type is a two-letter code that identifies the kind of source media (see Appendix A, "Glossary of Equipment Types" for details).

    • VSN is the unique volume serial number that identifies the source volume.

    In the example, we copy the volume logs from specified log directory, /var/adm/hsm_migration_logs/, to the directory where we keep the file-system recovery resources on an NFS-mounted remote file system:

    root@solaris:~# ls /var/adm/hsm_migration_logs/
    li.VOL001  li.VOL002  li.VOL003  li.VOL004  li.VOL005  li.VOL006 ... ti.801 ...
    root@solaris:~# cp /var/adm/hsm_migration_logs/*.* /zfs/recover/hsmfs1/2015mig/
    
  11. If you have moved data off of media without creating an additional archive copy, dispose of the old media according to your requirements.

    See "Disposing of Old Media After Migration".

  12. Stop here. Migration is complete.

Staging Files and Rearchiving to Replacement Media

To migrate archive files from old to new media using the staging and rearchiving method, you need to identify the files that will migrate, stage them to the disk cache, and then write them to the new media, without interfering with normal file-system operations. This chapter covers the following stages in the process:

Estimate The Resources Available for Staging and Re-Archiving

The details of the staging and rearchiving process depend largely on two factors: the amount of disk storage available and the number of removable media drives available. During media migration, the Oracle HSM stager loads the old removable volumes into drives that can read the old media format and restores archived files to the disk cache. Then the Oracle HSM archiver rearchives the files to new removable volumes using drives that can write the new media format. So, ideally, you would stage all the files on any given tape volume to disk at once and then immediately archive them to new media.

To do this, you would have to dedicate significant resources for the duration of the migration:

  • disk space equivalent to the capacity of a full tape

  • exclusive use of a drive that reads the old tape format

  • exclusive use of a drive that writes the new format.

The above is not a problem if you can quiesce the file system until migration is complete. But migrating data in a production setting, without unduly interfering with ongoing file system and archive operations, requires some thought. If disk space or tape drives are in short supply, you need to identify the resources that you can reasonably spare for migration and then adjust the migration process. So proceed as follows:

  1. Estimate the amount of disk cache that you can use for migration without impeding normal file-system operations.

  2. Estimate the number of tape drives that you can afford to dedicate to migration.

    If only a limited number of tape drives are available, plan on throttling the staging and archiving processes so that the migration process does not impede normal operations.

  3. Based on your estimates above, decide on staging and archiving parameters. Determine the maximum number of migrating files that the available disk space will hold at any one time and the maximum rate at which files can be moved out of cache and on to new media.

  4. Once you have estimated the resources, plan for the post-migration disposition of the old media.

Configure an Archiving Process to Use the New Media

Add the new media to the archiver.cmd file and modify the archive copy directives so that one copy is always made using the new media.

  1. Open the /etc/opt/SUNWsamfs/archiver.cmd file in a text editor.

    The archiving policy specifies two copies, both of which are written to the media type that we want to replace. In the example, we open the file in the vi editor. We want to replace the DLT cartridges (type lt):

    root@solaris: vi /etc/opt/SUNWsamfs/archiver.cmd
    # =============================================
    # /etc/opt/SUNWsamfs/archiver.cmd
    # ---------------------------------------------
    ...
    # ---------------------------------------------
    # VSN Directives
    vsns
    allfiles.1 lt .*
    allfiles.2 lt .*
    endvsns
    
  2. In the directives for copy 2, change the specified media type to the identifier for the new media, save the file, and close the text editor.

    In the example, we want to migrate data from the old DLT tapes to new LTO cartridges. So, in copy 2, we change the old media type, lt (DLT), to li (LTO):

    root@solaris: vi /etc/opt/SUNWsamfs/archiver.cmd
    # =============================================
    # /etc/opt/SUNWsamfs/archiver.cmd
    # ---------------------------------------------
    ...
    # ---------------------------------------------
    # VSN Directives
    vsns
    allfiles.1 lt .*
    allfiles.2 li .*
    endvsns
    :wq
    root@solaris:~# 
    
  3. Check the archiver.cmd file for syntax errors. Run the command archiver -lv, and correct errors until no errors are found.

    The archiver -lv command will print out the file line-by-line. If it encounters an error, it will stop running at the point where the error occurred.

    root@solaris:~# archiver -lv
    Reading '/etc/opt/SUNWsamfs/archiver.cmd'.
    1: # =============================================
    2: # /etc/opt/SUNWsamfs/archiver.cmd
    3: # ---------------------------------------------
    4: # Global Directives
    5: logfile = /var/opt/SUNWsamfs/archiver.log
    6: # ---------------------------------------------
    7: # File System Directives:
    8: fs = samqfsms
    9: all .
    10: 1 5m ...
    root@solaris:~# 
    
  4. Once the modified archiver.cmd file is error-free, load it into the current configuration using the command samd config:

    root@solaris:~# samd config
    Configuring SAM-FS
    root@solaris:~# 
    
  5. Next, migrate data from cartridge to cartridge.

Migrate the Data to the Replacement Media

The staging and archiving method for migrating data uses sfind, the Oracle HSM extension of the GNU find command. The sfind command is used to locate files on a specified tape volume and launch the stage and rearchive commands against all files found.

If you are unfamiliar with the sfind, stage, and/or rearchive commands, you should review their respective man pages now. Then, for each tape cartridge that holds data that must be migrated, proceed as follows:

Migrate Data from One Cartridge to Another

  1. Log in to the file-system host as root.

    root@solaris:~# 
    
  2. Move to the mount-point directory of the file system that holds the files that you are migrating.

    In the example, we are migrating archived copies of files stored in the hsmfs1 file system mounted at /hsm/hsmfs1:

    root@solaris:~# cd /hsm/hsmfs1
    root@solaris:~# 
    
  3. Select a tape volume.

    When migrating data from media type to media type, work with one volume at a time. In the examples below, we work with volume serial number VOL008.

  4. First, search the selected volume for damaged files that cannot be successfully staged. Use the Oracle HSM command sfind . -vsn volume-serial-number -damaged, where volume-serial-number is the alphanumeric string that uniquely identifies the volume in the library.

    In the example, we start the search from the current working directory (.). The -vsn parameter limits the search to files that are found on our current tape, VOL008. The -damaged flag limits the search to files that cannot be successfully staged:

    root@solaris:~# sfind . -vsn VOL008 -damaged 
    
  5. If the sfind search for damaged files returns any results, try to fix the file. Use the command undamage -m media-type -vsn volume-serial-number file, where:

    • media-type is one of the two-character media type codes listed in Appendix A.

    • volume-serial-number is the alphanumeric string that uniquely identifies the volume.

    • file is the path and name of the damaged file.

    Sometimes a transitory I/O error causes a copy to be marked damaged. The Oracle HSM undamage command clears this condition. In the example, the archive file copy /hsm/hsmfs1/data0008/20131025DAT is reported damaged. So we undamage it, and retry the search for damaged files:

    root@solaris:~# sfind . -vsn VOL008 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# undamage -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind . -vsn VOL008 -damaged 
    
  6. If the sfind command again lists the file as damaged, the copy is unusable. See if the archive contains another, undamaged copy of the file. To list the available copies, use the command sls -D file, where file is the path and name of the file. To check the status of any copies found, use the command sfind file -vsn volume-serial-number.

    In the example, the undamage command could not fix the copy. So we use sls to list all copies of the file /hsm/hsmfs1/data0008/20131025DAT:

    root@solaris:~# undamage -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind . -vsn VOL008 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sls -D /hsm/hsmfs1/data0008/20131025DAT
    20131025DAT:
    mode: -rw-r--r--  links:   1  owner: root      group: other
                length:    319279  admin id:      7  inode: 1407.5
                project: system(0)
                offline;  archdone;  stage -n;
                copy 1: ---- May 21 07:12     1e4b1.1    lt VOL008
                copy 2: ---- May 21 10:29     109c6.1    lt VOL022
    ...
    

    Tape volume VOL022 holds a second copy of the file. So we check the second copy with sfind:

    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -vsn VOL022 -damaged
    
  7. If a copy is unusable and one undamaged copy of the file exists, rearchive the file. Then, once the archive holds two good copies, unarchive the damaged copy.

    In the example, copy 1 of file /hsm/hsmfs1/data0008/20131025DAT on volume VOL008 is unusable, but the sfind command did not find damage to copy 2. So we issue the archive command with the -c option to create a valid copy 1 before unarchiving the damaged copy on volume VOL008:

    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -vsn VOL022 -damaged
    root@solaris:~# archive -c 1 /hsm/hsmfs1/data0008/20131025DAT
    ...
    root@solaris:~# unarchive -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    
  8. If no usable copies exist, see if the file is resident in cache. Use the command sfind . -vsn volume-serial-number -online.

    In the example, both copy 1 on volume VOL008 and copy 2 on volume VOL022 are damaged and unusable. So we see if the file is available online, in the disk cache:

    root@solaris:~# undamage -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind . -vsn VOL008 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# undamage -m lt -vsn VOL022 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -vsn VOL022 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -online
    
  9. If no usable copies exist, but the file is resident in cache, archive the file. Then, once the archive holds two good copies, unarchive the damaged copy.

    In the example, both copy 1 on volume VOL008 and copy 2 on volume VOL022 are unusable, so we issue the archive command to create two valid copies before unarchiving the damaged copy on volume VOL008:

    root@solaris:~# undamage -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind . -vsn VOL008 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# undamage -m lt -vsn VOL022 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -vsn VOL022 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -online
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# archive /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# unarchive -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    
  10. If no usable copies exist and if the file is not resident in the disk cache, the data has probably been lost. If the data is critical, consult a specialist data recovery firm for assistance. Otherwise, unarchive the damaged copy.

    In the example, both copy 1 on volume VOL008 and copy 2 on volume VOL022 are unusable. The sfind command could not find the file in the disk cache. The data is not critical. So we unarchive the damaged copy on volume VOL008:

    root@solaris:~# undamage -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind . -vsn VOL008 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# undamage -m lt -vsn VOL022 /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -vsn VOL022 -damaged
    /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# sfind /hsm/hsmfs1/data0008/20131025DAT -online
    root@solaris:~# archive /hsm/hsmfs1/data0008/20131025DAT
    root@solaris:~# unarchive -m lt -vsn VOL008 /hsm/hsmfs1/data0008/20131025DAT
    
  11. When the sfind search for damaged files returns no results, stage the files from the current tape to the disk cache. Use the command sfind . -vsn volume-serial-number -offline -exec stage {}\;

    The -vsn parameter limits the search to files that are found on the current tape (always migrate data one tape at a time.).

    The -offline parameter further restricts the sfind output to files that are not already resident in cache, so that data is not overwritten.

    The -exec stage {}\; argument takes each path and file name that sfind returns and uses it as the argument to an Oracle HSM stage command. The stage command then restores the specified file to disk cache. The process repeats until all eligible files have been staged.

    In the example, the sfind -vsn VOL008 -damaged command returns no output. So we use sfind to stage all files that are found on VOL008 and are not already in cache:

    root@solaris:~# sfind . -vsn VOL008 -damaged
    root@solaris:~# sfind . -vsn VOL008 -offline -exec stage {}\;
    
  12. Once files have been staged from tape, selectively rearchive them. Use the command sfind . -vsn volume-serial-number -online -exec rearch -r -m media-type {}\; where media-type is the type of media from which you are migrating.

    The -vsn parameter limits the search to files that are also found on the current tape (always migrate data one tape at a time.).

    The -online parameter further restricts the sfind output to files that are resident in cache, so that data is not overwritten.

    The -exec rearch -r -m media-type {}\; argument takes each path and file name that sfind returns and uses it as the argument to an Oracle HSM rearch -r -m media-type command. The -r argument runs the process recursively through subdirectories. The -m argument rearchives only files that reside on the source media.

    In the example, the -vsn parameter value is VOL008, and the value of the -m parameter specifies lt, for DLT media:

    root@solaris:~# sfind . -vsn VOL008 -online -exec rearch -r -m lt {}\;
    
  13. Repeat the preceding step until the sfind search finds no more files.

  14. When all files have been rearchived, dispose of the old tape as planned.

  15. Repeat this procedure until data has been migrated from all old media to new media.

Disposing of Old Media After Migration

Once you complete a migration, the old media do not necessarily lose all value. So carefully consider how you should dispose of them.

  • At a minimum, retain the old media until you have accumulated enough new recovery point files to recover any file in the file system using only the new, replacement media.

  • If storage space allows, keep the old media indefinitely. As long as compatible drives remain available, the old media can be an invaluable backup and recovery resource.

  • If library space is at a premium, export the old media and retain them in off-site storage.

  • If the old media can be reused and if you are sure that the data they contain are no longer useful, relabel the old volumes. For example, you could relabel the media for an older Oracle StorageTek T10000C drive and use it with a newer T10000D drive.

  • Otherwise, if neither the data on the old volumes nor the media have any remaining value, export the volumes from the library and dispose of them appropriately.