1 Deploying Oracle HSM Solutions

Deploying Oracle Hierarchical Storage Manager and StorageTek QFS Software (Oracle HSM) is basically a fairly straightforward process. You install software packages, edit a couple of configuration files, run a couple of commands, and then mount and use the new file system(s). Nonetheless, Oracle HSM offers a wide range of options and tuning parameters. These extra features can let you address almost any special need. But unneeded features also complicate deployment and make the resulting solution less than satisfactory.

Accordingly, this document is designed to guide you through a deployment of an Oracle HSM deployment that closely follows detailed solution requirements. It starts from the workings, installation, and configuration of basic QFS and Oracle HSM file systems. These file systems will either meet all of your requirements on their own or form the foundation for a more specialized solution. Once the basics are in place, you can then branch to procedures for configuring additional features that support particular environments and specialized business needs. You will carry out the following core tasks:

Configure hardware and operating system software to meet requirements.
Configure the basic QFS and/or Oracle HSM file systems required, accepting defaults wherever possible.
Configure any additional Oracle HSM features demanded by requirements.
Back up your finished configuration and hand it over for testing and production use.

Throughout the planning and deployment process, remember that QFS and Oracle HSM are designed to hide the complexities of performance optimization, data protection, and archiving behind a simple, UNIX file-system interface. Users, applications, and, for the most part, administrators should be able to treat a fully optimized, Oracle HSM archiving system that is implemented on a mix of disk arrays and tape libraries as if it were an ordinary UFS file system on a single, local disk. Once installed and configured, Oracle HSM software should automatically manage your data and storage resources in the most efficient and reliable way possible, with minimal human intervention. Over-complex implementations and overzealous micromanagement of file systems and storage resources thus undermine key goals of a Oracle HSM deployment, while often impairing performance, capacity utilization, and/or data protection.

The remainder of this introduction provides brief, descriptive overviews of QFS File Systems and Oracle HSM Archiving File Systems. Basic familiarity with this information makes it easier to understand the purpose of subsequent configuration steps.

QFS File Systems

QFS file systems let you combine fully optimized, custom storage solutions with standard UNIX interfaces. Internally, they manage physical storage devices to meet exacting, often highly specialized performance requirements. But they present themselves to external users, applications, and the operating system as ordinary UNIX file systems. So you can meet specialized performance and data-protection requirements using a complex range of storage hardware and still insure straightforward integration with existing applications and business processes.

QFS file systems manage their own physical storage using an integral QFS volume manager. QFS software organizes standard physical storage devices into highly optimized logical devices that remain fully compatible with standard interfaces. The software encapsulates special features and customizations so that they remain hidden from the operating system and applications. To the latter, the QFS software presents a logical family-set device that processes I/O requests like a single disk, via standard, Solaris device drivers. This combination of standards compliance and tunability distinguishes QFS from other UNIX file systems.

The remainder of this section starts with a brief discussion of QFS Defaults and I/O Performance-Tuning Objectives, and then describes the core tools that let you control the I/O behavior of the file systems that you create:

Flexible Disk Allocation Units and Logical Device Types let you match the sizes of reads and writes to the sizes of files.
Striped and round-robin File Allocation Methods let you control how file I/O interacts with devices.
Fully configurable Storage Allocation and Integrated Volume Management let you control how file systems interact with underlying physical storage.
A choice of general-purpose and high-performance File System Types gives you the option of performing data and metadata I/O on separate devices.

QFS Defaults and I/O Performance-Tuning Objectives

Disk I/O (input/output) involves CPU-intensive operating system re quests and time-consuming mechanical processes. So I/O performance tuning focuses on minimizing I/O-related system overhead and keeping the mechanical work to the absolute minimum necessary for transferring a given amount of data. This means reducing both the number of separate I/Os per data transfer (and thus the number of operations that the CPU performs) and minimizing seeking (repositioning of read/write heads) during each individual I/O. The basic objectives of I/O tuning are thus as follows:

Read and write data in blocks that divide into the average file size evenly.
Read and write large blocks of data.
Write blocks in units that align on the 512-byte sector boundaries of the underlying media, so that the disk controller does not have to read and modify existing data before writing new data.
Queue up small I/Os in cache and write larger, combined I/Os to disk.

Oracle HSM default settings provide the best overall performance for the range of applications and usage patterns typical of most general-purpose file systems. But when necessary, you can adjust the default behavior to better suit the types of I/O that your applications produce. You can specify the size of the minimum contiguous read or write. You can optimize the way in which files are stored on devices. You can choose between from file systems optimized for general use or for high-performance.

Disk Allocation Units and Logical Device Types

File systems allocate disk storage in blocks of uniform size. This size—the disk allocation unit (DAU)—determines the minimum amount of contiguous space that each I/O operation consumes, regardless of the amount of data written, and the minimum number of I/O operations needed when transferring a file of a given size. If the block size is too large compared to the average size of the files, disk space is wasted. If the block size is too small, each file transfer requires more I/O operations, hurting performance. So I/O performance and storage efficiency are at their highest when file sizes are even multiples of the basic block size.

For this reason, the QFS software supports a range of configurable DAU sizes. When you create a QFS file system, you first determine the average size of the data files that you need to access and store. Then you specify the DAU that divides most evenly into the average file size.

You start by selecting the QFS device type that is best suited to your data. There are three types:

md devices
mr devices
gXXX striped-group devices (where XXX is an integer in the range [0-127].

When file systems will contain a predominance of small files or a mix of small and large files, md devices are usually the best choice. The md device type uses a flexible, dual-allocation scheme. When writing a file to the device, the file system uses a small DAU of 4 kilobytes for the first eight writes. Then it writes any remaining data using a large, user-selected DAU of 16, 32, or 64 kilobytes. Small files are thus written in suitably small blocks, while larger files are written in larger blocks tailored to their average size.

When file systems will contain a predominance of large and/or uniformly sized files, mr devices may be a better choice. The mr device type uses a DAU that is adjustable in increments of 8 kilobytes within the range [8-65528] kilobytes. Files are written in large, uniform blocks that closely approximate the average file size, thus minimizing read-modify-write overhead and maximizing performance.

Striped groups are aggregates of up to 128 devices that are treated as a single logical device. Like the mr device type, striped groups use a DAU that is adjustable in increments of 8 kilobytes within the range [8-65528] kilobytes. The file system writes data to the members of a striped group in parallel, one DAU per disk. So the aggregate write can be very large. This makes striped groups potentially useful in applications that must handle extremely large data files.

File Allocation Methods

By default, unshared QFS file systems use striped allocation, and shared file systems use round-robin allocation. But you can change allocation when necessary. Each approach has advantages in some situations.

Striped Allocation

When striped allocation has been specified, the file system allocates space in parallel, on all available devices. The file system segments the data file and writes one segment to each device. The size of each segment is determined by the stripe width—the number of DAUs written per device—times the number of devices in the family set. Devices may be md disk devices, mr disk devices, or striped groups.

Striping generally increases performance because the file system reads multiple file segments concurrently rather than sequentially. Multiple I/O operations occur in parallel on separate devices, thus reducing the seek overhead per device.

However, striped allocation can produce significantly more seeking when multiple files are being written at once. Excessive seeking can seriously degrade performance, so you should consider round-robin allocation if you anticipate simultaneous I/O to multiple files.

Round-Robin Allocation

When round-robin allocation has been specified, the file system allocates storage space serially, one file at a time and one device at a time. The file system writes the file to the first device that has space available. If the file is larger than the space remaining on the device, the file system writes the overage to the next device that has space available. For each succeeding file, the file system moves to the next available device and repeats the process. When the last available device has been used, the file system starts over again with the first device. Devices may be md disk devices, mr disk devices, or striped groups.

Round-robin allocation can improve performance when applications perform I/O to multiple files simultaneously. It is also the default for shared QFS file systems (see "Accessing File Systems from Multiple Hosts Using Oracle HSM Software" and the mount_samfs man page for more information on shared file systems).

Storage Allocation and Integrated Volume Management

Unlike UNIX file systems that address only one device or portion of a device, QFS file systems do their own volume management. Each file-system handles the relationships between the devices that provide its physical storage internally and then presents the storage to the operating system as a single family set. I/O requests are made via standard, Solaris device-driver interfaces, as with any UNIX file system.

File System Types

There are two types of QFS file-system. Each has its own advantages:

General-Purpose `ms` File Systems

QFS ms file systems are the simplest to implement and are well suited for most common purposes. They store the file-system metadata with the file data, on the same dual-allocation, md disk devices. This approach simplifies hardware configuration and meets most needs.

High-Performance `ma` File Systems

QFS ma file systems can improve data transfer rates in demanding applications. These file systems store metadata and data separately, on dedicated devices. Metadata is kept on mm devices, while data is kept on a set of md disk devices, mr disk devices, or striped groups. As a result, metadata updates do not contend with user and application I/O, and device configurations do not have to accommodate two different kinds of I/O workload. For example, you can place your metadata on RAID-10 mirrored disks for high redundancy and fast reads and keep the data on a more space-efficient, RAID-5 disk array.

Oracle HSM Archiving File Systems

Archiving file systems combine one or more QFS ma- or ms-type file systems with archival storage and Oracle Hierarchical Storage Manager software. The Oracle HSM software copies files from the file system's disk cache to secondary disk storage and/or removable media. It manages the copies as an integral part of the file system. So the file system offers both continuous data protection and the ability to flexibly and efficiently store extremely large files that might otherwise be too expensive to store on disk or solid-state media.

A properly configured, Oracle HSM file system provides continuous data protection without separate backup applications. The software automatically copies file data as the files are created or changed, as specified in user-defined policies. Up to four copies can be maintained on a mix of disk and tape media, using both local and remote resources. The file system metadata records the location of the file and all of the copies. The software provides a range of tools for quickly locating copies. So lost or damaged files can be readily recovered from the archive. Yet backup copies are kept in standard, POSIX-compliant tar (tape archive) format that also let you recover data even if the Oracle HSM software is not available. Oracle HSM keeps file-system metadata consistent at all times by dynamically detecting and recovering from I/O errors. So you can bring a file system back up without time-consuming integrity checks—a major consideration when hundreds of thousands of files and petabytes of data are being stored. If the file-system metadata is stored on separate devices and only data-storage disks are involved, recovery is complete when replacement disks are configured into the file system. When users request files that resided on a failed disk, Oracle HSM automatically stages backup copies from tape to the replacement disk. If metadata is lost as well, an administrator can restore it from a samfsdump backup file using the samfsrestore command. Once the metadata has been restored, files can again be restored from tape as users request them. Since files are restored to disk only as requested, the recovery process makes efficient use of network bandwidth and has minimal impact on normal operations.

This ability to simultaneously manage files on high-performance, primary disk or solid-state media and on lower-cost, higher-density secondary disk, tape, or optical media makes Oracle HSM file systems ideal for economically storing unusually large and/or little used files. Very large, sequentially accessed data files, like satellite images and video files, can be stored exclusively on magnetic tape. When users or applications access a file, the file system automatically stages it back to disk or reads it into memory directly from tape, depending on the chosen file configuration. Records that are retained primarily for historical or compliance purposes can be stored hierarchically, using the media that is best aligned with user access patterns and cost constraints at a given point in the life of the file. Initially, when users still occasionally access a file, you can archive it on lower-cost, secondary disk devices. As demand diminishes, you can maintain copies only on tape or optical media. Yet, when a user needs the data—in response to a legal discovery or regulatory process, for example—the file system can automatically stage the required material to primary disk with minimal delay, much as if it had been there all along. For legal and regulatory purposes, Oracle HSM file systems can be WORM-enabled. WORM-enabled file systems support default and customizable file-retention periods, data and path immutability, and subdirectory inheritance of WORM settings. Long-term data integrity can be monitored using manual and/or automated media validation.

There are four, basic, Oracle HSM processes that manage and maintain archiving file systems:

Archiving
Staging
Releasing
Recycling.

Archiving

The archiving process copies files from a file system to archival media that are reserved for storing copies of active files. Archival media may include removable media volumes, such as magnetic tape cartridges, and/or one or more file systems that reside on magnetic disk or solid-state storage devices. Archival file copies may provide backup redundancy for the active files, long-term retention of inactive files, or some combination of both.

In an Oracle HSM archiving file system, the active, online files, the archival copies, and the associated storage resources form a single, logical resource, the archive set. Every active file in an archiving file system belongs to exactly one archive set. Each archive set may include up to four archival copies of each file plus policies that control the archiving process for that archive set.

The archiving process is managed by a UNIX daemon (service), sam-archiverd. The daemon schedules archiving activities and calls the processes that perform the required tasks, archiver, sam-arfind, and sam-arcopy.

The archiver process reads the archiving policies in an editable configuration file, archiver.cmd, and sets up the remaining archiving processes as specified. Directives in this file control the general behavior of the archiving processes, define the archive sets by file system, and specify the number of copies made and archival media used for each.

The sam-archiverd daemon then starts a sam-arfind process for each currently mounted file system. The sam-arfind process scans its assigned file system for new files, modified files, renamed files, and files that are to be re-archived or unarchived. By default, the process scans continuously for changes to files and directories, since this offers the best overall performance. But, if you must maintain compatibility with older, StorageTek Storage Archive Manager implementations, for instance, you can edit the archive set rules in the archiver.cmd file to schedule scanning using one of several methods (see the sam-archiverd man page for details).

Once it identifies candidate files, sam-arfind identifies the archive set that defines the archiving policies for the file. The sam-arfind process identifies the archive set by comparing the file's attributes to the selection criteria defined by each archive set. These criteria might include one or more of the following file attributes:

the directory path to the file and, optionally, a regular expression that matches one or more of the candidate file names
a specified user name that matches the owner of one or more candidate files
a specified group name that matches the group associated with the file
a specified minimum file size that is less than or equal to the size of the candidate file
a specified maximum file size that is greater than or equal to the size of the candidate file.

Once it has located the correct archive set and the corresponding archiving parameters, sam-arfind checks whether the archive age of the file equals or exceeds the threshold specified by the archive set. The archive age of a file is the number of seconds that have elapsed since the file was created, last modified (the default), or last accessed. If the archive age meets the age criteria specified in the policy, sam-arfind adds the file to the archive request queue for the archive set and assigns it a priority. Priorities are based on rules specified in the archive set and on factors such as the number of archive copies that already exist, the size of the file, any outstanding operator requests, and any other operations that depend on the creation of an archive copy.

Once sam-arfind has identified the files that need archiving, prioritized them, and added them to archive requests for each archive set, it returns the requests to the sam-archiverd daemon. The daemon composes each archive request. It arranges data files into archive files that are sized so that media is efficiently utilized and files are efficiently written to and, later, recalled from removable media. The daemon honors any file-sorting parameters or media restrictions that you set in the archiver.cmd file (see the archiver.cmd man page for details), but be aware that restricting the software's ability to select media freely usually reduces performance and media utilization. Once the archive files have been assembled, sam-archiverd prioritizes the archive requests so that the copy process can transfer the largest number of files in the smallest number of mount operations (see the scheduling section of the sam-archiverd man page for details). Then sam-archiverd schedules copy operations so that, at any given time, they require no more than the maximum number of drives allowed by the archive set policies and/or the robotic library.

Once the archive requests are scheduled, sam-archiverd calls an instance of the sam-arcopy process for each archive request and drive scheduled. The sam-arcopy instances then copy the data files to archive files on archival media, update the archiving file system's metadata to reflect the existence of the new copies, and update the archive logs.

When the sam-arcopy process exits, the sam-archiverd daemon checks the archive request for errors or omissions caused by read errors from the cache disk, write errors to removable media volumes, and open, modified, or deleted files. If any files have not been archived, sam-archiverd recomposes the archive request.

The sam-arfind and sam-arcopy processes can use the syslog facility and archiver.sh to create a continuous record of archiving activity, warnings, and informational messages. The resulting archiver log contains valuable diagnostic and historical information, including a detailed record of the location and disposition of every copy of every file archived. So, during disaster recovery, for example, you can often use an archive log to recover missing data files that would otherwise be irrecoverable (for details, see the Oracle Hierarchical Storage Manager and StorageTek QFS Software File System Recovery Guide in the Customer Documentation Library). File-system administrators enable archiver logging and define log files using the logfile= directive in the archiver.cmd file. For more information about the log file, see the archiver.cmd man page.

Staging

The staging process copies file data from archival storage back into the primary disk cache. When an application tries to access an offline file—a file that is not currently available in primary storage—an archive copy is automatically staged—copied back to primary disk. The application can then access the file quickly, even before complete data is written back to disk, because the read operation tracks along directly behind staging. If a media error occurs or if a specific media volume is unavailable, the staging process automatically loads the next available archive copy, if any, using the first available device. Staging thus makes archival storage transparent to users and applications. All files appear to be available on disk at all times.

The default staging behavior is ideal for most file systems. However, you can alter the defaults by inserting or modifying directives in a configuration file, /etc/opt/SUNWsamfs/stager.cmd, and you can override these directives on a per-directory or per-file basis from the command line. To access small records from large files, for example, you might choose to access data directly from the archival media without staging the file. Or you might stage a group of related files whenever any one of them is staged, using the associative staging feature. See the stage and stager.cmd man pages for details.

Releasing

The releasing process frees up primary disk cache space by deleting the online copies of previously archived files that are not currently in use. Once a file has been copied to archival media, such as a disk archive or tape volume, it can be staged when and if an application accesses it. So there is no need to retain it in the disk cache when space is needed for other files. By deleting unneeded copies from disk cache, releasing insures that primary cache storage is always available for newly created and actively used files, even if the file system grows without any corresponding increase in primary storage capacity.

Releasing occurs automatically when the cache utilisation exceeds the high-water mark and remains above a low-water mark, two configurable thresholds that you set when you mount an archiving file system. The high-water mark insures that enough free space is always available, while the low-water mark insures that a reasonable number of files are always available in cache and that media mount operations are thus kept to the minimum necessary. Typical values are 80% for the high value and 70% for the low.

Releasing by water mark using the default behavior is ideal for most file systems. However, you can alter the defaults by modifying or adding directives in a configuration file, /etc/opt/SUNWsamfs/releaser.cmd, and you can override them on a per-directory or per-file basis from the command line. You can, for example, partially release large, sequentially accessed files, so that applications can start reading a part of the file that is always retained on disk while the remainder stages from archival media. See the release and releaser.cmd man pages for details.

Recycling

The recycling process frees up space on archival media by deleting archive copies that are no longer in use. As users modify files, the archive copies associated with older versions of the files eventually expire. The recycler identifies the media volumes that hold the largest proportion of expired archive copies. If the expired files are stored on an archival disk volume, the recycler process deletes them. If the files reside on removable media, such as a tape volume, the recycler re-archives any unexpired copies that remain on the target volume to other media. It then calls an editable script, /etc/opt/SUNWsamfs/scripts/recycler.sh, to relabel the recycled volume, export it from the library, or perform some other, user-defined action.

By default, recycling process does not run automatically. You can configure the Solaris crontab file to run it at a convenient time. Or you can run it as needed from the command line using the command /opt/SUNWsamfs/sbin/sam-recycler. To modify default recycling parameters, edit the file /etc/opt/SUNWsamfs/archiver.cmd or create a separate /etc/opt/SUNWsamfs/recycler.cmd file. See the corresponding man pages for details.