Thus far, this document has discussed managing Oracle Hierarchical Storage Manager and StorageTek QFS Software solutions as ordinary UNIX file systems, where users and applications are regularly creating, modifying, and deleting files. The focus has been on the disk cache, with the archive serving primarily as a highly integrated backup service. In this chapter, we refocus on the archive as a repository and management solution for long-term data preservation. The previously covered management principles and techniques remain relevant. But now the disk cache serves primarily as a means of ingesting files into an archive that does not allow deletion or modification following ingestion.
Exact requirements vary. A repository that retains business or medical records for a legally mandated period may need to discard records periodically. But an archive storing scientific data, historical or genealogical records, or digital music, films, or television programs may need to store content, in effect, forever. For this reason, Oracle HSM supports digital preservation is several ways:
Message digests (checksums) let you detect damage, data corruption, and unauthorized modifications to files, so that you can correct hardware problems and replace unsound files with sound copies stored elsewhere in the archive.
File fixity attributes work with message digests to insure that only the super user can alter files that have been fixed. Whenever Oracle HSM stages or archives a fixed file, it revalidates a checksum stored with the fixity attribute to prove that the file remains unchanged.
Oracle HSM Write Once Read Many (WORM) file systems let you make files read-only and enforce retention for a specified period. These file systems can be configured so that the super user cannot alter files or file attributes, such as the fixity attribute discussed above.
The chapter starts with a brief review of the basic, Oracle HSM data-protection measures that form the foundation of any long-term storage solution. It then explains the tasks that specifically address data-preservation:
Every preservation solution starts with sound, highly redundant file systems. So review the implementation chapters of the companion Oracle Hierarchical Storage Manager and StorageTek QFS Installation and Configuration Guide, if you have not already done so. Protect access to the archive by providing redundant servers, network connections, and storage devices. Protect file data by configuring at least two additional copies of each file, with each stored on independent media. Archiving one copy to disk or solid-state storage devices and two copies to tape media is ideal, in most situations. When possible, insure that tape blocks are correctly written and read by implementing the Oracle HSM Data Integrity Verification feature. Protect file-system metadata by regularly generating dump files and by regularly backing up the archiving logs.
Message digests (checksums) let preservationists test archived files for changes that might indicate gradual deterioration, hardware or operator error, or deliberate, unauthorized alterations to the content. A message digest is simply a mathematical summary of a file's contents that has been generated by a one-way cryptographic hash function. Cryptographic hash functions are extremely sensitive to changes in their input data. Even small changes in the input produce large changes in the output. So message digests are ideal for detecting file corruption and unauthorized alterations. Recomputing a file's digest and comparing the resulting value to a stored digest value shows whether the file has changed.
Oracle Hierarchical Storage Manager file systems can ingest, create, store, and validate message digests using any of the following cryptographic hash functions:
SHA1, the 160-bit member of the Secure Hash Algorithm family of cryptographic functions
The Secure Hash Algorithms are defined in Federal Information Processing Standard (FIPS) Publication 180-4, National Institute of Standards and Technology (2012). Oracle HSM uses SHA1 by default.
SHA256, the 256-bit member of the Secure Hash Algorithm family
SHA384, the 384-bit member of the Secure Hash Algorithm family
SHA512, the 512-bit member of the Secure Hash Algorithm family.
MD5, the 128-bit Message Digest function defined by the Internet Engineering Task Force (IETF) in Request for Comment (RFC) 1321
A proprietary, 128-bit Oracle HSM function that is now mainly useful for backward compatibility with older, Storage Archive Manager implementations.
Users can supply an existing digest value when a file in ingested into the repository or they can have the file system compute one, either immediately or when the file is first archived. Oracle HSM file systems store digest values with the file system metadata, using a special file attribute. Once the attribute is set, the file system recomputes a digest and validates it against the stored value whenever the corresponding file is rearchived and, optionally, whenever the file is staged from archival media to the disk cache.
Note, however, that the Oracle HSM media migration feature copies files to new media without recalculating checksums (for information on media migration see Chapter 8, "Migrating to New Storage Media"). If a file is not copied correctly, there is thus a small risk that the corruption will not be detected until the file is restaged and validated. Using Data Integrity Validation (DIV) minimizes this risk (see the Oracle Hierarchical Storage Manager and StorageTek QFS Installation and Configuration Guide for details).
Before you start using message digests, you should first make sure that the host can handle the required calculations without undue reductions in host performance. You can then carry out the following tasks as needed:
supplying a pre-existing digest and enabling validation when a file is ingested
generating a digest and enabling validation when a file is ingested
generating a digest and enabling validation for each file in a directory
changing message digesting and validation attributes before a file is archived.
If you plan to make significant use of message digests, make sure that the file-system host has enough computing resources for adequate performance. Most modern platforms incorporate dedicated cryptographic hardware that can efficiently perform specialized calculations without consuming central processor cycles. Be sure to take advantage of these capabilities when available.
To check the capabilities of a potential file-system host, proceed as follows:
Log in to the file system host as root
:
root@mds1:~#
Make sure that the host operating system is Solaris 11.1 or higher. Use the command uname
-v
.
Earlier versions of the operating system do not support hardware acceleration of hash functions. In the example, the host operating system is Solaris 11.2:
root@mds1:~# uname -v 11.2 root@mds1:~#
Display the instruction set architecture. At the command prompt, enter the command isainfo
-v
:
root@mds1:~# isainfo -v
If the Solaris 11 host is an Oracle Sun SPARC T3 or later system, the output of the isainfo
-v
command should list instruction-sets that support sha512
, sha256
, sha1
, and md5
cryptographic algorithms.
In the example, the SPARC host provides hardware acceleration for the SHA1, SHA2, and MD5 algorithm families:
root@mds1:~# isainfo -v 64-bit sparcv9 applications crc32c cbcond pause mont mpmul sha512 sha256 sha1 md5 camellia kasumi des aes ima hpc vis3 fmaf asi_blk_init vis2 vis popc root@mds1:~#
If the Solaris host is an x86/64 system, it will support SHA-1 hardware acceleration if the output of the isainfo
-v
command includes the ssse3
(Supplemental Streaming SIMD Extensions 3) instruction set.
In the example, the host x86/64 system includes the ssse3
instruction set:
root@mds1:~# isainfo -v
64-bit amd64 applications
avx xsave pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3
sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu
root@mds1:~#
When you are archiving files that already have associated message digests, proceed as follows.
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-a
algorithm
-h
digest
-G
[-u]
filename
, where:
-a
algorithm
identifies the cryptographic hashing function that the file system should use when validating the file against the supplied message digest.
-h
digest
identifies the message digest that the file system should use to validate the file.
-G
specifies immediate validation. The file system sets the hash
file attribute to the value of the supplied message digest, independently calculates a message digest for the file and compares the result to the stored value. If the supplied and calculated digests match, the file system sets the validated
attribute for the file. It then sets the generate
attribute so that validity is rechecked whenever the file is rearchived.
-u
sets the use
file attribute (optional). Whenever the file is staged, the file system recalculates the digest and validates the result against the value stored in the hash
attribute.
filename
is the path and name of the file.
In the example, we supply a SHA256 digest and ask the file system to immediately recalculate and validate the digest value for the file data10
against the supplied value. When we check the file attributes with the command sls
-D
-h
data10
, we see that the generate
and validated
file attributes have been set, the algorithm
attribute has been set to SHA-256
and the digest value has been calculated and stored in the hash
attribute
root@mds1:~# ssum -h f03ce01b3828...f7459503007e -a sha256 -g data10 root@mds1:~# sls -D -h data10 data10: mode: -rw-r--r-- links: 1 owner: root group: root length: 14975 admin id: 0 inode: 90217.1 project: user.root(1) access: Jul 16 16:14 modification: Jul 16 16:14 changed: Jul 16 16:15 attributes: Jul 16 16:14 creation: Jul 16 16:14 residence: Jul 16 16:14 checksum: generate validated algorithm: SHA-256 hash: f03ce01b3828...f7459503007e root@mds1:~#
When necessary, edit the file as you normally would.
In the example, we have modified a file named data10m
since it was last archived. The sls
-D
-h
command shows that the S
(stale) flag has been set on both copies, since neither reflects the most recent changes. When we check the SHA-256 digest value for the modified file using the Solaris digest
command, we see that the file's hash
attribute also stores an out of date digest value:
root@mds1:~# sls -D -h data10m data10m: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) copy 1: S----- Jul 17 16:47 dd.1 dk diskarchive f221 copy 2: S----- Jul 20 11:31 a8d.1 li VOL002 access: Jul 20 11:32 modification: Jul 20 11:31 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 17 16:36 checksum: generate algorithm: SHA-256 hash: f03ce01b3828...f7459503007e root@mds1:~# digest -a sha256 data10m 56c55bb421cc...71ac2ac0b7b0 root@mds1:~#
If necessary, you can change the digest attributes of a modified file prior to rearchiving.
In the example, we change the digest algorithm from SHA256 to SHA1, with immediate effect:
root@mds1:~# ssum -a sha1 -G data10m root@mds1:~# sls -D -h data10m data10m: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) release -a; copy 1: S----- Jul 20 13:00 e0.1 dk diskarchive f224 copy 2: S----- Jul 20 13:05 a93.1 li VOL002 access: Jul 20 16:39 modification: Jul 20 16:39 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 20 16:29 checksum: generate validated algorithm: SHA-1 hash: 92003525f0f8...53e29d0718c8 root@mds1:~#
Otherwise, wait for the file system to archive the modified file and automatically update the digest-related attributes.
When a modified file is archived, the file system recalculates the digest value, stores the new value to the hash
attribute, and sets the S
(stale) flag on any archived copies of older versions of the file. In the example, we have edited the file data10m
without altering the digest attributes. The archiver has created a new copy 1
on disk, as scheduled, and updated the hash
attribute. A copy of the unmodified file remains on tape, flagged S
(stale), until it is time for the archiver to create copy 2
:
root@mds1:~# sls -D -h data10m data10m: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) copy 1: ------ Jul 17 16:47 dd.1 dk diskarchive f221 copy 2: S----- Jul 20 11:31 a8d.1 li VOL002 access: Jul 20 11:32 modification: Jul 20 11:31 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 17 16:36 checksum: generate algorithm: SHA-256 hash: 56c55bb421cc...71ac2ac0b7b0
To generate a digest for a file and enable file validation, proceed as follows:
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-a
algorithm
-g|G
[-u]
filename
, where:
-a
algorithm
specifies the cryptographic hashing function that the file system will use when generating a message digest for the file.
-g
sets the generate
file attribute for the file. The first time that the file is archived, the file system calculates a message digest. Whenever the file is rearchived, the file system recalculates the digest and validates the result against the stored value.
-G
sets the generate
and validate
file attributes for the file. The file system immediately calculates a message digest and stores the result in the hash
attribute. Whenever the file is archived, the file system recalculates the digest and validates the result against the stored value.
-u
sets the use
file attribute (optional). Whenever the file is staged, the file system recalculates the digest and validates the result against the value stored in the hash
attribute.
filename
is the path and name of the file.
In the example, we ask the file system to use the SHA256 algorithm to calculate the digest for the file data11
prior to archiving. When we check the file attributes with the command sls
-D
-h
data10
, we see that, for each file, the generate
file attribute has been set and the algorithm
attribute has been set to SHA-256
. The file has not yet been archived, so the digest value has not as yet been calculated and stored in the hash
attribute:
root@mds1:~# ssum -a sha256 -g data11 root@mds1:~# sls -D -h data11 data11: mode: -rw-r--r-- links: 1 owner: root group: root length: 14975 admin id: 0 inode: 90218.1 project: user.root(1) access: Jul 16 16:14 modification: Jul 16 16:14 changed: Jul 16 16:22 attributes: Jul 16 16:14 creation: Jul 16 16:14 residence: Jul 16 16:14 checksum: generate algorithm: SHA-256 hash: root@mds1:~#
When necessary, edit the file as you normally would.
In the example, we have modified a file named data11m
since it was last archived. The sls
-D
-h
command shows that the S
(stale) flag has been set on both copies, since neither reflects the most recent changes. When we check the SHA-256 digest value for the modified file using the Solaris digest
command, we see that the file's hash
attribute also stores an out of date digest value:
root@mds1:~# sls -D -h data11m data11m: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) copy 1: S----- Jul 17 16:47 dd.1 dk diskarchive f221 copy 2: S----- Jul 20 11:31 a8d.1 li VOL002 access: Jul 20 11:32 modification: Jul 20 11:31 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 17 16:36 checksum: generate algorithm: SHA-256 hash: f03ce01b3828...f7459503007e root@mds1:~# digest -a sha256 data11m 56c55bb421cc...71ac2ac0b7b0 root@mds1:~#
If necessary, you can change the digest attributes of a modified file prior to rearchiving.
In the example, we change the digest algorithm from SHA256 to SHA1, with immediate effect:
root@mds1:~# ssum -a sha1 -G data11m root@mds1:~# sls -D -h data11m data11m: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) release -a; copy 1: S----- Jul 20 13:00 e0.1 dk diskarchive f224 copy 2: S----- Jul 20 13:05 a93.1 li VOL002 access: Jul 20 16:39 modification: Jul 20 16:39 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 20 16:29 checksum: generate validated algorithm: SHA-1 hash: 92003525f0f8...53e29d0718c8 root@mds1:~#
Otherwise, wait for the file system to archive the modified file and automatically update the digest-related attributes.
When a modified file is archived, the file system recalculates the digest value, stores the new value to the hash
attribute, and sets the S
(stale) flag on any archived copies of older versions of the file.
In the example, we have edited the file data11m
without altering the digest attributes. The archiver has created a new copy 1
on disk, as scheduled, and updated the hash
attribute. A copy of the unmodified file remains on tape, flagged S
(stale), until it is time for the archiver to create copy 2
:
root@mds1:~# sls -D -h data11m mdata11: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90307.1 project: user.root(1) copy 1: ------ Jul 17 16:47 dd.1 dk diskarchive f221 copy 2: S----- Jul 20 11:31 a8d.1 li VOL002 access: Jul 20 11:32 modification: Jul 20 11:31 changed: Jul 17 16:37 attributes: Jul 17 16:36 creation: Jul 17 16:36 residence: Jul 17 16:36 checksum: generate algorithm: SHA-256 hash: 56c55bb421cc...71ac2ac0b7b0
To recursively generate a digest and set the validation attributes for every file in a directory, proceed as follows:
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-a
algorithm
-g|G
[-u]
-r
directoryname
, where:
-a
algorithm
specifies the cryptographic hashing function that the file system will use when generating message digests.
-g
sets the generate
file attribute for each file. The first time that a file is archived, the file system calculates a message digest for the file. Whenever the file is rearchived, the file system recalculates the digest and validates the result against the stored value.
-G
sets the generate
and validate
file attributes for each file. The file system immediately calculates a message digest and stores the result in the hash
attribute. Whenever the file is archived, the file system recalculates the digest and validates the result against the stored value.
-u
sets the use
file attribute (optional). Whenever the file is staged, the file system recalculates the digest and validates the result against the stored value.
-r
recursively applies the command to all files in the specified directory.
directoryname
is the path and name of the directory.
In the first example, we tell the file system to use the SHA256 algorithm to calculate the digest for the files in the directory datasetA
prior to archiving. When we check the file attributes with the command sls
-D
-h
datasetA
, we see that, for each file, the generate
file attribute has been set and the algorithm
attribute has been set to SHA-256
. The files have not yet been archived, so the digest values have not as yet been calculated and stored in the hash
attribute:
root@mds1:~# ssum -a sha256 -g -r datasetA root@mds1:~# sls -D -h datasetA datasetA/pdata0: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90232.1 project: user.root(1) access: Jul 16 16:47 modification: Jul 16 16:47 changed: Jul 16 16:47 attributes: Jul 16 16:47 creation: Jul 16 16:47 residence: Jul 16 16:47 checksum: generate algorithm: SHA-256 hash: ... datasetA/pdata20: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90234.1 project: user.root(1) access: Jul 16 16:47 modification: Jul 16 16:47 changed: Jul 16 16:47 attributes: Jul 16 16:47 creation: Jul 16 16:47 residence: Jul 16 16:47 checksum: generate algorithm: SHA-256 hash: ... root@mds1:~#
In the second example, we ask the file system to use the SHA256 algorithm to immediately calculate the digest for the files in the directory datasetB
prior to archiving. When we check the file attributes with the command sls
-D
-h
datasetB
, we see that, for each file, the generate
and validated
file attributes have been set, the algorithm
attribute has been set to SHA-256
, and the digest value has been calculated and stored in the hash
attribute:
root@mds1:~# ssum -a sha256 -G -r datasetB root@mds1:~# sls -D -h datasetB datasetB/qdata0: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90232.1 project: user.root(1) access: Jul 16 16:47 modification: Jul 16 16:47 changed: Jul 16 16:47 attributes: Jul 16 16:47 creation: Jul 16 16:47 residence: Jul 16 16:47 checksum: generate validated algorithm: SHA-256 hash: 4d2800eb82b3...520341edde95 ... datasetB/qdata12: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90234.1 project: user.root(1) access: Jul 16 16:47 modification: Jul 16 16:47 changed: Jul 16 16:47 attributes: Jul 16 16:47 creation: Jul 16 16:47 residence: Jul 16 16:47 checksum: generate validated algorithm: SHA-256 hash: 5b057f1b7b48...88c590d47dec ... root@mds1:~#
When required, you can validate a file before it is staged to the disk cache for use. Proceed as follows:
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-u
[
-a
algorithm
[
-h
digest
]
-g|G
]
filename
, where:
-u
specifies validation prior to staging by setting the use
file attribute. When the use
attribute is set for a file, the file system will not copy the file from archival media to the disk cache until it has generated a message digest and successfully validated the result against the value stored in the file's hash
attribute.
-a
algorithm
, -h
digest
, and -g|G
are optional parameters that set the required algorithm
, hash
, and generate
attributes on the file if the attributes have not been set previously.
filename
is the path and name of the file.
In the example, we have already enabled validation for the file data102
. As the command sls
-D
-h
data102
shows, the generate
and validated
file attributes have been set, the algorithm
attribute has been set to SHA-256
, and the digest value has been calculated and stored in the hash
attribute:
root@mds1:~# ssum -a sha256 -F data102 root@mds1:~# sls -D -h data102 data102: mode: -rw-r--r-- links: 1 owner: root group: root length: 14979 admin id: 0 inode: 90264.1 project: user.root(1) access: Jul 16 17:34 modification: Jul 16 17:34 changed: Jul 16 17:34 attributes: Jul 16 17:34 creation: Jul 16 17:34 residence: Jul 16 17:34 checksum: generate validated algorithm: SHA-256 hash: baae932ce1cf...93166a2e36b5 root@mds1:~#
So we can set the use
attribute to insure that the file system validates the file prior to staging. The command sls
-D
-h
data102
shows that the use
attribute is now set:
root@mds1:~# ssum -u data102 root@mds1:~# sls -D -h data102 data102: mode: -rw-r--r-- links: 1 owner: root group: root length: 14979 admin id: 0 inode: 90264.1 project: user.root(1) access: Jul 16 17:34 modification: Jul 16 17:34 changed: Jul 16 17:34 attributes: Jul 16 17:34 creation: Jul 16 17:34 residence: Jul 16 17:34 checksum: generate use validated algorithm: SHA-256 hash: baae932ce1cf...93166a2e36b5 root@mds1:~#
If a file that has not been made immutable and has not yet been archived, you can change message digesting and validation attributes using the procedure below.
Log in to the file system host as root
:
root@mds1:~#
If necessary, change the digesting algorithm. At the command prompt, enter the command ssum
-a
newalgorithm
filename
, where:
-a
newalgorithm
specifies the cryptographic hash function that replaces the previously specified digesting algorithm.
filename
is the path and name of the file.
In the example, our preservation policies require the highly collision-resistent SHA256 function. But as the command sls
-D
-h
shows, we have inadvertently specified the SHA1 algorithm when we set the digest attributes of the file data319
. Since the file has not yet been archived, we can successfully change the algorithm to SHA256:
root@mds1:~# sls -D -h data319 data319: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90301.1 project: user.root(1) access: Jul 17 15:27 modification: Jul 17 15:27 changed: Jul 17 15:28 attributes: Jul 17 15:27 creation: Jul 17 15:27 residence: Jul 17 15:27 checksum: generate algorithm: SHA-1 hash: root@mds1:~# ssum -a sha256 data319 root@mds1:~# sls -D -h data319 data319: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90301.1 project: user.root(1) access: Jul 17 15:27 modification: Jul 17 15:27 changed: Jul 17 15:28 attributes: Jul 17 15:27 creation: Jul 17 15:27 residence: Jul 17 15:27 checksum: generate algorithm: SHA-256 hash: root@mds1:~#
If necessary, clear the digest attributes and restore the default file settings. At the command prompt, enter the command ssum
-d
filename
, where:
-d
resets the file digest attributes to their default values.
filename
is the path and name of the file.
In the example, we did not mean to configure message digesting and validation for the file data44
. But, as the command sls
-D
-h
shows, we have inadvertently done so. Since the file has not yet been archived, we can successfully clear generate
and use
, the attributes that control digest validation during archiving and staging. The data in validated
, algorithm
, and hash
attributes remains but does not affect file system's behavior:
root@mds1:~# sls -D -h data44 data44: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90292.1 project: user.root(1) access: Jul 17 14:58 modification: Jul 17 14:57 changed: Jul 17 14:58 attributes: Jul 17 14:57 creation: Jul 17 14:57 residence: Jul 17 14:57 checksum: generate use validated algorithm: SHA-256 hash: 3b4b15f8f69c...bae62c7e7568 root@mds1:~# ssum -d data44 root@mds1:~# sls -D -h data44 data44: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90292.1 project: user.root(1) access: Jul 17 14:58 modification: Jul 17 14:57 changed: Jul 17 14:58 attributes: Jul 17 14:57 creation: Jul 17 14:57 residence: Jul 17 14:57 checksum: validated algorithm: SHA-256 hash: 3b4b15f8f69c...bae62c7e7568 root@mds1:~#
If necessary, reset any required message digesting and validation attributes before the file is archived. At the command prompt, enter the command ssum
with the appropriate options and file name.
In the example, we decide to re-enable message digesting on the file qndat44
and validate digests prior to archiving. But we do not require validation prior to staging. So we restore the generate
attribute but not the use
attribute:
root@mds1:~# ssum -g data44 root@mds1:~# sls -D -h data44 data44: mode: -rw-r--r-- links: 1 owner: root group: root length: 14983 admin id: 0 inode: 90292.1 project: user.root(1) access: Jul 17 14:58 modification: Jul 17 14:57 changed: Jul 17 14:58 attributes: Jul 17 14:57 creation: Jul 17 14:57 residence: Jul 17 14:57 checksum: generate validated algorithm: SHA-256 hash: 3b4b15f8f69c...bae62c7e7568 root@mds1:~#
Preservation requirements frequently require mechanisms that assure file fixity. The archive must both prevent changes and prove that such changes have not occurred. To provide fixity, Oracle HSM archival file systems combine the message digests and digest-related file attributes discussed above with additional attributes that render the file immutable. Once a file has been made immutable, only those with super-user authority can change its status. If you combine immutability with a strict Write Once Read Many (WORM) file system, even super users will be unable to make changes (for details, see "Understanding WORM File Systems").
You can make a file immutable in either of the following situations:
When you need to insure that a file remains unchanged after ingestion into the archive, proceed as follows.
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-a
algorithm
[
-h
digest
]
-F
filename
, where:
-a
algorithm
identifies the cryptographic hashing function that the file system should use when validating the file against the supplied message digest.
-h
digest
identifies the message digest that the file system should use to validate the file.
-F
specifies immediate validation and immutability, and sets the fixity
, generate
, validated
, and use
file attributes. The file system immediately calculates and validates a message digest. When the file is staged or archived, the file system recalculates and revalidates a message digest.
filename
is the path and name of the file.
In the example, we supply a SHA256 digest and tell the file system to recalculate the digest, validate the value for the file data20
, and make the file immutable. When we check the file attributes with the command sls
-D
-h
data10
, we see that, for each file, the fixity, generate
, use
, and validated
file attributes have been set, the algorithm
attribute has been set to SHA-256
, and the digest value has been calculated and stored in the hash
attribute:
root@mds1:~# ssum -h bfaefde932cf...d450892eda63 -a sha256 -F data20 root@mds1:~# sls -D -h data20 data20: mode: -rw-r--r-- links: 1 owner: root group: root length: 14979 admin id: 0 inode: 90264.1 project: user.root(1) access: Jul 16 17:34 modification: Jul 16 17:34 changed: Jul 16 17:34 attributes: Jul 16 17:34 creation: Jul 16 17:34 residence: Jul 16 17:34 checksum: fixity generate use validated algorithm: SHA-256 hash: bfaefde932cf...d450892eda63 root@mds1:~#
When you are archiving files that already have associated message digests and need to insure that the file remains unchanged after ingestion into the archive, proceed as follows.
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command ssum
-a
algorithm
[
-h
digest
]
-F
filename
, where:
-a
algorithm
identifies the cryptographic hashing function that was used to generate the digest that is specified in the -h
digest
parameter.
-F
sets the fixity
, generate
, validated
, and use
file attributes. The file system immediately calculates and validates a message digest. When the file is staged or archived, the file system recalculates and revalidates a message digest.
filename
is the path and name of the file.
In the example, we tell the file system to calculate a SHA256 digest, validate the value for the file data200
, and make the file immutable. When we check the file attributes with the command sls
-D
-h
data10
, we see that, for each file, the fixity
, generate
, validated
, and use
file attributes have been set, the algorithm
attribute has been set to SHA-256
, and the digest value has been calculated and stored in the hash
attribute:
root@mds1:~# ssum -a sha256 -F data200 root@mds1:~# sls -D -h data200 data200: mode: -rw-r--r-- links: 1 owner: root group: root length: 14979 admin id: 0 inode: 90264.1 project: user.root(1) access: Jul 16 17:34 modification: Jul 16 17:34 changed: Jul 16 17:34 attributes: Jul 16 17:34 creation: Jul 16 17:34 residence: Jul 16 17:34 checksum: fixity generate use validated algorithm: SHA-256 hash: efde93cc12cf...d496602e36dd root@mds1:~#
To view the message digest and fixity attributes of one or more files, use the Oracle HSM directory listing command, sls
. Proceed as follows.
Log in to the file system host as root
:
root@mds1:~#
At the command prompt, enter the command sls
-D
-h
filename
, where:
-D
specifies a detailed display of file attributes.
-h
includes the hash (digest) value in the display.
filename
identifies one or more files by path and name.
In the example, we see the file digest attributes for the file data02
listed in the checksum
and hash
fields of the display:
root@mds1:~# sls -D -h data02 data02: mode: -rw-r--r-- links: 1 owner: root group: root length: 14975 admin id: 0 inode: 90217.1 project: user.root(1) access: Jul 16 16:14 modification: Jul 16 16:14 changed: Jul 16 16:15 attributes: Jul 16 16:14 creation: Jul 16 16:14 residence: Jul 16 16:14 checksum: generate use validated algorithm: SHA-256 hash: f03ce01b3828...f7459503007e root@mds1:~#
The hash
attribute stores the message digest for the file, f03ce01b3828...f7459503007e
.
The algorithm
attribute shows that the SHA-256
cryptographic hashing function generated the stored message digest.
The generate
attribute shows that the file system independently recalculates the message digest and validates it against the stored value whenever the file is archived.
The use
attribute shows that the file system independently recalculates the message digest and validates it against the stored value whenever the file is staged.
The validated
attribute shows that the independently calculated message digest matched the value stored in the hash
attribute when last checked.
The fixity
attribute appears if the file has been made immutable.
When legal or archival considerations so require, you can create write-once read-many (WORM) directories and files in any Oracle HSM file system that has been configured to support them. This section focuses on understanding WORM file systems and on specific tasks that you need to perform when working with WORM files and directories, including:
WORM-enabling directories
activating WORM protection for a file
finding and listing WORM files.
For information on enabling WORM support for a file system, see the Oracle Hierarchical Storage Manager and StorageTek QFS Installation and Configuration Guide.
Write Once Read Many (WORM) file systems protect data by letting users make files read-only for the duration of a specified retention period. WORM-enabled Oracle HSM file systems support default and customizable file-retention periods, data and path immutability, and subdirectory inheritance of the WORM setting.
Depending on how your file systems are configured, you use one of two Oracle HSM WORM modes:
standard compliance mode (the default)
emulation mode
In a file system that is mounted under the standard WORM mode, a user WORM-enables directories and starts the read-only retention period for files by executing the command chmod 4000
path_name
, where path_name
is the path and name of the file or directory. This sets UNIX setuid
(set user ID upon execution) permission. Setting setuid
permission on a file that also has execute
permission is a security risk, so, in standard WORM mode, only non-executable files can be made read-only.
In a file system that is mounted under the WORM emulation mode, a user WORM-enables directories and starts the read-only retention period for files by executing the command chmod 555
path_name
, where path_name
is the path and name of a writable file or directory. Since emulation mode does not require setuid
permission, executable files can be made read-only and assigned retention periods.
Both standard and emulation modes have a strict WORM implementation and a less restrictive, lite implementation. Both strict and lite implementations do not allow changes to data or paths once retention has been triggered on a file or directory. Both set the default retention period to 43,200 minutes (30 days). But the lite implementation relaxes some restrictions for root
users.
The strict implementations do not let anyone shorten the specified retention period or delete files or directories prior to the end of the retention period. They also do not let anyone use sammkfs
to delete volumes that hold currently retained files and directories. The strict implementations are thus well-suited to meeting the most demanding legal, regulatory compliance, and preservation requirements.
The lite implementations let root
users shorten retention periods, delete files and directories, and delete volumes using the sammkfs
command. This provides a high level of protection against casual data loss and provides more flexibility when administering file systems and storage resources. But file systems that allow super users this degree of control may not meet some legal and regulatory compliance requirements.
You can create both hard and soft links to WORM files. You can only create hard links with files that reside in a WORM-capable directory. After a hard link is created, it has the same WORM characteristics as the original file. Soft links can also be established, but a soft link cannot use the WORM features. Soft links to WORM files can be created in any directory in an Oracle HSM file system.
For full information on creating and configuring WORM file systems, see the Oracle Hierarchical Storage Manager and StorageTek QFS Installation and Configuration Guide in the Customer Documentation Library.
When you WORM-enable a directory, you add support for WORM files, but do not otherwise change the characteristics of the directory. Users can continue to create and edit files within a WORM-enabled directory, and WORM-enabled directories that do not contain WORM files can be deleted. To WORM-enable a directory, proceed as follows:
Log in to the file-system server.
user@mds1:~#
See if the directory has already been WORM-enabled. Use the command sls
-Dd
directory
, where directory
is the path and name of the directory. Look for the attribute worm-capable
in the output of the command.
Usually, directories will be WORM-enabled, because, when one user WORM-enables a directory, all current and future child directories inherit the WORM capability (for full information on the command, see the sls
man page). In the first example, we find that our target directory, /hsm/hsmfs1/records
, is already worm-enabled:
user@mds1:~# sls -Dd /hsm/hsmfs1/records/2013/ /hsm/hsmfs1/records/2013: mode: drwxr-xr-x links: 2 owner: root group: root length: 4096 admin id: 0 inode: 1048.1 project: user.root(1) access: Mar 3 12:15 modification: Mar 3 12:15 changed: Mar 3 12:15 attributes: Mar 3 12:15 creation: Mar 3 12:15 residence: Mar 3 12:15 worm-capable retention-period: 0y, 30d, 0h, 0m
But in the second example, we find that our target directory, /hsm/hsmfs1/documents
, is not worm-enabled:
user@mds1:~# sls -Dd /hsm/hsmfs1/documents /hsm/hsmfs1/documents mode: drwxr-xr-x links: 2 owner: root group: root length: 4096 admin id: 0 inode: 1049.1 project: user.root(1) access: Mar 3 12:28 modification: Mar 3 12:28 changed: Mar 3 12:28 attributes: Mar 3 12:28 creation: Mar 3 12:28 residence: Mar 3 12:28
If the directory is not WORM-enabled and if the file system was mounted with the worm_capable
or worm_lite
mount option, enable WORM support with the Solaris command chmod
4000
directory-name
, where directory-name
is the path and name of the directory that will hold the WORM files.
The command chmod 4000
sets the setuid
(set user ID upon execution) attribute on the directory and enables standard WORM support. In the example, we WORM-enable the directory /hsm/hsmfs1/documents
and check the result with sls -Dd
. The operation succeeds and the directory is WORM-enabled:
user@mds1:~# chmod 4000 /hsm/hsmfs1/documents user@mds1:~# sls -Dd /hsm/hsmfs1/documents /hsm/hsmfs1/documents mode: drwxr-xr-x links: 2 owner: root group: root length: 4096 admin id: 0 inode: 1049.1 project: user.root(1) access: Mar 3 12:28 modification: Mar 3 12:28 changed: Mar 3 12:28 attributes: Mar 3 12:28 creation: Mar 3 12:28 residence: Mar 3 12:28 worm-capable retention-period: 0y, 30d, 0h, 0m
If the directory is not WORM-enabled and if the file system was mounted with the worm_emul
or emul_lite
mount option, enable WORM support with the Solaris command chmod
555
directory-name
, where directory-name
is the path and name of the directory that will hold the WORM files.
The command chmod
555
removes write permissions for the directory and enables WORM-emulation support. In the example, we WORM-enable the directory /hsm/hsmfs1/documents
and check the result using the command sls
-Dd
. The operation succeeds and the directory is WORM-enabled:
user@mds1:~# chmod 555 /hsm/hsmfs1/documents user@mds1:~# sls -Dd /hsm/hsmfs1/documents /hsm/hsmfs1/documents mode: drwxr-xr-x links: 2 owner: root group: root length: 4096 admin id: 0 inode: 1049.1 project: user.root(1) access: Mar 3 12:28 modification: Mar 3 12:28 changed: Mar 3 12:28 attributes: Mar 3 12:28 creation: Mar 3 12:28 residence: Mar 3 12:28 worm-capable retention-period: 0y, 30d, 0h, 0m
When you activate WORM protection on a file in a WORM-enabled directory, the file system no longer allows modifications to the file data or the path to the data until the retention period expires. So you must use care. To activate WORM protection, proceed as follows:
Log in to the file-system server.
user@mds1:~#
If you need to retain the file for some period other than the default for the file system, specify the required retention time by changing the access time for the file. Use the Solaris command touch
-a
-t
expiration-date
path-name
, where:
expiration-date
is a string of numerals consisting of a four-digit year, a two-digit month, a two-digit day of the month, a two-digit hour of the day, a two digit minute within the hour, and, optionally, a two-digit second within the minute.
path-name
is the path and name of the file.
Note that Oracle Solaris UNIX utilities such as touch
cannot extend a retention period beyond 10:14 PM on 01/18/2038. These utilities use signed 32–bit numbers to represent time in seconds starting from 01/01/1970. So use a default retention period if you need to retain files beyond this cut-off date.
In the example, we set the retention period for the file to expire in four years, on October 4, 2019 at 11:59 AM:
user@mds1:~# touch -a -t201910141159 /hsm/hsmfs1/plans/master.odt
If the file system was mounted with the worm_capable
or worm_lite
mount option, activate WORM protection with the Solaris command chmod
4000
path-name
, where path-name
is the path and name of the file.
The chmod
4000
command sets the setuid
(set user ID upon execution) attribute on the specified file. Setting this attribute on an executable file is insecure. So, if the file system was mounted with the worm_capable
or worm_lite
mount option, you cannot set WORM protections on files that have UNIX execute
permission.
In the example, we activate WORM protection for the file master.odt
. We check the result with sls
-D
, and note that the retention
attribute is now set to active
, and the retention-period
is set to four years:
user@mds1:~# chmod 4000 /hsm/hsmfs1/plans/master.odt user@mds1:~# sls -Dd /hsm/hsmfs1/plans/master.odt /hsm/hsmfs1/plans/master.odt: mode: -r-xr-xr-x links: 1 owner: root group: root length: 104 admin id: 0 inode: 1051.1 project: user.root(1) access: Mar 4 2018 modification: Mar 3 13:14 changed: Mar 3 13:16 retention-end: Apr 2 14:16 2014 creation: Mar 3 13:16 residence: Mar 3 13:16 retention: active retention-period: 4y, 0d, 0h, 0m
If the file system was mounted with the worm_emul
or emul_lite
mount option, activate WORM protection with the Solaris command chmod
555
path-name
, where path-name
is the path and name of the file.
The command chmod
555
removes write permissions for the directory. So you can WORM protect executable files, if required. In the example, we activate WORM retention for the file master-plan.odt
. We check the result with sls
-D
, and note that the retention
attribute is now set to active
, and the retention-period
is set to four years:
user@mds1:~# chmod 555 /hsm/hsmfs1/plans/master.odt user@mds1:~# sls -Dd /hsm/hsmfs1/plans/master.odt /hsm/hsmfs1/plans/master.odt: mode: -r-xr-xr-x links: 1 owner: root group: root length: 104 admin id: 0 inode: 1051.1 project: user.root(1) access: Mar 4 2018 modification: Mar 3 13:14 changed: Mar 3 13:16 retention-end: Apr 2 14:16 2014 creation: Mar 3 13:16 residence: Mar 3 13:16 retention: active retention-period: 4y, 0d, 0h, 0m
To find and list WORM files that meet specified search criteria, use the sfind
command. Proceed as follows:
Log in to the file-system server.
user@mds1:~#
To list files that are WORM-protected and being actively retained, use the command sfind
starting-directory
-ractive
, where starting-directory
is the path and name for the directory where you want the listing process to start.
user@mds1:~# sfind /hsm/hsmfs1/ -ractive /hsm/hsmfs1/documents/2013/master-plan.odt /hsm/hsmfs1/documents/2013/schedule.ods /samma1/records/2013/progress/report01.odt /samma1/records/2013/progress/report02.odt /samma1/records/2013/progress/report03.odt ... user@mds1:~#
To list WORM-protected files for which the retention period has expired, use the command sfind
starting-directory
-rover
, where starting-directory
is the path and name for the directory where you want the listing process to start.
user@mds1:~# sfind /hsm/hsmfs1/ -rover /samma1/documents/2007/master-plan.odt /samma1/documents/2007/schedule.ods user@mds1:~#
To list WORM-protected files for which the retention period will expire after a specified date and time, use the command sfind
starting-directory
-rafter
expiration-date
, where:
starting-directory
is the path and name for the directory where you want the listing process to start
expiration-date
is a string of numerals consisting of a four-digit year, a two-digit month, a two-digit day of the month, a two-digit hour of the day, a two digit minute within the hour, and, optionally, a two-digit second within the minute.
In the example, we list any files for which the retention period expires after January 1, 2015 at one minute after midnight:
user@mds1:~# sfind /hsm/hsmfs1/ -rafter 201501010001 /hsm/hsmfs1/documents/2013/master-plan.odt user@mds1:~#
To list WORM-protected files that must remain in the file system for at least a specified amount of time, use the command sfind
starting-directory
-rremain
time-remaining
, where:
starting-directory
is the location in the directory tree where the search starts.
time-remaining
is a string of non-negative integers paired with the following units of time: y
for years, d
for days, h
for hours, m
for minutes.
In the example, we find all files under the directory /hsm/hsmfs1/
that will be retained for at least three more years:
user@mds1:~# sfind /hsm/hsmfs1/ -rremain 3y /hsm/hsmfs1/documents/2013/master-plan.odt user@mds1:~#
To list WORM-protected files that must remain in the file system for more than a specified amount of time, use the command sfind
starting-directory
-rlonger
time
, where:
starting-directory
is the location in the directory tree where the search starts.
time-remaining
is a string of non-negative integers paired with the following units of time: y
for years, d
for days, h
for hours, m
for minutes.
In the example, we find all files under the directory /hsm/hsmfs1/
that will be retained for more than three years and ninety days:
user@mds1:~# sfind /hsm/hsmfs1/ -rremain 3y90d /hsm/hsmfs1/documents/2013/master-plan.odt user@mds1:~#
To list WORM-protected files that must remain in the file system permanently, use the command sfind
starting-directory
-rpermanent
.
In the example, we find that no files under the directory /hsm/hsmfs1/
are being retained permanently:
user@mds1:~# sfind /hsm/hsmfs1/ -rpermanent user@mds1:~#