1 Introduction

The Checksum Support and Content Verification program provides additional levels of data verification into the Oracle DIVArchive Suite. This feature brings checksum generation and verification for each file and object managed by DIVArchive. Additional checksum verification is performed at the Oracle Storage Cloud level. See the Storage Cloud documentation for information.

See Appendix A for Oracle DIVArchive options and licensing information.

See the Oracle DIVArchive Supported Environments Guide in the Oracle DIVArchive Core documentation library for information about certain limitations when running in the Linux environment.

Supported Checksum Algorithms

The available checksum algorithms for Oracle DIVArchive are as follows:

Checksum Algorithm MD2

Message Digest 2 (MD2) Algorithm: A cryptographic hash function developed in 1989. The algorithm is optimized for 8-bit computers. Although other algorithms have been proposed since, MD2 remains in use in public key infrastructures as part of certificates generated with MD2 and RSA.

Checksum Algorithm MDC2

Modification Detection Code 2: In cryptography MDC2 (sometimes called Meyer-Schilling) is a cryptographic hash function with a 128-bit hash value. MDC2 is based on a block cipher with a proof of security in the ideal-cipher model.

Checksum Algorithm MD5

Message Digest 5 Algorithm: In cryptography, MD5 is a widely used cryptographic hash function with a 128-bit hash value. As an Internet standard (RFC 1321), MD5 has been employed in a wide variety of security applications and is commonly used to check the integrity of files. MD5 is the default DIVArchive Checksum Type.

Checksum Algorithm SHA

Secure Hash Algorithm: One of several cryptographic hash functions published by the National Institute of Standards and Technology as a US Federal Information Processing Standard.

Checksum Algorithm SHA-1

Secure Hash Algorithm-1: A 160-bit hash function which resembles the MD5 algorithm. This was designed by the National Security Agency (NSA) to be part of the Digital Signature Algorithm.

Checksum Algorithm RIPEMD160

RACE Integrity Primitives Evaluation Message Digest: A 160-bit message digest algorithm (cryptographic hash function) developed in Leuven (Belgium) at the COSIC Research Group at the Katholieke Universiteit Leuven, and first published in 1996. It is an improved version of RIPEMD, which was based upon the design principles used in MD4, and is similar in performance to the more popular SHA-1.

Note:

Additional checksum verification is done at the Oracle Storage Cloud level. See the Storage Cloud documentation for information.

Supported Checksum Sources

When an object contains multiple files (components), a checksum will be generated and later verified for each of the component elements. There are three types of checksums that are distinguishable:

Genuine Checksum (GC)

This checksum is provided through the API in the Archive request, or retrieved by the Oracle DIVArchive Actor from the Source/Destination. It ensures maximum security because it allows DIVArchive to verify all transfers to and within the archive system.

The GC is obtained before the archive starts. It is either passed in the archiveObject API function, or obtained from the Source/Destination by the Oracle DIVArchive Actor using an API provided by the Source/Destination manufacturer.

Archive Checksum (AC)

This checksum is generated during the transfer phase into DIVArchive and based on the data that is received from the network (for networked sources), calculated during the actual transfer, or read from the device (for disk type sources). This type of checksum does not allow detecting corruptions occurring during the transfer from the Source/Destination to the Actor. However, all other subsequent corruptions can be detected.

The AC is calculated real-time before it writing to disk or other storage medium (within DIVArchive) during data transfer.

Deferred Checksum (DC)

This checksum is generated during the read of an object already stored in the archive system which has no checksum previously associated with it - either because the previous DIVArchive version did not support it, or the option was not activated.

This type of checksum does not allow corruption detection that occurred during an earlier stage (for example, during the archive or further data movement within a copy or repack process). However, it enables corruption detection in all further data processing.

This checksum is generated during requests on existing objects. (for example, Copy, Restore, and so on).

Checksum Verification Workflows and Supported Requests

Checksum verification occurs when transferring data from a Source/Destination or when reading data from a source or a storage medium. The latter occurs during the retrieval of an object from a storage medium during routine functions (Restore, Copy, Repack, and Transcode, but not Oracle DIVArchive Partial File Restore), during a read-back from storage (Verify Following Write feature), or from the source (Verify Following Restore feature). MD5 is the default DIVArchive Checksum Type which applies to the entire system.

There are four major workflows used for Checksum Support Verification. Each of these workflows are described as follows:

Verify Read (VR)

Verify Read is the default workflow. It calculates a real-time checksum for content as it is being read from a storage medium (for example, a disk or a tape - including cache disks configured for VR) inside DIVArchive, and performs checksum verification. The operation is only considered successful after this full read operation is complete and the calculated checksum matches the checksum attached to the stored data.

Verify Write (VW)

Verify Write reads back data that was just written to a storage medium, for example a disk or a tape inside DIVArchive, and performs checksum verification. This read-back data will be used to calculate a real-time checksum and then discarded. The write operation is successful when the full read operation is complete and the calculated checksum matches the checksum of the incoming data.

Verify Following Archive (VFA)

Verify Following Archive re-transfers the data from the source device after an initial archive operation, and then compares it against the AC checksum calculated. The archive operation is successful when the second transfer is fully complete and the checksums are identical. This verification mode is not compatible with GC enabled sources or complex objects.

Verify Following Restore (VFR)

Verify Following Restore re-transfers the data from the source device after an initial restore operation, and then compares it with the checksum attached to the stored data. This restore operation is considered successful when the second transfer is fully complete and the checksums are identical.

WARNING:

This verification mode is not compatible with complex objects.

Each workflow can be used with one or several requests. The following list describes the DIVArchive requests and the appropriate corresponding checksum workflows. The operation is listed and then the supported checksum workflows for that operation.

Notes:

Verify Following Archive and Genuine Checksum are mutually exclusive.

Verify Write does not apply to cache disk writes.

Archive

Verify Read (default), Genuine Checksum, Verify Following Archive, Verify Write

Restore

Verify Read (default), Verify Following Restore

N-Restore

Verify Read (default)

Partial File Restore

Checksums are not supported for Partial File Restore. See Appendix A for Oracle DIVArchive options and licensing information.

Copy

Verify Read (default), Verify Write

Copy As New

Verify Read (default), Verify Write

Associative Copy

Verify Read (default), Verify Write

Verify Tapes

Verify Read (default)

Repack Tapes

Verify Read (default), Verify Write

Export

Export content with checksum (default) - See Appendix A for Oracle DIVArchive options and licensing information.

Import

Import content with checksum (default) - See Appendix A for Oracle DIVArchive options and licensing information.

Transcoding (Archive, Restore, Copy)

Verify Read, Genuine Checksum - both workflows are supported with a change in the object format.

Checksum Support with Complex Objects

DIVArchive 7.6 supports all checksum workflows for non-complex objects. However, only Verify Write (VW) is supported for complex objects.

Complex object checksums are stored in the Metadata database rather than the Oracle database, and therefore will not be displayed in any database queries. The getObjectInfo API call will return a phony checksum, and not all files and folders will be displayed (only a single file representing the entire complex object).

If checksum support is disabled when a complex object is archived, and then subsequently enabled, there will be no checksum comparison during operations on the complex object. In other words, whatever checksum is used when the complex object is archived will be the checksum used throughout the life of the object.

Checksum Support for Symbolic Links

DIVArchive 7.6 has limited support of checksums for symbolic links. The symbolic links carry the checksum value of the empty file that represents the link instead of the checksum of the file reference by the link.