Metadata means “data about the data”; it describes the data and helps to determine how the data should be interpreted. In addition, metadata can be used to facilitate querying the 5800 system for objects that match a particular set of search criteria.
For the 5800 system, the supported metadata option is in the form of name-value fields stored with each object. The set of possible fields is defined in the metadata schema. Setting up a metadata schema is an important system administration task that is described in the 5800 System Administration Guide, and is analogous to the process of database design that goes into creating a data management application. The metadata schema determines what field names, types, and lengths may be used with the metadata stored with each object. In addition, the layout of fields into tables within the schema, together with the definition of views that speed certain searches, determine which kinds of queries about that metadata will be both possible and effective. As such, the metadata schema should match the characteristics of the expected range of applications that will deal with the stored data. The underlying software is designed to support multiple different kinds of metadata to aid in searching. For example, eventually there might be a specialized index to facilitate full-text search within the data objects. This document describes only the API for dealing with the name-value metadata type.
Fields in the schema can be either queryable or non-queryable. The values for non-queryable fields may be retrieved later but may not be used in queries. The 5800 system supports only single-valued fields. Each object can have only a single name-value pair of a given name. There is no built-in support for multiple-valued fields, such as a list of authors of a book in the form of multiple fields named 'author'.
Each data object is associated with a set of name-value pairs at the time the object is stored. Some metadata (system metadata) is assigned by the5800 system as each object is stored. For example, each object contains an “object creation time” (system.object_ctime) and an OID (system.object_id), both of which are assigned by the system at the time an object is created. Some metadata (the computed metadata) is implicit in the stored data, and is made explicit at the time of the object store. For example, the system exposes the object data length as a metadata field (system.object_size). In addition, the 5800 system computes a Secure Hash Algorithm (SHA1) hash of the stored data as the data is stored and stores the hash as a metadata field (system.object_hash). There is also an associated field (system.object_hash_alg) to specify which hash algorithm was used in computing the system.object_hash. It is currently always set to “sha1.”
Finally, some metadata (the user metadata) is supplied by the customer application in the API call at the time an object is stored. Each store operation is allowed to include a NameValueRecord that indicates a set of name-value pairs to be associated with the data object as metadata. Each name in the name-value record must match a field name from the metadata schema; in addition, the data value supplied for each field must match the type and length for the field as specified in the schema. If the names or values supplied for the user metadata do not match the active schema, then an exception is generated and the object is not stored.
The metadata associated with an object is immutable. There is no operation to modify the metadata associated with an object after the object has been stored. Instead, the storeMetadata operation can be used to create a completely new object by associating new user metadata with the underlying data and system-metadata of an existing object. The storeMetadata operation does not merge the new metadata in with the metadata from the original OID; instead, the storeMetadata operation creates a new metadata record pointing to the same data object. To accomplish a merge of new field values into existing metadata, the customer application must manually retrieve the existing metadata from the original object, perform the merge into a single NameValueRecord on the client side, and then call storeMetadata to create a new object with the merged metadata.
When creating a new object using storeMetadata, a new system.object_id and new system.object_ctime are generated, to indicate that a new object has been created. The metadata computed from the object data itself (system.object_length, system.object_hash_alg, and system.object_hash) does not change. Both the storeObject and the storeMetadata operations return a SystemRecord value that includes all of the system-assigned fields.
While retrieving the OID is the most common use of the SystemRecord, the other system fields can also be helpful. For example, the customer application might use the system.object_length, the system.object_hash_alg and the system.object_hash fields to verify that the data as stored matches the data as present in the customer application. If a hash independently computed on the client matches the hash stored on the 5800 system, then the data store has been validated.
The metadata values associated with an object can be retrieved using the retrieveMetadata operation. The retrieveMetadata operation takes an OID as input, and returns the entire set of user, system, and system-computed metadata. The retrieved metadata is in the form of a NameValueRecord that contains the value of each field as originally stored. The system fields occur using their field names, for example. the field system.object_ctime contains the object creation time. There is no operation to retrieve just a single field or a subset of fields by supplying a list of field names. The retrieveMetadata operation retrieves the values of both queryable and non-queryable fields.