C H A P T E R  6

Configuring Metadata and Virtual File System Views

This chapter describes how to modify the default schema file to add metadata specific to your applications. This chapter also describes how to modify the default schema file to configure virtual file system views that allow users to browse data objects as though they were stored in a traditional hierarchical file structure.

This chapter contains the following sections:



Note - For instructions on accessing the CLI commands and GUI functions described in this chapter, see Using the Administrative Interfaces.



The Metadata Schema

The metadata schema specifies the metadata attributes that can be stored with objects in the 5800 system. The system comes preconfigured with a default metadata schema, which you can modify to specify metadata appropriate to your applications.

The following sections describe the metadata schema file and its components.

Metadata Schema File

You specify what metadata the data objects in your system include and how that metadata is structured using the schema file. You also configure virtual views using the schema file. A predefined schema file, which contains the minimum set of attributes, is included with the 5800 system. You modify that schema file to add the extended metadata and file system views applicable to your configuration.

Schema File Structure

The schema file for the 5800 system is a standard XML file with the general format shown in CODE EXAMPLE 6-1. See CODE EXAMPLE 6-3 for an example of a schema file.


CODE EXAMPLE 6-1 General Structure of the Schema File
<?xml version=”1.0” encoding=”UTF-8”?>
<metadataConfig>
          <schema>
               Schema definition
          </schema>
          <fsViews>
               File system views specification
          </fsViews>
          <tables>
               Tables definition
          </tables>
</metadataConfig>

 

Schema File DTD

The Document Type Definition (DTD), which defines the structure of a the schema file, is shown in CODE EXAMPLE 6-2.


CODE EXAMPLE 6-2 DTD for Schema File
<?xml version="1.0" encoding="UTF-8"?>
<!--Sun StorageTek 5800 Metadata Configuration Rules.(Sun Microsystems, Inc.)-->
<!ELEMENT metadataConfig (schema, fsViews?, tables?)>
<!ELEMENT schema (namespace*, field*)>
<!ELEMENT namespace (namespace*, field*)>
<!ELEMENT field EMPTY>
<!ELEMENT fsViews (fsView*)>
<!ELEMENT fsView (attribute+)>
<!ELEMENT attribute EMPTY>
<!ELEMENT tables (table*)>
<!ELEMENT table (column+)>
<!ELEMENT column EMPTY>
<!ATTLIST namespace
	name CDATA #REQUIRED
	writable (true | false) "true"
	extensible (true | false) "true"
	name CDATA #REQUIRED
	type (long | double | string | char | binary | date | time | timestamp | objectid) #REQUIRED
	length CDATA #IMPLIED
	queryable (true | false) "true"
<!ATTLIST fsView
	name CDATA #REQUIRED
	filename CDATA #REQUIRED
	namespace CDATA #IMPLIED
	readonly (true | false) "false"
	filesonlyatleaflevel (true | false) "true"
	fsattrs (true | false) "false"
<!ATTLIST attribute
	name CDATA #REQUIRED
<!ATTLIST table
	name CDATA #REQUIRED
<!ATTLIST column
	name CDATA #REQUIRED

 

Example Schema File

CODE EXAMPLE 6-3 shows an example of a schema file for a system storing MP3 music files.


CODE EXAMPLE 6-3 Example Schema File
<?xml version="1.0" encoding="UTF-8"?>
     <!--Example of schema configuration file for a system storing MP3
     music files.-->
     <metadataConfig>
     <schema>
         <namespace name="mp3" writable="true" extensible="true">
             <field name="artist" type="string" length="128"
queryable="true" />
             <field name="album" type="string" length="128"
queryable="true" />
             <field name="title" type="string" length="128"
queryable="true" />
             <field name="type" type="string" length="128"
queryable="true" />
             <field name="year" type="long" queryable="true"/>
         </namespace>
     </schema>
     <fsViews>
         <fsView name="byArtist" namespace="mp3"
filename="${title}.{type}" fsattrs="true"
filesonlyatleaflevel="true">
             <attribute name="artist"/>
             <attribute name="album"/>
         </fsView>
<fsView name="byAlbum" filename="${mp3.title}.mp3"
     readonly="true" fsattrs="true">
             <attribute name="mp3.album"/>
         </fsView>
     </fsViews>
     <tables>
         <table name="mp3">
             <column name="mp3.artist"/>
             <column name="mp3.album"/>
             <column name="mp3.title"/>
             <column name="mp3.year"/>
         </table>
     </tables>
     </metadataConfig> 

 

Metadata

Metadata is information that describes a data object. The 5800 system stores metadata about all data objects in a distributed database. Users can issue queries to search the database and find objects based on the metadata assigned to them. The 5800 system allows two types of metadata: system and extended.

System Metadata

The 5800 system automatically assigns system metadata to every data object when it is stored on the 5800 system. System metadata includes a unique identifier for each object, called the Object ID or OID. The application programming interface (API) included with the 5800 system can retrieve the object using this OID. System metadata also includes creation time, data length, and data hash.

Extended Metadata

Extended metadata goes beyond the system metadata to further describe each data object. For example, if the data stored on the 5800 system includes medical records, extended metadata attributes might include patient name, date of visit, doctor name, medical record number, and insurance company. Users can issue queries to retrieve data objects using these attributes. For example, a query could retrieve all records (data objects) for a given doctor and a particular insurance company.

Metadata Types

The 5800 system supports metadata as sets of typed name-value pairs. TABLE 6-1 lists the supported metadata types.


TABLE 6-1 Supported Metadata Types

Valid Types

Description

Long

64 bits

Maximum Value: -9223372036854775808

Minimum Value: 9223372036854775807

Double

64 bits

Maximum Value: 1.7976931348623157E308

Minimum Positive Value: 4.9E-324

String

A string of characters from the Basic Multilingual Plane of unicode values, excluding the null character (0). Characters from the range of Unicode Surrogates (D800-DFFF) are not supported. Length can be 0 to 4000 unicode characters.

Char

A string of eight-bit characters in the ISO-8859-1 (Latin-1) character set, excluding the null character (0). Length can be 0 to 8000 Latin-1 characters.

Binary

A string of bytes from the range 00 to FF. Length can be 0 to 8000 bytes.

Date

Corresponds to JBDC SQL DATE type. Year/Month/Day

Time

Corresponds to JDBC SQL TIME type with precision 0 (seconds past midnight)

Timestamp

Corresponds to JDBC SQL TIMESTAMP type with precision 3 (absolute Year/Month/Day/Hour/Minute/Second/Millisecond)

ObjectID

Binary that specifies that OID of the data


Namespaces

You can group metadata into namespaces, or collections of metadata names, identified by a string. Namespaces are essentially directories of metadata names. Just as directories can include subdirectories, namespaces can include subnamespaces or namespaces within namespaces. You can have as many namespaces as you want in the 5800 system metadata schema. There is also no limit on the number of subnamespaces within a given namespace.

The full name of an attribute is the name of its namespace, followed by a dot, followed by the attribute name. For example, the attribute name yoyodyne.widget.oscillation.overthruster represents an attribute whose name is overthruster, which is grouped within the subnamespace oscillation, which is part of the subnamespace widget, which in turn is part of the namespace yoyodyne.

Writable and Extensible Namespaces

When defining a namespace in the metadata schema, you can define two optional properties:

If a namespace is writable, you can specify any field in the namespace when you store an object. If a namespace is nonwritable, it is read-only, and you cannot specify any of the fields. The system namespace, for example, is nonwritable (read-only). If a namespace is nonwritable, any subnamespaces you add will also be nonwritable.

By default, namespaces are extensible, which means that you can add attributes or subdomains to the namespace. You can change a namespace from extensible to nonextensible, but the reverse is not true.

Reserved Namespaces

The 5800 system reserves a namespace called system for metadata created by the 5800 system itself and a namespace called filesystem to specify how the file system layer presents files. For example, the system namespace includes the creation time for an object and the filesystem namespace includes the object’s user identifier (UID) and group identifier.

TABLE 6-2 lists the namespaces that 5800 system reserves.


TABLE 6-2 Reserved Domains

Name

Writable

Extensible

system

false

false

filesystem

true

false


The system Namespace

TABLE 6-3 lists the contents of the reserved system namespace.


TABLE 6-3 system Namespace Contents

Attribute Name

Definition

system.object_id

The object identifier

system.object_ctime

Creation time

system.object_layoutMapId

Layout map used to store the object

system.object_size

Data size

system.object_hash

Hash value for the data

system.object_hash_alg

Algorithm used to compute the hash (for example, SHA1)


The filesystem Namespace

TABLE 6-4 lists the contents of the reserved filesystem namespace.


TABLE 6-4 filesystem Namespace Contents

Attribute Name

Definition

filesystem.uid

Owner ID

filesystem.gid

Group ID

filesystem.mode

File mode (permissions, and so on)

filesystem.mtime

Last modification time.

filesystem.mimetype

MIME type


Fully Qualified Names

Applications must always use the fully qualified name of the attribute when storing metadata or querying. The fully qualified name includes of all the enclosed namespace names from the broadest to the narrowest, separated by dots, followed by the attribute name itself, as in namespace.subnamespace.fieldName.

Planning Namespaces

You might want to use the name of your organization or company as the top level namespace, and something like project names as subnamespaces. For example, an organization named Yoyodyne, Inc. might set up their namespaces and subnamespaces as follows:


<namespace name="yoyodyne">
    <namespace name="widget">
      <namespace name="oscillation">
       <attribute name="overthruster" />
       ...
      </namespace>
    </namespace>
    <namespace name="lectroid">
      <attribute name="type"/>
      ...
    </namespace>
  </namespace>

Tables and Columns

You partition the metadata schema into tables and specify each metadata field as a column within a particular table. Objects stored in the 5800 system become rows in one or more tables, depending on which metadata fields are associated with that data.

All fields in a query should come from the same table, since queries might fail if they include fields from different tables. The largest supported query string is 8080 bytes. The combined size of all query literals and parameters is also limited to 8080 bytes. If you must use queries that include fields from more than one table, make sure that no more than one query that references multiple tables is running at a time. Refer to the Sun StorageTek 5800 System Client API Reference Guide for complete information about query sizes and limitations.

Table Example

Suppose you specify columns in a table reference in the metadata schema as shown in the following example.


<table name="reference"/>
	<column name="mp3.artist"/>
	<column name="mp3.album"/>
	<column name="mp3.title"/>
	<column name="dates.year"/>
</table>

The reference table you create would have the logical layout shown in TABLE 6-5.


TABLE 6-5 Example Table ( reference Table)

OID

mp3.artist

mp3.album

mp3.title

dates.year

Object1

Benny Goodman

The Very Best of Benny Goodman

St. Louis Blues

2000

Object2

Rod Stewart

The Very Best of Rod Stewart

Maggie May

2001

Object3

Bing Crosby

Null

I’m Dreaming of a White Christmas

Null


When an object is stored in the 5800 system that has any of the specified metadata attributes associated with it (mp3.artist, mp3.album, mp3.title, or dates.year), that object OID is listed as a row in the reference table and the values of the attributes are listed in the column that corresponds to that attribute. If no value is assigned to an attribute for that object, no value is listed in the corresponding column.

If the object has other metadata associated with it, that object will also be stored in whichever tables include that other metadata as columns.

The length Attribute for Fields

You specify a length attribute for fields of type string, binary, and char. The length attribute is important because there are limits to the number of bytes that each table row and each index can store. See Planning Tables and Planning Indexes for more information.



Note - The 5800 system emulator supports the same field length as that supported by the 5800 system, within the specified limits.


Trying to store a string, binary, or char value that is longer than the specified field length will result in an error message.

Planning Tables

You should store metadata attributes that occur together in queries in the same table, since queries that include fields from different tables may fail. Pay close attention to which metadata attributes occur together in your data, especially if those attributes are used in queries, and group those fields together into the same table.

Conversely, avoid putting fields into the same table that do not occur together in queries, since doing so wastes space and degrades query performance.

Planning Table Rows

When planning tables, be aware that the maximum number of bytes allowed for any single row in the table is 8080.

You might want to specify as small a length value as possible for each field (column) in the table so that you can fit as many columns as possible in the table and any single row will not exceed the 8080-byte limit.

TABLE 6-6 lists the number of bytes that each element in a column consumes. The total amount of space consumed by all the columns in a table cannot exceed 8080 bytes.


TABLE 6-6 Number of Bytes Used By Each Column in a Table Row Definition

Element

Space Consumed

System overhead

78 bytes per table row.

Column (Field)

Each column (or field) in a table row uses 2 bytes per column for overhead, plus the number of bytes in the field. The number of bytes for each field type is as follows:

  • string - twice the length of the field
  • long - 8 bytes
  • double - 8 bytes
  • timestamp - 8 bytes
  • date - 4 bytes
  • time - 4 byes
  • char - length of the field
  • binary - length of the field
  • objectID - 30 bytes

For example, a field of type string with length 80 consumes 2 bytes for overhead plus 160 bytes for the field length, for a total of 162 bytes. A field of type date consumes 2 bytes for overhead plus 4 bytes for field length, for a total of 6 bytes.


Planning Table Rows Example

Suppose the fields listed in TABLE 6-7 commonly occur together and will be used together in queries. (Three of these fields are in the namespace mp3 and one is in the namespace dates.)


TABLE 6-7 Examples of Fields to Be Added to a Table

Field

Type

Length Setting

Bytes Required

mp3.artist

string

256

512

mp3.album

string

256

512

mp3.title

string

512

1024

dates.year

long

NA

8


Include each of these fields as columns in the same table, called, for example, reference. The maximum number of bytes allowed in any row in the table is 8080. When planning the reference table, calculate the total number of bytes used by all the columns combined, to make sure it is less than 8080, as follows:

78 (for system overhead) +
8 (2 per column for column overhead) +
512 (for mp3.artist) +
512 (for mp3.album) +
1024 (for mp3.title) +
8 (for dates.year)

____________________________

= 2142 bytes total

Since 2142 bytes is less than 8080 bytes, the total combined size of all columns is acceptable.

Planning Tables Checklist

For best results, consider this information when planning tables:

Indexes

The system creates indexes on metadata fields to allow those fields to be queried more efficiently. You use virtual file system views to specify the content of the indexes the system creates and to maximize query performance.



Note - You also configure virtual file system views that have nothing to do with indexes. See Virtual File System Views for information about virtual file system views.


For each virtual file system view you create, the system creates an index of up to 15 fields, as long as those fields all come from the same table.



Note - Each virtual file system view consumes more system resources than in previous versions of the 5800 system. For best performance, create only virtual file system views that are required for an application or to define an index to speed up queries.


Planning Indexes

For virtual file system views that you create in order to specify indexes that improve query performance, follow these guidelines:

Planning Indexes Examples

This section includes two different examples of how you might go about planning for indexes.

Example 1

Suppose you want to have a query on the fields listed in TABLE 6-9.


TABLE 6-9 Example of Fields to Be Added to a Table

Field

Type

Length Setting

Bytes Required

book.author

string

50

100

book.series

string

50

100

book.title

string

50

100

dates.year

long

NA

8


To maximize query performance, you include each of these fields as columns within the same table, called books. To maximize performance even further, you create a virtual file system view called, for example, bookview, that includes these fields and no others so that an index is created on these fields for querying.

Since all of the fields are from the same table, the system creates an index that includes all of these fields, as long as the total number of bytes required for the index does not exceed 1024. Calculate the number of bytes required for the index as follows:

78 (for system overhead) +
8 (2 per column for column overhead) +
100 (for book.author) +
100 (for book.series) +
100 (for book.title) +
8 (for dates.year)

____________________________

= 394 bytes total

Since 394 is less than 1024, the system indexes all of the fields, allowing them to be queried at maximum performance.

If you calculate that the fields in a query cannot be indexed because they require too much space, you might want to reduce the length specified for each field. Alternatively, you might want to define a virtual file system view with a smaller set of fields. An index of a subset of fields in the query might still help to speed up query performance.

Example 2

Suppose your system is configured with the schema file shown in CODE EXAMPLE 6-4.


CODE EXAMPLE 6-4 Example Schema File for Index Planning
<?xml version="1.0" encoding="UTF-8"?>
     <!--Using fsView to Create Index on Commonly Searched Fields -->
     <metadataConfig>
     <schema>
         <namespace name="MyTube" writable="true" extensible="true">
             <field name="Title" type="string" length="38"
queryable="true"/>
             <field name="keywords" type="string" length="120"
queryable="true"/>
             <field name="owner" type="string" length="25"
queryable="true"/>
             <field name="format" type="long" queryable="false"/>
             <field name="date" type="string" length="12"
queryable="true"/>
         </namespace>
     </schema>
     <tables>
         <table name="videos">
             <column name="MyTube.Title"/>
             <column name="MyTube.keywords"/>
             <column name="MyTube.owner"/>
             <column name="MyTube.format"/>
             <column name="MyTube.date"/>
         </table>
     </tables>
     </metadataConfig> 

 

If you know that users are likely to do searches on the owner, date, and keywords fields, you could create an index called key_owner_index on those fields using the fsView tag, as shown in the schema file example shown in CODE EXAMPLE 6-5. (Since keyword is included as the filename property, it is automatically included as an attribute of the fsView and therefore included in the index.)


CODE EXAMPLE 6-5 Using fsView to Create Index on Commonly Searched Fields
<?xml version="1.0" encoding="UTF-8"?>
     <!--Using fsView to Create Index on Commonly Searched Fields-->
     <metadataConfig>
     <schema>
         <namespace name="MyTube" writable="true" extensible="true">
             <field name="Title" type="string" length="38"
queryable="true"/>
             <field name="keywords" type="string" length="120"
queryable="true"/>
             <field name="owner" type="string" length="25"
queryable="true"/>
             <field name="format" type="long" queryable="false"/>
             <field name="date" type="string" length="12"
queryable="true"/>
         </namespace>
     </schema>
     <fsViews>
         <fsView name= "key_owner_index" namespace="MyTube"
     filename="${keywords}">
             <attribute name="owner" />
             <attribute name="date" />
         </fsView>
     </fsViews>
     <tables>
         <table name="videos">
             <column name="MyTube.Title"/>
             <column name="MyTube.keywords"/>
             <column name="MyTube.owner"/>
             <column name="MyTube.format"/>
             <column name="MyTube.date"/>
         </table>
     </tables>

Users in this example might also commonly search on just the owner and keyword fields, and also sometimes on owner, keyword, and title. The system cannot process queries that do not exactly match an existing index as quickly as queries that do, but if the query fields are almost the same as the fields in the indexes, the performance might still be acceptable.

You should test the queries on your system to see if additional indexes are required to speed query performance.

Excluding Attributes From Indexes and Queries

By setting queryable = false, you can exclude that field from the metadata that is indexed and available for queries. You might want to exclude a field from indexes, for example, if you will only access that field through the retrieveMetadata example application, and never through queries.

Planning Tables and Indexes Checklist

To maximize the performance of queries, keep in mind these considerations when planning tables and indexes:


Virtual File System Views

The 5800 system stores data as discrete objects that users retrieve through queries on object identifiers and/or metadata. The data is not stored in the hierarchical structure typical of file systems, which contain directories, subdirectories, and files.

However, you can set up a virtual view of the data, which presents the data objects in a hierarchical structure that mimics a file system. For example, for a 5800 system that stores MP3 files, you could set up a virtual view with a directory for the artist, a subdirectory for the album, and file names based on the title of the music files.

Users access the file system views of the data using a browser and the Web-based Distributed Authoring and Versioning (WebDAV) protocol.

WebDAV

You access virtual file system views of the data through the Web-based Distributed Authoring and Versioning (WebDAV) protocol, a set of extensions to the HTTP/1.1 protocol that enables you to read, add, and delete files on remote web servers.

WebDAV is not supported for multi-cell configurations.



Note - Virtual file system views are available to browse whenever the Status at a Glance panel in the GUI or the sysstat CLI command shows that the Query Engine status is HAFaultTolerant. See Monitoring the System or Obtaining System Status for more information.


Using WebDAV to Browse Virtual File System Views

To access the virtual file system views using WebDAV, type the following in the browser’s address page:

http://data-VIP:8080/webdav

where data-VIP is the data VIP address of the 5800 system. See Data IP Address for information about the data VIP address.

WebDAV Example

The following example shows a WebDAV screen that might be displayed in a user’s browser. It lists the virtual file system views that are defined on that system.


.
..
byArtist
byAlbum
byYear

Clicking the links on this page enables users to browse objects as though they were arranged in a file system structure.

For example, suppose you have defined a virtual file system view byArtist that includes the subdirectories artist and album (in that order). You have indicated in the virtual file system view definition that the files should be named by track number (tracknum). Clicking byArtist in the browser would yield the list of artists, as follows:


..
Beatles
Madonna
Prince
Rush

Clicking Rush would list the album names, as follows:


.
..
2112
Signals

Clicking Signals would list the album's track numbers, as follows:


.
..
1
2
3
4
5
6
7
8

Clicking the link for 1 would enable users to access the data object on the 5800 system associated with track 1 of the Rush album Signals.



Note - The procedure users follow to add or delete files from the virtual file system view using WebDAV varies depending by browser. Consult the browser’s documentation or online help for information.


Metadata Attributes and WebDAV Properties

Each file in the 5800 system virtual view appears as a file in the file system exported to WebDAV. The file attributes (stat data) are exported as WebDAV properties. TABLE 6-10 lists the WebDAV property names and corresponding system metadata attributes. These attributes are regular metadata values accessible through API queries.


TABLE 6-10 WebDAV Property Names and System Metdata Attributes

WebDAV Property

Metadata Attribute

Description

Predefined properties

DAV:getlastmodified

filesystem.mtime

Last modification time

 

DAV:getcontentlength

system.object_size

Size of file

 

DAV:creationdate

system.object_ctime

File creation time

 

DAV:getcontenttype

filesystem.mimetype

MIME type

 

DAV:displayname

filename

Name presented to user

Properties specific to 5800 storage system

HCFS:mode

filesystem.mode

File mode (permissions, and so on)

 

HCFS:uid

filesystem.uid

Owner ID

 

HCFS:gid

filesystem.gid

Group ID




Note - The timestamps are all 64-bit signed offsets from the epoch -- 00:00:00 1/1/1970 Coordinated Universal Time (UTC), in milliseconds, while the range is 300 million years.)

The file size, uid, and gid are unsigned 64-bit integers, while the creationdate property is returned as an ISO 8601 localized string. The getlastmodified property is a string similar to the output of date(1) (for example, Mon Apr 9 17:57:11 UTC 2007).


Including Additional File Attributes in Virtual Views

As described in Metadata Attributes and WebDAV Properties, the 5800 system exports a number of file attributes as part of a virtual view. In addition to those attributes that are always exported, you can choose to have the remaining attributes in the filesystem namespace (filesystem.mimetype and filesystem.mtime) exported with the files.

If you choose this option, the WebDAV browser uses the filesystem.mimetype attribute as Content-type in the HTTP header. With Content-type supplied in the HTTP header, when a user clicks on a link to download the file, WebDAV opens the appropriate program. Without Content-type in the HTTP header, the WebDAV browser does not know the file’s type and simply prompts the user to save the file to disk.

If you are using the CLI to configure virtual views, select this option by setting fsattrs to true in the schema file, as shown in the Example Schema File.

If you are using the GUI to configure virtual views, select this option by selecting the Include Extended File System Fields checkbox on the Setup Virtual File Systems panel. See Configuring Virtual File System Views Using the GUI for more information on using the GUI to configure virtual views.



Note - Choosing this option to retrieve the additional file system attributes requires an additional query to the 5800 system and therefore might negatively affect system performance.


Directory Structure in a Virtual File System View

You can use the filesonlyatleaflevel attribute to control which objects are displayed as part of a virtual file system view.

If you keep the filesonlyatleaflevel attribute at its default of true, an object is displayed as part of the virtual file system view only if it has metadata values stored in the 5800 system for all fields specified in the attribute list for the virtual file system view and also in the filename description.

For example, suppose you have set up a virtual view called byArtist as follows:


<fsView name="byArtist" namespace="mp3"
     filename="${title}.{type}" fsattrs="true"
     filesonlyatleaflevel="true">
             <attribute name="artist"/>
             <attribute name="album"/>

In this case, only objects with metadata values for title, type, artist, and album will appear in the virtual file system view. For example, the three objects shown here are stored with metadata values for title, type, artist, and album, and therefore appear at the bottom (or “leaf”) level of the directory in the virtual file system view.


beatles
          abbey_road	
             something.mp3
             because.mp3
             come_together.mp3

An object that has metadata values for title and artist, but not for type or album, simply would not appear in the view.

If you set the filesonlyatleaflevel attribute to false, any object that has metadata values for all fields specified in the filename description, as well as metadata values for a subset of the fields in the attribute list, appears in the virtual file system view, at the upper levels of the directory (not at the “leaf-level”).



Note - To appear in an upper-level directory in the structure, all of the object’s attributes at higher levels must have values and all attributes at lower levels must not have values. All attributes in the virtual file system view must be defined for the object.


For example, in the preceding example, if the filesonlyatleaflevel attribute were set to false, an object with metadata values for title, type, and artist, but not album, would appear in the virtual file system view as the song “Shattered” by the Rolling Stones appears here:


beatles
          abbey_road	
             something.mp3
             because.mp3
             come_together.mp3
rolling_stones	
             shattered.mp3



Note - All attributes in a virtual file system view for which you have specified filesonlyatleaflevel = false must be in the same table. See Tables and Columns for more information about tables.


Virtual File System Views in the Schema File

The fsView section of the schema file determines the virtual file system views that users can browse using WebDAV. See The Metadata Schema for more information about virtual file system views.

Note the following for fsViews in the schema file:



Note - All attributes in the system namespace are read only. If you include a system attribute in an fsView entry, that entire entry is automatically read only.



Summary of Metadata Schema Elements

TABLE 6-11 summarizes the purpose and meaning of the fields you must specify and plan for when configuring the metadata schema.


TABLE 6-11 Metadata Schema Fields

Element

Purpose

For more information...

Metadata attribute

Describes something about an object. For example, in a patient record, the metadata attribute doctor might specify the patient’s doctor’s name. The metadata attribute insurance might specify the patient’s insurance company.

Metadata

Namespace

Organizes metadata names into collections of names, similar to directories.

Namespaces

Table

Uses rows and columns to group metadata attributes that commonly occur together into a single group.

Tables and Columns

Index

Mechanism that enables the system to query metadata fields. Each virtual file system view created becomes an index. You can use virtual file system views to control the content of the indexes the system creates and maximize query performance.

Indexes

Virtual File System Views

Allow you to view files using WebDAV in a hierarchical structure that mimics a file system. Each view you create also becomes an index, so even if you do not plan on using WebDAV to browse files, you will want to create views in order to specify indexes to maximize query performance.

Virtual File System Views



Configuring the Metadata Schema Using the CLI



Note - Before configuring the metadata schema, make sure that the query database is online by issuing the sysstat command and checking that the “Query Engine Status” indicates HAFaultTolerant. See sysstat for more information about the sysstat command.



procedure icon  To Modify the Schema File Using the CLI

1. Create a schema overlay to extend an existing schema.

A schema overlay is an XML file that follows the specification shown in Schema File DTD. It contains only the new namespaces and fields that you want to add.

If you want, you can use mdconfig followed by the -t or --template option. This returns an XML template file that you can use as a starting point to create that overlay.

Once a version of the overlay is available, you can perform a validation through the CLI. The purpose of the validation is to ensure that the XML syntax is correct and also to provide an overview of the operation that will be performed if the overlay occurs.

2. To perform a validation on the overlay.xml file, use the command mdconfig followed by the -p or --parse option.



Note - You can use ssh to log in to the 5800 system and issue CLI commands such as mdconfig simultaneously by typing the ssh command and the CLI command on the same line. That method of issuing commands is useful in this procedure and is shown in the following examples. You may have to enter the administrative password for the 5800 system before the command takes effect.


For example, to validate the local overlay.xml file, type the following command from the system on the network where the overlay.xml file is stored:

$ cat overlay.xml | ssh admin@admin_IP mdconfig --parse

Once you are satisfied with the overlay, you must commit it so the 5800 system can execute it.

3. To commit the overlay.xml file, use the command mdconfig followed by the -a or --apply option.

For example, to continue the previous example, enter the following command from the system on the network where the overlay.xml file is stored:

$ cat overlay.xml | ssh admin@admin_IP mdconfig --apply


Note - The --apply option runs a validation before performing the commit operation. If the XML syntax is not correct, the system returns an error.


If the system is under heavy load, you might see the following error message when you issue the mdconfig --apply command:

Timed out waiting for the state machine.

This message indicates that the new schema definition file has been committed to the system, but not all of the tables have been created.

In this case, reduce the load on the system if possible, and use the command mdconfig --retry to finish the table creation:

$ ssh admin@admin_IP mdconfig --retry

When you issue the mdconfig --retry command, the system finishes creating any tables that were not completed during the mdconfig -a operation. Tables that had already been created are not affected. You might have to issue the mdconfig --retry command several times before all tables are created.


Configuring the Metadata Schema Using the GUI

This section includes procedures for using the GUI to display the current metadata schema and make changes to the schema.


procedure icon  To Display the Current Metadata Schema

single-step bullet  From the navigational panel, choose Configuration > Metadata Schema > View Schema.

The View Schema panel is displayed, listing the namespaces and tables that are configured in the schema.


procedure icon  To Display the Fields in a Namespace

1. From the navigational panel, choose Configuration > Metadata Schema > View Schema.

The View Schema panel is displayed, listing the namespaces and tables that are configured in the schema.

2. In the Namespaces section, select the namespace for which you want to display fields.

The fields are listed in the Fields for Selected Namespace section.


procedure icon  To Display the Fields in a Table

1. From the navigational panel, choose Configuration > Metadata Schema > View Schema.

The View Schema panel is displayed, listing the namespaces and tables that are configured in the schema.

2. In the Tables section, select the table for which you want to display fields.

The fields are listed in the Columns for Selected Table section.


procedure icon  To Change the Metadata Schema



Note - Before making changes to the metadata schema, make sure that the query database is online by checking that the Status At A Glance panel indicates that “Query Engine Status” is HAFaultTolerant.


1. From the navigation panel, choose Configuration > Metadata Schema.

2. Click Set Up Schema.

The Set Up Schema panel is displayed.

3. Create namespaces as described in Creating Namespaces.

For information about namespaces, see Namespaces.

4. Create tables as described in Creating Tables.

For information about planning tables, see Planning Tables.

5. Click Apply.

Creating Namespaces

You cannot delete a namespace from the schema. Once a namespace is created, you can only add fields to the namespace, assuming the namespace is extensible. Therefore, review the following information before creating namespaces and namespace fields:


procedure icon  To Create Namespaces

1. From the navigational panel, choose Configuration > Metadata Schema.

2. Click Set Up Schema.

The Set Up Schema panel is displayed.


3. Click the Add button

next to the Namespaces box.

The Add Namespace panel is displayed.

4. Type the namespace name.

5. Choose the parent namespace from the Parent Namespace drop-down menu.



Note - Choosing root as the parent namespace, selecting the Is Extensible check box, and applying your changes causes this namespace to become a parent namespace.


6. Define whether the namespace will be writable and/or extensible by clearing or selecting the appropriate check boxes.


7. Click the Add button

next to the Fields box.

Columns are displayed in the Fields box.

8. Specify the following:

9. Click OK.

The Create Namespace panel is closed, and the newly created namespace and its fields are displayed on the Set Up Schema panel.

10. Create tables for the fields in the namespace as described in Creating Tables.

11. Click Apply.

Creating Tables

You cannot delete a table from the schema. Therefore, review the following information before creating tables:


procedure icon  To Create Tables

1. From the navigation panel, go to Configuration > Metadata Schema.

2. Click Set Up Schema.

The Set Up Schema panel is displayed.

3. Create namespaces as described in Creating Namespaces.


4. Click the Add button

next to the Tables box.

The Create File System Table panel is displayed.

5. Type the table name.

For information on planning tables, see Planning Tables.

6. Choose the namespace that contains the fields you want to include in the table.

The available fields from the namespace are displayed in the Available Fields box.


7. Select the fields that you want included in the table and click the Move Right button

to move the fields to the Selected Fields box.

8. Perform Steps 5 and 6 for all fields that you want to include in the table.

9. Click OK.

The Create Filesystem Table panel is closed and the newly created table is displayed on the Set Up Schema panel.

10. Click Apply.


procedure icon  To Add Fields to an Existing Namespace



Note - You can add fields to existing namespaces only if they are extensible.


1. From the navigation panel, go to Configuration > Metadata Schema.

2. Click Set Up Schema.

The Set Up Schema panel is displayed.

3. Make sure the Show New/Modified Namespaces Only check box is not selected, so that all existing namespaces are displayed on the panel.

4. Select the namespace to which you want to add fields.

The namespace fields are displayed in the Fields for Selected Namespace box.


5. Click the Add button

next to the Fields for Selected Namespace box.

The Add Namespace Fields panel is displayed.

6. Specify the following for this field:


7. If you want to add another new field, click the Add button

and repeat Steps 5 and 6.

8. Click OK.

The panel is closed and you are returned to the Set Up Schema Panel.

9. Click Apply.


Configuring Virtual File System Views Using the GUI

This section includes procedures for displaying the currently configured virtual file system views, creating new views, and browsing the views.


procedure icon  To Display the Current Virtual File System Views

single-step bullet  From the navigational panel, choose Configuration > Virtual File Systems > View Virtual File Systems.

The View Virtual File Systems Views panel is displayed, listing the views that are defined in the system.


procedure icon  To Display the Fields in a View

1. From the navigational panel, choose Configuration > Virtual File Systems > View Virtual File Systems.

The View Virtual File Systems Views panel is displayed, listing the views that are defined in the system.

2. In the Views section, select the view for which you want to display fields.

The fields are listed in the Fields for Selected View section.


procedure icon  To Create New Virtual File System Views

1. From the navigation panel, go to Configuration > Virtual File Systems.

2. Click Set Up Virtual File Systems.

The Set Up Virtual File Systems panel is displayed.

3. Type the view name.

4. If you do not want users who browse this view to be able to add or delete objects, select the Read-Only check box.

5. If you want users who browse this view to see only files for which there are attributes at every level in the hierarchy, select the Files Only at Leaf Level check box.

If you want users to see files at higher levels in the hierarchy if there are no attributes at the lower levels, do not select this checkbox. See Directory Structure in a Virtual File System View for more information.

6. If you want to include the filesystem.mimetype and filesystem.mtime attributes for each file as part of the virtual view, select the Include Extended File System Fields check box.

See Including Additional File Attributes in Virtual Views for more information.


7. In the Available Fields box, select the fields that you want in the view and click the Move Right button

to move the fields to the Selected Fields box.

Note - Fields you select will appear in the virtual view as directories and subdirectories, with the first field selected as the top-level directory and successive fields as subdirectories, in the order you select the fields.


8. In the File Naming Convention For View section, choose a field from the Selected Fields drop-down menu and click Add To Pattern.

The fields you select are displayed in the Name Pattern field. This pattern specifies what the names of the objects will be in the virtual view.

9. Click Apply.

For example, you might set up a virtual file system view named Songs, as shown in FIGURE 6-1. Users connecting to the 5800 system using WebDAV would see a virtual file system view that displayed song files in a hierarchy with album as the main folder, and artist and title as subfolders.

 

FIGURE 6-1 Virtual File System View Configuration Example


Figure shows Set Up Virtual File System View screen, with file system view configured.


procedure icon  To Preview Virtual File System Views

1. From the navigation panel, choose Configuration > Virtual File Systems.

2. Click Browse Virtual File Systems.

The virtual file systems configured on your system are displayed, as a user accessing the system using WebDAV would see them.