12 Managing a File Store System

This chapter describes how to configure the Oracle WebCenter Content Server file store system and use the File Store Provider.

This chapter includes the following topics:

Note:

Oracle supports the Sun Storage Archive Manager (SAM-QFS) with the WORM option as an alternative to the Content Server standard file store system. For details, see Section 12.5.

Once you have configured Content Server to use SAM-QFS, you cannot revert to using the standard file store system.

12.1 Introduction to the File Store System

With the release of version 11gR1, Content Server implemented a file store system for data management, replacing the traditional file system for storing and organizing content. The FileStoreProvider component exposes the file store functionality in the Content Server interface and allows additional configuration options. For example, you can configure the Content Server instance to use binary large object (BLOB) data types to store content in a database, instead of using a file system. This functionality offers several advantages:

  • Integrates repository management with database management for consistent backup and monitoring processes.

  • Helps overcome limitations associated with directory structure and number of files per directory in a file system approach.

  • Aids in distributing content more easily across systems, for better scaling of Content Server.

  • Allows for different types of storage devices not commonly associated with a file system, for example, content addressed storage systems and write-only devices necessary in some business uses.

With a huge volume of data to manage in a database, one solution to keeping the database running is to use database partitions for your content repository. This requires careful planning. For more information, see the "Partitioned Repository for WebCenter Content using Oracle Database 11g" blog.

Caution:

The FileStoreProvider component is installed, enabled, and upgraded by default during Content Server deployment. It should not be uninstalled or disabled after the default file store is upgraded. For more information on upgrading, see Section 12.2.

If you have an earlier version of Content Server software where you have not yet upgraded the default file store, you can disable the component following the procedure in Section 9.2.2.

This section covers the following topics:

12.1.1 Data Management

Content Server manages content by tracking the storage of electronic files and their associated metadata. It provides the ability for users to store and access their checked in files, any associated information, and any associated renditions. This section discusses the data management methods historically used by Content Server and how they are addressed with the FileStoreProvider component.

12.1.1.1 File Management

The first half of data management is storing electronic files checked in to a Content Server repository. With Content Server, file storage has typically been done with a traditional file system, storing electronic files in a hierarchical directory structure that includes vault and weblayout directories. By using the revision information specified by the content type, security group, and account (if used), files and their associated renditions are placed into particular directories within the vault and weblayout directories. For example, the primary and alternate files specified at check in are stored in subdirectories in the vault directory. The specific file location is defined to be the following:

IntradocDir/vault/dDocType/account/dID.dExtension

In this path name, dDocType is the content type chosen by the user on check in, dID is the unique system-generated identification that identifies this revision, and dExtension is the extension of the file checked in. In this hierarchical model, the system uses the dDocType metadata field to distribute the files within the hierarchy established in the vault directory. Similarly, any web rendition is distributed across the hierarchy within the IntradocDir/weblayout/groups/ directory. The web rendition is the file served out of a web server, and in the historical file system storage method, could be the native file, the alternate file, or a web-viewable file generated by Inbound Refinery or some other conversion application.

This straightforward determination of file storage location is helpful to component and feature writers, helping them understand where files are located and how to manipulate them. However, it also has the effect of limiting storage management. Without careful management of the location metadata, directories can become saturated, causing the system to slow down.

12.1.1.2 Metadata Management

The second half of data management is storing metadata associated with an electronic file. With Content Server, metadata management has typically been done using a relational database, primarily involving three database tables. Metadata enables users to catalogue content and provides a means for creating file descriptors to facilitate finding it within the Content Server repository. For users, the retrieval is done by Content Server, and how and where the file is stored can be completely hidden. For component and feature writers, who may need to generate or manipulate files, the metadata provides a robust means of access.

12.1.1.3 File Stores

The traditional file system model historically used by Content Server limits scalability. As data management needs grow, adding extra storage devices to increase storage space is not conducive to easy file sharing through a web-based interface. Complex, nested file structures could slow performance. Suppressing the creation of a duplicate web-viewable file when the native file format could be used could be difficult. As a consequence of dealing with large systems, for example over 100 million content items, Content Server has shifted to using a file store. This offers the advantages of scalability, flexibility, and manageability.

12.1.2 File Store Provider Features

The FileStoreProvider component enables you to define data-driven rules to store and access content managed by Content Server. File Store Provider offers the following features:

  • The ability to relocate files easily

  • The ability to have the web-viewable file be optional

  • The ability to manage and control directory saturation

  • The ability to integrate with third-party storage devices

  • An API to use, extend, and enhance different storage paradigms

With File Store Provider, checked-in content and associated metadata are examined and assigned a storage rule based on criteria established by a system administrator. Criteria can include metadata, profiles, or other considerations. The storage rule determines how vault and web files are stored by Content Server and how they are accessed by a web server.

12.2 About the File Store Provider Upgrade

The FileStoreProvider component is installed, enabled, and upgraded by default for a Content Server instance with no documents in it. The upgrade includes creation of metadata fields with default values for the file store system (DefaultFileStore). If an existing Content Server instance with documents in it and no File Store Provider is upgraded to a newer version of Content Server, the File Store Provider upgrade is not automatically performed.

If you do not want to upgrade File Store Provider from your current settings, prior to upgrade installation you must add the configuration variable FsAutoConfigure=false in the Content Server config.cfg file.

Caution:

If you start the Content Server instance to set the variable in the Additional Configuration Variables field on the General Configuration page, then Content Server will automatically upgrade File Store Provider.

12.2.1 DefaultFileStore Settings

A Content Server instance containing no documents and with the FileStoreProvider component automatically upgraded uses these DefaultFileStore settings. The settings are single lines of code, not wrapped as shown here.

  • Vault Path:

    $#env.VaultDir$$dDocType$/$dDocAccount$/$dispersion$/$dID$$ExtensionSeparator$
    $dExtension$
    
  • Dispersion Rule:

    $dRevClassID[-9:-6:0:b]/$dRevClassID[-6:-3:0:b]
    

    If encoding is not required, specify:

    $dRevClassID[-9:-6:0]/$dRevClassID[-6:-3:0]
    
  • Web-viewable Path:

    $#env.WeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/$edisp$/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
    
  • Web URL File Path:

    $HttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/$edisp$/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
    

The dispersion field is added in Path information for the storage rule and can be edited. The Web-viewable Path and Web URL File Path fields cannot be edited. The dispersion rule is added in web paths at the $dispersion$ location.

The dispersion rule allows you to specify :b for base 64 encoding of that part of the URL. For example, the following dispersion rule encodes the two parts to be base 64:

$dRevClassID[-9:-6:0:b]/$dRevClassID[-6:-3:0:b]

A Content Server instance containing documents and which does not have the FileStoreProvider component upgrade will return an informational message that the Revisions table is not empty, therefore dispersion for the default storage rule is not set for DefaultFileStore.

12.2.2 Empty Storage Rule

If a site has used an earlier version of Content Server without using the file store system, then upgraded and implemented the FileStoreProvider component, or a site has uninstalled the FileStoreProvider component completely and also removed the metadata fields added by File Store Provider, then when a user checks in a document it will not have an associated storage rule (no xStorageRule field). When File Store Provider is implemented after these types of situations, users will find that documents checked in before File Store Provider was implemented will have an empty xStorageRule field. To fix this situation, users must perform an Update to the Content Information for those documents. The documents will be updated to the default value of the xStorageRule field and will be moved to the location specified by the storage rule. For details on xStorageRule, see Section 12.3.2.2.

12.3 Managing the File Store Provider

A file store for data management is used in Content Server instead of the traditional file system for storing and organizing content. The FileStoreProvider component is installed and enabled by default during Content Server deployment. The FileStoreProvider component automatically upgrades the default file store (DefaultFileStore) to make use of functionality exposed by the component, including modifying the web, vault, and web URL path expressions.

Note:

Partitions are not required to run Content Server, but any attempt to check in content before creating a partition, changing the vault path root, or creating a new, well-formed storage rule will fail. For more information, see Section 12.3.1, including the sections on storage rules and path construction.

Note:

Oracle WebLogic Server does not support configuring its web server for the Content Server instance to add a new virtual directory and alias to point to the weblayout directory for each partition that is created. Partitions can be used for the vault files, and partitions are supported for web files, but the partition root must exist under the default vault and weblayout directories.

Caution:

Resource files should not be edited directly. Proper modification of resource files should be done within the Content Server administration interface or through additional component development. For more information on component development, see Chapter 9.

Three resource tables are used to define and handle file paths. The defaults for the PathMetaData Table and PathConstruction Table cover most scenarios. The StorageRules Table stores the values specified when a storage rule is defined. These three tables are provider-specific, and as such are defined in the provider.hda file of the defaultfilestore/ directory. The defaultfilestore/ directory is located in the IntradocDir/data/providers/ directory. A fourth table, the FileSystemFileStoreAlgorithmFilters Table, requires a component along with Java code to modify.

This section covers the following topics:

12.3.1 Understanding File Store Provider Storage Principles

When a content item is checked in to Content Server, it consists of metadata, a primary file selected by the user, and potentially an alternate file. The alternate file may also be selected and checked in by the user, and is presumed to be a web-viewable file. In a file system approach to Content Server, the primary file is stored in the vault directory at the root of the DomainHome/ directory and is called the native file. If an alternate file is checked in, it is also stored in the vault, but is copied to the weblayout directory or passed to a conversion application, such as Inbound Refinery. If no alternate file is checked in, then the native file is copied from the vault directory to the weblayout directory, existing in two places. If no alternate file is checked in and Inbound Refinery is installed, a rendition of the native file could be created and stored in weblayout directory.

In a file system approach to Content Server, storing content in specified directories defines a path to the content. You can access content from a browser by using a static web URL file path, when you know the content is in a specific location, or using a dynamic Content Server service request, such as GET_FILE, when you do not. With File Store Provider, content may or may not be stored in a file system. Consequently, a new approach to defining paths to the content must be taken.

Depending on how you set up File Store Provider, you may or may not have a static web URL. By using a dynamic Content Server service request, you can access content when you do not know the specific location. With File Store Provider, the static web URL is defined as the web URL file, and the dynamic access is simply called the web URL. Using the File Store Provider interface, you can configure only the static web URL file path. However, you can decide to have the static web URL done as a Content Server service request, essentially making it dynamic.

This section covers the following topics:

12.3.1.1 Using Storage Rules on Renditions to Determine Storage Class

When content is checked in, all versions of the content managed by Content Server are considered renditions. These renditions include the native file, web-viewable file, and any other files that may have been rendered by Inbound Refinery or third-party conversion applications.

Renditions are grouped together into a storage class, which determines where and how a rendition is accessed. Storage classes are grouped together into a storage rule, which defines the vault, web, and web URL path expressions, through a storage class. Additionally, a storage rule determines if a rendition is not stored, as in a web-less file store, or if it is stored in a different device, such as a database rather than a file system.

The following examples illustrate how storage rules can determine where and how different content items can be stored.

Example 1

A storage rule is defined as File system only on the Storage Rule Name dialog and Is Webless File Store is not selected. In this scenario, the system makes a copy of the primary files and places them in the weblayout directory.

This traditional file system storage example typically offers the advantage of faster access time to content when compared with database storage. This advantage diminishes if the file system hierarchy is complex or becomes saturated, or as the quantity of content items increases.

Example 2

A storage rule is defined as File system only on the Storage Rule Name dialog and Is Webless File Store is selected. In this scenario, no copy is made of the primary files and so the native files are the only renditions. Requests for web-viewable files are routed to the native files stored in the vault.

Note:

The web-less option of File Store Provider can specify that no web rendition be created. When this is used in conjunction with Inbound Refinery, a web rendition is always created and stored in either the file system or the database, depending on the storage rule in effect.

This traditional file system storage example, like the previous one, offers the advantage of faster access time to content. It also saves on storage space by not copying a version of the content from the vault directory to the weblayout directory. Instead, it redirects web-viewable access to the content in the vault directory. This is useful if most of the native files checked in are in a web-viewable format, or if Content Server is being used to manage content that is not required to be viewed in a browser.

Example 3

A storage rule is defined as JDBC Storage on the Storage Rule Name dialog and no selection is made from the Renditions choice list. In this scenario, both the vault and web files are stored in the database.

This database storage example offers the advantage of integrating repository management with database management for consistent backup and monitoring processes, and helps overcome limitations associated with directory structure and number of files per directory in a file system approach.

Important:

When necessary, content items stored in a database can be forced onto the file system, for example, during indexing or conversion. The files on the file system are treated as temporary cache and deleted following the parameters specified in the config.cfg file located in the IntradocDir/config/ directory. For more information on the parameters used, see Section 12.3.3.7.

Example 4

A storage rule is defined as JDBC Storage on the Storage Rule Name dialog and Web Files is selected from the Renditions choice list. In this scenario, the vault files are stored in the database and the web files are permanently stored on the file system.

This mixed approach of storing native files in a database but web-viewable files on a file system offers the advantages of database storage in the previous example (integrated backup and monitoring, overcoming file system limitations) for the native files, while providing speedy web access to web-viewable renditions. Like the first example, this advantage can be diminished if the file system structure is overly complex, or the quantity of files is extreme.

12.3.1.2 Understanding Path Construction and URL Parsing

The path to content stored in Content Server is defined in the PathExpression column of the PathConstruction Table. Paths are made up of pieces, with each piece separated by a slash (/). Each piece can be made of a static string or a sequence of dynamic parts. A dynamic part is encapsulated by a dollar sign ($). A part can be calculated using an algorithm, Idoc Script variable, environment variable, or a metadata lookup, and can have the following interpretations:

  • It can be a field defined in the PathMetaData table. If it is defined in the PathMetaData table, it can be mapped to an algorithm, for example:

    $dDocType$
    
  • If it has the prefix #env., it is an environment variable, for example:

    $#env.VaultDir$ or $#env.WeblayoutDir$
    
  • It can be an Idoc Script variable, such as $HttpWebRoot$. For example, the standard vault location is defined as follows:

    $PartitionRoot$/vault/$dDocType$/$dDocAccount$/$dID$$ExtensionSeparator$
    $dExtension$
    

When parsed, the path expression turns into five pieces, interpreted according to the rules specified in the PathMetaData table, as follows:

  • $PartitionRoot$: mapped to the partitionSelection algorithm and uses the xPartitionId as a lookup into the PartitionList table to determine the partition root

  • /vault/: a string, so no calculation or substitution

  • $dDocType$: by the PathMetaData table this is a look up in the file parameters

  • $dDocAccount$: this is mapped to a documentAccount algorithm which takes dDocAccount and parses it into the standard Content Server account presentation with all the appropriate delimiters

  • $dID$$ExtensionSeparator$$dExtension$: this piece has three parts:

    • $dID$: similar to dDocType, this is defined in the file parameters and is a required field

    • $ExtensionSeparator$: determined by an algorithm and by default it returns '.'

    • $dExtension$: similar to dDocType

In the standard configuration for the web-viewable path, the URL contains variables to add the partition root to the web-viewable path, security, dDocType, and dispersion information, as well as the dDocName, rendition, and extension information. FsWeblayoutDir denotes $#env.WeblayoutDir$ by default.

$FsWeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

In the standard configuration for the web URL file path, the URL contains variables to add the partition root to the web-viewable path, security, dDocType, and dispersion information, as well as the dDocName, rendition, and extension information. FsHttpWebRoot denotes $HttpWebRoot$ by default.

$FsHttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

The groups separator indicates to Content Server that the directories that follow are the name of the security group and account to which the content item belongs. Accounts are optional and consequently computed by an algorithm. After the security information is the documents separator, which is immediately followed by the dDocType. Dispersion is optional. The last part of the URL is the dDocName, its rendition and revision information, and its format extension.

Because the URL is expected in this format, Content Server can successfully extract metadata from it. More importantly, it can determine the security information for the content item and derive the access privileges for a particular user.

The parsing guidelines have been expanded to allow for dispersion in the web directory. When $dRevClass$ is encountered, the system processes the dispersion information, then continues with dDocName and dWebExtension as before. This means that the system can now successfully parse URLs of the form:

../groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/$dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$$ExtensionSeparator$$dWebExtension$

12.3.2 About File Store Provider Modifications to Content Server

The FileStoreProvider component makes several modifications to the Content Server database, Content Server metadata fields, and other configuration files, allowing for possible configuration options.

This section covers the following topics:

12.3.2.1 Database Options

In some situations, content stored in a database may have to be forced onto a file system. One example would be when Oracle WebCenter Content: Inbound Refinery must have access to a file for conversion. Files forced onto a file system are considered temporary cache. The following configuration values are used to control when the temporarily cached files are to be cleaned up. Note that the system only cleans up files that have an entry in the FileCache Table.

Variable Description

FsCacheThreshold

Specifies the maximum cache size, in megabytes. The default is 100. When the threshold is met, the Content Server instance starts deleting files that are older than the minimum age, as specified by the FsMinimumFileCacheAge parameter.

FsCleanUpCacheDuringIndexing

Specifies if the cache will be cleaned during the indexing cycle. The default is false.

FsCleanUpCacheIndexingMax

Specifies the number of cache files to delete in each indexing cycle, which limits the load on the cycle. The default is to delete all eligible cache files for the indexing cycle.

FsMaximumFileCacheAge

Specifies the maximum age at which files are cached, expressed in days. The default is 365.

FsMinimumFileCacheAge

Specifies the minimum age at which cached files can be deleted, expressed in days. The default is 1. This parameter is used in conjunction with the FsCacheThreshold parameter to determine when to delete cached files.


12.3.2.2 Content Server Metadata Fields

File Store Provider adds several Content Server metadata fields and makes additional options available for use in configuration files.

12.3.2.2.1 Configuring Metadata Fields

File Store Provider adds three metadata fields to the Content Server instance:

  • xPartitionId: This metadata field is used in conjunction with the PartitionList table to determine the root location of the content item files. It is recommended that this field be hidden on the user interface, because the partition selection algorithm provides a value.

  • xWebFlag: This metadata field is used to determine whether a content item has a web-viewable file. Consequently, if the system has content items that have only vault files, then removing this metadata field causes the system to expect the presence of a web-viewable and may cause harm to the system. The metadata field can be specified by the configuration value WebFlagColumn.

  • xStorageRule: This metadata field is used to track the rule that was used to determine how the file is to be stored. The metadata field may be specified by the configuration value StorageRuleField.

Note:

These metadata fields are added by File Store Provider on startup and if deleted are added again when the Content Server instance restarts. If the metadata fields must be permanently deleted, set the configuration variable FsAddExtraMetaFields=false in the intradoc.cfg file to disable the automatic creation of the fields. The intradoc.cfg file is located in the DomainHome/ucm/cs/bin/ directory.

12.3.2.2.2 Setting the Default Storage Directory

A StorageDir parameter can be set equal to a root directory, used for all partitions where the PartitionRoot column value has not been specified. In this case the storage directory and the partition name is used to create the PartitionRoot parameter. The StorageDir parameter is set in the intradoc.cfg file, located in the DomainHome/ucm/cs/bin/ directory.

12.3.2.2.3 Standard File Store Variables

In the provider.hda file located in the IntradocDir/data/providers/defaultfilestore/ directory, the following parameters and classes are standard for a file system store:

ProviderType=FileStore
ProviderClass=intradoc.filestore.BaseFileStore
IsPrimaryFileStore=true
# Configuration information specific to a file system store provider.
ProviderConfig=intradoc.filestore.filesystem.FileSystemProviderConfig
EventImplementor=intradoc.filestore.filesystem.FileSystemEventImplementor
DescriptorImplementor=intradoc.filestore.filesystem.FileSystemDescriptorImplementor
AccessImplementor=intradoc.filestore.filesystem.FileSystemAccessImplementor

12.3.3 File Store Provider Resource Tables

This section covers the following topics:

12.3.3.1 PartitionList Table

The PartitionList table defines the partitions that are available for the partitionSelection algorithm. The table is defined in the fsconfig.hda file, located in the DomainHome/ucm/cs/data/filestore/config/ directory, and modified using the Add/Edit Partition page in the Content Server interface. The columns of the table are used as follows:

Column Description

PartitionName

Specifies the name of the partition. This name is referenced in the path expression.

PartitionRoot

An argument passed into the partitionSelection algorithm.

IsActive

Determines if the partition is currently active and accepts new files.

CapacityCheckInterval

Specifies the interval in seconds used in determining the available disk space. This may not work on all platforms.

SlackBytes

Determines if there is sufficient space on a partition to store content. If the available space is lower than the slack bytes, the partition is deactivated and no longer used for contribution.

DuplicationMethods

Specifies how native files are treated when not converted to a web-viewable rendition.

copy (default): copies the native file to the web path.

link: Resolves the web path to the native file in the vault

Copy and Link rely on functionality of the operating system on which the Content Server instance is installed. As such, not all methods are available on all platforms


12.3.3.2 StorageRules Table

The StorageRules table defines the rules used for storing content items. The rule specifies which path expression to use for which storage class, and how content items are to be stored.

The table is defined in the provider.hda file, located in the DomainHome/ucm/cs/data/providers/defaultfilestore/ directory, and it can be modified using the Storage Rule Name dialog in the Content Server interface. The columns of the table are used as follows:

Column Description

StorageRule

The name of the storage rule. Computed from a dynamic include and stored in the xStorageRule metadata field of a content item.

StorageType

Determines the storage implementation.

FileStorage: files are stored on the file system

JdbcStorage: files are stored in the database

IsWeblessStore

Used to specify if system allows web-less files.

true: By default, newly created content items do not have a web-viewable file. In certain circumstances it is necessary to insist on a web-viewable file. In such situations, an argument in the calling code can be used to specify that a web-viewable file must be created. Information regarding whether there is a web-viewable file is stored in the xWebFlag metadata field.

false: By default, newly created content items do have a web-viewable file.

RenditionsOnFileSystem

Used by JdbcStorage to determine if any files are to be stored on the file system instead of the database.


12.3.3.3 PathMetaData Table

The PathMetaData table defines what metadata is used to determine the location of a file. The metadata may come directly from a content item's metadata, or be calculated using an algorithm. The PathMetaData table is defined in the provider.hda file of the defaultfilestore/ directory. The defaultfilestore/ directory is located in the DomainHome/ucm/cs/data/providers/ directory.

The columns of the table are used as described in the following table.

Column Description

FieldName

Name of the field as it appears in the path expression.

GenerationAlgorithm

Specifies the algorithm used to resolve or compute the value for the field.

RequiredForStorage

Defines for which storage class the metadata is required.

#all: Both vault and web-viewable renditions require the metadata

web: Just the web-viewable rendition requires the metadata

vault: Just the native file rendition requires the metadata

The field is optional for all renditions not specified. Consequently, if this column is empty, then the metadata field is optional for all renditions or storage classes. If an algorithm has been specified, this value is empty. The algorithm uses the value specified in the ArgumentFields column to dictate which fields are required.

Arguments

Optional arguments passed into the algorithm specified in the GenerationAlgorithm field.

ArgumentFields

A comma-delimited list of fields required by the arguments defined in the Arguments column, and consequently required by the algorithm specified in the GenerationAlgorithm field.


12.3.3.4 PathConstruction Table

The PathConstruction table maps a file to a path. The PathConstruction table is defined in the provider.hda file of the defaultfilestore/ directory. The defaultfilestore/ directory is located in the DomainHome/ucm/cs/data/providers/ directory. For more information, see Section 12.3.1.2.

Caution:

The defaults provided in the PathConstruction table should work for most scenarios. This resource file should not be edited directly. Proper modification should be done through additional component development. For more information on component development, see the chapter about components in Oracle Fusion Middleware Developing with Oracle WebCenter Content.

The columns of the PathConstruction table are defined in the following table.

Column Description

FileStore

Specifies the storage path that is being calculated.

web: Path to the web-viewable file.

vault: Path to the native file.

weburl: Generated by Content Server. Tends to be GET_FILE.

weburl.file: Nicely constructed URL used to access the web-viewable rendition in a browser.

dispersion: Variable for dispersion of content on the file system.

FsWeblayoutDir: Variable for the web-viewable path for the weblayout directory.

FsHttpWebRoot: Variable for the web URL file path for the weblayout directory

PathExpression

Defines the path.

AutoCreateLimit

Specifies the depth of the directories that may be created.

StorageRule

Specifies to which storage rule this path construction belongs.


12.3.3.5 FileSystemFileStoreAlgorithmFilters Table

The FileSystemFileStoreAlgorithmFilters table is used to map an algorithm name to an implementation of the FilterImplementor interface. The algorithm can be referenced in the PathMetaData Table and is used to calculate the desired path field. The class implementing the algorithm must return the required metadata fields it uses for calculation, when the file parameters object is null. Through the ExecutionContext, the doFilter method is passed in information about the field, content item, and file store provider that initiated the call. In particular, for the file system provider, the algorithm will be passed the following information through the ExecutionContext. Bear in mind that other file store providers may choose to pass in more or possibly different information.

Properties fieldProperties = (Properties)
    context.getCachedObject("FieldProperties");
Parameters data = (Parameters)
    context.getCachedObject("FileParameters");
Map localData = (Map) context.getCachedObject("LocalProperties");
String algorithm = (String) context.getCachedObject("AlgorithmName");

The FileSystemFileStoreAlgorithmFilters table is part of File Store Provider and requires a component along with Java code to modify.

Caution:

The defaults provided in the FileSystemFileStoreAlgorithmFilters table should work for most scenarios. This resource file should not be edited directly. Proper modification should be done with Java code and through additional component development. For more information on component development, see Oracle Fusion Middleware Developing with Oracle WebCenter Content.

12.3.3.6 FileStorage Table

The FileStorage table is added to Content Server when File Store Provider is installed. It is used exclusively by the JdbcStorage storage type, when content is stored in a database. The FileStorage table contains the renditions of content items and uses the dID of the content item and rendition to uniquely identify what renditions belong to which content item.

12.3.3.7 FileCache Table

The FileCache table is added to Content Server when File Store Provider is installed. It is used exclusively by the JdbcStorage storage type to remember which renditions have been placed on a file system. Renditions stored in a database are placed on a file system when required for a specific event, for example indexing or conversion. These files are often temporary and deleted after a specified interval as part of a scheduled event.

12.3.4 Working with the File Store Provider

When the File Store Provider default file store is upgraded, checked-in content and associated metadata are examined and assigned a storage rule based on criteria established by the system administrator. Criteria can include metadata, profiles, or other considerations. The storage rule determines how vault and web files are stored and accessed by Content Server and how they are accessed by a web server. Files can be stored in a database or placed on one or more file systems or storage media. Partitions can be created to help manage storage location, but are not required.

Caution:

The FileStoreProvider component should not be disabled once it has been used with Content Server.

This section covers these topics:

12.3.4.1 Adding or Editing a Partition

You can create partitions to define additional root paths to files managed by Content Server but requiring storage in different locations or on different types of media. You create partitions using the Partition Listing page. When a new partition is created, Content Server modifies the PartitionList resource table in the fsconfig.hda file, located in the IntradocDir/data/filestore/config/ directory.

Note:

Oracle WebLogic Server does not support configuring its web server for the Content Server instance to add a new virtual directory and alias to point to the weblayout directory for each partition that is created. Partitions can be used for the vault files, and partitions are supported for web files.

To add a partition to the Content Server instance:

  1. Log in to the Content Server instance as system administrator.

  2. Choose Administration, then File Store Administration.

  3. If there are no partitions defined, click Add Partition. Otherwise, the Add/Edit Partition page opens.

  4. Enter a partition name. The name must be unique.

  5. Modify the partition root, duplication methods, and any other pertinent parameters.

  6. Ensure that Is Active is enabled.

  7. Click Update.

12.3.4.2 Editing the File Store Provider

You can edit File Store Provider at any time. To edit the provider:

  1. Log in to the Content Server instance as system administrator.

  2. Choose Administration, then Providers.

  3. In the Providers page, click Info in the Action column next to the DefaultFileStore provider.

  4. In the File Store Provider Information page, click Edit.

  5. In the Edit File Store Provider page, make the necessary modifications and click Update to submit the changes.

    Note:

    Do not navigate away from the Edit File Store page before clicking Update to submit the change.

  6. Restart the Content Server instance.

12.3.4.3 Adding or Editing a Storage Rule

You can add multiple storage rules to the file store.

Important:

Storage rules cannot be deleted. Carefully consider each storage rule before you create it.

Caution:

Changing a storage rule after content has been checked in to the Content Server repository may cause Content Server to lose track of the content.

To add or edit storage rules:

  1. Log in to the Content Server instance as a system administrator.

  2. Choose Administration,then Providers.

  3. In the Providers page, click Info in the Action column next to the DefaultFileStore provider.

  4. In the File Store Provider Information page, click Edit.

  5. In the Edit File Store Provider page, click Add new rule, or select the name of the rule to edit from the Storage choice list, and click Edit rule.

  6. In the Storage Rule Name dialog, make the necessary modifications to the storage rule, and click OK.

    Note:

    If there are records associated with the storage rule being edited, then the following rules can not be modified: FsWeblayoutDir (the weblayout directory) and FsHttpWebRoot (the HttpWebRoot and URL prefix).

  7. In the Edit File Store Provider page, click Update.

    Important:

    If the web root used in the web URL file path defined in the storage rule is something other than the default weblayout directory defined for Content Server, you must add an alias or virtual directory in your web server for the web root used in the storage rule. Otherwise, Content Server does not know where to access the file. For information on adding virtual directories to your web server, see the documentation that came with your web server.

12.4 Sample Implementations of File Store Provider

This section list the contents of the tables contained in the provider definition file (provider.hda) for each of the examples. The provider.hda file does not need to be edited manually. Proper modification of the provider.hda file should be done within the Content Server interface using the Add/Edit Partition page, or through additional component development. The provided default options for other resource tables, such as PathMetaData Table, PathConstruction Table, and FileSystemFileStoreAlgorithmFilters Table, should have sufficient flexibility for most scenarios.

This section covers these topics:

12.4.1 Example PathMetaData Table Options

In most of the examples, the following PathMetaData Table configuration definitions are used. The table has been trimmed of some it columns not pertinent to the examples for clarity.

@ResultSet PathMetaData
6
FieldName
GenerationAlgorithm
RequiredForStorage
    <trimmed columns>
dID
#all
dDocName
#all
dDocAccount
documentAccount
dDocType
#all
dExtension
#all
dWebExtension
weburl
dSecurityGroup
#all
dRevisionID
#all
dReleaseState
#all
dStatus
web
xPartitionId
partitionSelection
ExtensionSeparator
extensionSeparator
xWebFlag
RenditionId
#all
RevisionLabel
revisionLabel
RenditionSpecifier
renditionSpecifier
@end

12.4.2 Configuration for Standard File Paths

File Store Provider can be configured to place content on a file system in the standard Content Server locations.

12.4.2.1 Defining the Storage Rule

The first step is to define the storage rule. In this case, the storage rule will be of type FileStorage, because all content is to be stored on the file system.

Example:

@ResultSet StorageRules
4
StorageRule
StorageType
IsWeblessStore
RenditionsOnFileSystem
default
FileStorage
@end@

12.4.2.2 Defining the Path Construction

The second step is to define the path construction for each of the storage classes for the rule. In general, the last part of the path should be standard for all usage examples. If not, then Content Server does not work well with hcs* files. However, the root path can be changed without affecting functionality, assuming that changing the web URL file path root is properly acknowledged by the web server as a Content Server web root.

In this configuration, the vault, web, and web URL storage classes need to be defined in the PathConstruction Table. The path expression for the vault has already been discussed in Section 12.3.1.2. $dispersion$ implements dispersion of content on the file system. The caller can provide this dispersion on the storage rule page.

This setup only looks at the web path expression, which differs from the web URL only in its root. In other words, the web path is an absolute path on the file system, while the web URL is a URL served up by a web server.

Example:

@ResultSet PathConstruction
4
FileStore
PathExpression
AutoCreateLimit
IsWritable
StorageRule
vault
$#env.VaultDir$$dDocType$/$dDocAccount$/$dispersion$/$dID$$ExtensionSeparator$
    $dExtension$
6
true
default
weburl
$FsHttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
3
false
default
web
$FsWeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
3
true
default
@end
  • The web path construction is defined to be:

    $FsWeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
    
  • This is parsed into its parts as described in the following table:

    Path Segment Description

    $FsWeblayoutDir$

    Variable for the web-viewable path for the weblayout directory.

    $FsHttpWebRoot$

    Alternate Idoc Script variable for web URL.

    /groups/

    String.

    $dSecurityGroup$

    Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

    $dDocAccount$

    This is mapped to a documentAccount algorithm which takes dDocAccount and parses it into the standard Content Server account presentation with all the appropriate delimiters.

    /documents/

    String.

    $dDocType$

    Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

    $dispersion$

    Implements dispersion of content on the file system.

    !edisp

    Indicates that dispersion has ended at the point where this marker is placed. Even if dispersion is empty there will be an ~edisp marker.

    $dDocName$

    Used by the PathMetaData table. This is a required field and must consequently be provided by the caller or descriptor creator. It is part of a content item's metadata information.

    $RenditionSpecifier$

    This is provided by the renditionSpecifier, which is only of interest if the system is creating additional renditions such as thumbnails. Otherwise, this returns an empty string.

    $RevisionLabel$

    The revision label is provided by the revisionLabel algorithm which, depending on the status of the content item, adds a '~dRevLabel' to the path.

    $ExtensionSeparator$

    The extensionSeparator algorithm is used here and by default it returns '.'.

    $dWebExtension$

    The dWebExtension is a required field for the web and web URL storage classes and is passed in through the file parameters.


12.4.3 Configuration for a Webless or Optional Web Store

In this example, the previous example storage rule is configured to have IsWeblessStore set to true and consequently the web-viewable file will not be created by default. However, if the document is processed through Inbound Refinery or WebForms or any other component that requires a web-viewable, the web file will be created. The location of the files is as above in the standard configuration. However, because a file might not have a web rendition, the web URL path must be adjusted. Also, note the use of weburl.file. This is used to compute the URL when the web-viewable actually exists. The metadata field xWebFlag is used to determine how the file is to be served up in the browser.

12.4.3.1 Defining the Storage Rule Example

@ResultSet StorageRules
4
StorageRule
StorageType
IsWeblessStore
RenditionsOnFileSystem
default
FileStorage
true
@end@

12.4.3.2 Defining the Path Construction Example

@ResultSet PathConstruction
4
FileStore
PathExpression
AutoCreateLimit
IsWritable
vault
$#env.VaultDir$$dDocType$/$dDocAccount$/$dispersion$/$dID$$ExtensionSeparator$
    $dExtension$
6
true
default
weburl
$HttpCgiPath$?IdcService=GET_FILE&dID=$dRevClassID$
    &dDocName=$dDocName$&allowInterrupt=1&noSaveAs=1&fileName=$dOriginalName$
3
false
default
weburl.file
$FsHttpWebRoot$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
3
false
default
web
$FsWeblayoutDir$groups/$dSecurityGroup$/$dDocAccount$/documents/$dDocType$/
    $dispersion$/~edisp/$dDocName$$RenditionSpecifier$$RevisionLabel$
    $ExtensionSeparator$$dWebExtension$
3
true
default
@end

12.4.4 Configuration for Database Storage

To store files in the database, you need a storage rule that is of type JdbcStorage. By default, all content items belonging to this rule have their files stored in the database. However, even though the files are stored in the database, there is the presumption of an underlying file system and the system may need to temporarily cache a file on the file system. In particular, this may happen for indexing or for some conversions.

Tech Tip:

A rule can be configured to always store renditions belonging to a given storage class on the file system. This is most useful for systems that store vault files in the database, but web files on the file system.

12.4.5 Altered Path Construction and Algorithms

The previous examples have kept the file paths consistent with the standard configuration. For very large implementations, this can result in directory saturation and slow performance. The following examples aid in dispersing files over several storage options.

12.4.5.1 Using Partitioning

File Store Provider makes it easy to use partitions to create a sparser directory structure. By default, the xPartitionId metadata field is used and becomes a part of a content item revision's metadata information. It is recommended that this field is disabled on the Content Server interface, instead letting the partition selection algorithm determine the partition to use. The partition selection algorithm looks at all the active partitions, and as a new content enters the system, the partitions are selected in order. Each partition has an entry in the PartitionList Table and can be declared active. The PartitionRoot is calculated from xPartitionId, where the value is a look up key into the PartitionList table. If no xPartitionId is specified, the system finds the next available and active partition and uses this value for the location calculation. The xPartitionId is then stored as part of the content item's metadata.

To use the partition selection, define the vault storage class in the PathConstruction table as follows:

vault
$PartitionRoot$/$dDocType$/$dDocAccount$/$dRevClassID$$ExtensionSeparator$$dExtension$
6
true

Partitions can be deactivated using the Add/Edit Partition page at any time if a system administrator needs to close a partition to contribution, for example if maintenance is required on the storage device.

12.4.5.2 Adding a Partition to the Weblayout Path

This example shows how to partition both vault and weblayout directories, and also maintain valid web URL file paths.

Add the partition root to the web-viewable path and web URL file path, and edit the variables $FsWeblayoutDir$ and $FsHttpWebRoot$ on the Storage Rule Name dialog.

$FsWeblayoutDir$ represents $PartitionRoot$/weblayout. $FsHttpWebRoot$ represents $HttpWebRoot$/$xPartitionId$/weblayout/.

Define partitionRoot in the Add/Edit Partition page as follows:

Partition Name Partition Root

partition1

$#env.WeblayoutDir$/partition1/

partition2

$#env.WeblayoutDir$/partition2/


In order to keep the web URL file path consistent with the web-viewable path in the weblayout directory, the variable xPartitionId is used so that partition1 or partition2 is correctly replaced when creating the web URL file path.

Ensure that the web-viewable path and the web URL file path evaluate into the same path.

  • $FsWeblayoutDir$ represents $PartitionRoot$/weblayout/. For partition1 this evaluates to $#env.WeblayoutDir$/partition1/weblayout/. For partition2 this evaluates to $#env.WeblayoutDir$/partition2/weblayout/.

  • $FsHttpWebRoot$ represents $HttpWebRoot$/$xPartitionId$/weblayout/. For partition1 this evaluates to $HttpWebRoot$/partition1/weblayout/. For partition2 this evaluates to $HttpWebRoot$/partition2/weblayout/.

If you set up the partitions (partition1 and partition2) to use the partition root of $#env.VaultDir$/partition1 and $#env.VaultDir$/partition2 instead of the $#env.WeblayoutDir$ and $HttpWebRoot$ settings, then the weblayout file will end up stored in the vault directory. It then can be used only for partitioning the vault files.

12.4.5.3 Limiting the Number Files in a Directory

Another way of dispersing files is to alter the path so that files get partitioned out by the dRevClassID of the content item. In the example below, the directories are limited to 10,000 files plus extra files for additional renditions.

If your path expression contains $RevClassID[-12:-10:0]/$dRevClassID[-10:-8:0]$/$dRevClassID[-8:-4:0]$ and $dRevClassID is 1234567890, the result is 00/12/3456.

Note the $dRevClassID[-12:-10:0] in the path expression. This is interpreted as follows:

  • Get the characters starting at 12 back from the end of the string until you get the character 10 back from the end of the string.

  • Pad the resulting string to length 2, which 12-10, with 0 characters.

12.5 Using Sun Storage Archive Manager

This section introduces the Sun Storage Archive Manager (SAM-QFS) product and explains how to configure Content Server to work with SAM-QFS.

12.5.1 About SAM-QFS

The Sun Storage Archive Manager (SAM-QFS) is a hierarchical file storage system that runs on the Oracle Solaris operating system. When configured with the WORM (Write Once Read Many) option, it supports archiving file system data so that it can be only read by users. The SAM-QFS environment includes a storage and archive manager along with Sun QFS file system software. SAM-QFS can be used with a NFS mount by the Content Server machine and additional Content Server configuration.

SAM-QFS comprises two products:

  • Quick File System (QFS) is a kernel-level file system that can be installed on Oracle Solaris and SPARC platforms. This product provides the WORM feature and retention manager feature, which can be used with Content Server.

    Note:

    Weblayout cannot be stored using WORM.

  • Storage Archive Manager (SAM) includes several programs that run in user space to archive files initially filed in QFS, retrieve files on demand, and manage the archive by freeing space and so on.

    • The Archiver program is proactively notified when files change so that they are archived in an event-driven fashion and not by polling the file system. It also manages the archive and creates backup copies as needed.

    • The Releaser program releases the content from primary storage after it validates that all copies have been made by the archiver.

    • The Stager program loads data or stages data to primary storage from an archive copy (which can be on an archive disk or an archive tape) so that the data can be retrieved by users from QFS.This activity can be configured to be done on demand or according to policy.

    • The Recycler program purges deleted files from secondary storage so that the space can be reclaimed and reused.

    Files are stored in the TAR utility format so that the file system metadata is retained along with the actual file. Multiple files can be stored in the same TAR for efficiency.

12.5.2 Considerations for Using SAM-QFS

Consider the following before implementing SAM-QFS:

  • SAM-QFS provides WORM (Write Once Read Many) capability, along with the ability to specify a retention period after which the WORM constraints are lifted.

  • Files can be automatically archived to tape. SAM-QFS provides an integrated, seamless, backup solution with a transparent restore.

  • If recycling of files on archive is scheduled with SAM-QFS, then revisions of documents can be retrieved by maintaining the metadata backup snapshots and mounting them as needed.

  • Content items cannot be deleted. A delete action will fail and generate an error message.

  • Content checked in using a WORM enabled storage rule will not be able to be edited in a workflow step if that step uses the option of "User can edit (replace) the current revision". When a workflow step uses this option, the vault rendition of the file is replaced, which is not possible with a read-only file system.

  • The native file path depends on the storage rule vault path field. If the native file path value contains a metadata field as part of its path, the metadata field cannot be updated, because the update action will try to change the file system path (which is not possible in a read-only system). Trying to change the metadata attribute will fail and generate an error message. The recommended settings for a WORM-enabled storage rule is to not have any metadata field in the native file path if the path might be changed in the future.

12.5.3 Installing SAM-QFS

For information on how to install SAM-QFS on an Oracle Solaris system and enable WORM, see the Sun QFS and Sun Storage Archive Manager (SAM-QFS) wiki at https://wikis.oracle.com/display/SAMQFS/Home. The SUNWsamfswm package does not come with SAM-QFS 5.2 download version, so please contact the SAM-QFS group for this package.

12.5.4 Configuring Content Server and SAM-QFS with WORM

To work with SAM-QFS, certain configurations must be set. To enable WORM for the default storage rule, the default rule (DispByContentId) in the Content Server File Store Provider must be modified. Other storage rules also can be modified to enable WORM, if needed.

12.5.4.1 Configuring the Vault Path

To configure the vault path:

  1. In the Content Server instance, choose Administration, then Admin Server, then General Configuration.

  2. In the Additional Configuration Variables field, enter the environment variable IsVaultFileSystemWorm=true

  3. Edit the DomainHome/ucm/cs/bin/intradoc.cfg file to set the VaultDir parameter to the vault file path for the SAM-QFS location.

  4. Starting at the SAM-QFS mount point, apply chmod -R 4000 directoryName to the subdirectories except for the vault/~temp directory. The vault/~temp directory must never be WORM enabled.

    Work from the top-most level down. The WORM trigger can be applied only if the parent directory has the WORM trigger enabled.

  5. Edit the Solaris /etc/vfstab file to specify the default retention period.

12.5.4.2 Configuring the File Store Provider to Enable WORM

To configure the File Store Provider to Enable WORM for vault files:

  1. In the Content Server instance, choose Administration, then Providers.

  2. From the DefaultFileStore row in the list of providers, click Info in the Action column.

  3. In the File Store Provider Information page, click Edit.

  4. In the File Store Provider page, in the line for Storage Rules, click Edit Rule.

  5. In the Storage Rule Name page, ensure that File system only is selected.

  6. Check Allow WORM/Retention (SAM-QFS only).

  7. If a retention period needs to be set, check Set default retention period for vault files and enter the number of years and months for retention.

    This option has a limitation up to 2038 if either of these two conditions are true: the SAM-QFS file system is 32-bit, or the operating system where Content Server is running is 32-bit. If a greater retention period is needed, use the SAM-QFS mount option parameter to set the default retention period instead of checking this Content Server option.