Creating and Managing Content


	Corporate Info \| News \| Solutions \| Products \| Partners \| Services \| Events \| Download \| How To Buy
	http://www.oracle.com/technology/documentation/index.html \| Site Map \| Search \| PDF Files \| Contact \| Glossary \| What's New

WLCS Doc Home \| Personalization Server User's Guide \| Previous Topic \| Next Topic \| Contents \| Index

This following topics are included:

What is the Content Management Component?

Third-party tools and WLPS

Constructing queries using Java

Differences between content management and document management

Using the document servlet

JSP Tags

Configuring the Content Management component

Configuring the Document EJB deployment descriptor

Configuring the Document Schema EJB deployment descriptor

Configuring the DocumentManager EJB deployment descriptor

Setting up Connection pools

Using the Show Document Servlet

Querying document content

Structuring a query

Using comparison operators to construct queries

Using the BulkLoader to load file-based content

Using Content Management JSP Tags

The Content Management component of WLPS 2.0 provides content and document management capabilities for use in personalization services. The Content Manager works with files or with content managed by third-party vendor tools from Documentum and Interwoven.

What is the Content Management Component?

The Content Management runtime component provides access to content via both tags and EJBs. The Content Management tags allow a JSP developer to receive an enumeration of Content objects by querying the content database directly using a search expression syntax.

WebLogic Personalization Server 2.0 provides several components that allow content personalization for users. Together, these components provide a complete personalization solution. Of these personalization components, the Portal, Rules, User, and Property Set Management elements include edit-time graphical user interfaces (GUIs) that allow developers to customize the elements. Neither the Content Management or Personalization Advisor components have a GUI.

The Content Management component works alongside the other components to deliver personalized content, but doesn't have a GUI-based tool for edit-time customization. The content engine behind the ContentManager may be set up to be the reference implementation, provided out of the box, or Documentum. The Content Management component supports querying that returns content from a content repository using several methods:

Search for content by metadata: Boolean logic searching evaluates content that matches a metadata/operator/value criteria.
Retrieve content by ID: The system allows retrieval of raw bytes of content data-either in blocks or in its entirety-through the content's known identifier.
Query content metadata by ID: The system, through the known identifier of a content piece, can query the metadata describing the content piece. Several metadata attributes provide information about the content. The query language maps some attribute names onto explicit attributes of the Content or Document objects the query searches. Queries searching for Content objects support the following case-sensitive explicit attribute names:
- identifier: Corresponds to the unique String identifier of the Content (i.e. the getIdentifier method).
- mimeType: Corresponds to the String MIME type of the Content (i.e. the getMimeType method).
Queries searching for Document objects support the following additional case-sensitive explicit attribute names:
- size: Corresponds to the Long size of the document in bytes (i.e. the getSize method). Documents without file bytes will have a size of 0 or less.
- version: Corresponds to the Integer version number of the document (i.e. the getVersion method).
- author: Corresponds to the String identifier of the author of the document (i.e. the getAuthor method).
- creationDate: Corresponds to the Timestamp of when the document was created (i.e. the getTimestamp method).
- modifiedBy: Corresponds to the String identifier of the individual who last modified the document (i.e. the getModifiedBy method).
- modifiedDate: Corresponds to the Timestamp of when the document was last modified (i.e. the getModifiedDate method).
- lockedBy: Corresponds to the String identifier of the individual who has the document locked (i.e. the getLockedBy method).
- description: Corresponds to the String description of the document (i.e. the getDescription method).
- comments: Corresponds to any String comments about the document (i.e. the getComments method).
  Note: All other attribute names in queries are considered implicit metadata properties.
Get content schema by name: The document management system (DMS) contains a set of named schemas that describe a set of non-standard metadata attributes. Each piece of content in the DMS is associated with one of these schemas and each schema specifies valid attributes
Get content schema names: A user can query the system for a list of all schema names a DMS supports.
Note: See Querying document content for more information about queries.

Third-party tools and WLPS

BEA partners with third-party vendors to add flexibility to WebLogic Personalization Server. The Content Management component works with Interwoven's TeamSite/OpenDeploy product and Documentum's 4i product. Both these products provide robust content creation management solutions while the Content Management component of WLPS 2.0 personalizes and serves the content to the end-user.

Constructing queries using Java

To construct queries using Java syntax instead of using the query language supplied with the Content Management component, refer to the //API documentation//.

Note: Use the constants in TypesHelper when calling Logical.setLogical and Criteria.setComparator.

The ContentManager session bean is the primary interface to the functionality of the Content Management component. Using a ContentManager instance, content is returned based on a Search object with an embedded Expression. An Expression is a boolean tree of arbitrary depth, with other sub-Expressions as nodes. The Expression interface is meant to be abstract, where the actual instances are Logical or Criteria interfaces. As an example, the expression color == 'red' && price > 50 would consist of a Logical with the value and that has as children two Criteria.

Differences between content management and document management

Content objects include metadata about the content. Metadata provides a means to query and match content with users by allowing the system to retrieve content based on the metadata that describes the content. In general, some kind of content management system provides services such as retrieval of content and content authoring services including creation, editing, versioning, and workflow.

Documents are a specialized type of Content that provide two methods for retrieval: a metadata-searching mechanism and retrieval of the pure bytes of the document's file. Documents should include additional explicit metadata properties related to the file and its versioning, including its size, name, path, author, and version. A document management system usually provides document-based services for documents that reside in the system's repository.

WebLogic Personalization Server 2.0 provides the entire Content object model; however, it only provides the Document object as a concrete implementation (subclass) of the Content class.

Using the document servlet

The Content Management component includes a servlet capable of outputting the contents of a Document object. This servlet is useful when streaming the contents of an image that resides in a content management system or to stream a document's contents that are stored in a content management system when an HTML link is selected. The servlet supports the following Request/URL parameters:

Request Parameter

Required

Description

contentHome

maybe

If the contentHome initialization parameter is not specified, then this is required and will be used as the JNDI name of the DocumentHome. If the contentHome initialization parameter is specified, this is ignored.

contentId

no

The string identifier of the Document to retrieve. If not specified, the servlet looks in the PATH_INFO.

blockSize

no

The size of the data blocks to read. The default is 8K. Use 0 or less to read the entire block of bytes in one operation.

The servlet only supports Documents, not other subclasses of Content. It sets the Content-Type to the Document's mimetype, the Content-Length to the Document's size, and correctly sets the Content-Disposition, which should present the correct file name when the file is saved from a browser.

Example 1: Usage in a JSP:

<cm:select contentHome="bea.eDocs.CMgr" max="5" 
sortBy="creationDate ASC, title ASC" query="type = 'News' && 
timeOfDay = 'Evening' && mimetype like 'text/*' " id="newsList" />

<ul>
    <es:foreachinarray array="newsList" id="newsItem"
    type="com.beasys.commerce.axiom.content.Content">
        <li><a href="/showDocServlet/<cm:printproperty
        id="newsItem" name="identifier" encode="url"/>
        &contentHome=bea.eDocs.CMgr"><cm:printproperty id="newsItem"
        name="Title" encode="html" /></a>
    </es:foreachinarray>
</ul>

Example 2: Usage in a JSP

This example searches for image files that match keywords that contain bird and displays the image in a bulleted list.

<cm:select contentHome="bea.eDocs.cMgr" max="5" sortBy="name" 
id="list" query="Keywords like `*birds*' && mimeType like 
`image/*'" />
<ul>
    <es:foreachinarray array="list" id="img" 
    type="com.beasys.commerce.axiom.content.Content">
        <li><img src="/showDocServlet?contentId=<cm:printproperty 
        id="img" name="identifier" 
        encode="url"/>&contentHome="bea.eDocs.cMgr">
    </es:foreachinarray>
</ul>

JSP Tags

The Content Management component includes four JSP tags. These tags allow a JSP developer to include non-personalized content in a HTML-based page. Note that none of the tags support or use a body. The tags include:

The <cm:select> tag uses only the search expression query syntax to select content. See the JSP documentation for more information.
The <cm:selectbyid> tag retrieves content using the content's unique identifier. See the JSP documentation for more information.
The <cm:printproperty> tag inlines the value of the specified Content metadata property as a string. See the JSP documentation for more information.
The <cm:printdoc> tag inlines the raw bytes of a Document object into the JSP output stream. See the JSP documentation for more information.

Configuring the Content Management component

The Document EJB, Document Schema EJB, and DocumentManager EJB deployment descriptors handle the configuration for the Content Management component. To use the reference implementation document repository, you need to configure the EJB deployment descriptors and also set up two WLS JDBC connection pools.

Once the deployment descriptor has been written, just build the EJBs as you normally would, then add the resulting jar file to your ejb.deploy entry in the weblogic.properties file.

Configuring the Document EJB deployment descriptor

The logic for loading Document EJBs is handled via a SmartBMP. The Document EJB implementation loads the SmartBMP object from a class name specified in the EJB environment in the EJB's deployment descriptor. The EJB environment variable is SmartBMPClass. The value must be the fully-qualified class name of the SmartBMP to use. This class must be capable of populating a DocumentImpl object and must also have the methods defined in the Content and Document Javadocs.

To use the reference implementation document management system, set SmartBMPClass to com.beasys.commerce.axiom.document.SPIDocumentSmartBMP and specify the following EJB environment variables in the document EJB deployment descriptor:

SmartConnectionPoolClass (required): Specifies the fully-qualified class name of the SmartConnectionPool implementation class. In WebLogic Server, set SmartConnectionPool to com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool.
SmartBMPUpdate: Set to false.
SmartBMP_URL (required): Specifies the JDBC URL to the document JDBC connection pool (see Setting up Connection pools), which is the URL that the EJB uses to obtain a document connection. This value should correspond to the WebLogic connection pool that uses the document reference implementation JDBC driver.
PropertyCase: This sets how the DocumentImpl modifies incoming property names. If this is lower, all property names are converted to lower case. If this is upper, all property names are converted to upper case. If this is anything else or not specified, property names are not modified. Use lower or upper if the SmartBMP class expects everything in a certain case (e.g. the Documentum SmartBMP expects everything in lower case). For the document reference implementation, do not specify the PropertyCase.

Other SmartBMP class for other document management system will possibly require more and/or different EJB environment variables.

Configuring the Document Schema EJB deployment descriptor

The logic for loading Document Schema EJBs is handled via a SmartBMP. The Schema EJB implementation loads the SmartBMP object from a class name specified in the EJB environment in the EJB's deployment descriptor. The EJB environment variable is SmartBMPClass. The value must be the fully-qualified class name of the SmartBMP to use. This SmartBMP must be capable of populating a SchemaImpl object with PropertyMetaData objects.

To use the reference implementation document management system, set SmartBMPClass to com.beasys.commerce.axiom.document.SPISchemaSmartBMP and specify the following EJB environment variables in the document EJB deployment descriptor:

SmartConnectionPoolClass (required): Specifies the fully-qualified class name of the SmartConnectionPool implementation class. In WebLogic Server, set SmartConnectionPool to com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool.
SmartBMPUpdate: Set to false.
SmartBMP_URL (required): Specifies the JDBC URL to the document JDBC connection pool (see Setting up Connection pools), which is the URL that the EJB uses to obtain a document connection. This value should correspond to the WebLogic connection pool that uses the document reference implementation JDBC driver.
Note: This value should correspond to the value in the Document EJB. See Configuring the Document EJB deployment descriptor for more information.

Other SmartBMP class for other document management system will possibly require more and/or different EJB environment variables.

Configuring the DocumentManager EJB deployment descriptor

The DocumentManagerSession EJB simply hides the details of getting to the Document and DocumentSchema EJBs. It understands the following environment variables in its deployment descriptor:

UseDefaultHomeNames: If this set to true, then the default home names will be used if either ContentHome or SchemaHome is not specified.
ContentHome: This specifies the JNDI home name of the DocumentHome object to use.
SchemaHome: This specifies the JNDI home name of the SchemaHome object to use.

Example deployment descriptor file

The following is a sample ejb-jar.xml deployment descriptor file:

<?xml version="1.0"?>
<!DOCTYPE ejb-jar PUBLIC '-//Sun Microsystems, Inc.//DTD Enterprise JavaBeans 1.1//EN' 'http://java.sun.com/j2ee/dtds/ejb-jar_1_1.dtd'>
<ejb-jar>
  <enterprise-beans>

    <!-- our Document entity bean -->
    <entity>
      <ejb-name>com.beasys.commerce.axiom.document.Document</ejb-name>
      <home>com.beasys.commerce.axiom.document.DocumentHome</home>
      <remote>com.beasys.commerce.axiom.document.Document</remote>
      <ejb-class>com.beasys.commerce.axiom.document.DocumentImpl</ejb-class>
      <persistence-type>Bean</persistence-type>
      <prim-key-class>
          com.beasys.commerce.axiom.document.DocumentPk
      </prim-key-class>
      <reentrant>False</reentrant>
      <env-entry>
        <env-entry-name>SmartConnectionPoolClass</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>
            com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool
        </env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMP_URL</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>jdbc:weblogic:pool:docPool</env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMPClass</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>
          com.beasys.commerce.axiom.document.SPIDocumentSmartBMP
        </env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMPUpdate</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>false</env-entry-value>
      </env-entry>
    </entity>

    <!-- our Schema entity bean -->
    <entity>
      <ejb-name>com.beasys.commerce.axiom.document.DocumentSchema</ejb-name>
      <home>com.beasys.commerce.foundation.property.SchemaHome</home>
      <remote>com.beasys.commerce.foundation.property.Schema</remote>
      <ejb-class>com.beasys.commerce.foundation.property.SchemaImpl</ejb-class>
      <persistence-type>Bean</persistence-type>
      <prim-key-class>
         com.beasys.commerce.foundation.property.SchemaPk
      </prim-key-class>
      <reentrant>False</reentrant>
      <env-entry>
        <env-entry-name>SmartConnectionPoolClass</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>
            com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool
        </env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMP_URL</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>jdbc:weblogic:pool:docPool</env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMPClass</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>
            com.beasys.commerce.axiom.document.SPISchemaSmartBMP
        </env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SmartBMPUpdate</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>false</env-entry-value>
      </env-entry>
    </entity>

    <!-- The default DocumentManager bean -->
    <session>
      <ejb-name>com.beasys.commerce.axiom.document.DocumentManager</ejb-name>
      <home>com.beasys.commerce.axiom.document.DocumentManagerHome</home>
      <remote>com.beasys.commerce.axiom.document.DocumentManager</remote>
      <ejb-class>
          com.beasys.commerce.axiom.document.DocumentManagerImpl
      </ejb-class>
      <session-type>Stateless</session-type>
      <transaction-type>Container</transaction-type>
      <env-entry>
        <env-entry-name>ContentHome</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>com.beasys.commerce.axiom.document.Document
        </env-entry-value>
      </env-entry>
      <env-entry>
        <env-entry-name>SchemaHome</env-entry-name>
        <env-entry-type>java.lang.String</env-entry-type>
        <env-entry-value>
         com.beasys.commerce.axiom.document.DocumentSchema</env-entry-value>
      </env-entry>
    </session>
  </enterprise-beans>

  <assembly-descriptor>
    <container-transaction>
      <method>
        <ejb-name>com.beasys.commerce.axiom.document.Document</ejb-name>
        <method-intf>Remote</method-intf>
        <method-name>*</method-name>
      </method>

      <method>
        <ejb-name>com.beasys.commerce.axiom.document.DocumentSchema</ejb-name>
        <method-intf>Remote</method-intf>
        <method-name>*</method-name>
      </method>

      <method>
        <ejb-name>com.beasys.commerce.axiom.document.DocumentManager</ejb-name>
        <method-intf>Remote</method-intf>
        <method-name>*</method-name>
      </method>

      <trans-attribute>Supports</trans-attribute>
    </container-transaction>
  </assembly-descriptor>
</ejb-jar>

Setting up Connection pools

For the document reference implementation, set up a specialized WebLogic connection pool with the same name as the Document and Schema EJB's SmartBMP_URL environment variable (see Configuring the Document EJB deployment descriptor).

For example, if the connection pool name is docPool:

the SmartBMP_URL environment variable should be jdbc:weblogic:pool:docPool.
The URL should be jdbc:beasys:docmgmt:com.beasys.commerce.axiom.document.ref.RefDocumentProvider.
The driver should be com.beasys.commerce.axiom.document.jdbc.Driver. It should not be configured to use a test_table, although it can be allowed to shrink. The driver supports the following properties:
- jdbc.url (required): Specifies the JDBC URL of the database. The connection in this pool opens a connection to this JDBC URL. This property probably should refer to another, non-specialized JDBC connection pool, although it can be any JDBC URL.
- jdbc.driver: Specifies a JDBC driver class name to load.
- jdbc.isPooled: If true, then the system assumes the JDBC URL in jdbc.url is a pooling connection URL and connections will open and close as needed. If false, then this connection opens one connection via the jdbc.url and uses that for its lifetime. If the jdbc.url starts with jdbc:weblogic:pool or jdbc:weblogic:jts, then this property automatically becomes true.
- docBase (required): Specifies the document base of the document files. The ids in the database use file paths relative to this directory and must exist when the connection is created. To operate in a cluster or a multi-server environment, you must either replicate the files on the machines or the put the docBase directory on a shared volume.
- schemaXML: Specifies the file or directory where the XML schema (following the doc-schemas.dtd) resides. Either the schemaXML property or the iw.schemaBase property is required, although the schemas under schemaXML take precedence if both are specified. The schemaXML property has the same constraints as the docBase property when used in a cluster.
  Note: If schemaXML is a directory, the connection will recurse under it and load all files ending in .xml (*.xml).
  
  Note: If schemaXML is a file, the connection loads it.
- iw.schemaBase: Specifies the directory in which the InterWoven datacapture.cfg files reside. The connection recurses through this directory, loading all datacapture.cfg files it finds. Either the iw.schemaBase or schemaXML property is required, although you can specify both. The iw.schemaBase property has the same constraints as the docBase property when used in a cluster.

All other properties are passed with jdbc.url when the Driver Manager opens a database connection.

Example connection pool entry

The following example shows a sample configuration in the weblogic.properties file.

weblogic.jdbc.connectionPool.docPool=\
url=jdbc:beasys:docmgmt:com.beasys.commerce.axiom.document.ref.RefDocumentProvider,\
    driver=com.beasys.commerce.axiom.document.jdbc.Driver,\
    loginDelaySecs=1,\
    initialCapacity=1,\
    maxCapacity=5,\
    capacityIncrement=1,\
    allowShrinking=true,\
    shrinkPeriodMins=15,\
    refreshMinutes=10,\
    props=jdbc.url=jdbc:weblogic:pool:commercePool;\
    jdbc.isPooled=true;\
    docBase=C:/WeblogicCommerce/docBase;\
    schemaXML=C:/WeblogicCommerce/docSchemas;\
    iw.schemaBase=C:/iw-home/templatedata

Using the Show Document Servlet

To operate the Show Document Servlet, it should be registered with WebLogic Server. The class name of the servlet is com.beasys.commerce.content.ShowDocServlet. To register it with WebLogic, add a line similar to the following to your weblogic.properties files:

weblogic.httpd.register.showDocServlet=\
    com.beasys.commerce.content.ShowDocServlet

Reference the class in the URL as /showDocServlet. To change the URL reference, change /showDocServlet. For example, to specify the URL as /myapp/doc-shower, enter the following in the weblogic.properties file:

weblogic.httpd.register.myapp/doc-shower=\
    com.beasys.commerce.content.ShowDocServlet

Querying document content

JSP tags (see Using Content Management JSP Tags.)
ContentHelper (see the API documentation)
ContentManager (see the API documentation)
ContentHome (see the API documentation)

Structuring a query

WLPS 2.0 queries use a syntax similar to the SQL string syntax that supports basic Boolean-type comparison expressions, including nested parenthetical queries. In general, the template for use includes a metadata property name, a comparison operator, and a literal value. The basic query uses the following template:

Note: Consult the API documentation on com.beasys.commerce.util.ExpressionHelper for more information about the query syntax.

attribute_name comparison_operator literal_value

Several constraints apply to queries constructed using this syntax:

String literals must be enclosed in single quotes.
- `WebLogic Server'
- `football'
Date literals can be created via a simplistic toDate method that takes one or two String arguments (enclosed in single quotes). The first, if two arguments are supplied, is the SimpleDateFormat format string; the second argument is the date string. If only one argument is supplied, it should include the date string in `MM/dd/yyyy HH:mm:ss z' format.
- toDate(`EE dd MMM yyyy HH:mm:ss z', `Thr 06 Apr 2000 16:56:00 MDT')
- toDate(`02/23/2000 13:57:43 MST')
Use the toProperty method to compare properties whose names include spaces or other special characters. In general, use toProperty when the property name doesn't comply with the Java variable-naming convention that uses alphanumeric characters.
- toProperty (`My Property') = `Content'
Use \ along with the appropriate character(s) to create an escape sequence that include special characters in string literals.
- toProperty (`My Property\'s Contents') = `Content'
The now keyword-only used on the literal value side of the expression-refers to the current date and time.
Boolean literals are either true or false.
Numeric literals consist of the numbers themselves without any text decoration (like quotation marks). The system supports scientific notation in the forms (e.g. 1.24e4 and 1.24E-4).
An exclamation mark (!) can be placed at an opening parenthesis to negate an expression.
- !(keywords contains `football') || (size >= 256)
The Boolean and operator is represented by the literal &&.
- author == `james' && age < 55
The Boolean or operator is represented by the literal ||.
- creationDate > now || expireDate < now

The following examples illustrate full expressions:

Example 1:

((color="red" && size <=1024) || (keywords contains "red" && creationDate < now))

Example 2:

creationDate > toDate (`MM/dd/yyyy HH:mm:ss', `2/22/2000 14:51:00') && expireDate <= now && mimetype like `text/*'

Using comparison operators to construct queries

To support advanced searching, the system allows construction of nested Boolean queries incorporating comparison operators. The table summarizes the comparison operators available for each metadata type. (See Support for Native Types in the Developer's Guide topic Overview of Personalization Development for more information about the native types supported in WLPS 2.0.)

Operator Type
Characteristics

Boolean (==, !=)

Boolean attributes support an equality check against Boolean.TRUE or Boolean.FALSE.

Numeric (==, !=, >, <, >=, <=)

Numeric attributes support the standard equality, greater than, and less than checks against a java.lang.Number.

Text (==, !=, >, <, >=, <=, like)

Text strings support standard equality checking (case sensitive), plus lexicographical comparison (less than or greater than). In addition, strings can be compared using wildcard pattern matching (i.e. the like operator), similar to the SQL LIKE operator or DOS prompt file matching. In this situation, the wildcards will be * (asterisk) for match any and ? (question mark) for match single. Interval matching (e.g. using [ ]) is not supported. To match * or ? exactly, the quote character will be \ (backslash).

Datetime (==, !=, >, <, >=, <=)

Date/time attributes support standard equality, greater than, and less than checks against a java.sql.Timestamp.

Multi-valued Comparison Operators (contains, containsall)

Multi-valued attributes support a contains operator that takes an object of the attribute's subtype and checks that the attribute's value contains it. Additionally, multi-valued attributes support a containsall operator, which takes another collection of objects of the attribute's subtype and checks that the attribute's value contains all of them.

Single-valued operators applied to a multi-valued attribute should cause the operator to be applied over the attribute's collection of values. Any value that matches the operator and operand should return true. For example, if the multi-valued text attribute keywords has the values BEA, Computer, and WebLogic and the operand is BEA, then the < operator returns true (BEA is less than Computer), the > operator returns false (BEA is not greater than any of the values), and the == operator returns true (BEA is equal to BEA).

User Defined Comparison Operators

Currently, no operators can be applied to a user-defined attribute.

Operator Type	Characteristics
Boolean (==, !=)	Boolean attributes support an equality check against Boolean.TRUE or Boolean.FALSE.
Numeric (==, !=, >, <, >=, <=)	Numeric attributes support the standard equality, greater than, and less than checks against a java.lang.Number.
Text (==, !=, >, <, >=, <=, like)	Text strings support standard equality checking (case sensitive), plus lexicographical comparison (less than or greater than). In addition, strings can be compared using wildcard pattern matching (i.e. the like operator), similar to the SQL LIKE operator or DOS prompt file matching. In this situation, the wildcards will be * (asterisk) for match any and ? (question mark) for match single. Interval matching (e.g. using [ ]) is not supported. To match * or ? exactly, the quote character will be \ (backslash).
Datetime (==, !=, >, <, >=, <=)	Date/time attributes support standard equality, greater than, and less than checks against a java.sql.Timestamp.
Multi-valued Comparison Operators (contains, containsall)	Multi-valued attributes support a contains operator that takes an object of the attribute's subtype and checks that the attribute's value contains it. Additionally, multi-valued attributes support a containsall operator, which takes another collection of objects of the attribute's subtype and checks that the attribute's value contains all of them. Single-valued operators applied to a multi-valued attribute should cause the operator to be applied over the attribute's collection of values. Any value that matches the operator and operand should return true. For example, if the multi-valued text attribute keywords has the values BEA, Computer, and WebLogic and the operand is BEA, then the < operator returns true (BEA is less than Computer), the > operator returns false (BEA is not greater than any of the values), and the == operator returns true (BEA is equal to BEA).
User Defined Comparison Operators	Currently, no operators can be applied to a user-defined attribute.

Note: The search parameters and expression objects support negation of expressions via a bit flag (!).

Using the BulkLoader to load file-based content

WebLogic Personalization Server 2.0 provides no run-time tools to load metadata information from a content database. However, the server provides a command line utility, the BulkLoader, that descends a directory hierarchy, parses the HTML-style <meta> tags, reverses the metadata content contained within the <meta> tags into schema information, and loads the resulting documents into the reference implementation database.

The BulkLoader is a command-line application that is capable of loading document metadata into the reference implementation database from a directory and file structure. The BulkLoader parses the document base and weblogic.properties and loads all the document metadata so that the Content Management component can search for documents.

Command line usage

The BulkLoader class allows a number of command-line switches:

java com.beasys.commerce.axiom.document.loader.BulkLoader
  [-/+verbose] [-/+recurse] [-/+delete] [-/+metaparse] [-/+cleanup]
  [-/+hidden] [-/+inheritProps]
  [-properties <name>] -conPool <name> [-schema <name>] [+schema]
  [-match <pattern>] [-ignore <pattern>] [-htmlPat <pattern>]
  [-d <dir>] [-mdext <ext>] [--] [files... directories...]

-verbose: emit verbose messages
+verbose: run quietly [default]
-recurse: recurse into directories [default]
+recurse: don't recurse into directories
-delete: remove document from database
+delete: insert documents into database [default]
-metaparse: parse HTML files for <meta> tags [default]
+metaparse: don't parse HTML files for <meta> tags
-cleanup: if specified, this only performs a table cleanup using the -d 
    argument as the document base (i.e. all files will need to be under 
    that directory).
+cleanup: turn off table cleanup (i.e. do a document load) [default]
-hidden: specify to ignore hidden files and directories [default]
+hidden: specify to include hidden files and directories
-inheritProps: specify to have metadata properties be inherited when 
    recursing [default]
+inheritProps: specify to have metadata properties not be inherited 
    when recursing.
-htmlPat <pattern>: Specifies a pattern for determining which files are HTML 
    files for determining whether to do the <meta> tag parse. This can be 
    specified mulitple times. If none are specified, '*.htm' and '*.html' 
    are used.
-properties <name>: specifies the location of the weblogic.properties file 
    which should contain the connectionPool definition. Defaults to 
    "weblogic.properites" in the current directory.
-conPool <name>: specifies the connectionPool name from the properties file 
    from which the BulkLoader should get the connection information
-schema <name>: specifies the path to the schema file the BulkLoader will 
    generate (defaults to "document-schema.xml")
+schema: if specified, than no schema file will be created.
-match <pattern>: specifies a file pattern the BulkLoader should include. 
    This can be specified multiple times. If none are specified, all files 
    and directories are included.
-ignore <pattern>: specifies a file pattern the BulkLoader should not include.
    This can be specified multiple times.
-d <dir>: specifies the docBase that non-absolute paths will be relative to. 
    If not specified, "." (current directory) is used.
-mdext <ext>: specifies the file name extension for metadata property files. 
    The value should starts with a ".". This defaults to ".md.properties".
--: everything after this is considered a file or directory

How the BulkLoader finds files

The following sequence describes how the BulkLoader locates files.

The BulkLoader starts by looking at the list of files and directories specified from the command line.
- If no files or directory are specified, it uses only the docBase specified by the -d option. It then loops over the list of files and directories.
- If it finds a directory and +recurse is specified, then it stops.
- If it finds a directory and recursion is turned on (the default or with -recurse), then the BulkLoader loops over the files and directories contained within that directory.
  Note: If the file or directory is not an absolute path, then it is assumed to be relative to the docBase specified by the -d option.
To determine if the BulkLoader should process a file or directory, it checks to see if the file is marked as a hidden file.
Note: If it is a hidden file (or directory) and the +hidden option was not specified, then the file or directory is ignored.
If the file or directory does not exist or is not readable by the user executing the BulkLoader, a warning is displayed and the file or directory is ignored.
If the file or directory is a file, then it is loaded.
If the loaded object is a directory and recursion is enabled, then the files and directories under the directory are retrieved by filtering against the -match and -ignore options.
Note: The -match and -ignore options only apply to files and directories not listed on the command line; in other words, they apply only to those found by recursing into a directory. The patterns specified with the -match and -ignore options (and the -htmlPat options, for that matter) should be DOS-style patterns: '*' matches any set of characters, '?' matches any one character. Sets of characters (e.g. [aceg]) are not supported.
If the subfile or directory name matches any of the patterns specified by a -ignore option, the subfile or directory is ignored.
If the subfile or directory is a directory, then it is included.
If the subfile or directory is a file and no -match options were specified, then it will be included; if at least one -match option is supplied, then the file name must match at least one of -match patterns.
Note: Files with an extension matching the extension specified by -mdext (.md.properties by default) are always ignored.

How the BulkLoader finds metadata properties

As the BulkLoader is finding files and directories, it will also attempt to load metadata property files. Whenever the BulkLoader encounters a directory that it will process, it looks for a file called dir.<mdext> where <mdext> is the extension specified by the -mdext option. Therefore, the default file name it looks for is dir.md.properties. If this file exists and is readable by the user, the BulkLoader loads it as a Java-style properties file of name=value properties. If the directory is actually a subdirectory entered because +recurse was not specified and the +inheritProps option is not specified, then the properties from dir.md.properties be added to the properties from the parent directory. All files in the directory gain these metadata properties.

When the BulkLoader finds a file which is to be included and loaded, it looks for a file whose name is the original file name appended with the -mdext. So, by default, if the file is called image.gif, the BulkLoader looks for a file called image.gif.md.properties. If that file exists and is readable, the BulkLoader loads those properties into the directory's (and possibly parent directories') properties.

Finally, if the file is an HTML file and the +metaparse option was not specified, then the BulkLoader will parse the HTML, looking for <meta> tags. The BulkLoader determines if a file is an HTML file by using the filename patterns specified by the -htmlPat options. If no -htmlPat patterns are specified, then *.htm and *.html are used. The BulkLoader will load any <meta> tags that contain name and content values found anywhere in the file (not just in the HTML head section) into the file's properties.

In summary, the BulkLoader gathers metadata for a document from the following sources (in this order):

The parent directories' dir.md.properties file
The file's directory's dir.md.properties file
The files's .md.properties file
If the file is an HTML file, then it uses <meta> tags.

The metadata is gathered in a last-seen-is-used algorithm. Therefore, for example, if a metadata attribute is specified in both the <meta> tags and the directory's dir.md.properties file, the value from the <meta> tags will be used.

From there, the id of the document in the database will be the file path, relative to the docBase specified by the -d option. If the file path is not relative to the docBase, then it will be relative to the path from the command line. The file size will be retrieved from the file. The mimeType will be determined by the file's extension. The modifiedDate in the database will become the current time (since that's when the document is being modified in the database).

After loading all the documents on the list, if the +schema option is not specified, the BulkLoader will output a XML file containing the schema information and following the doc-schemas DTD. The BulkLoader will output a single schema which contains entries for all the metadata attributes it finds over the entire load.

Cleaning up the database

If the -cleanup option is specified, the BulkLoader will not actually load any documents. Instead, it will attempt to cleanup and update the database tables. It will first query the database, looking for any metadata entries that do not have corresponding document entries. For each of those, it will create a document entry. It will then go over each document entry and update the size, modified date, and possibly the mime type (if the mime type is not in the database) based upon the files in the docBase specified with the -d option.

Using Content Management JSP Tags

To use the Content Management JSP tags, ensure that the cm.tld file resides in the WEB-INF directory of your WAR files or in your document root.

Request Parameter	Required	Description
contentHome	maybe	If the contentHome initialization parameter is not specified, then this is required and will be used as the JNDI name of the DocumentHome. If the contentHome initialization parameter is specified, this is ignored.
contentId	no	The string identifier of the Document to retrieve. If not specified, the servlet looks in the PATH_INFO.
blockSize	no	The size of the data blocks to read. The default is 8K. Use 0 or less to read the entire block of bytes in one operation.