BEA Logo BEA WLCS Release 2.0

  Corporate Info  |  News  |  Solutions  |  Products  |  Partners  |  Services  |  Events  |  Download  |  How To Buy

 

   WLCS Doc Home   |   Personalization Server User's Guide   |   Previous Topic   |   Next Topic   |   Contents   |   Index

Creating and Managing Content

 

This following topics are included:

What is the Content Management Component?

Third-party tools and WLPS

Constructing queries using Java

Differences between content management and document management

Using the document servlet

JSP Tags

Configuring the Content Management component

Configuring the Document EJB deployment descriptor

Configuring the Document Schema EJB deployment descriptor

Configuring the DocumentManager EJB deployment descriptor

Setting up Connection pools

Using the Show Document Servlet

Querying document content

Structuring a query

Using comparison operators to construct queries

Using the BulkLoader to load file-based content

Using Content Management JSP Tags

The Content Management component of WLPS 2.0 provides content and document management capabilities for use in personalization services. The Content Manager works with files or with content managed by third-party vendor tools from Documentum and Interwoven.

 


What is the Content Management Component?

The Content Management runtime component provides access to content via both tags and EJBs. The Content Management tags allow a JSP developer to receive an enumeration of Content objects by querying the content database directly using a search expression syntax.

WebLogic Personalization Server 2.0 provides several components that allow content personalization for users. Together, these components provide a complete personalization solution. Of these personalization components, the Portal, Rules, User, and Property Set Management elements include edit-time graphical user interfaces (GUIs) that allow developers to customize the elements. Neither the Content Management or Personalization Advisor components have a GUI.

The Content Management component works alongside the other components to deliver personalized content, but doesn't have a GUI-based tool for edit-time customization. The content engine behind the ContentManager may be set up to be the reference implementation, provided out of the box, or Documentum. The Content Management component supports querying that returns content from a content repository using several methods:

Third-party tools and WLPS

BEA partners with third-party vendors to add flexibility to WebLogic Personalization Server. The Content Management component works with Interwoven's TeamSite/OpenDeploy product and Documentum's 4i product. Both these products provide robust content creation management solutions while the Content Management component of WLPS 2.0 personalizes and serves the content to the end-user.

Constructing queries using Java

To construct queries using Java syntax instead of using the query language supplied with the Content Management component, refer to the //API documentation//.

Note: Use the constants in TypesHelper when calling Logical.setLogical and Criteria.setComparator.

The ContentManager session bean is the primary interface to the functionality of the Content Management component. Using a ContentManager instance, content is returned based on a Search object with an embedded Expression. An Expression is a boolean tree of arbitrary depth, with other sub-Expressions as nodes. The Expression interface is meant to be abstract, where the actual instances are Logical or Criteria interfaces. As an example, the expression color == 'red' && price > 50 would consist of a Logical with the value and that has as children two Criteria.

Differences between content management and document management

Content objects include metadata about the content. Metadata provides a means to query and match content with users by allowing the system to retrieve content based on the metadata that describes the content. In general, some kind of content management system provides services such as retrieval of content and content authoring services including creation, editing, versioning, and workflow.

Documents are a specialized type of Content that provide two methods for retrieval: a metadata-searching mechanism and retrieval of the pure bytes of the document's file. Documents should include additional explicit metadata properties related to the file and its versioning, including its size, name, path, author, and version. A document management system usually provides document-based services for documents that reside in the system's repository.

WebLogic Personalization Server 2.0 provides the entire Content object model; however, it only provides the Document object as a concrete implementation (subclass) of the Content class.

Using the document servlet

The Content Management component includes a servlet capable of outputting the contents of a Document object. This servlet is useful when streaming the contents of an image that resides in a content management system or to stream a document's contents that are stored in a content management system when an HTML link is selected. The servlet supports the following Request/URL parameters:

Request Parameter

Required

Description

contentHome

maybe

If the contentHome initialization parameter is not specified, then this is required and will be used as the JNDI name of the DocumentHome. If the contentHome initialization parameter is specified, this is ignored.

contentId

no

The string identifier of the Document to retrieve. If not specified, the servlet looks in the PATH_INFO.

blockSize

no

The size of the data blocks to read. The default is 8K. Use 0 or less to read the entire block of bytes in one operation.

The servlet only supports Documents, not other subclasses of Content. It sets the Content-Type to the Document's mimetype, the Content-Length to the Document's size, and correctly sets the Content-Disposition, which should present the correct file name when the file is saved from a browser.

Example 1: Usage in a JSP:

<cm:select contentHome="bea.eDocs.CMgr" max="5" 
sortBy="creationDate ASC, title ASC" query="type = 'News' &&
timeOfDay = 'Evening' && mimetype like 'text/*' " id="newsList" />

<ul>
<es:foreachinarray array="newsList" id="newsItem"
type="com.beasys.commerce.axiom.content.Content">
<li><a href="/showDocServlet/<cm:printproperty
id="newsItem" name="identifier" encode="url"/>
&contentHome=bea.eDocs.CMgr"><cm:printproperty id="newsItem"
name="Title" encode="html" /></a>
</es:foreachinarray>
</ul>

Example 2: Usage in a JSP

This example searches for image files that match keywords that contain bird and displays the image in a bulleted list.

<cm:select contentHome="bea.eDocs.cMgr" max="5" sortBy="name" 
id="list" query="Keywords like `*birds*' && mimeType like
`image/*'" />
<ul>
<es:foreachinarray array="list" id="img"
type="com.beasys.commerce.axiom.content.Content">
<li><img src="/showDocServlet?contentId=<cm:printproperty
id="img" name="identifier"
encode="url"/>&contentHome="bea.eDocs.cMgr">
</es:foreachinarray>
</ul>

JSP Tags

The Content Management component includes four JSP tags. These tags allow a JSP developer to include non-personalized content in a HTML-based page. Note that none of the tags support or use a body. The tags include:

 


Configuring the Content Management component

The Document EJB, Document Schema EJB, and DocumentManager EJB deployment descriptors handle the configuration for the Content Management component. To use the reference implementation document repository, you need to configure the EJB deployment descriptors and also set up two WLS JDBC connection pools.

Once the deployment descriptor has been written, just build the EJBs as you normally would, then add the resulting jar file to your ejb.deploy entry in the weblogic.properties file.

Configuring the Document EJB deployment descriptor

The logic for loading Document EJBs is handled via a SmartBMP. The Document EJB implementation loads the SmartBMP object from a class name specified in the EJB environment in the EJB's deployment descriptor. The EJB environment variable is SmartBMPClass. The value must be the fully-qualified class name of the SmartBMP to use. This class must be capable of populating a DocumentImpl object and must also have the methods defined in the Content and Document Javadocs.

To use the reference implementation document management system, set SmartBMPClass to com.beasys.commerce.axiom.document.SPIDocumentSmartBMP and specify the following EJB environment variables in the document EJB deployment descriptor:

Other SmartBMP class for other document management system will possibly require more and/or different EJB environment variables.

Configuring the Document Schema EJB deployment descriptor

The logic for loading Document Schema EJBs is handled via a SmartBMP. The Schema EJB implementation loads the SmartBMP object from a class name specified in the EJB environment in the EJB's deployment descriptor. The EJB environment variable is SmartBMPClass. The value must be the fully-qualified class name of the SmartBMP to use. This SmartBMP must be capable of populating a SchemaImpl object with PropertyMetaData objects.

To use the reference implementation document management system, set SmartBMPClass to com.beasys.commerce.axiom.document.SPISchemaSmartBMP and specify the following EJB environment variables in the document EJB deployment descriptor:

Other SmartBMP class for other document management system will possibly require more and/or different EJB environment variables.

Configuring the DocumentManager EJB deployment descriptor

The DocumentManagerSession EJB simply hides the details of getting to the Document and DocumentSchema EJBs. It understands the following environment variables in its deployment descriptor:

Example deployment descriptor file

The following is a sample ejb-jar.xml deployment descriptor file:

<?xml version="1.0"?>
<!DOCTYPE ejb-jar PUBLIC '-//Sun Microsystems, Inc.//DTD Enterprise JavaBeans 1.1//EN' 'http://java.sun.com/j2ee/dtds/ejb-jar_1_1.dtd'>
<ejb-jar>
<enterprise-beans>

    <!-- our Document entity bean -->
<entity>
<ejb-name>com.beasys.commerce.axiom.document.Document</ejb-name>
<home>com.beasys.commerce.axiom.document.DocumentHome</home>
<remote>com.beasys.commerce.axiom.document.Document</remote>
<ejb-class>com.beasys.commerce.axiom.document.DocumentImpl</ejb-class>
<persistence-type>Bean</persistence-type>
<prim-key-class>
com.beasys.commerce.axiom.document.DocumentPk
</prim-key-class>
<reentrant>False</reentrant>
<env-entry>
<env-entry-name>SmartConnectionPoolClass</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>
com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool
</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMP_URL</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>jdbc:weblogic:pool:docPool</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMPClass</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>
com.beasys.commerce.axiom.document.SPIDocumentSmartBMP
</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMPUpdate</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>false</env-entry-value>
</env-entry>
</entity>

    <!-- our Schema entity bean -->
<entity>
<ejb-name>com.beasys.commerce.axiom.document.DocumentSchema</ejb-name>
<home>com.beasys.commerce.foundation.property.SchemaHome</home>
<remote>com.beasys.commerce.foundation.property.Schema</remote>
<ejb-class>com.beasys.commerce.foundation.property.SchemaImpl</ejb-class>
<persistence-type>Bean</persistence-type>
<prim-key-class>
com.beasys.commerce.foundation.property.SchemaPk
</prim-key-class>
<reentrant>False</reentrant>
<env-entry>
<env-entry-name>SmartConnectionPoolClass</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>
com.beasys.commerce.foundation.plugin.weblogic.WeblogicConnectionPool
</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMP_URL</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>jdbc:weblogic:pool:docPool</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMPClass</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>
com.beasys.commerce.axiom.document.SPISchemaSmartBMP
</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SmartBMPUpdate</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>false</env-entry-value>
</env-entry>
</entity>

    <!-- The default DocumentManager bean -->
<session>
<ejb-name>com.beasys.commerce.axiom.document.DocumentManager</ejb-name>
<home>com.beasys.commerce.axiom.document.DocumentManagerHome</home>
<remote>com.beasys.commerce.axiom.document.DocumentManager</remote>
<ejb-class>
com.beasys.commerce.axiom.document.DocumentManagerImpl
</ejb-class>
<session-type>Stateless</session-type>
<transaction-type>Container</transaction-type>
<env-entry>
<env-entry-name>ContentHome</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>com.beasys.commerce.axiom.document.Document
</env-entry-value>
</env-entry>
<env-entry>
<env-entry-name>SchemaHome</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>
com.beasys.commerce.axiom.document.DocumentSchema</env-entry-value>
</env-entry>
</session>
</enterprise-beans>

  <assembly-descriptor>
<container-transaction>
<method>
<ejb-name>com.beasys.commerce.axiom.document.Document</ejb-name>
<method-intf>Remote</method-intf>
<method-name>*</method-name>
</method>

      <method>
<ejb-name>com.beasys.commerce.axiom.document.DocumentSchema</ejb-name>
<method-intf>Remote</method-intf>
<method-name>*</method-name>
</method>

      <method>
<ejb-name>com.beasys.commerce.axiom.document.DocumentManager</ejb-name>
<method-intf>Remote</method-intf>
<method-name>*</method-name>
</method>

      <trans-attribute>Supports</trans-attribute>
</container-transaction>
</assembly-descriptor>
</ejb-jar>

Setting up Connection pools

For the document reference implementation, set up a specialized WebLogic connection pool with the same name as the Document and Schema EJB's SmartBMP_URL environment variable (see Configuring the Document EJB deployment descriptor).

For example, if the connection pool name is docPool:

All other properties are passed with jdbc.url when the Driver Manager opens a database connection.

Example connection pool entry

The following example shows a sample configuration in the weblogic.properties file.

weblogic.jdbc.connectionPool.docPool=\
url=jdbc:beasys:docmgmt:com.beasys.commerce.axiom.document.ref.RefDocumentProvider,\
driver=com.beasys.commerce.axiom.document.jdbc.Driver,\
loginDelaySecs=1,\
initialCapacity=1,\
maxCapacity=5,\
capacityIncrement=1,\
allowShrinking=true,\
shrinkPeriodMins=15,\
refreshMinutes=10,\
props=jdbc.url=jdbc:weblogic:pool:commercePool;\
jdbc.isPooled=true;\
docBase=C:/WeblogicCommerce/docBase;\
schemaXML=C:/WeblogicCommerce/docSchemas;\
iw.schemaBase=C:/iw-home/templatedata

Using the Show Document Servlet

To operate the Show Document Servlet, it should be registered with WebLogic Server. The class name of the servlet is com.beasys.commerce.content.ShowDocServlet. To register it with WebLogic, add a line similar to the following to your weblogic.properties files:

weblogic.httpd.register.showDocServlet=\
com.beasys.commerce.content.ShowDocServlet

Reference the class in the URL as /showDocServlet. To change the URL reference, change /showDocServlet. For example, to specify the URL as /myapp/doc-shower, enter the following in the weblogic.properties file:

weblogic.httpd.register.myapp/doc-shower=\
com.beasys.commerce.content.ShowDocServlet

Querying document content

Structuring a query

WLPS 2.0 queries use a syntax similar to the SQL string syntax that supports basic Boolean-type comparison expressions, including nested parenthetical queries. In general, the template for use includes a metadata property name, a comparison operator, and a literal value. The basic query uses the following template:

Note: Consult the API documentation on com.beasys.commerce.util.ExpressionHelper for more information about the query syntax.

attribute_name comparison_operator literal_value

Several constraints apply to queries constructed using this syntax:

The following examples illustrate full expressions:

Example 1:

((color="red" && size <=1024) || (keywords contains "red" && creationDate < now))

Example 2:

creationDate > toDate (`MM/dd/yyyy HH:mm:ss', `2/22/2000 14:51:00') && expireDate <= now && mimetype like `text/*'

Using comparison operators to construct queries

To support advanced searching, the system allows construction of nested Boolean queries incorporating comparison operators. The table summarizes the comparison operators available for each metadata type. (See Support for Native Types in the Developer's Guide topic Overview of Personalization Development for more information about the native types supported in WLPS 2.0.)

Operator Type

Characteristics

Boolean (==, !=)

Boolean attributes support an equality check against Boolean.TRUE or Boolean.FALSE.

Numeric (==, !=, >, <, >=, <=)

Numeric attributes support the standard equality, greater than, and less than checks against a java.lang.Number.

Text (==, !=, >, <, >=, <=, like)

Text strings support standard equality checking (case sensitive), plus lexicographical comparison (less than or greater than). In addition, strings can be compared using wildcard pattern matching (i.e. the like operator), similar to the SQL LIKE operator or DOS prompt file matching. In this situation, the wildcards will be * (asterisk) for match any and ? (question mark) for match single. Interval matching (e.g. using [ ]) is not supported. To match * or ? exactly, the quote character will be \ (backslash).

Datetime (==, !=, >, <, >=, <=)

Date/time attributes support standard equality, greater than, and less than checks against a java.sql.Timestamp.

Multi-valued Comparison Operators (contains, containsall)

Multi-valued attributes support a contains operator that takes an object of the attribute's subtype and checks that the attribute's value contains it. Additionally, multi-valued attributes support a containsall operator, which takes another collection of objects of the attribute's subtype and checks that the attribute's value contains all of them.

Single-valued operators applied to a multi-valued attribute should cause the operator to be applied over the attribute's collection of values. Any value that matches the operator and operand should return true. For example, if the multi-valued text attribute keywords has the values BEA, Computer, and WebLogic and the operand is BEA, then the < operator returns true (BEA is less than Computer), the > operator returns false (BEA is not greater than any of the values), and the == operator returns true (BEA is equal to BEA).

User Defined Comparison Operators

Currently, no operators can be applied to a user-defined attribute.

Note: The search parameters and expression objects support negation of expressions via a bit flag (!).

Using the BulkLoader to load file-based content

WebLogic Personalization Server 2.0 provides no run-time tools to load metadata information from a content database. However, the server provides a command line utility, the BulkLoader, that descends a directory hierarchy, parses the HTML-style <meta> tags, reverses the metadata content contained within the <meta> tags into schema information, and loads the resulting documents into the reference implementation database.

The BulkLoader is a command-line application that is capable of loading document metadata into the reference implementation database from a directory and file structure. The BulkLoader parses the document base and weblogic.properties and loads all the document metadata so that the Content Management component can search for documents.

Command line usage

The BulkLoader class allows a number of command-line switches:

java com.beasys.commerce.axiom.document.loader.BulkLoader
[-/+verbose] [-/+recurse] [-/+delete] [-/+metaparse] [-/+cleanup]
[-/+hidden] [-/+inheritProps]
[-properties <name>] -conPool <name> [-schema <name>] [+schema]
[-match <pattern>] [-ignore <pattern>] [-htmlPat <pattern>]
[-d <dir>] [-mdext <ext>] [--] [files... directories...]

-verbose: emit verbose messages
+verbose: run quietly [default]
-recurse: recurse into directories [default]
+recurse: don't recurse into directories
-delete: remove document from database
+delete: insert documents into database [default]
-metaparse: parse HTML files for <meta> tags [default]
+metaparse: don't parse HTML files for <meta> tags
-cleanup: if specified, this only performs a table cleanup using the -d
argument as the document base (i.e. all files will need to be under
that directory).
+cleanup: turn off table cleanup (i.e. do a document load) [default]
-hidden: specify to ignore hidden files and directories [default]
+hidden: specify to include hidden files and directories
-inheritProps: specify to have metadata properties be inherited when
recursing [default]
+inheritProps: specify to have metadata properties not be inherited
when recursing.
-htmlPat <pattern>: Specifies a pattern for determining which files are HTML
files for determining whether to do the <meta> tag parse. This can be
specified mulitple times. If none are specified, '*.htm' and '*.html'
are used.
-properties <name>: specifies the location of the weblogic.properties file
which should contain the connectionPool definition. Defaults to
"weblogic.properites" in the current directory.
-conPool <name>: specifies the connectionPool name from the properties file
from which the BulkLoader should get the connection information
-schema <name>: specifies the path to the schema file the BulkLoader will
generate (defaults to "document-schema.xml")
+schema: if specified, than no schema file will be created.
-match <pattern>: specifies a file pattern the BulkLoader should include.
This can be specified multiple times. If none are specified, all files
and directories are included.
-ignore <pattern>: specifies a file pattern the BulkLoader should not include.
This can be specified multiple times.
-d <dir>: specifies the docBase that non-absolute paths will be relative to.
If not specified, "." (current directory) is used.
-mdext <ext>: specifies the file name extension for metadata property files.
The value should starts with a ".". This defaults to ".md.properties".
--: everything after this is considered a file or directory

How the BulkLoader finds files

The following sequence describes how the BulkLoader locates files.

  1. The BulkLoader starts by looking at the list of files and directories specified from the command line.

  2. To determine if the BulkLoader should process a file or directory, it checks to see if the file is marked as a hidden file.

    Note: If it is a hidden file (or directory) and the +hidden option was not specified, then the file or directory is ignored.

  3. If the file or directory does not exist or is not readable by the user executing the BulkLoader, a warning is displayed and the file or directory is ignored.

  4. If the file or directory is a file, then it is loaded.

  5. If the loaded object is a directory and recursion is enabled, then the files and directories under the directory are retrieved by filtering against the -match and -ignore options.

    Note: The -match and -ignore options only apply to files and directories not listed on the command line; in other words, they apply only to those found by recursing into a directory. The patterns specified with the -match and -ignore options (and the -htmlPat options, for that matter) should be DOS-style patterns: '*' matches any set of characters, '?' matches any one character. Sets of characters (e.g. [aceg]) are not supported.

  6. If the subfile or directory name matches any of the patterns specified by a -ignore option, the subfile or directory is ignored.

  7. If the subfile or directory is a directory, then it is included.

  8. If the subfile or directory is a file and no -match options were specified, then it will be included; if at least one -match option is supplied, then the file name must match at least one of -match patterns.

    Note: Files with an extension matching the extension specified by -mdext (.md.properties by default) are always ignored.

How the BulkLoader finds metadata properties

As the BulkLoader is finding files and directories, it will also attempt to load metadata property files. Whenever the BulkLoader encounters a directory that it will process, it looks for a file called dir.<mdext> where <mdext> is the extension specified by the -mdext option. Therefore, the default file name it looks for is dir.md.properties. If this file exists and is readable by the user, the BulkLoader loads it as a Java-style properties file of name=value properties. If the directory is actually a subdirectory entered because +recurse was not specified and the +inheritProps option is not specified, then the properties from dir.md.properties be added to the properties from the parent directory. All files in the directory gain these metadata properties.

When the BulkLoader finds a file which is to be included and loaded, it looks for a file whose name is the original file name appended with the -mdext. So, by default, if the file is called image.gif, the BulkLoader looks for a file called image.gif.md.properties. If that file exists and is readable, the BulkLoader loads those properties into the directory's (and possibly parent directories') properties.

Finally, if the file is an HTML file and the +metaparse option was not specified, then the BulkLoader will parse the HTML, looking for <meta> tags. The BulkLoader determines if a file is an HTML file by using the filename patterns specified by the -htmlPat options. If no -htmlPat patterns are specified, then *.htm and *.html are used. The BulkLoader will load any <meta> tags that contain name and content values found anywhere in the file (not just in the HTML head section) into the file's properties.

In summary, the BulkLoader gathers metadata for a document from the following sources (in this order):

  1. The parent directories' dir.md.properties file

  2. The file's directory's dir.md.properties file

  3. The files's .md.properties file

  4. If the file is an HTML file, then it uses <meta> tags.

The metadata is gathered in a last-seen-is-used algorithm. Therefore, for example, if a metadata attribute is specified in both the <meta> tags and the directory's dir.md.properties file, the value from the <meta> tags will be used.

From there, the id of the document in the database will be the file path, relative to the docBase specified by the -d option. If the file path is not relative to the docBase, then it will be relative to the path from the command line. The file size will be retrieved from the file. The mimeType will be determined by the file's extension. The modifiedDate in the database will become the current time (since that's when the document is being modified in the database).

After loading all the documents on the list, if the +schema option is not specified, the BulkLoader will output a XML file containing the schema information and following the doc-schemas DTD. The BulkLoader will output a single schema which contains entries for all the metadata attributes it finds over the entire load.

Cleaning up the database

If the -cleanup option is specified, the BulkLoader will not actually load any documents. Instead, it will attempt to cleanup and update the database tables. It will first query the database, looking for any metadata entries that do not have corresponding document entries. For each of those, it will create a document entry. It will then go over each document entry and update the size, modified date, and possibly the mime type (if the mime type is not in the database) based upon the files in the docBase specified with the -d option.

Using Content Management JSP Tags

To use the Content Management JSP tags, ensure that the cm.tld file resides in the WEB-INF directory of your WAR files or in your document root.