Oracle® WebCenter Interaction Development Kit 10.3.3.0.0

Plumtree.Remote.Crawler Namespace

Provides classes and interfaces for crawling, indexing and representing documents from other systems in the AquaLogic Interaction Knowledge Directory.

Crawlers are extensible components used to index documents from a specific type of document repository, including Lotus Notes, Microsoft Exchange, Documentum and Novell. Crawlers only import links to documents; the documents themselves are left in their original locations. Crawlers access content from an external repository and index it in the portal. Portal users can search for and open crawled files through the portal Knowledge Directory. Crawlers can be used to provide access to files on protected backend systems without violating access restrictions.

In ALI version 5.x and above, crawlers are implemented as remote services that use XML over SOAP and HTTP. Using the IDK, you can create remote crawlers that access a wide range of backend systems. The purposes of a crawler are threefold:

  1. Iterate over and catalog a hierarchical data repository.
  2. Retrieve and index metadata about each document in the data repository and include it in the portal Knowledge Directory and search index. (After the documents are indexed, the crawler is still required.)
  3. Retrieve individual documents on demand through the portal Knowledge Directory, enforcing any user-level access restrictions.

Namespace hierarchy

Classes

ClassDescription
ACLEntry Bean-type class representing a security domain and user or group.
ChildContainer Bean-type class representing the ChildContainer data type. SOAP RPC.
ChildDocument Bean-type class representing the ChildDocument data type.
ChildRequestHint A type-safe enumeration of portal child request queries. ChildRequestHints can signal the backend to behave differently.
ContainerMetaData Stores metadata information about the Container object.
CrawlerConstants Constants related to crawlers.
CrawlerInfo A NamedValueMap for storing information about Crawler settings.
CWSLogger A simple logging implementation for passing string messages back to the portal. The CWS stub will instantiate this logger and pass it through to ContainerProvider.Initialize(). It is up to the developer to keep and use this object.
DocumentFormat Enumeration for portal DocumentFormat flag. The document format flag tells the backend whether to return the actual requested document or a metadata file more suitable for indexing.
DocumentMetaData Stores metadata information about a Document object.
TypeNamespace Enumeration for the portal's Document Type Map namespaces. The document type identifier and namespace help the portal decide how to interpret a document's metadata. Standard namespaces include "file" and "MIME." Custom crawlers may define new namespaces in the OTHER namespace.

Interfaces

InterfaceDescription
IContainer An interface that allows the portal to systematically crawl a remote document repository by querying all documents and child containers (sub-nodes) that a particular container (node) contains and the respective users and groups that can access that particular container (node).
IContainerProvider An interface that allows the portal to iterate over a backend directory structure.
ICrawlerLog An instance of this interface will be passed into the service's Initialize calls. To return messages to the portal job log, keep track of this instance and invoke the Log method with messages. The portal currently retrieves messages after IContainerProvider.AttachToContainer but this behavior is subject to change in future versions.
IDocument An interface that allows the portal to query information about and retrieve documents from a backend repository.
IDocumentProvider An interface that allows the portal to specify documents for retrieval from a backend repository.