This chapter provides a general introduction to Oracle WebCenter Enterprise Capture and its administration. It includes the following sections:
Batches and documents are the primary drivers of work in WebCenter Enterprise Capture. In Capture, documents are scanned or imported and maintained in batches. A batch consists of scanned images or electronic document files (such as PDF or Microsoft Office files) that are organized into documents and assigned metadata (index) values. Each document shares a set of metadata values.
WebCenter Enterprise Capture involves the following main processes:
Capture refers to the scanning or importing of documents into batches within a Capture workspace (see About Capture Workspaces). Common document input scenarios include:
High volume scanning using a production document imaging scanner
Ad hoc remote scanning or import, such as from a business application
Automated import, such as from an email account
Capture provides these methods of document input:
Using the Capture client, end-users manually scan documents or import electronic document files into batches using client profiles created by workspace managers. See About the Capture Client and Introduction to Client Profiles.
Using settings stored in an import job, the Import Processor automatically imports images and other electronic documents directly into Capture from email, network folders, or list files. See Introduction to Import Processing.
Depending on the business scenario, non-image documents and attachments input into Capture may need to be converted to a different format. For example, an organization might convert PDF expense reports attached to imported email messages to an image format to allow their bar codes to be read. Capture's Document Conversion Processor automatically converts documents/attachments and merges documents/attachments within a batch using settings stored in a document conversion job. See Introduction to Document Conversion.
Classification in Capture refers to separating a batch into its logical documents and assigning a document profile which determines a set of possible metadata fields and attachment types available to each document. It also refers to assigning a batch status to a batch.
Classification can occur manually or automatically in Capture, in a variety of ways.
Document separation can occur in multiple ways:
Manually, by users in the Capture client. For example, users can select a client profile configured for a specific number of pages per document. They can also insert separator sheets between documents prior to scanning to identify a new document. While visually inspecting a batch, client users can create new documents by splitting larger documents into one or more documents.
Manually, by users during file import in the Capture client.
Automatically, during import by the Import Processor, based on job settings.
Automatically, during bar code recognition by the Recognition Processor. If a batch is sent to the Recognition Processor, it will automatically perform bar code recognition and document classification. See Introduction to Recognition Processing.
In Capture, documents are assigned a set of metadata values based on a document profile, which identifies the metadata fields available to index that type of document. Metadata values can be assigned:
Manually, by users in the metadata pane of the Capture client.
Automatically, during processing by the Import Processor, based on job settings.
Automatically, during processing by the Recognition Processor, based on job settings.
Metadata fields can take a variety of forms in Capture, including choice lists, dependent choice lists, database lookups, auto populated fields, and fields with input masks and display formats. Workspace managers configure these metadata field definitions in the workspace and then use them in client profiles or processor jobs. See Managing Metadata Fields.
In Capture, attachment types can be assigned to document profiles and these attachment types may be used to classify document attachments on documents that have been assigned to this document profile. See Managing Attachment Types.
In Capture, a batch status is assigned to a batch by the user or by a Capture processor such as Recognition Processor, Import Processor, or Document Conversion Processor. See Managing Batch Statuses.
Capture uses a lock and release method to ensure that only one user or processor has access to any batch at a time, as described below.
|Client Batch Icon||Description|
Locked to You
A batch automatically becomes locked to a user when the user creates or opens (expands) it, and stays locked until the user releases or unlocks the batch.
When done working with batches, users release them or unlock them. Releasing a batch automatically synchronizes its documents and metadata with the WebCenter Enterprise Capture server and forwards the batch for post-processing, if post-processing is configured in its client profile.
Locked to Another User
The batch is locked by another user and is unavailable. You cannot open a batch that is locked by another user.
The batch is not currently locked and is available to any user that has access to the batch.
The batch is currently being processed by a batch processor and is unavailable. If a released batch is set for post-processing (commit, recognition, or document conversion), its icon changes to processing.
If a batch enters an error state, its lock is released. This allows the batch to be examined and locked by another processor or user. Users can right-click an error icon to view details about the error.
Committing a batch takes all of its documents and their metadata, writes them in a selected output format (images only) to a specific location or content repository, and then removes them from the batch. This allows the documents to be located and accessed in the content repository via their metadata or contents. When a batch is committed, some of its documents may not be committed. For example, documents without their required fields populated are skipped. If all documents in a batch are committed, the batch is also deleted from the Capture workspace.
Batches are committed by Capture's Commit Processor using settings selected in an assigned commit profile. Commit profiles can commit to:
WebCenter Content Server
WebCenter Content: Imaging (directly or via Input Agent)
Oracle Documents Cloud Service
Text file (often for consumption by another process or Oracle technology)
You can output image documents to one of the following formats during commit: multiple page TIFF, image only PDF (creates a PDF/A-1a compliant PDF file), and searchable PDF (with an optional full text file that contains text found in documents via Optical Character Recognition (OCR)).
During committing, non-image files that were not converted to image format remain in their original format.
For information about committing, see Introduction to Commit Processing.
The Capture client is the end-user application that a knowledge worker or scan operator uses to create batches using scanners or document files imported from a file folder and/or index documents within batches. The Capture client is installed and launched as a native desktop application that does not require a web browser. The client's main functionality includes:
Scanning and importing documents, using the industry standard TWAIN interface to scan from desktop scanners or other TWAIN-compliant input devices
Reviewing, editing, and indexing documents
Releasing batches so that they can be further processed and corresponding documents can be checked into a content repository or attached to business application records
The Capture client provides a single window whose upper left batch pane is fixed, while its other panes change, depending on the batch pane selection. For example, the document pane shown in the right of Figure 2-1 displays page thumbnails (smaller page representations) and options for editing documents and their pages. The lower left indexing pane shows metadata fields to complete for the selected document. See Getting started with Capture in Using Oracle WebCenter Enterprise Capture.
Figure 2-1 Capture Client Window
A Capture workspace represents a complete capture system, providing a centralized location for metadata, configuration profiles, and batch data for a particular environment. Capture client users create and access batches within the workspace to which they have been granted access. Workspace managers configure and manage workspaces they have been granted access to and control others' access to the workspace.
The Capture workspace provides these benefits:
A separate work area useful for managing document capture for a department, division, or entire organization
Shareable elements for re-use in multiple Capture components
Secure access to workspaces, provided by Capture's user/group restrictions on workspaces
Ability to copy a workspace, for easily adapting its configuration for another environment
Ability to restrict access to batches created within a workspace by client profile
For more information about workspaces, see Introduction to Workspaces and Their Elements.
WebCenter Enterprise Capture provides the following processors, which workspace managers configure for automation in the workspace console:
The Import Processor provides automated bulk importing, from sources such as a file system folder, a delimited list (text) file, or the inbox/folder of an email server account. The import job monitors the source and imports at a specified frequency, such as once a minute, hour, or day. See Introduction to Import Processing.
The Document Conversion Processor automatically converts non-image documents and attachments to a specified format in Capture using Oracle Outside In Technology and/or an external (third party) conversion program. Documents and attachments can also be merged in various ways during conversion. For example, the conversion processor can convert document files such as PDFs or Microsoft Office documents to TIFF image format. See Introduction to Document Conversion.
The Recognition Processor automatically performs bar code recognition, document organization, and automatic indexing. See Introduction to Recognition Processing.
The Commit Processor executes commit profiles to automatically output batch documents to a specified location or content repository, then removes the batches from the workspace. Supported document and attachment output formats include: multiple page TIFF, image only PDF, and searchable PDF. A commit profile specifies how to output the documents and their metadata, and includes metadata field mappings, output format, error handling instructions, and commit driver settings. See Introduction to Commit Processing.
Workspace managers can queue batches to specific batch processors through post-processing options.
If configured, import processing is the initial processing step, responsible for creating and capturing document batches.
If configured, commit processing is the final processing step.
If configured, document conversion and recognition processing are intermediate processes. Because bar code recognition requires image format, document conversion typically precedes recognition.
The following is an example batch flow:
Batches are captured either in the client or through an Import Processor job.
Post-processing in the client profile or import job is set to Document Conversion Processor.
The imported documents in the batch are converted through a document conversion job to images.
Post-processing in the conversion job is set to Recognition Processor.
Bar codes on the converted image documents are recognized by a recognition job.
Post-processing in the recognition job is set to Commit Processor.
When a batch is processed by the Commit Processor, online commit profiles process the documents, committing them to a content repository or network folder.
Capture provides a central configuration console in which workspace managers create and manage workspaces and their elements for use throughout Capture. For example, workspace managers create metadata fields, choice lists, and database lookups in the console, then use them in multiple areas such as client profiles and batch processors. See Introduction to Workspaces and Their Elements.
Figure 2-2 Workspace Console Window
Capture provides the following administrator and user roles, each with a different access level and set of tasks:
System administrators install and configure the Capture system environment, map users or groups configured in a policy store to Capture roles, and monitor Capture processing. They also manage data sources for use in a Capture workspace. It is assumed that system administrators have system administration permissions, including Enterprise Manager and Oracle WebLogic Server access. See Capture System Administration Overview in Administering Oracle WebCenter Enterprise Capture.
Capture workspace managers have read/write access to workspaces they create and ones to which they have been granted access. They can add, edit, copy, and delete workspaces, and configure profiles and processor jobs. Workspace managers can also use certain WLST commands with their assigned workspaces. See Workspace Manager Tasks.
Capture workspace viewers have read-only access to workspaces to which they have been granted access. For example, a workspace viewer might review workspace configuration, client profiles, and processor job settings within the workspace to resolve issues. Workspace viewers cannot make changes to workspaces.
Capture users have access to the Capture client only. They can see and select only those client profiles to which they have been granted access. These end-users create batch-related content within a workspace, including batches, documents, attachments, and pages. See Getting started with Capture in Using Oracle WebCenter Enterprise Capture. Workspace managers are typically assigned both the workspace manager and user role, which enables them to switch between configuring the workspace and testing configurations in the client.
Developers who write customization scripts for use in Capture components may either be added as Capture workspace managers and granted the workspace manager role or they may provide scripts to workspace managers who import, reference, and test their scripts in Capture components. See Introduction to Developing Scripts with Oracle WebCenter Enterprise Capture in Developing Scripts for Oracle WebCenter Enterprise Capture.
Capture's user login, access, and authentication are integrated with Oracle Platform Security Services (OPSS). See Introduction to Oracle Platform Security Services in Securing Applications with Oracle Platform Security Services. After authentication, users' permissions depend on their assigned Capture roles, which the system administrator assigns in Oracle Enterprise Manager.
Capture provides multiple access points, as described in the following sections:
Access to the console and workspaces functions as follows:
For the console, workspace managers and viewers assigned the Capture workspace manager or viewer role, respectively, can sign in and access the workspace console.
For workspaces, workspace managers automatically have access to workspaces they create. To access any other workspace, a manager with workspace access must grant other managers and viewers access to it on the Security tab, as described in Managing Workspace Security.
Access to the client and client profiles functions as follows:
For the client, users or an assigned user group must be granted the Capture user role to sign into and access the client.
For client profiles, workspace managers must grant users and/or user groups access to a selected client profile on the profile's Security train stop, as described in Managing Client Profile Security.
The following steps summarize how you configure and manage a workspace environment, using the workspaces pane and workspace tabs.
Get started accessing the Capture workspace console and the Capture client, as described in Getting Started Managing Capture Workspaces.
In the workspaces pane, create and manage workspaces.
The Capture Workspace Manager role has access to workspaces they create or other workspace managers grant them access to. See Managing Workspaces.
On the Security tab, manage workspace access for Capture users. See Managing Workspace Security.
On the Capture tab, create and manage client profiles, as described in Managing Client Profiles. In the Capture client, test the client profiles.
On the Capture tab, create and manage Import Processor jobs, as described in Managing Import Processing.
On the Commit tab, create and manage commit profiles. See Introduction to Commit Processing.
On the Advanced tab, import scripts provided by developers. See Managing Capture Scripts.
If needed, use WLST commands to import and export workspaces, or release or export batches. See Performing Advanced Functions.
For information about using scripts in supported Capture components, see Managing Capture Scripts. See Introduction to Developing Scripts with Oracle WebCenter Enterprise Capture in Developing Scripts for Oracle WebCenter Enterprise Capture.
You can use Oracle WebCenter Enterprise Capture to process virtually any type of document. This guide features a use case in which Oracle WebCenter Enterprise Capture processes a large volume of customer documents, with the workspace manager automating the process as much as possible to meet business needs.
The Customer workspace processes these types of customer documents:
Figure 2-3 illustrates the workspace's main configuration.
Figure 2-3 Customer Workspace
Three document profiles accommodate the main types of documents processed, and include: Correspondence, Purchase Orders, and Customer Agreements.
Correspondence arrives by mail and takes the following path:
Client users scan and index batches of correspondence documents using a
Correspondence client profile, then release them.
Documents are output (committed) to a folder using a text file commit profile and picked up by another process.
Purchase orders arrive by email and take the following path:
An Import Processor job checks for new email messages for specified accounts, and imports and indexes the email message as a document as well as attached purchase order documents.
Documents are committed to Oracle WebCenter Content: Imaging for transaction processing.
Customer agreements are scanned using a variety of multi-function devices (MFDs), which send the scanned documents to a network file share. These documents may arrive in either TIFF or PDF format, and take the following path:
An Import Processor job checks the network folder for new files and imports them.
A Document Conversion job converts the PDF documents and PDF attachments to a standard image format. A Document Conversion job is used to ensure all incoming documents and attachments are in image format to ensure processing by the Recognition Processor.
A Recognition job reads the images' bar codes, organizes the images into documents, and indexes the documents. Document separation is needed in case multiple agreements were scanned into a single file.
Documents are committed to Oracle WebCenter Content for storage and retrieval.
Figure 2-4 displays metadata fields defined for the Customer workspace, which are then included in document profiles as they apply. For example, the
Correspondence document profile includes
Correspondence Type metadata fields.
Figure 2-4 Example Metadata Fields and Configuration
Correspondence Type metadata field uses a user defined choice list that allows users to select the type of correspondence document they are indexing (a complaint, satisfaction, suggestion, or other document).
Customer Name metadata field uses a database lookup, where a search for customer name returns both the full customer name and the unique customer ID.
Product Family and
Product metadata fields use database choice lists that allow users to select product information from a database source. They have a choice list dependency, where the user's product family choice determines the products listed.
There are multiple business scenarios in which document conversion and merging play an integral role, particularly when Capture's other automated batch processors are also involved.
the expense report PDF attachment
the email message (positioned last in the batch)
After processing the email message, Import Processor forwards the batch to the Document Conversion Processor.
A Document Conversion Processor job converts each batch's PDF and email message to image format. Image format is needed for later bar code recognition. The processor merges the two documents to a single document, so that the email message is included within the PDF document. If the email contains multiple PDF expense reports, the email message should be appended to each expense report document. The Document Conversion Processor forwards the batch to the Recognition Processor.
A Recognition Processor job performs bar code recognition and indexing of each document. Recognition Processor forwards the batch to the Commit Processor.
The Commit Processor commits the batch.