6 Creating Import Processor Scripts

This chapter discusses creating Import Processor scripts.

The Import Processor provides many features, and if you need to customize the importing process, there are a variety of customization methods.

This chapter contains the following sections:

6.1 Scripting for the Import Processor

You can develop scripts for the Import Processor to perform a wide variety of functions. Some common tasks include:

  • Skipping the importing of certain image files

  • Changing Capture batch properties

  • Skipping the importing of a batch

  • Adding page level metadata values during importing

  • After importing, moving images to a different folder

If an Import Job specifies a script to use during processing, the Import Processor Bean will create an instance of the JDK's ScriptRuntime class and initialize it with the script specified in the job. The Import Processor Bean, Import Manager Bean, and import sources all share this scripting runtime.

This section defines the following events:

6.1.1 Import Processor Events

The following table provides definitions of Import Processor events.

Events Description

public void preProcess(ImportProcessorContext ctx);

This event occurs prior to the pre-processing of the import source. Initialization code can be performed here.

public process(ImportProcessorContext ctx;)

This event signals the start of the import process.

public void postProcess(ImportProcessorContext ctx);

This event occurs after the import source has been processed. Clean-up code can be performed here.

public void preCreateBatch(ImportProcessorContext ctx);

This event occurs immediately after a new batch is started.

public void postCreateBatch(ImportProcessorContext ctx);

This event occurs immediately after a batch is created, but before any documents have been created.

public void preCreateDocument(ImportProcessorContext ctx);

This event occurs prior to a new document being created.

public void postCreateDocument(ImportProcessorContext ctx);

This event occurs after a new document has been created.

public void preImportFile(ImportProcessorContext ctx);

This event occurs prior to a file being imported.

public void postImportFile(ImportProcessorContext ctx);

This event occurs after a file is imported.

public void preRelease(ImportProcessorContext ctx);

This event occurs prior to a batch being released.

public void postRelease(ImportProcessorContext ctx);

This event occurs after a batch has been released.

public void preDatabaseSearch(ImportProcessorContext ctx);

This event occurs prior to a database lookup.

public void processDatabaseSearchResults(ImportProcessorContext ctx);

This event occurs after the database lookup has returned the search results.


6.1.2 Email Source Events

The following table provides descriptions of email source events.

Events Description

public void newMessage(ImportProcessorContext ctx, EmailSourceContext emailCtx);

This event occurs when a new email message is about to be processed.

public void newAttachment(ImportProcessorContext ctx, EmailSourceContext emailCtx);

This event occurs when a new email attachment is about to be processed.

public void deleteMessage(ImportProcessorContext ctx, EmailSourceContext emailCtx);

This event occurs in the email message post-processing step when an email message is about to be deleted.

public void moveMessage(ImportProcessorContext ctx, EmailSourceContext emailCtx);

This event occurs in the email message post-processing step when an email message is about to be moved to an email folder.


6.1.3 Folder Source Events

The following table provides descriptions of folder source events.

Events Description

public void newFolder(ImportProcessorContext ctx, FolderSourceContext folderCtx);

This event occurs when a new folder is about to be processed.

public void deleteDocumentFile(ImportProcessorContext ctx, FolderSourceContext folderCtx);

This event occurs in the folder post-processing step when a file from the folder is about to be deleted.

public void renameDocumentFile(ImportProcessorContext ctx, FolderSourceContext folderCtx);

This event occurs in the folder post-processing step when a file from the folder is about to be renamed.


6.1.4 List File Source Events

The following table provides descriptions of list file source events.

Events Description

public void newFolder(ImportProcessorContext ctx, ListFileSourceContext listFileCtx);

This event occurs when a new folder containing list files is about to be processed.

public void newListFile(ImportProcessorContext ctx, ListFileSourceContext listFileCtx);

This event occurs when a new list file is about to be processed.

public void newListFileLine(ImportProcessorContext ctx, ListFileSourceContext listFileCtx);

This event occurs when a new line in the list file is about to be processed.

public void deleteListFile(ImportProcessorContext ctx, ListFileSourceContext listFileCtx);

This event occurs in the list file post-processing step when a list file is about to be deleted.

public void renameListFile(ImportProcessorContext ctx, ListFileSourceContext listFileCtx);

This event occurs in the list file post-processing step when a list file is about to be renamed.


6.2 Import Processor Classes

This section defines the following Import Processor classes:

6.2.1 ImportJob

Import jobs are configured within a Capture Workspace to import batches from import sources such as a file system folder, a delimited list file, or an inbox/folder of an email server. The following table defines the properties for an Import Job.

Property Description

String jobID

A value that uniquely identifies the job in the system. This will be a GUID.

String workspaceID

The identifier of the workspace to which the job belongs.

String jobName

A human-readable name for the job.

String dbSearchID

The identifier of the database search to use when processing the job.

String dbSearchFieldID

The identifier of the database search field to use when processing the job.

Integer imageDownsample

This integer determines how to sample an image.

0 - None (retain image format)

1 - Down-sample color to 8 bit grayscale

2 - Down-sample color or grayscale to black and white

Integer jpegQuality

The JPEG quality ratio 0 to 99.

String batchPrefix

The batch prefix to use when creating batch names.

String defaultBatchStatusID

The identifier of the batch status to associate with batches created by this job.

Integer defaultPriority

The default priority assigned to batches ranging from 0 to 10.

String defaultDocumentTypeID

The default document profile for documents created by this job.

Integer searchResultOption

Determines how to handle database lookups that return more than one result.

0 - Use the first record

1 - Ignore results (do not populate fields)

String scriptID

The unique identifier of a script to use for this job.

Integer importFrequency

A value, specified in seconds, that determines how often a job should be polled for work to process. The following values are possible:

0 - Inactive

30 - Every 30 seconds

60 - Every 1 minute

300 - Every 5 minutes

900 - Every 15 minutes

1800 - Every 30 minutes

3600 - Every 1 hour

-1 - Daily (Specify Time)

Integer hour

If the importFrequency is set to Daily, this specifies the hour of the day.

Integer minute

If the importFrequency is set to Daily, this specifies the minute of the day.

Date lastCheck

The date/time the job was last checked for processing. This will be updated by the Import Job Scheduler after a job is polled for work to process.

Map<String, FieldMappingInfo> fieldMappings

A set of values that map Capture fields to import source metadata fields.

String importSourceClassName

The name of the Java class that provides the implementation of the import source for this job.

ImportSourceConfiguration importSourceConfig

Contains the configuration for the import source defined for this job

String batchProcessorClassName

The name of the class that will be used to process the batch when it is released. If this value is null, the batch lock will be discarded and the batch will be put in a READY state.

String batchProcessorJobID

A unique identifier for a batch processor job. If this value is null, either the processor does not support jobs or the batch is going to be put in a READY state.

Integer imageFailureAction

This is the action to take if an invalid image is encountered.

0 - Abort the batch

1 - Skip the item

Locale locale

Specifies the locale of the list file source.

String defaultDateFormat

Specifies the default date format of dates in the list file source.

String description

The description of this job.

String encoding

Specifies the file encoding of the list file source.

Boolean isJobOnline

Indicates whether this job should be processed.


6.2.2 ImportProcessorContext Class

The ImportProcessorContext class contains properties relevant to the job being processed. An instance of this class is created before processing is started and is passed to an import source at various stages throughout processing.

Property Description

Boolean cancel

When this boolean value is set to True, it will cancel the operation being performed.

Boolean cancelDBSearch

When this boolean value is set to True, it will cancel the database lookup.

DBSearchResults dbSearchResults

This contains the results from a database lookup.

String sourceName

The name of the import source that the current Import Job is configured to use.

Logger logger

This is an instance of the Import Processor's Logger class, which can be used to log information related to processing.

ImportManagerSession importManager

This is the import manager session bean used in this context.

ScriptEngine scriptEngine

The scripting engine used to invoke methods in Import Processor scripts.

ImportJob importJob

This is the current Import Job being processed.

BatchLockEntity ble

This contains the batch lock entity for the batch, after a new batch has been created.

String importSourceFile

This is the name of the file currently being processed.

DocumentEntity documentEntity

The document entity associated with the file currently being processed.

DocumentPageEntity documentPageEntity

This is the document page entity associated with the file currently being processed.

ImportHATokenEntity importHATokenEntity

This is the high availability token associated with the current batch.

Integer lastMultiPageTiffNumber

This contains the current page number of a multi-page TIFF file being processed.

CaptureWorkspaceEntity workspaceEntity

This is the workspace entity associated with the current batch.

WorkspaceManager workspaceManager

This is the workspace manager associated with the current Import Job.


6.2.3 Capture Core Classes used by Import Processor

The following table describes the Capture Code classes used by the Import Processor.

Class Properties Description

oracle.odc.data.DBSearchResults

List<DBSearchResultRow>

A list of rows from the database lookup.

 

List<DBSearchFieldInfo> fieldInfoList

A list of search field info describing the columns used in the database lookup.

oracle.odc.data.DBSearchResultRow

List<String> results

A list of results from the database lookup. Each item in the list represents a column.

oracle.odc.data.DBSearchFieldInfo

String captureIndexDefID

The ID of the Capture metadata field definition.

 

String dbColumnName

The name of the database lookup column.

 

Integer dbColumnType

The database lookup column type.

 

Integer captureFieldType

The Capture metadata field definition type.


6.2.4 Email Source Classes

The following table describes email source classes. See the Javamail API documentation for the Folder and Message class definitions.

Class Properties Description

oracle.odc.importprocessor.email.EmailSourceContext

String account

The name of the email account currently being processed.

 

Folder folder

The email folder currently being processed.

 

Message message

The email message currently being processed.

 

String attachmentFilename

The file name of the email message attachment currently being processed.


6.2.5 Folder Source Classes

The following table describes folder source classes.

Class Properties Description

oracle.odc.importprocessor.folder.FolderSourceContext

String folderName

The name of the directory currently being processed.

 

String documentFilename

The name of the file currently being processed.

 

String renamedDocumentFilename

If the post-processing step indicates the file should have a prefix added to it or the extension changed, this property indicates the changed file name.


6.2.6 List File Source Classes

The following table describes list file source classes.

Class Properties Description

oracle.odc.importprocessor.listfile.ListFileSourceContext

String folderName

The name of the folder currently being processed.

 

String listFilename

The name of the list file currently being processed.

 

String listFileLine

The contents of the line currently being processed in the list file.

 

String documentFilename

The name of the file currently being processed from the current line in the list file.

 

String renamedListFilename

If the post-processing step indicates the list file should have a prefix added to it or the extension changed, this property indicates the changed list file name.