Configure Import Processor Jobs

About Import Processing

The import processor lets you automate bulk importing of documents (image and non-image) into Oracle Content Management from email, network folders, or list files.

Key Import Processor Job Settings

You can apply the import processor's automated import of images and other electronic documents into Content Capture to such applications as multifunctional devices, images scanned using third-party software, and documents sent as email attachments.

Depending on the files you want to import, you can create these import processor jobs:

Email Source: For an email source processor job, import processor imports files attached to incoming email messages into Content Capture. It can also import the email body and the entire email message.
Folder Source: For a folder source processor job, import processor monitors an import folder and imports all files it finds with a specified file mask.
List File Source: Import processor monitors an import folder and reads a list (text) file containing records that identify each file to import, zero or more attachment files to import, and optionally, metadata values to be assigned to the file.

Important Points About Import Processing

Unlike the other batch processors that process batches that are queued, the import processor polls at a specified frequency (ranging from every 30 seconds to once a day), in which it searches the specified source for files to import, and if found, begins processing them.
You configure settings specific to the selected source (email or folder) on the Import Source Settings page: For example,
- For an email source, you can specify email accounts you want to be monitored and email messages and/or attachments you want to import.
- For a list file job, you can identify the folder and list files you want to be read.
- For a folder import job, you can identify the folder and file types you want to import.

Add or Edit an Import Processor Job

Do not make major changes to the procedure elements of an import processor job while it is online. For example, if you modify or remove metadata fields used by the job, errors will occur because the data in the batch no longer matches the job's settings.
When you edit processor job settings, run the client to view imported batches. In the client, you must refresh the batch list to see newly imported batches.

To add or edit an import processor job:

In the procedures pane on the left, select your procedure.
The configuration pages for the selected procedure are displayed on the right.
Open the Capture tab of your procedure.
In the Import Processor Jobs table, click to create a new job. To edit an existing job, select it and click .

You can also copy an import processor job by selecting it, clicking , and entering a new name when prompted. Copying a job allows you to quickly duplicate and modify it.

Select settings on the General Settings page.

Enter a name in the Import Job Name field and a prefix in the Batch Prefix field. Imported batches are named using this prefix, followed by a number that increments with each new batch.

In the Import Source field, specify a source for imported files: Email Source, Folder Source, or List File Source.

The source you select determines the settings shown on the Import Source Settings page.

For This Import Source Type	See
Email Source	Configure Email Message and Email Attachment Importing Set Up Google Mail (OAuth) for Email Import
Folder Source	Configure File Importing from a Folder
List Source	Configure List File Importing Import Attachments During List File Importing

For Folder Source or List File Source, in the Import Frequency field, specify the time interval at which the import processor job checks for files to import. You can choose every 30 seconds, every 1, 5, 15, or 30 minutes, every 1 hour, or every day. If you specify every day, specify a time in the Time Hr and Min fields that display.
Complete other settings on the page, such as specifying a default batch status or priority to assign to batches when they are created.

On the Image Settings page, complete settings relating to how imported image files are formatted and validated.
1. Select the Preserve Image Files option to preserve image files and allow the import processor to import images without performing any image processing. Selecting this option automatically disables all the other options on this page. You won't be able to edit preserved image file documents (for example, append pages, delete pages, or move pages).
2. In the Image Down-Sample field, specify how to convert images, by retaining their image format (None), converting color to grayscale (Down-sample color to 8 bit grayscale), or converting either to black and white (Down-sample color or grayscale to black and white).
  Some of the common raster image file formats for which down sampling will be applied are:
  - JPG (.jpg, .jpeg)
  - PNG (.png)
  - TIFF (.tiff, .tif)
  - RAW (.cr2)
  - BMP (.bmp)
3. In the JPEG Image Quality field, specify a value between 0 and 99, where 99 is the highest quality and 85 is the default setting. This field does not apply to black and white images.
4. In the If Image Validation Fails field, specify an action if an image page fails decompression validation:
  - Fail the batch: The entire batch goes in to the error state, and it is sent to the Content Capture Client.
  - Isolate the file: Creates a new batch that contains only the failed document. Other documents that didn't fail are processed successfully.
  - Delete the batch: This option is available if you've chosen Email Source as the import source on the General Settings page.
  - Skip the file: This option is available if you've chosen Email Source as the import source on the General Settings page.
5. In the Blank Page Byte Threshold for black and white and the Blank Page Byte Threshold for color or grayscale fields, enter a file size value (in bytes). Any image whose size is less than or equal to the threshold is considered a blank page and therefore deleted.
  
  Note:
  For black and white (200 x 200 DPI images), the recommended value is 1500. At this setting, a blank page and a page with a small amount of text are usually differentiated.
On the Document Profile page, configure settings related to assigning metadata to imported documents. See Configure Metadata Assignment During Import.
On the Import Source Settings page, configure source-specific settings.
- For an email source job, see Configure Email Message and Email Attachment Importing.
- For a folder source job, see Configure File Importing from a Folder.
On the Post-Processing page, specify what happens after import processing completes. See Configure Post-Processing of an Import Processor Job.
Review settings on the Import Job Summary page and click Submit.

You can now test the import processor job you created and set the frequency to every 30 seconds and monitor the folder or email account to view processing activity.

Deactivate or Delete an Import Processor Job

When you delete an import job, the import processor no longer monitors for files at the specified frequency. If import jobs are online, they run at the interval specified in the Import Frequency field on the General Settings page of the job. You can temporarily stop the job from running (take it offline) or change a deactivated job to run again.

To deactivate or delete an import processor job:

In the procedures pane on the left, select your procedure.
The configuration pages for the selected procedure appear on the right.
Open the Capture tab.
In the Import Processor Jobs table, select the job you want to first deactivate and click .

You can also deactivate or activate an import processor job by deselecting or selecting the Online field on the General Settings page.
Select the deactivated job and click .
When prompted, click Yes to confirm that you want to delete this import processor job.

Configure Blank Page Detection in an Import Processor Job

Often users import image documents that contain blank pages. You can configure Content Capture to automatically detect and delete blank pages from documents. All you need to do is, specify a threshold file size so any image whose size is less than or equal to this threshold size is considered a blank page.

To configure blank page detection:

Add or edit an import processor job and then select the Image Settings page.
In the Blank Page Byte Threshold for black and white and the Blank Page Byte Threshold for color or grayscale fields, enter a file size value (in bytes). These fields are applicable only for imported image files and not for non-image files. If blank images should be preserved, select the Preserve image files option instead.
Click Submit to save the import processor job.

You can verify the result of this configuration in the client. Blank images will be post-processed as valid images.

Configure Email Message and Email Attachment Importing

The import processor imports files attached to incoming email messages along with email message elements such as subject and body text, into Content Capture. Each imported email message becomes a batch, with email elements such as attachments, message body, or entire email message created as separate documents within the batch.

To configure email import processor job settings:

To add, edit, or copy an import processor job, select Email Source in the Import Source field on the General Settings page.
To configure email import, select the Import Source Settings page.

On the Email Accounts tab, configure the email server the import processor job should connect to.

Note:

If you update the email account's password or switch the type of security authentication, or if the administrator changes the authentication type, then you must reconfigure or re-validate your email import job.
If multi-factor authentication is enabled in your import job, then you must use the Microsoft exchange web service type: OAuth: Authorization code flow plus Exchange Online Keys.
Basic Authentication in Exchange Online is being deprecated for IMAP and EWS protocols. For more information, see Deprecation of Basic authentication in Exchange Online. We recommend switching to OAuth authentication at the earliest.

Connection Protocol	Available Options	Value
Standard IMAP email server	Standard IMAP email server	Enter a DNS name or IP address. For example, `emailserver.example.com`. This email server must support TLS 1.2 or higher and accept connections via port 993.
Microsoft Exchange Web Services	Exchange Service Type: Basic Authentication. Email account credentials only.	Enter the Microsoft exchange web service URL in the following format: `https://<hostname>/ews/exchange.asmx`. For example, `https://outlook.office365.com/ews/exchange.asmx`.
Microsoft Exchange Web Services	Exchange Service Type: OAuth. Email account credentials plus Exchange Online keys	Microsoft Email Exchange Service URL field: Enter the exchange web service URL to use in the following format: `https://<hostname>/ews/exchange.asmx`. For example,`https://outlook.office365.com/ews/exchange.asmx`. App Client ID, App Client Secret, and App Tenant ID: To get the values for the client ID, client secret, and tenant ID, register a new application using the Azure portal. When you register the application, add EWS.AccessAsUser.All API permission of Office 365 Exchange Online under Configuration permissions on the API permissions page under Manage. For details, see Register an application with the Microsoft identity platform. EWS.AccessAsUser.All in the App Scope: Enter the required values. Add user accounts and configure other job settings.
Microsoft Exchange Web Services	Exchange Service Type: OAuth: Authorization code flow plus Exchange Online Keys. Note: This option is compatible with multi-factor authentication.	Microsoft Email Exchange Service URL field: Enter the exchange web service URL to use in the following format: `https://<hostname>/ews/exchange.asmx`. For example,`https://outlook.office365.com/ews/exchange.asmx`. App Client ID, App Client Secret, and App Tenant ID: To get the values for the client ID, client secret, and tenant ID, register a new application using the Azure portal. When you register the application, add these permissions under Configuration permissions on the API permissions page under Manage: User.Read API permission of Microsoft Graph EWS.AccessAsUser.All API permission of Office 365 Exchange Online For details, see Register an application with the Microsoft identity platform. Add user accounts and configure other job settings.
Microsoft Graph API	Exchange Service Type: OAuth. Email account credentials plus Exchange Online keys	App Client ID, App Client Secret, and App Tenant ID: To get the values for the client ID, client secret, and tenant ID, register a new application using the Azure portal. When you register the application, add mail.readwrite API permission of Office 365 Exchange Online under Configuration permissions on the API permissions page under Manage. For details, see Register an application with the Microsoft identity platform. user.read and mail.readwrite in the App Scope: Enter the required values. Add user accounts and configure other job settings.
Microsoft Graph API	Exchange Service Type: OAuth: Authorization code flow plus Exchange Online Keys. Note: This option is compatible with multi-factor authentication.	App Client ID, App Client Secret, and App Tenant ID: To get the values for the client ID, client secret, and tenant ID, register a new application using the Azure portal. When you register the application, add these permissions under Configuration permissions on the API permissions page under Manage: Mail.Read API permission of Microsoft Graph mail.readwrite API permission of Office 365 Exchange Online For details, see Register an application with the Microsoft identity platform. Add user accounts and configure other job settings.
Google Mail (OAuth)	Google Mail (OAuth)	See Set Up Google Mail (OAuth) for Email Import

Configure the email accounts that the job should check for messages:
1. In the Email Accounts to Process table, click .
2. In the Add/Edit Email Account dialog, enter the required information to provide the job access to the email account.
  
  Note:
  If you selected OAuth: Authorization code flow plus Exchange Online Keys on the Import Source Settings page, in the Connection Protocol section, a URI is displayed as Redirect URL. You will need to add this URI under the Authentication section (Redirect > URIs) at the time of registering the application.
On the Message Filters tab, specify where and how to search for email messages and/or attachments.
1. In the Folders to Process field, enter one or more folders to search in the specified email accounts. The default value is the server's inbox. To specify multiple folders, separate them with a ; (semi-colon). To specify subfolders, include a path delimiter applicable for the mail server, such as a / (forward slash), as in folder/subfolder.
2. By default, content capture processes all emails in the specified folder unless a message filter is applied to the job. Optionally, in the Message Filters table, select the Enabled field for each email element to search, then enter characters to find in the Field Contains field.
  
  For example, to search for emails whose subject or message body contains the word payment, you would select Enabled for both search fields, include payment in each Field Contains entry, and select the Or search operator.
3. In the Search Operator field, select the search operator to use for the specified message filters: And (default) imports only if all search criteria match, while Or imports if any search criteria matches.
On the Processing tab, specify how to process the email messages and their attachments. You can specify which information to include and the priority to assign to batches, based on the email priority.
1. Under Email Message Options, specify if the message body file should be imported. Specify its import format (text or EML) and whether it should be included when no attachments are present, and if the entire email message (including attachments) should be imported as an EML file.
2. If you want the attachments of attached emails to be processed to see if they match the masks and should be included as documents in the batch, select the Process attachments of attached emails check box.
  
  Note:
  Emails are scanned and processed for attachments up to 10 levels of nesting.
3. In the Include only attachments matching these masks(s) field, specify attachment files based on their file masks. You can enter multiple file masks separated by a comma or semi-colon. For example, you might include all PDF files (*.pdf).
4. In the Exclude attachments matching these mask(s) field, specify attachment files to exclude based on their file masks. You can enter multiple file masks separated by a comma or semi-colon.
5. Optionally, select Always post-process when attachments do not match mask(s) to always post-process emails when attachments do not match mask(s) specified in fields, Include attachments matching these mask(s) and Exclude attachments matching these mask(s). If this field is enabled and the attachments do not match the mask(s) specified, then the email import is considered unsuccessful and post-processed according to the settings you specify under the Upon Failed Import field in the Post-Processing tab.
  
  Note:
  The Always post-process when attachments do not match mask(s) field is disabled when both Import message body file and Include when no attachments exist fields are enabled together.
6. Under Document Ordering, specify the order in which the elements (for example, message body and attachments) from an email message are to be ordered as documents in imported batches.
7. Under Include in Batch Note, select message elements (such as Received Date/Time, From Address, To Address, Subject, and Message Body).
8. Under Batch Priority, optionally assign a priority to each new batch based on its email priority (low, normal, or high). For example, enter 8 in the High field to assign high priority emails a batch priority of 8 in Content Capture. Emails not assigned a priority are considered normal priority.
On the Post-Processing tab, specify what happens to email messages after successful or failed import. You can delete messages, move them to a specified folder within the email account, or in the case of failed import, prevent messages from being deleted. For example, if the job is run regularly, you might prevent successfully imported emails from being imported again by moving them to a specified folder.
Complete other import processor job pages as described in Add or Edit an Import Processor Job.

You can now test the email import job. The import processor checks the configured email accounts for messages and searches folders for matching emails. If matching emails are found, the import processor creates a content capture batch and a document for each document being imported from the email message. Optionally, the import processor populates metadata fields with email metadata and deletes successfully imported messages, or it moves them to a folder.

Periodically, Content Capture fetches emails from the email account you configure on the Email Accounts tab. However, if you want to trigger an email import at a certain time, select an email import job in the Import Processor Jobs table and click the icon. This icon is disabled if your email import job is offline.

Configure List File Importing

With a list file import job, the import processor monitors an import folder for matching list files. It imports the document files, metadata values, and attachments identified in the list file.

Make sure the file import agent is up and running on your computer.

Note:

The list file source follows standards defined by RFC 4180. So you can use field delimiters in a field value and include new lines and tabs. For instance, if you have a comma-delimited file, you can use a comma in a field value. Similarly, to set multi-line fields in a list file, the fields must be enclosed in quotes ("). You can map multi-line fields to any ALPHA_NUMERIC field.

To configure list file import processor job settings:

Generate a list file, naming it using file name characters that are supported by the operating system on which the file import agent is running.

A list file is a text file containing records of delimited data that identify the names of files to be imported and their location. Each record may also include metadata values to assign to the document or to match against a database file. The list file may also contain one or more attachment records to be imported for a document. See Import Attachments During List File Importing.
Add, edit, or copy an import processor job, selecting List File Source in the Import Source field on the General Settings page.
On the General Settings page, complete the Default Locale, Encoding, and Default Date Format fields.

These fields enable the import processor to correctly read list files based on your locale.
On the Document Profile page, map capture metadata fields to list file values, identifying field position in the list file using the Field 1 - Field n metadata attributes. You can also map system level fields, as described in Configure Metadata Assignment During Import.

For example, to map a Customer ID metadata field with the first field in each record in the list file, you would select the Customer ID field in the Metadata Field Mappings table, click, and select Field 1 in the Metadata Attributes field of the Metadata Field Mappings dialog.
Complete the settings on the Import Source Settings page.
1. In the File Mask(s) field, specify the type of files to import by entering an extension. Specify *.* to import all files. Separate multiple masks with a semi-colon (;) character.
2. To monitor and import list files from subfolders within the specified folder, select the Process subfolders field.
3. From the Create a New Batch options, specify whether to create a new batch for each list file or folder imported. When creating a batch per folder, each subfolder processed will create a new batch.
4. The Maximum Files Imported Per Batch field is enabled when you choose the Per list file or Per folder option. For both options, maximum 500 files can be imported per batch, which is also the default value. If the number of files is greater than the maximum files per batch value, then multiple batches are created.
  Note:
  - For the Per list file option, if the Maximum Files Imported Per Batch field value is set to 500, and if there are two list files for processing, wherein the first file has 100 list items and second has 600 list items, then the first batch will consist of 100 documents, second will include 500 documents, and third batch will contain the remaining 100 documents.
  - For the Per folder option, if the Maximum Files Imported Per Batch field value is set to 500, if there are two list files in the same folder, the first file contains 100 list items, and the second file includes 600 list items, then two batches will be created. The first batch will include 500 documents (first 100 from the first list file and the remaining 400 from the second list file) and second batch will contain 200 documents (the remaining 200 from the second list file).
  - It's recommended that you modify your existing list file import jobs if you configured any in previous releases and set the Maximum Files Imported Per Batch field value for them as well.
5. In the Field Delimiter field, specify how fields are delimited in the list file. Use a delimiter that will not be used in the list file metadata.
  
  For example, enter | (pipe), , (comma) or ~ (tilde).
6. In the Maximum Fields Per Document field, specify a maximum number of fields in the list file to map to metadata fields.
7. In the Document File Field Position field, enter the list file field position of document file names and locations. For example, enter 1 if the first field in each record in the list file identifies a document file path and name.
  
  Note:
  
  If the specified document file field position does not contain a path to the file to be imported, it is assumed that the file is located in the same folder as the list file being processed.
8. In the List File Post-Processing fields, specify how to change list files after import so they are not imported again if the job is run regularly. In other words, you must change the list file names so that they no longer match the File Mask(s) specified for the job. You can delete them, change their extension, or add a prefix to them. You can also move list files to another location of your choice.
9. In the Document File Post-Processing fields, specify if you want to delete document files and their attachments from their specified location after successful import. You can choose to leave the files as they are, that is, do nothing. You can also move document files to another location of your choice.
Complete other import processor job pages as described in Add or Edit an Import Processor Job.
Test the list file import job.

When the job is activated at the specified frequency, the import processor checks the folder for list files matching the specified file mask(s), imports the document files and their attachments identified in the list file, optionally populates metadata fields with list file data and deletes or renames the list file.

Import Attachments During List File Importing

When processing a list file, the import processor imports the document files, metadata values, and attachments identified in the list file. The format to define an attachment within the list file is:

@Attachment[delimiter][Attachment Type][delimiter][Attachment File]

or

@Support[delimiter][Attachment Type][delimiter][Attachment File]

Usage of the @Attachment command is recommended.

When the import processor processes an attachment record, it imports the attachment for the document specified in the previous record. Therefore, the attachment must not be specified as the first record in the list file. Specifying the attachment as the first record will cause an error.

Example 11-1 Example:

Doc1.TIF|Corp 1|Invoice
@Attachment|PO|PO1.TIF
Doc2.TIF|Corp 2|Invoice

In the above example, PO1.TIF is imported as a document attachment for the Doc1.TIF document. Multiple attachment records can be specified for a document.

Example 11-2 Example:

Doc1.TIF|Corp 1|Invoice
@Attachment|PO|PO1.TIF
@Attachment|Contract|Contract1.PDF
@Attachment|Contract|Amendment1.PDF
Doc2.TIF|Corp 2|Invoice

If the attachment file is a multiple page TIFF, each page is imported as a separate batch item and assembled into an attachment.

Configure File Importing from a Folder

With a folder import job, the import processor monitors an import folder and imports all files it finds with a specified file mask.

Make sure the file import agent is up and running on your computer.

To configure folder import processor job settings:

Add, edit, or copy an import processor job and select Folder Source in the Import Source field on the General Settings page.
Configure settings on the Import Source Settings page.
1. In the File Mask(s) field, specify the type of files to import by entering an extension (*.tif or *.pdf, for example). Specify *.* to import all files. Separate multiple masks with a semi-colon (;) character.
2. If you want import processor to monitor and import files from subfolders within this folder, select the Process subfolders field.
3. In the Create a New Batch field, specify whether to create a new batch with each file imported or with each folder imported. When a batch is created per folder, batches are created for a folder's subfolders as well.
  
  When you choose the Per folder option, the Maximum Files Imported Per Batch is enabled. Enter a number not exceeding 500.
4. In the Ready File field, optionally enter a file name that must exist in the folder (and each subfolder, if applicable) before the folder is processed. This option delays the processing of a folder until the ready file appears. When processing is completed, the ready file is deleted.
5. In the File Processing Order fields, specify the primary and secondary sort type and order in which files in the import folder are processed. Sort types options are: None (no sort type), File Name, File Extension, or File Modified Date, and sort order options are: Ascending or Descending.
6. In the File Post-Processing fields, specify how to change files after import so they are not imported again if the job is run regularly. For this, you must change the file names so that they no longer match the File Mask(s) specified for the job. You can delete files, change their extension, or add a prefix to them. You can also perform cleanup on processed subfolders by selecting the Delete processed subfolder if empty field.
Complete other import processor job pages.
Test the folder import job to make sure it is activated at the chosen frequency.

The import processor checks the folder for files matching the file mask(s). If it finds matches, it imports the files and creates new batches, populates metadata fields, and deletes or renames the files as you specified.

Configure Metadata Assignment During Import

On the Document Profile page, you can configure how import job values are mapped to content capture metadata fields during import processing.

To configure metadata assignment during import:

In an import processor job, select the Document Profile page.
In the Default Document Profile field, specify a document profile to assign to imported documents. The selected profile classifies the document. For example, if users open the batch in the client, this document profile is selected.
In the Metadata Field Mappings table, map metadata fields to the values specific to the selected import source.

In the Metadata Field column, select a capture field to populate, and click Edit. Regardless of the default document profile selected, all metadata fields in the procedure are available for mapping.
Complete settings in the Metadata Field Mappings dialog.
1. Select a metadata value for the import source in the Metadata Attributes field. To populate with a default value, select Default value in this field, then specify the value in the Default Value field.
2. In a folder import job, select from folder, file, or path-related attributes listed in the System-Provided Metadata Fields for a Folder Import Job table in System-Provided Metadata Fields.
3. In a list file import job, select from the list file or actual document listed in the System-Provided Metadata Fields for a List File Import Job table in System-Provided Metadata Fields.
4. In an email import job, select from email message-related attributes listed in System-Provided Metadata Fields for an Email Import Job table in System-Provided Metadata Fields.
5. In any import job, select from common system attributes listed in System-Provided Metadata Fields Common Across All Import Jobs table in System-Provided Metadata Fields.
Map other metadata fields in the Metadata Field Mappings table as needed.

Configure Post-Processing of an Import Processor Job

Post-processing settings let you control what happens after import processing completes a batch.

To configure post-processing for an import processor job:

Add or edit an import processor job and then select the Post-Processing page.
In the Batch Processor field, select the next step--what happens after batches are created and import processing is complete. Selecting None allows the batch to be immediately available to the client.
In the Batch Processor Job field, select a recognition, conversions to TIFF/PDF, asset lookup, XML transformation, taxonomy, OCR processor, conditional assignment, or external processor job to run. You can make this choice here only if you selected recognition processor, a conversion processor, an asset lookup processor, or an XML transformation processor in the previous step.
Click Submit to save your changes.