Extract Document Information with a Document Understanding Action

You can optionally store and access the invoices, receipts, passports, drivers licenses, and healthcare IDs from which to exact data in Oracle Cloud Infrastructure Object Storage. See Object Storage.

Capabilities

Oracle Cloud Infrastructure Document Understanding is an AI service that enables you to extract text, tables, and other key data from PDF, PNG, and JPEG files through APIs and command line interface tools. Oracle Cloud Infrastructure Document Understanding automates repetitive business processing tasks with prebuilt AI models and customizes document extraction to satisfy your industry-specific needs. Oracle Integration supports using Oracle Cloud Infrastructure Document Understanding in an integration with the document understanding action.

Support is also provided in the mapper for extracting data in multiple languages.

The Body element under Request Wrapper includes subelements for Compartment Id, Language, and Document.

See AI Document Understanding.

Prerequisites

See Prerequisites for information about the prerequisites you must satisfy in the Oracle Cloud Infrastructure Console.

Invoke Oracle Cloud Infrastructure Document Understanding from an Integration

Add an OCI Document Understanding action to an integration in either of the following ways:
- On the side of the canvas, click Actions and drag the OCI Document Understanding action to the appropriate location.
- Click at the location where you want to add the document understanding action, then select OCI Document Understanding.
Enter a name and optional description.
From the Select Categories list, select the task you want to perform.
- Pre-trained Models
- Processor Job

Pre-trained Models

Follow these steps to configure a pretrained model action for use.

From the Action list, select the action to perform, then click Continue.

Action	Description	See...
Analyze document (with confidence score)	Extracts an element from a document and provides a confidence score on its certainty. A confidence score is a numerical value between 0 and 1 indicating how certain the model is about the accuracy of the extracted piece of information, such as a text field, key-value pair, or element within a document. For example, extracting the date from an invoice may yield a confidence score of 0.86, meaning the model is 86% confident that the extracted value is correct. Confidence scores help you identify which elements were extracted with high certainty, and which elements may require human review.
Analyze document	Extracts information from a document. This action accepts an inline document in base64 format or retrieves the document from object storage, performs the extraction, and returns an immediate response.	Step 2 to configure this selection in the wizard.
Table extraction	Accepts a text file that contains tables and extracts the elements in tabular format. For example, if a receipt in PDF format contains a table that includes the taxes and total amount, the table is identified and the table structure is extracted. See Table Extraction.
Text extraction	Identifies the plain text in a file and extracts and returns the results in words and lines. With this action, you don't select a specific document type such as an invoice or driver's license. The plain text document you provide is processed. This action accepts an inline document in base64 format or retrieves the document from object storage and returns the words and lines.	See Step 3 to configure this selection in the wizard.
Document classification	Classifies and returns the type of document. For example, if you send a pay slip, check, or bank statement it returns the pay slip, check, or bank statement, respectively as the document type. This action accepts an inline document in base64 format or retrieves the document from object storage and returns the document type in an immediate response. See Document Classification.	Step 3 to configure this selection in the wizard.

If you selected the Analyze document or Analyze document (with confidence score) action:

From the Compartment list, select the Oracle Cloud Infrastructure compartment in which Oracle Integration is installed.

From the Document type list, select an action to perform, then click Continue.

Element	Description
Document Type	Select the type of document from which to extract information. Key value extraction can identify values for predefined keys in a document. Invoice: Identifies values for predefined keys in an invoice. For example, if an invoice includes a vendor name, total, and invoice ID, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair. Receipt: Identifies values for predefined keys in a receipt. For example, if a receipt includes a merchant name, merchant address, or merchant phone number, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair. Driver license: Identifies values for predefined keys in a US or UK driver's documentation. For example, if a driver ID includes an issue date, region, and expiry date, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair. Passport: Identifies values for predefined keys in an MRZ-supported passport. For example, if a passport includes nationality and date of issue, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair. Healthcare Insurance ID: Identifies values for predefined keys in a healthcare insurance ID card and returns these values as a key-value pair. Your selection is passed into the mapper for you to map.
Input source	Select the source location from which to get the input document. Inline: Accepts the document through base64 format in the request payload of the mapper. Object storage: Reads the document from the object storage bucket. Input storage bucket: Select the storage bucket from which to read the file. Namespace: Select the namespace for the object storage bucket. After completing configuration in the wizard, you must specify the file name to read from the object storage bucket in the mapper.

Element

Description

Document Type

Select the type of document from which to extract information. Key value extraction can identify values for predefined keys in a document.

Invoice: Identifies values for predefined keys in an invoice. For example, if an invoice includes a vendor name, total, and invoice ID, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair.
Receipt: Identifies values for predefined keys in a receipt. For example, if a receipt includes a merchant name, merchant address, or merchant phone number, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair.
Driver license: Identifies values for predefined keys in a US or UK driver's documentation. For example, if a driver ID includes an issue date, region, and expiry date, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair.
Passport: Identifies values for predefined keys in an MRZ-supported passport. For example, if a passport includes nationality and date of issue, Oracle Cloud Infrastructure Document Understanding can identify these values and return them as a key-value pair.
Healthcare Insurance ID: Identifies values for predefined keys in a healthcare insurance ID card and returns these values as a key-value pair.

Your selection is passed into the mapper for you to map.

Input source

Select the source location from which to get the input document.

Inline: Accepts the document through base64 format in the request payload of the mapper.
Object storage: Reads the document from the object storage bucket.
- Input storage bucket: Select the storage bucket from which to read the file.
- Namespace: Select the namespace for the object storage bucket.
After completing configuration in the wizard, you must specify the file name to read from the object storage bucket in the mapper.

If you selected the Table extraction, Text extraction, or Document classification action:

Select the following information, then click Continue.

Element

Description

Compartment Name

Select the Oracle Cloud Infrastructure compartment in which Oracle Integration is installed.

Input source

Select the source location from which to get the input document.

Inline: Accepts the document through base64 format in the request payload of the mapper.
Object storage: Reads the document from the object storage bucket and returns the response based on the selected document type.
- Input storage bucket: Select the storage bucket from which to read the file.
- Namespace: Select the namespace for the object storage bucket.
After completing configuration in the wizard, you must specify the file name to read from the object storage bucket in the mapper.

On the Summary page, click Finish.

Processor Job

Follow these steps to configure a processor job action for use.

Select an action to perform, then click Continue.

Action	Description	See...
Create a processor job	Creates a processor job. This action reads a file from object storage and provides a response with a processor job ID.	Step 2 to configure this selection in the wizard.
Get a processor job status	Gets the status of the processor job. This action takes the processor job ID as a parameter from the Create a processor job action and returns the status of the processor job.	Step 3. The Summary page is displayed and no further wizard configuration is required.
Cancel a processor job	Cancels the processor job submitted through a Create a processor job action.	Step 3. The Summary page is displayed and no further wizard configuration is required.

Select the following information, then click Continue.

Element	Description
Compartment Name	Select the Oracle Cloud Infrastructure compartment in which Oracle Integration is installed.
Output storage bucket	Specify the output storage location for the file.
Namespace	Select the namespace for the object storage bucket name.

On the Summary page, click Finish.
Upon completing configuration in the wizard, you must perform specific tasks in the mapper based on the action and file source you configured. The following sections provide examples of some of the above selections.

Configure the Mapper to Read the Inline Document for the Analyze Document Action

You send the data in base-64 format for any document type that you selected in the wizard (for example, invoice, receipt, passport, driver's license, or healthcare insurance ID).

Expand the target Document element.
Right-click Data, and select Create target node.
Click Functions .
Click Design View in the Expression Builder.
In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder. This step is required for all document types (invoice, receipt, drivers license, and passport).
```
oraext:encodeReferenceToBase64 ( )
```
Drag Stream Reference from the Sources section into the Expression Builder.
```
oraext:encodeReferenceToBase64 (/nssrcmpr:execute/ns19:streamReference )
```
Exit the mapper.
Add a log action and select fields to log to the activity stream.
The document understanding action is now configured.

Configure the Mapper to Read the Document from a Storage Bucket for the Analyze Document Action

You must select the file name to read from the object storage bucket in the mapper (for example, invoice, receipt, passport, driver's license, or healthcare insurance ID).

Drag the file name from the Sources section to the Object Name target element.
If you also want to override other values you defined in the wizard, you can specify values for the Component id, Bucket Name, and Namespace Name target elements.
When complete, exit the mapper.
The document understanding action is now configured.

Configure the Mapper to Read the Inline Document for the Analyze Document (with confidence score) Action

You send the data in base-64 format in the request mapper for any document type that you selected in the wizard (for example, invoice, receipt, passport, driver's license, or healthcare insurance ID).

Expand the target Document element.
Right-click Data, and select Create target node.
Click Functions .
Click Design View in the Expression Builder.
In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder. This step is required for all document types (invoice, receipt, drivers license, passport, and healthcare ID).
```
oraext:encodeReferenceToBase64 ( )
```
Drag Stream Reference from the Sources section into the Expression Builder.
In the response mapper, expand the source and target elements. For this example, the payload is the contents of a healthcare ID.
Map the source and target elements. Note the confidence elements.

The document understanding action is now configured.

Configure the Mapper to Read the Inline Document for the Document Classification Action

You send the data in base-64 format for any document type that you selected in the wizard for the Analyze document action.

Expand the target Document element.
Right-click Data, and select Create target node.
Click Functions .
Click Design View in the Expression Builder.
In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder.
```
oraext:encodeReferenceToBase64 ( )
```

Drag Stream Reference from the Sources section into the Expression Builder.

oraext:encodeReferenceToBase64 (/nssrcmpr:execute/nssrcmpr:attachments/ns19:attachment/ns19:attachmentReference )

The Sources, Mapping canvas, and Target sections are shown. The source Stream Reference element is mapped to the target Data element. The Expression Builder shows the encodeReferenceToBase64 function configured with the stream reference.

Exit the mapper.
Add a log action and select fields to log to the activity stream.
The document understanding action is now configured.

Configure the Mapper to Read the Document from a Storage Bucket for the Document Classification Action

You must select the file name to read from the object storage bucket in the mapper.

Drag the file name from the Sources section to the Target Object Name element.
If you also want to override other values you defined in the wizard, you can specify values for the Component id, Bucket Name, and Namespace Name target elements.
Exit the mapper.
Add a log action and select fields to log to the activity stream.
The document understanding action is now configured.

Configure the Mapper to Read the Inline Document for the Table Extraction Action

You send the data in base-64 format for any document type that you selected in the wizard for the Table Extraction action.

Expand the target Document element.
Right-click Data, and select Create target node.
Click Functions .
Click Design View in the Expression Builder.
In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder.
```
oraext:encodeReferenceToBase64 ( )
```
Drag Stream Reference from the Sources section into the Expression Builder.
In the response mapper, expand the source and target elements.
Map the source and target elements. For this example, the payload is in tabular format.

Configure the Mapper to Read the Inline Document for the Text Extraction Action

You send the data in base-64 format for any document type that you selected in the wizard for the Text Extraction action.

Expand the target Document element.
Right-click Data, and select Create target node.
Click Functions .
Click Design View in the Expression Builder.
In the Functions section, expand Advanced, and drag encodeReferenceToBase64 into the Expression Builder.
```
oraext:encodeReferenceToBase64 ( )
```
Drag Stream Reference from the Sources section into the Expression Builder.
```
oraext:encodeReferenceToBase64 (/nssrcmpr:execute/ns19:streamReference )
```
Add a log action and select fields to log to the activity stream.
The document understanding action is now configured.

Configure the Mapper for the Create a Processor Job Action

This action returns a processor job ID. The input from the storage bucket is sent through the request mapper. The output is sent to the object storage bucket.

Expand the Target Request Wrapper to specify the target values. Most values are optional such as Display Name, Processor Config, and Document Type (for example, an invoice). However, the following elements are mandatory:
1. Feature Type: Specify the document analysis type (for example, TEXT_EXTRACTION or others). For a complete list of available types, see DocumentFeature Reference.
2. Bucket Name (under Input Location): Specify the input object storage bucket from which to get the file.
3. Namespace (under Input Location): Specify the namespace for the input object storage bucket.
4. Prefix (under Output Location): Specify the prefix. The folder in object storage is created with the prefix that you specify. If you also want to override values you defined in the wizard, you can specify values for the Bucket Name and Namespace Name target elements under Output Location.
The response mapper returns a job ID (id element) and a job status (Percent Complete element).
Exit the response mapper.
The document understanding action is now configured.

Configure the Mapper for the Get a Processor Job Status Action

This action takes the job ID returned in the response mapper of the Create a processor job action. This action uses that value to get the status of the processor job. You configure this use case as follows:

Configure an initial OCI Document Understanding action with the Create a processor job action. This action returns the job ID.
Configure a second OCI Document Understanding action with the Get a processor job status action. This action gets the status of the processor job.

Map the Sources Create A Processor Job id element to the Target Processor Job id element.
Exit the mapper.
Add a log action and select fields to log to the activity stream.
The document understanding action is now configured.

Configure the Mapper for the Cancel a Processor Job Action

This action takes the job ID returned in the response mapper of the Create a Processor Job action. This action uses that value to cancel the processor job. You configure this use case as follows:

Configure an initial OCI Document Understanding action with the Create a processor job action. This action returns the job ID.
Configure a second OCI Document Understanding action with the Get a processor job status action. This action gets the status of the processor job.
Configure a third OCI Document Understanding action with the Cancel a processor job action. This action cancels the processor job. The job status must be in progress.

Map the Sources Create A Processor Job id element to the Target Processor Job id element.

A design-time to runtime use case using the document understanding action is provided. See Extract Content from an Invoice PDF Document with a Document Understanding Action.

Watch a video to learn more: