documentCapture.documentToStructure(options)

Note:

The content in this help topic pertains to SuiteScript 2.1.

Method Description

Extracts content from a document.

This method can return the text content, table content, and key-value pairs (fields) from the specified document located in the NetSuite File Cabinet. The content returned depends on the features you specify when you call this method (using the options.features parameter). Use the documentCapture.Feature enum to specify the features you want to extract, such as TEXT_EXTRACTION, TABLE_EXTRACTION, and FIELD_EXTRACTION.

This method extracts content synchronously and supports documents up to five pages in length. If you want to extract content from documents longer than five pages, you must submit an asynchronous extraction task using the N/task module. For an example, see Extract Content from a Document Asynchronously.

This method supports PDF, JPG, PNG, and TIFF files. Encrypted files are not supported.

This method consumes usage from the monthly usage pool of free requests provided by NetSuite. Each successful call to this method counts as a request, and the AI Preferences page lets you track your free usage. For more information, see Manage SuiteScript AI Preferences. If you need to submit more requests per month than the free usage pool provides, you can provide Oracle Cloud account credentials for unlimited usage. For more information, see Configure OCI Credentials for AI.

Returns

documentCapture.Document

Supported Script Types

Server scripts

For more information, see SuiteScript 2.x Script Types.

Governance

100

Module

N/documentCapture Module

Since

2025.2

Parameters

Parameter

Type

Required / Optional

Description

Since

options.file

file.File

required

The document file to extract content from.

The specified file must be located in the NetSuite File Cabinet, be in PDF, JPG, PNG, or TIFF format, and be five pages in length or shorter. You can specify the file using its internal ID or the path to the file in the File Cabinet. For more information, see N/file Module. Encrypted files are not supported.

2025.2

options.documentType

string

optional

The document type.

Note:

This parameter is required if you specify the FIELD_EXTRACTION feature using the options.features parameter.

Use values from the documentCapture.DocumentType enum to set this parameter. By specifying the type of document, the service can apply pretrained models that are optimized for that type, which can provide more accurate extraction results. If you don't specify a document type when you call this method, the OTHERS document type is used by default.

2025.2

options.features

string[]

optional

The features to extract from the specified document.

Use values from the documentCapture.Feature enum to set this property. If you don't specify any features when you call this method, the TEXT_EXTRACTION and TABLE_EXTRACTION features are used by default.

2025.2

options.language

string

optional

The language of the specified document.

Use values from the documentCapture.Language enum to set this property. If you don't specify a language when you call this method, the ENG (English) language is used by default.

2025.2

options.ociConfig

Object

optional

Configuration parameters for unlimited usage through the OCI Document Understanding service.

This parameter is required only when accessing the OCI Document Understanding service through an Oracle Cloud account. The credentials you provide here override any OCI credentials that are configured on the AI Preferences page.

2025.2

options.ociConfig.compartmentId

string

optional

Compartment OCID.

For more information, refer to Managing Compartments in the Oracle Cloud Infrastructure Documentation.

2025.2

options.ociConfig.endpointId

string

optional

Endpoint ID.

This value is required only when an OCI dedicated AI cluster (DAC) is used. For more information, refer to Managing Endpoints in the Oracle Cloud Infrastructure Documentation.

2025.2

options.ociConfig.fingerprint

string

optional

Fingerprint of the public key.

Only a NetSuite secret is accepted. For more information about secrets, see Creating Secrets). For more information about public key fingerprints, refer to Required Keys and OCIDs in the Oracle Cloud Infrastructure Documentation.

2025.2

options.ociConfig.privateKey

string

optional

Private key of the OCI user.

Only a NetSuite secret is accepted. For more information about secrets, see Creating Secrets). For more information about private keys, refer to Required Keys and OCIDs in the Oracle Cloud Infrastructure Documentation.

2025.2

options.ociConfig.tenancyId

string

optional

Tenancy OCID.

For more information, refer to Viewing Tenancy Details in the Oracle Cloud Infrastructure Documentation.

2025.2

options.ociConfig.userId

string

optional

User OCID.

For more information, refer to Managing Users in the Oracle Cloud Infrastructure Documentation.

2025.2

options.timeout

number

optional

The timeout period to wait for a response from the service.

By default, the timeout period is 30,000 milliseconds (30 seconds). You can specify a longer timeout period, but you can't specify one that's shorter than 30,000 milliseconds. If you try to specify a shorter timeout period, the default value of 30,000 milliseconds is used instead.

2025.2

Errors

Error Code

Thrown If

ACCESS_DENIED

The specified OCI credentials (in the options.ociConfig parameter) don't provide access to the OCI Document Understanding service.

DOCUMENT_TOO_LONG

The specified file is longer than five pages. To extract content from documents longer than five pages, submit an asynchronous extraction task using the N/task module. For an example, see Extract Content from a Document Asynchronously.

FEATURES_CANNOT_BE_EMPTY

The list of features to extract is empty (that is, an empty array is specified in the options.features parameter).

FEATURE_1_DOES_NOT_SUPPORT_LANGUAGE_2

A feature is specified (using the options.features parameter) that is not supported in the specified language.

INCOMPATIBLE_DOCUMENT_TYPE_FOR_FEATURE_1

A feature is specified (using the options.features parameter) that is not supported for the specified document type.

INVALID_DOCUMENT_CAPTURE_RESULT

The document capture result provided by the service is invalid.

INVALID_DOCUMENT_TYPE

The specified document type is not included in the documentCapture.DocumentType enum.

INVALID_LANGUAGE

The specified language is not included in the documentCapture.Language enum.

MAXIMUM_PARALLEL_REQUESTS_LIMIT_EXCEEDED

The number of parallel requests to the service is greater than five. A maximum of five parallel requests are supported.

MONTHLY_QUOTA_OF_1_SUITE_SCRIPT_DOCUMENT_CAPTURE_REQUESTS_HAS_BEEN_MET

You've exceeded the number of monthly free requests provided by NetSuite. For more information, see Using OCI Credentials to Obtain Additional Usage.

ONLY_API_SECRET_IS_ACCEPTED

One (or both) of the options.ociConfig.fingerprint or options.ociConfig.privateKey parameters are not NetSuite secrets.

SSS_MISSING_REQD_ARGUMENT

The required options.file parameter is not specified.

UNRECOGNIZED_OCI_CONFIG_PARAMETERS

The object specified in the options.ociConfig parameter includes unknown properties or values.

UNSUPPORTED_FILE_TYPE

The specified file is not in PDF, JPG, PNG, or TIFF format.

Syntax

Important:

The following code sample shows the syntax for this member. It is not a functional example. For a complete script example, see N/documentCapture Module Script Samples.

            // Add additional code
...

const extractedData = documentCapture.documentToStructure({
    file: file.load("SuiteScripts/sample_invoice.pdf"),
    documentType: documentCapture.DocumentType.INVOICE,
    features: [
        documentCapture.Feature.TEXT_EXTRACTION,
        documentCapture.Feature.FIELD_EXTRACTION
    ]
});

...
// Add additional code 

          

Related Topics

General Notices