documentCapture.documentToStructure(options)
The content in this help topic pertains to SuiteScript 2.1.
|
Method Description |
Extracts content from a document. This method can return the text content, table content, and key-value pairs (fields) from the specified document located in the NetSuite File Cabinet. The content returned depends on the features you specify when you call this method (using the This method extracts content synchronously and supports documents up to five pages in length. If you want to extract content from documents longer than five pages, you must submit an asynchronous extraction task using the N/task module. For an example, see Extract Content from a Document Asynchronously. This method supports PDF, JPG, PNG, and TIFF files. Encrypted files are not supported. This method consumes usage from the monthly usage pool of free requests provided by NetSuite. Each successful call to this method counts as a request, and the AI Preferences page lets you track your free usage. For more information, see Manage SuiteScript AI Preferences. If you need to submit more requests per month than the free usage pool provides, you can provide Oracle Cloud account credentials for unlimited usage. For more information, see Configure OCI Credentials for AI. |
|
Returns |
|
|
Supported Script Types |
Server scripts For more information, see SuiteScript 2.x Script Types. |
|
Governance |
100 |
|
Module |
|
|
Since |
2025.2 |
Parameters
|
Parameter |
Type |
Required / Optional |
Description |
Since |
|---|---|---|---|---|
|
|
required |
The document file to extract content from. The specified file must be located in the NetSuite File Cabinet, be in PDF, JPG, PNG, or TIFF format, and be five pages in length or shorter. You can specify the file using its internal ID or the path to the file in the File Cabinet. For more information, see N/file Module. Encrypted files are not supported. |
2025.2 |
|
|
|
string |
optional |
The document type.
Note:
This parameter is required if you specify the Use values from the documentCapture.DocumentType enum to set this parameter. By specifying the type of document, the service can apply pretrained models that are optimized for that type, which can provide more accurate extraction results. If you don't specify a document type when you call this method, the |
2025.2 |
|
|
string[] |
optional |
The features to extract from the specified document. Use values from the documentCapture.Feature enum to set this property. If you don't specify any features when you call this method, the |
2025.2 |
|
|
string |
optional |
The language of the specified document. Use values from the documentCapture.Language enum to set this property. If you don't specify a language when you call this method, the |
2025.2 |
|
|
Object |
optional |
Configuration parameters for unlimited usage through the OCI Document Understanding service. This parameter is required only when accessing the OCI Document Understanding service through an Oracle Cloud account. The credentials you provide here override any OCI credentials that are configured on the AI Preferences page. |
2025.2 |
|
|
string |
optional |
Compartment OCID. For more information, refer to Managing Compartments in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
string |
optional |
Endpoint ID. This value is required only when an OCI dedicated AI cluster (DAC) is used. For more information, refer to Managing Endpoints in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
string |
optional |
Fingerprint of the public key. Only a NetSuite secret is accepted. For more information about secrets, see Creating Secrets). For more information about public key fingerprints, refer to Required Keys and OCIDs in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
string |
optional |
Private key of the OCI user. Only a NetSuite secret is accepted. For more information about secrets, see Creating Secrets). For more information about private keys, refer to Required Keys and OCIDs in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
string |
optional |
Tenancy OCID. For more information, refer to Viewing Tenancy Details in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
string |
optional |
User OCID. For more information, refer to Managing Users in the Oracle Cloud Infrastructure Documentation. |
2025.2 |
|
|
number |
optional |
The timeout period to wait for a response from the service. By default, the timeout period is 30,000 milliseconds (30 seconds). You can specify a longer timeout period, but you can't specify one that's shorter than 30,000 milliseconds. If you try to specify a shorter timeout period, the default value of 30,000 milliseconds is used instead. |
2025.2 |
Errors
|
Error Code |
Thrown If |
|---|---|
|
|
The specified OCI credentials (in the |
|
|
The specified file is longer than five pages. To extract content from documents longer than five pages, submit an asynchronous extraction task using the N/task module. For an example, see Extract Content from a Document Asynchronously. |
|
|
The list of features to extract is empty (that is, an empty array is specified in the |
|
|
A feature is specified (using the |
|
|
A feature is specified (using the |
|
|
The document capture result provided by the service is invalid. |
|
|
The specified document type is not included in the documentCapture.DocumentType enum. |
|
|
The specified language is not included in the documentCapture.Language enum. |
|
|
The number of parallel requests to the service is greater than five. A maximum of five parallel requests are supported. |
|
|
You've exceeded the number of monthly free requests provided by NetSuite. For more information, see Using OCI Credentials to Obtain Additional Usage. |
|
|
One (or both) of the |
|
|
The required |
|
|
The object specified in the |
|
|
The specified file is not in PDF, JPG, PNG, or TIFF format. |
Syntax
The following code sample shows the syntax for this member. It is not a functional example. For a complete script example, see N/documentCapture Module Script Samples.
// Add additional code
...
const extractedData = documentCapture.documentToStructure({
file: file.load("SuiteScripts/sample_invoice.pdf"),
documentType: documentCapture.DocumentType.INVOICE,
features: [
documentCapture.Feature.TEXT_EXTRACTION,
documentCapture.Feature.FIELD_EXTRACTION
]
});
...
// Add additional code