documentCapture.documentToText(options)
The content in this help topic pertains to SuiteScript 2.1.
|
Method Description |
Extracts text content from a PDF file. This method returns a string with the text of the specified PDF file located in the NetSuite File Cabinet. If you want to extract other content from a file, such as tables and fields (key-value pairs), or extract content from a JPG, PNG, or TIFF file, use documentCapture.documentToStructure(options) instead. Encrypted files are not supported. You can use the text returned from this method in calls to N/llm methods for further querying. For example, you can provide the returned text to llm.generateText(options) and ask questions about the data, as the following code sample shows:
This method doesn't consume usage from the monthly usage pool of free requests provided by NetSuite (unlike documentCapture.documentToStructure(options), which does consume usage). |
|
Returns |
string |
|
Supported Script Types |
Server scripts For more information, see SuiteScript 2.x Script Types. |
|
Governance |
100 |
|
Module |
|
|
Since |
2025.2 |
Parameters
|
Parameter |
Type |
Required / Optional |
Description |
Since |
|---|---|---|---|---|
|
|
required |
The PDF file to extract content from. The specified file must be located in the NetSuite File Cabinet and be in PDF format. You can specify the file using its internal ID or the path to the file in the File Cabinet. For more information, see N/file Module. Encrypted files are not supported. |
2025.2 |
|
|
|
number |
optional |
The timeout period to wait for a response from the service. By default, the timeout period is 30,000 milliseconds (30 seconds). You can specify a longer timeout period, but you can't specify one that's shorter than 30,000 milliseconds. If you try to specify a shorter timeout period, the default value of 30,000 milliseconds is used instead. |
2025.2 |
Errors
|
Error Code |
Thrown If |
|---|---|
|
|
The specified file is empty. |
|
|
The specified file couldn't be parsed. It could be corrupted or invalid. |
|
|
The specified file is corrupted or contains invalid characters. |
|
|
The specified file is not a PDF file. |
Syntax
The following code sample shows the syntax for this member. It is not a functional example. For a complete script example, see N/documentCapture Module Script Samples.
// Add additional code
...
// "14" is the unique ID of a PDF file stored in the NetSuite File Cabinet
const fileObj = file.load({
id: "14"
});
const extractedData = documentCapture.documentToText({
file: fileObj,
timeout: 40000
});
...
// Add additional code