VOID
expressions return no value but are used to
perform other work. The
VOID CONVERTTOTEXT
expression extracts document content,
converts it to text, and assigns the text to a record.
This expression is available as part of the optional Document
Conversion Module. Recall that the
RETRIEVE_URL
expression fetches a document's content
and writes the content to a file. The
Endeca.Document.Body
property stores the absolute path
of the file that contains the document's content.
CONVERTTOTEXT
read the path and converts the content
of the indicated file to text. The text is then assigned to a record as a
property with the name
Endeca.Document.Text
. If the expression fails, a
warning is logged, and the property is not assigned to the record.
The following optional expression nodes modify the behavior of
VOID CONVERTTOTEXT
:
TIMEOUT
- Specifies the maximum time allowed to convert a document. The default value is 300 seconds.RESPONSE_TIMEOUT
- Specifies the messaging time out between Forge and the converter process. The default value is 30 seconds.CONVERT_EMBEDDED
- If set toTRUE
, specifies that embedded documents will also be extracted and converted. If this option is not used, the default isFALSE
.
See the
EXPRESSION
element for DTD and attribute information.
This example converts
Endeca.Document.Body
to text if the property exists.
<EXPRESSION TYPE="VOID" NAME="IF"> <EXPRESSION TYPE="INTEGER" NAME="PROP_EXISTS"> <EXPRNODE NAME="PROP_NAME" VALUE="Endeca.Document.Body"/> </EXPRESSION> <EXPRESSION TYPE="VOID" NAME="CONVERTTOTEXT"/> </EXPRESSION>
This example shows how to use the
CONVERT_EMBEDDED
option to process embedded documents.
<EXPRESSION TYPE="VOID" NAME="CONVERTTOTEXT"> <EXPRNODE NAME="CONVERT_EMBEDDED" VALUE="TRUE" /> </EXPRESSION>