The Text Enrichment component requires the Salience Engine and a properties file, in addition to the input source text to process. If you want to use the query topics feature, you also need a query topics properties file; if you want to use normalized themes, you need a normalization.dat file.
The Text Enrichment component requires installation of the Salience Engine on the same machine as Integrator ETL. For details on installing the Salience Engine, see the Oracle Endeca Text Enrichment Installation Guide.
When installing the Salience Engine, write down the path to the Salience Engine data directory. You must specify this path when configuring the Text Enrichment component.
The source input is the text to be processed by the Salience Engine. You can use any supported input source, including files, database columns, and IAS record store data.
Input text must end sentences appropriately with appropriate punctuation (usually a period, but question marks or exclamation points if appropriate), and must be separated by spaces. If sentences do not end correctly and are not spaced correctly, themes will not be extracted correctly.
If the input text is formatted in all upper case, use the setFlattenAllUpperCase property. For details, see Processing text formatted in all caps.