The Text Enrichment component provides information extraction and summarization capabilities.
Extracted information include entities (such as people, places, and organizations), quotations, and themes. The Text Enrichment component utilizes the Salience Engine from Lexalytics. Depending on the version of the Salience Engine that you purchased, the engine also provides the ability to extract sentiment from documents at the document, entity, and theme levels.
Text Enrichment feature | Resulting information in the output record |
---|---|
Sentiment Analysis | An overall sentiment score for the current document (computed only if the Sentiment Analysis feature has been enabled). |
Named Entities | A list of named entities in the current
document (computed only if the Named Entities feature has been enabled). The
user specifies which types of entities will be extracted. Supported entity
types are:
The output record will have one column per type and each column can have multiple values. Additionally, if the user has enabled Sentiment Analysis, the entities will be added to different groups based on their sentiment scores. The user has to specify the different ranges for the entity sentiment scores. The output record will have one column per range and each column can have multiple values. |
Themes | A list of themes in the document (computed
only if the Themes feature has been enabled). All meta-themes are added to the
output record (the user has to specify the name of the field/property for
meta-themes).
For any theme that is not a meta-theme, if the theme score
is higher than a user-specified threshold, then:
|
Quotations | A list of quotes, with their speakers, in the document (computed only if the Quotations feature has been enabled). The user can specify the maximum length of quotes and the name of the field/property in the output record. |
Document Summary | A shortened version of the input content so as to best represent the whole content in a limited number of words. |
Although both sources are aimed at a developer audience, they can provide useful information for Integrator users who are implementing the Text Enrichment feature.