The exclude list configuration contains a set of terms that are removed from the final list of extracted terms.

Excludes are compared against the canonical and all raw forms of a term; if it matches any, the term is excluded. This is equivalent to canonicalizing the exclude term.

The exclude list configuration can be passed to the CAS term extraction manipulator by creating a new Record Store of a supported type, for example de-limited or JDBC. You must load the data into the Record Store and add the following pass through information in the manipulator configuration.

The format rules for the excluded terms list are as follows:

The list is processed after all terms have been extracted from the records.

In this brief example of a delimited exclude list file, EXCLUDE and RecordSpec are the headers -- multiple headers are allowed. For the Exclude term property name property in the CAS term extraction manipulator, you must pass EXCLUDE.

EXCLUDE,RecordSpec
- 12.1 megapixel,1
- 12 MP,2
The 12.1 megapixel sensor,4


Copyright © Legal Notices