The minimal term extraction configuration consists of the following required pass-through parameters.
These parameters are listed in the following table, with details in later sections.
Parameter |
Configuration Value |
---|---|
TEXT_PROP_NAME |
Source property to use as the source text for term extraction. No default. |
RECORD_SPEC_PROP_NAME |
The source property that is mapped to the Guided Search record specifier property. No default. |
Use OLT for NounGrouping |
Property determines if term extraction uses OLT (true) or legacy code (false) for noun phrase grouping. |
OUTPUT_PROP_NAME |
The source property to use as the destination for tagged terms. No default. |
ALL_TERMS_OUTPUT_PROP_NAME |
Source property to use as the destination for all terms on a record. No default. |
LANG |
Specifies the language ID to use on a global basis. No default. |
State Directory path |
Directory path where term extraction state
files should be stored. The location should be Guide Search application
specific. For example:
|
This configuration runs as follows:
This configuration is the most permissive for term extraction. Although most sites prefer to perform some level of corpus and record filtering, this minimal configuration might be useful with small data sets that have a closely-related set of noun phrases in their documents.
Note
You must run a full crawl if any one of the minimum configuration parameters except LANG are modified.