This section describes the tasks involved in implementing the
Stemming and Thesaurus features.
Overview of stemming and thesaurus
The Oracle Endeca Server supports stemming and thesaurus features that allow keyword search queries to match text containing alternate forms of the query terms or phrases.
About the stemming feature
The stemming feature broadens search results to include word roots and word derivations.
About the thesaurus feature
The thesaurus feature allows you to configure rules for matching queries to text containing equivalent words or concepts.
Dgraph flags for stemming and thesaurus
Stemming and thesaurus data that has been configured is automatically enabled for use during text indexing and search query processing. In addition, there is no Oracle Endeca Server configuration necessary to configure thesaurus and stemming information.
Interactions with other search features
As core features of the Oracle Endeca Server search subsystem, stemming and the thesaurus have interactions with other search features.
Performance impact of stemming and thesaurus
Stemming and thesaurus equivalences generally add little or no time to data processing and indexing, and introduce little space overhead (beyond the space required to store the raw string forms of the equivalences).