The Oracle Endeca Server treats characters in indexed text based
on three categories.
The categories are:
- Alphanumeric characters including ASCII characters as
well as non-punctuation characters in ISO-Latin1.
- Non-alphanumeric search
characters (configured using the search characters feature, as described
below).
- Other non-alphanumeric
characters (this category is the default for all non-alphanumeric characters
not explicitly configured to be in group 2).
During data processing, each word in the source text (that is,
searchable attributes for record search, attribute values for value search) is
indexed based on the alternatives for handling characters from the three
categories, which is described in subsequent topics.