Categories of characters in indexed text

The Oracle Endeca Server treats characters in indexed text based on three categories.

The categories are:

Alphanumeric characters including ASCII characters as well as non-punctuation characters in ISO-Latin1.
Non-alphanumeric search characters (configured using the search characters feature, as described below).
Other non-alphanumeric characters (this category is the default for all non-alphanumeric characters not explicitly configured to be in group 2).

During data processing, each word in the source text (that is, searchable attributes for record search, attribute values for value search) is indexed based on the alternatives for handling characters from the three categories, which is described in subsequent topics.