The semantics of matching search query terms to result text
containing non-alphanumeric characters are described in this topic.
- During query processing, each user query term is
transformed to replace all non-alphanumeric characters that are not marked as
search characters with delimiters (spaces).
- Non-alphanumeric
characters considered to be punctuation (! @ # & ( ) – [ { } ] : ; ', ? /
*) are treated as white space and preserve word order. This means that the
equivalent of a quoted phrase search is generated. For that reason, all search
features that are incompatible with quoted phrase search, such as spelling
correction, stemming, and thesaurus expansion, are not activated. (For details,
see
Using Phrase Search.)
- Non-alphanumeric
characters that are considered to be symbols (` ~ $ ^ + = < > “) are also
treated as white space. However, unlike punctuation characters, they do not
preserve word order in a multi-word search.
- Alphabetic characters in the
user query are replaced with lowercase equivalents, to ensure that they match
against case-folded indexed strings.
- Each query term in the
transformed query must exactly match some indexed string from the given source
text for the text to be considered a hit.
As noted above, when parsing user-entered search terms, a query with
non-searchable characters is transformed to replace all non-alphanumeric
characters (that are not marked as search characters) with white space, but the
treatment of word order depends on whether the character in question is
considered to be a punctuation character or a symbol. The search behavior
preserves the word order and proximity of the search term only in the case of
punctuation characters.
For example, a search query for ice-cream will replace the hyphen (a
punctuation character) with white space and return only records with this text:
Records with this text are not returned because the word order and word
proximity of text do not match the original query term:
- cream ice
- ice in the cream container
However, assuming the match mode is
All, a search for ice~cream would return non-contiguous
results for [ice AND cream].