Punctuation
Punctuation is treated specially in searches.
The following rules describe the interpretation
of punctuation characters.
- Quotation marks are always interpreted as operators signifying
a quoted phrase. It is therefore impossible to search for a quotation
mark (there is no escape character, such as a backslash, which would
remove the special significance of the quotation marks).
- All other punctuation loses any special operator significance
inside of quotation marks. (The same holds for all operators, such
as AND.)
- Outside of quotation marks, punctuation either has significance
as an operator, or it is ignored. The following punctuation has special
operator significance outside of quotation marks:
- Left and right angle brackets(<>) enclose operators, as in
<NEAR>
- Comma (,) is treated as OR
- Ampersand (&) is treated as AND
- Vertical bar (|) is treated as OR
- Plus (+) and minus (-) are interpreted as Internet Style syntax
- Asterisk (*) is interpreted as a wildcard character
- Punctuation is always split apart from adjoining alpha-numeric
characters. For example, an advanced search for bag-of-words matches documents containing the three tokens bag, of, and words.
Underscore is treated as punctuation. This means you must enclose
a term containing an underscore in quotes to get an exact match (for
example, "HOST_NAME"matches HOST_NAME, but without
the quotes, it also matches HOST NAME).
Symmetrical punctuation
tokenization takes place on text stored in the index, so the explosion
of a query term such as bag-of-words does not prevent the search from
matching a document containing the phrase bag-of-words.
Note:
- Terms generated by wildcard expansion are not stemmed.
- Wildcard expansion is performed internally by replacing each pattern
with a limited list of terms that match the pattern before actually
executing the query. Very broad wildcard expressions might therefore
return a partial list of results.