Oracle ATG Web Commerce Search uses a common regular expression pattern syntax consisting of the following operand types:

Operand

Description

.

Match any character

x

Match character x

\x

Match character x, which might otherwise have special meaning to the syntax, such as +, *, ?, (, ), and /.

[set]

Match any character belonging to the given set, where hyphens denote a range

[^set]

Match any character that does not belong to the given set, where hyphens denote a range

The language allows each operand to have an optional operator immediately following it:

Syntax

Description

operand?

Match zero or one of the operand

operand*

Match zero or more of the operand

operand+

Match one or more of the operand

The language allows operands and operators to be grouped by parentheses to form a larger operand, which also can take operators.

To use a regular expression in a query, it must be denoted as shown:

re/regexp/

For efficiency, Oracle ATG Web Commerce Search requires that the regexp pattern must contain at least one non-optional, non-negative operand, which means either a literal (non-.) character or a [set] operand without a * or ? operator.

For example, book.* is a wildcard term that matches any index term that starts with the sub-string book, such as book, books, booking, and bookshelf. An example of a set operand, the pattern r[oa]m will match the index terms rom and ram only.

Regular expressions are a form of term expansion, because a single query term is replaced with a disjunction of alternative terms. But in this case, the expansions are not from a thesaurus, but based solely on the characters of the terms in the index and the regexp pattern.

Note: The regexp patterns expand to index terms, not to morphological forms of words, so the results are not always intuitively obvious. For example, .*desk expands to workdesk, but this could retrieve all forms of workdesk, including workdesks, which doesn’t really match the regexp pattern due to the trailing s. This behavior is by design, since it makes it consistent with the rest of Search’s query handling and search results. Users must understand that the regular expressions are matching against the Search dictionary of index terms, not literally across the text of the collection.

Regular expressions can expand to hundreds or thousands of index terms with patterns like s.*. To prevent slow searching and poor results, Search limits the number of expansions a regular expression can produce. This limit is configured by the wildcardMax described in Appendix B, Search XML Reference. The default is 20.