The Web Crawler implements Sun’s java.util.regex package to parse and match the pattern of the regular expression. Therefore, the supported regular-expression constructs are the same as those in the documentation page for the java.util.regex.Pattern class:
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
This means that among the valid constructs you can use are:
Character classes (simple, negation, range, intersection, subtraction). For example, [^abc] means match any character except a, b, or c, while [a-zA-Z] means match any upper- or lower-case letter.
Predefined character classes, such as \d for a digit or \s for a whitespace character.
POSIX character classes (US-ASCII only), such as \p{Alpha} for an alphabetic character, \p{Alnum} for an alphanumeric character, and \p{Punct} for punctuation.
Boundary matchers, such as ^ for the beginning of a line, $ for the end of a line, and \b for a word boundary.
For a full list of valid constructs, see the Pattern class documentation page referenced above.