A script-enabled browser is required for this page to function properly.

Regular expressions

Regular expressions are strings that can contain any of the following special wildcard characters:

.

Matches any single character. For example, .he would match both the and she.

( )

Used to force precedence of operators (for example, +) when the default precedence is not desired. For example, concatenation of characters (for example, the) precedes other operators such as +, so that the+A is equivalent to (the)+A. If instead you wished to match theeeeeeA, you can force + to precede concatenation using parentheses: th(e+)A.

[ ]

Matches any one of the single characters in the brackets. The brackets are a logical OR operator. For example, t[hr]e would match the in other and tre in trend.

|

Used between two regular expressions. Matches if either regular expression matches.

\

Matches the special character that follows the backslash. For example, \* would match *. The backslash is the escape character. Any special character following it is not treated as a special character.

Some special uses of the escape character are:

  • \n matches a newline character
  • \t matches a tab character
  • \b matches a blank character
  • \w matches \n, \t, \b or \0

*

Matches 0 or more instances of the regular expression. For example, (the)*A would match A, theA, and thetheA.

+

Matches 1 or more instances of the regular expression. For example, (the)+A would match both theA and thetheA, but it would not match A by itself as (the)*A would.

These characters have special meaning between square brackets:

~

As the first character, matches any characters not found in the characters or ranges inside the brackets. For example, t[~hr]e would match toe, but not the.

-

Between character pairs, matches any characters in the range. For example, [A-E] would match the letters A, B, C, D, and E. Character range pairs can be either of the following:

  • same case alphabetics, in which the first character comes before (or is equal to) the second character in the alphabet
  • digits, in which the first digit is less than or equal to the second digit
    (for example, [4-9] or [0-3]).

 

You can also use the escape character between brackets.

 

Usage notes

The order of precedence of the regular expression special characters is as follows: ( ), [ ], concatenated characters (for example, abc), *, +, ~, , |, .

The following examples show some of the implications of the order of precedence: