About Regular Expressions

Two of the most fundamental ideas you need to know in order to work with regular expressions and with this feature are:

  • Regular expressions are a way of creating strings to match other string values.
  • You can use groupings in order to create stored values on which you can then operate.

To learn more about regex, you can visit the following Web site, which has information and tutorials that can help to get you started:http://www.regular-expressions.info/.

Many of the characters you can type on your keyboard are literal, ordinary characters—they present their actual value in the pattern. Some characters have special meaning, however, and they instruct the regex function (or engine which interprets the expressions) to treat the characters in designated ways. The following table outlines these “special characters” or metacharacters.

Character Name Description
. dot Matches any one character, including a space; it will match one character, but there must be one character to match.

Literally a . (dot) when bracketed ([]), or placed next to a \ (backslash).

* star/asterisk Matches one or more preceding character (0, 1, or any number), bracketed carrier class, or group in parentheses. Used for quantification.

Typically used with a . (dot) in the format .* to indicate that a match for any character, 0 or more times.

Literally an * (asterisk) when bracketed ([]).

+ plus Matches one or more of the preceding character, bracketed carrier class, or group in parentheses. Used for quantification.

Literally a + (plus sign) when bracketed ([]).

| bar/vertical bar/pipe Matches anything to the left or to the right; the bar separates the alternatives. Both sides are not always tried; if the left does not match, only then is the right attempted. Used for alternation.
{ left brace Begins an interval range, ended with } (right brace) to match; identifies how many times the previous singles character or group in parentheses must repeat.

Interval ranges are entered as minimum and maximums ({minimum,maximum}) where the character/group must appear a minimum of times up to the maximum. You can also use these character to set magnitude, or exactly the number of times a character must appear; you can set this, for example, as the minimum value without the maximum ({minimum,}).

? question mark Signifies that the preceding character or group in parentheses is optional; the character or group can appear not at all or one time.
^ caret Acts as an anchor to represent the beginning of a string.
$ dollar sign Acts as an anchor to represent the end of a string.
[ left bracket Acts as the start of a bracketed character class, ended with the ] (right bracket). A character class is a list of character options; one and only on of the characters in the bracketed class must appear for a match. A - (dash) in between two character enclosed by brackets designates a range; for example [a-z] is the character range of the lower case twenty-six letters of the alphabet.

Note that the ] (right bracket) ends a bracketed character class unless it sits directly next to the [ (left bracket) or the ^ (caret); in those two cases, it is the literal character.

( left parenthesis Creates a grouping when used with the ) (right parenthesis). Groupings have two functions:

They separate pattern strings so that a whole string can have special characters within it as if it were a single character.

They allow the designated pattern to be stored and referenced later (so that other operations can be performed on it).