Regular Expression Syntax

Time Savers
 \w	Match a "word" character (alphanumeric plus "_")
 \W	Match a non-word character
 \s	Match a whitespace character
 \S	Match a non-whitespace character
 \d	Match a digit character
 \D	Match a non-digit character
 .	Match any character
Pattern Repeat Specifiers
 *	Match 0 or more times (equivalent to {0,})
 +	Match 1 or more times (equivalent to {1,})
 ?	Match 1 or 0 times    (equivalent to {0,1})
 {n}	Match exactly n times
 {n,}	Match at least n times
 {n,m}	Match at least n but not more than m times

Any of these can be followed by an optional ? character to denote that they should take as many
characters as possible when matching, rather than the default of the fewest. You can set this
option for the whole pattern more easily using the (?-U) flag (see below).
Special Characters
 \	Treat next character as literal, for example \. \\ \$ \^ \|
 \t	Tab
 \f	Form Feed
 \r	Carriage Return
 \n	Line Feed
 \xHH	HH are 2 hex digits specifying the character code
 ^	Match the beginning of the line
 $	Match the end of the line
 \b	Match word boundary
 \B	Match non-word boundary
 \A	start of subject (independent of multiline mode)
 \Z	end of subject or newline at end (independent of multiline mode)
 \z	end of subject (independent of multiline mode)
 |	Alternation
 ()	Grouping
 []	Character class (match any one of these characters)
 [^]	Exempt Character class (match anything but one of these characters)
 E-mail Address Validator  ^[\w.'%+-]+@([\w-]+\.)+[A-Za-z]{2,4}$
 UK Postcode Validator     ^[A-Za-z]{1,2}\d{1,2}[A-Za-z]? \d[A-Za-z]{2}$

 [abc] meaning match a b or c, and [^abc] for match not a b or c
 [0-3] meaning match 0 1 2 or 3, and [^0-3] for match not 0 1 2 or 3
 [03-] meaning match 0 3 or -, and [^03-] for match not 0 3 or -
 [aeiou]{5} meaning match a sequence of exactly 5 lowercase vowels
 (cat|dog|kid)nap would match catnap, dognap or kidnap

 \w	is equivalent to [A-Za-z0-9_]
 \W	is equivalent to [^A-Za-z0-9_]
 \s	is equivalent to [ \t\n\r\f]
 \S	is equivalent to [^ \t\n\r\f]
 \d	is equivalent to [0-9]
 \D	is equivalent to [^0-9]

By default, text matching is ungreedy, that is, it returns the minimum number of characters
that match the regular expression. To change this behaviour so that the maximum number of
characters matching an expression are returned, prefix the expression with (?-U). For example,
when tested against [\w.'%+-]+@([\w-]+\.)+[A-Za-z]{2,4} gives edwardwoodward@japanco.or
when tested against (?-U)[\w.'%+-]+@([\w-]+\.)+[A-Za-z]{2,4} gives
Flags can appear anywhere in a pattern and take effect from that position onwards. They work as follows :-

 (?i)	turn on case insensitivity (usually the default)
 (?-i)	turn off case insensitivity
 (?m)	turn on multi-line matching where the ^ and $, the "start of line" and "end of line"
	constructs, match immediately following or immediately before any newline in the subject
	string, respectively, as well as at the very start and end (default)
 (?-m)	turn off multi-line matching
 (?s)	make the dot character (.) match newlines as well as all other characters
 (?-s)	make the dot character (.) not match newlines, but match all other characters (default)
 (?x)	ignore white space characters in the pattern
 (?-x)	do not ignore white space characters in the pattern (default)
 (?U)	match a minimum amount of characters (default)
 (?-U)	match a maximum amount of characters

