regex

Synopsis

Regular expressions (also referred to as rational expressions) are sequences of characters that specify a search pattern in the text. Such patterns are often used in string-searching algorithms to perform “find” and “find and replace” operations on strings, or to validate inputs.

Anchors

OperatorsDescription
^Start of string, or start of line in multi-line pattern
\AStart of string
$End of string, or end of line in multi-line pattern
\ZEnd of string
\bWord boundary
\BNot word boundary
\<Start of word
\>End of word

Quanti­fiers

OperatorsOccurrenceExampleDescription
?0 or 1{3,5}3, 4 or 5
*0 or more{3}Exactly 3
+1 or more{3,}3 or more

Add a ? to a quantifier to make it ungreedy.

Groups and Ranges

OperatorDescription
.Any character except new line (\n)
(a|b)a or b
(...)Group
(?:...)Passive (non-c­apt­uring) group
[abc]Range (a or b or c)
[^abc]Not (a or b or c)
[a-q]Lower case letter from a to q
[A-Q]Upper case letter from A to Q
[0-7]Digit from 0 to 7
\xGroup/­sub­pattern number “­x”

Ranges are inclusive.

Escape Sequences

OperatorDescription
\Escape following character
\QBegin literal sequence
\EEnd literal sequence

Escaping is a way of treating characters which have a special meaning in regex literally, rather than as special charac­ters.

Common Meta-characters

^[.$
{*(\
+)|?
<>

The escape character is usually \

Special Characters

OperatorDescription
\nNew line
\rCarriage return
\tTab
\vVertical tab
\fForm feed
\xxxOctal character xxx
\xhhHex character hh

Assertions

OperatorDescription
?=Lookahead assertion
?!Negative lookahead
?<=Lookbehind assertion
?!=
or
?<!
Negative lookbehind
?>Once-only SubExp­ression
?()Condition [if then]
?()|Condition [if then else]
?#Comment

String Replacement

OperatorDescription
$nnth non-passive group
$2“xyz” in /^(abc(xyz))$
$1“xyz” in /^(?:abc)(xyz)$/
$`Before matched string
$'After matched string
$+Last matched string
$&Entire matched string

Some regex implementations use \ instead of $.

Pattern Modifiers

OperatorDescription
gGlobal match
i *Case-i­nse­nsitive
m *Multiple lines
s *Treat string as single line
x *Allow comments and whitespace in pattern
e *Evaluate replac­ement
U *Ungreedy pattern

* —> PCRE modifier