Next: , Up: Regular Expressions   [Contents][Index]


3.1 Fundamental Structure

In regular expressions, the characters ‘.?*+{|()[\^$’ are special characters and have uses described below. All other characters are ordinary characters, and each ordinary character is a regular expression that matches itself.

The period ‘.’ matches any single character. It is unspecified whether ‘.’ matches an encoding error.

A regular expression may be followed by one of several repetition operators; the operators beginning with ‘{’ are called interval expressions.

?

The preceding item is optional and is matched at most once.

*

The preceding item is matched zero or more times.

+

The preceding item is matched one or more times.

{n}

The preceding item is matched exactly n times.

{n,}

The preceding item is matched n or more times.

{,m}

The preceding item is matched at most m times. This is a GNU extension.

{n,m}

The preceding item is matched at least n times, but not more than m times.

The empty regular expression matches the empty string. Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated expressions.

Two regular expressions may be joined by the infix operator ‘|’; the resulting regular expression matches any string matching either alternate expression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole expression may be enclosed in parentheses to override these precedence rules and form a subexpression. An unmatched ‘)’ matches just itself.


Next: Character Classes and Bracket Expressions, Up: Regular Expressions   [Contents][Index]