Next: , Previous: findutils-default regular expression syntax, Up: Regular Expressions

8.5.2 ‘awk’ regular expression syntax

The character ‘.’ matches any single character except the null character.

indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
matches a ‘+
matches a ‘?’.

Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example ‘[z-a]’, are invalid. Within square brackets, ‘\’ can be used to quote the following character. Character classes are not supported, so for example you would need to use ‘[0-9]’ instead of ‘[[:digit:]]’.

GNU extensions are not supported and so ‘\w’, ‘\W’, ‘\<’, ‘\>’, ‘\b’, ‘\B’, ‘\`’, and ‘\'’ match ‘w’, ‘W’, ‘<’, ‘>’, ‘b’, ‘B’, ‘`’, and ‘'’ respectively.

Grouping is performed with parentheses ‘()’. An unmatched ‘)’ matches just itself. A backslash followed by a digit matches that digit.

The alternation operator is ‘|’.

The characters ‘^’ and ‘$’ always represent the beginning and end of a string respectively, except within square brackets. Within brackets, ‘^’ can be used to invert the membership of the character class being specified.

*’, ‘+’ and ‘?’ are special at any point in a regular expression except:

  1. At the beginning of a regular expression
  2. After an open-group, signified by ‘(
  3. After the alternation operator ‘|

The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.