Previous: , Up: Regular Expression Syntax   [Contents][Index]


18.2.4 The Backslash Character

The ‘\’ character has one of four different meanings, depending on the context in which you use it and what syntax bits are set (see Syntax Bits). It can: 1) stand for itself, 2) quote the next character, 3) introduce an operator, or 4) do nothing.

  1. It stands for itself inside a list (see List Operators ([] and [^])) if the syntax bit RE_BACKSLASH_ESCAPE_IN_LISTS is not set. For example, ‘[\]’ would match ‘\’.
  2. It quotes (makes ordinary, if it’s special) the next character when you use it either:
    • outside a list,3 or
    • inside a list and the syntax bit RE_BACKSLASH_ESCAPE_IN_LISTS is set.
  3. It introduces an operator when followed by certain ordinary characters—sometimes only when certain syntax bits are set. See the cases RE_BK_PLUS_QM, RE_NO_BK_BRACES, RE_NO_BK_VAR, RE_NO_BK_PARENS, RE_NO_BK_REF in Syntax Bits. Also:
  4. In all other cases, Regex ignores ‘\’. For example, ‘\n’ matches ‘n’.

Footnotes

(3)

Sometimes you don’t have to explicitly quote special characters to make them ordinary. For instance, most characters lose any special meaning inside a list (see List Operators ([] and [^])). In addition, if the syntax bits RE_CONTEXT_INVALID_OPS and RE_CONTEXT_INDEP_OPS aren’t set, then (for historical reasons) the matcher considers special characters ordinary if they are in contexts where the operations they represent make no sense; for example, then the match-zero-or-more operator (represented by ‘*’) matches itself in the regular expression ‘*foo’ because there is no preceding expression on which it can operate. It is poor practice, however, to depend on this behavior; if you want a special character to be ordinary outside a list, it’s better to always quote it, regardless.


Previous: Collating Elements vs. Characters, Up: Regular Expression Syntax   [Contents][Index]