rx notation can be extended by defining new symbols and
parameterized forms in terms of other
rx expressions. This is
handy for sharing parts between several regexps, and for making
complex ones easier to build and understand by putting them together
from smaller pieces.
For example, you could define
name to mean
(one-or-more letter), and
(quoted x) to mean
(seq ?' x ?') for any x. These forms could then be
rx expressions like any other:
(rx (quoted name))
would match a nonempty sequence of letters inside single quotes.
The Lisp macros below provide different ways of binding names to definitions. Common to all of them are the following rules:
group, cannot be redefined.
-regexpto names; they cannot collide with anything else.
rx-to-string, not merely by their presence in definition macros. This means that the order of definitions doesn’t matter, even when they refer to each other, and that syntax errors only show up when they are used, not when they are defined.
rxexpressions are expected; for example, in the body of a
zero-or-oneform, but not inside
categoryforms. They are also allowed inside
Define name globally in all subsequent calls to
rx-to-string. If arglist is absent, then name is
defined as a plain symbol to be replaced with rx-form. Example:
(rx-define haskell-comment (seq "--" (zero-or-more nonl))) (rx haskell-comment) ⇒ "--.*"
If arglist is present, it must be a list of zero or more
argument names, and name is then defined as a parameterized form.
When used in an
rx expression as
each arg will replace the corresponding argument name inside
arglist may end in
&rest and one final argument name,
denoting a rest parameter. The rest parameter will expand to all
extra actual argument values not matched by any other parameter in
arglist, spliced into rx-form where it occurs. Example:
(rx-define moan (x y &rest r) (seq x (one-or-more y) r "!")) (rx (moan "MOO" "A" "MEE" "OW")) ⇒ "MOOA+MEEOW!"
Since the definition is global, it is recommended to give name a package prefix to avoid name clashes with definitions elsewhere, as is usual when naming non-local variables and functions.
Forms defined this way only perform simple template substitution.
For arbitrary computations, use them together with the
(defun n-tuple-rx (n element) `(seq "<" (group-n 1 ,element) ,@(mapcar (lambda (i) `(seq ?, (group-n ,i ,element))) (number-sequence 2 n)) ">")) (rx-define n-tuple (n element) (eval (n-tuple-rx n 'element))) (rx (n-tuple 3 (+ (in "0-9")))) ⇒ "<\\(?1:[0-9]+\\),\\(?2:[0-9]+\\),\\(?3:[0-9]+\\)>"
rx definitions in bindings available locally for
rx macro invocations in body, which is then evaluated.
Each element of bindings is on the form
(name [arglist] rx-form), where the parts
have the same meaning as in
rx-define above. Example:
(rx-let ((comma-separated (item) (seq item (0+ "," item))) (number (1+ digit)) (numbers (comma-separated number))) (re-search-forward (rx "(" numbers ")")))
The definitions are only available during the macro-expansion of body, and are thus not present during execution of compiled code.
rx-let can be used not only inside a function, but also at top
level to include global variable and function definitions that need
to share a common set of
rx forms. Since the names are local
inside body, there is no need for any package prefixes.
(rx-let ((phone-number (seq (opt ?+) (1+ (any digit ?-))))) (defun find-next-phone-number () (re-search-forward (rx phone-number))) (defun phone-number-p (string) (string-match-p (rx bos phone-number eos) string)))
The scope of the
rx-let bindings is lexical, which means that
they are not visible outside body itself, even in functions
called from body.
Evaluate bindings to a list of bindings as in
and evaluate body with those bindings in effect for calls
This macro is similar to
rx-let, except that the bindings
argument is evaluated (and thus needs to be quoted if it is a list
literal), and the definitions are substituted at run time, which is
rx-to-string to work. Example:
(rx-let-eval '((ponder (x) (seq "Where have all the " x " gone?"))) (looking-at (rx-to-string '(ponder (or "flowers" "young girls" "left socks")))))
Another difference from
rx-let is that the bindings are
dynamically scoped, and thus also available in functions called from
body. However, they are not visible inside functions defined in