The Kawa Scheme language: Macros

7.10 Macros

Libraries and top–level programs can define and use new kinds of derived expressions and definitions called syntactic abstractions or macros. A syntactic abstraction is created by binding a keyword to a macro transformer or, simply, transformer.

The transformer determines how a use of the macro (called a macro use) is transcribed into a more primitive form.

Most macro uses have the form:

(keyword datum …)

where keyword is an identifier that uniquely determines the kind of form. This identifier is called the syntactic keyword, or simply keyword. The number of datums and the syntax of each depends on the syntactic abstraction.

Macro uses can also take the form of improper lists, singleton identifiers, or set! forms, where the second subform of the set! is the keyword:

(keyword datum … . datum)
keyword
(set! keyword datum)

The define-syntax, let-syntax and letrec-syntax forms create bindings for keywords, associate them with macro transformers, and control the scope within which they are visible.

The syntax-rules and identifier-syntax forms create transformers via a pattern language. Moreover, the syntax-case form allows creating transformers via arbitrary Scheme code.

Keywords occupy the same name space as variables. That is, within the same scope, an identifier can be bound as a variable or keyword, or neither, but not both, and local bindings of either kind may shadow other bindings of either kind.

Macros defined using syntax-rules and identifier-syntax are “hygienic” and “referentially transparent” and thus preserve Scheme’s lexical scoping.

If a macro transformer inserts a binding for an identifier (variable or keyword) not appearing in the macro use, the identifier is in effect renamed throughout its scope to avoid conflicts with other identifiers.
If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that may surround the use of the macro.

Macros defined using the syntax-case facility are also hygienic unless datum->syntax is used.

Kawa supports most of the syntax-case feature.

Syntax definitions are valid wherever definitions are. They have the following form:

Syntax: define-syntax keyword transformer-spec

The keyword is a identifier, and transformer-spec is a function that maps syntax forms to syntax forms, usually an instance of syntax-rules. If the define-syntax occurs at the top level, then the top-level syntactic environment is extended by binding the keyword to the specified transformer, but existing references to any top-level binding for keyword remain unchanged. Otherwise, it is an internal syntax definition, and is local to the body in which it is defined.

(let ((x 1) (y 2))
   (define-syntax swap!
     (syntax-rules ()
       ((swap! a b)
        (let ((tmp a))
          (set! a b)
          (set! b tmp)))))
   (swap! x y)
   (list x y))  ⇒ (2 1)

Macros can expand into definitions in any context that permits them. However, it is an error for a definition to define an identifier whose binding has to be known in order to determine the meaning of the definition itself, or of any preceding definition that belongs to the same group of internal definitions.

Syntax: define-syntax-case name (literals) (pattern expr) ...

A convenience macro to make it easy to define syntax-case-style macros. Defines a macro with the given name and list of literals. Each pattern has the form of a syntax-rules-style pattern, and it is matched against the macro invocation syntax form. When a match is found, the corresponding expr is evaluated. It must evaluate to a syntax form, which replaces the macro invocation.

(define-syntax-case macro-name (literals)
  (pat1 result1)
  (pat2 result2))

is equivalent to:

(define-syntax macro-name
  (lambda (form)
    (syntax-case form (literals)
      (pat1 result1)
      (pat2 result2))))

Syntax: define-macro (name lambda-list) form ...: This form is deprecated. Functionally equivalent to defmacro.

Syntax: defmacro name lambda-list form ...

This form is deprecated. Instead of

(defmacro (name ...)
  (let ... `(... ,exp ...)))

you should probably do:

(define-syntax-case name ()
  ((_ ...) (let #`(... #,exp ...))))

and instead of

(defmacro (name ... var ...) `(... var ...))

you should probably do:

(define-syntax-case name ()
  ((_ ... var ...) #`(... var ...))

Defines an old-style macro a la Common Lisp, and installs (lambda lambda-list form ...) as the expansion function for name. When the translator sees an application of name, the expansion function is called with the rest of the application as the actual arguments. The resulting object must be a Scheme source form that is futher processed (it may be repeatedly macro-expanded).

Procedure: gentemp: Returns a new (interned) symbol each time it is called. The symbol names are implementation-dependent. (This is not directly macro-related, but is often used in conjunction with defmacro to get a fresh unique identifier.)

Procedure: expand form

The result of evaluating form is treated as a Scheme expression, syntax-expanded to internal form, and then converted back to (roughly) the equivalent expanded Scheme form.

This can be useful for debugging macros.

To access this function, you must first (require 'syntax-utils).

(require 'syntax-utils)
(expand '(cond ((> x y) 0) (else 1))) ⇒ (if (> x y) 0 1)

7.10.1 Pattern language

A transformer-spec is an expression that evaluates to a transformer procedure, which takes an input form and returns a resulting form. You can do general macro-time compilation with such a procedure, commonly using syntax-case (which is documented in the R6RS library specification). However, when possible it is better to use the simpler pattern language of syntax-rules:

transformer-spec ::=
  (syntax-rules ( tr-literal^* ) syntax-rule^*)
  | (syntax-rules ellipsis ( tr-literal^* ) syntax-rule^*)
  | expression
syntax-rule ::= (list-pattern syntax-template)
tr-literal ::= identifier
ellipsis ::= identifier

An instance of syntax-rules produces a new macro transformer by specifying a sequence of hygienic rewrite rules. A use of a macro whose keyword is associated with a transformer specified by syntax-rules is matched against the patterns contained in the syntax-rules beginning with the leftmost syntax rule . When a match is found, the macro use is transcribed hygienically according to the template. The optional ellipsis species a symbol used to indicate repetition; it defaults to ... (3 periods).

syntax-pattern ::=
  identifier | constant | list-pattern | vector-pattern
list-pattern ::= ( syntax-pattern^* )
  | ( syntax-pattern syntax-pattern^* . syntax-pattern )
  | ( syntax-pattern^* syntax-pattern ellipsis syntax-pattern^* )
  | ( syntax-pattern^* syntax-pattern ellipsis syntax-pattern^* . syntax-pattern)
vector-pattern ::= #( syntax-pattern^* )
  | #( syntax-pattern^* syntax-pattern ellipsis syntax-pattern^* )

An identifier appearing within a pattern can be an underscore (_), a literal identifier listed in the list of tr-literals, or the ellipsis. All other identifiers appearing within a pattern are pattern variables.

The outer syntax-list of the pattern in a syntax-rule must start with an identifier. It is not involved in the matching and is considered neither a pattern variable nor a literal identifier.

Pattern variables match arbitrary input elements and are used to refer to elements of the input in the template. It is an error for the same pattern variable to appear more than once in a syntax-pattern.

Underscores also match arbitrary input elements but are not pattern variables and so cannot be used to refer to those elements. If an underscore appears in the literals list, then that takes precedence and underscores in the pattern match as literals. Multiple underscores can appear in a syntax-pattern.

Identifiers that appear in (tr-literal^*) are interpreted as literal identifiers to be matched against corresponding elements of the input. An element in the input matches a literal identifier if and only if it is an identifier and either both its occurrence in the macro expression and its occurrence in the macro definition have the same lexical binding, or the two identifiers are the same and both have no lexical binding.

A subpattern followed by ellipsis can match zero or more elements of the input, unless ellipsis appears in the literals, in which case it is matched as a literal.

More formally, an input expression E matches a pattern P if and only if:

P is an underscore (_); or
P is a non-literal identifier; or
P is a literal identifier and E is an identifier with the same binding; or
P is a list (P₁ ... P_n) and E is a list of n elements that match P₁ through P_n, respectively; or
P is an improper list (P₁ ... P_n . P_n+1) and E is a list or improper list of n or more elements that match P₁ through P_n, respectively, and whose nth tail matches P_n+1; or
P is of the form (P₁ ... P_k P_e ellipsis P_k+1 ... P_k+l) where E is a proper list of n elements, the first k of which match P₁ through P_k, respectively, whose next n-k-l elements each match P_e, and whose remaining l elements match P_k+1 through P_k+l; or
P is of the form (P₁ ... P_k P_e ellipsis P_k+1 ... P_k+l . P_x) where E is a list or improper list of n elements, the first k of which match P₁ through P_k, whose next n-k-l elements each match P_e, and whose remaining l elements match P_k+1 through P_k+l, and whose nth and final cdr matches P_x; or
P is a vector of the form #(P₁ ... P_n) and E is a vector of n elements that match P₁ through P_n; or
P is of the form #(P₁ ... P_k P_e ellipsis P_k+1 ... P_k+l) where E is a vector of n elements the first k of which match P₁ through P_k, whose next n-k-l elements each match P_e, and whose remaining l elements match P_k+1 through P_k+l; or
P is a constant and E is equal to P in the sense of the equal? procedure.

It is an error to use a macro keyword, within the scope of its binding, in an expression that does not match any of the patterns.

syntax-template ::= identifier | constant
   | (template-element^*)
   | (template-element template-element^* . syntax-template )
   | ( ellipsis syntax-template)
template-element ::= syntax-template [ellipsis]

When a macro use is transcribed according to the template of the matching syntax-rule, pattern variables that occur in the template are replaced by the elements they match in the input. Pattern variables that occur in subpatterns followed by one or more instances of the identifier ellipsis are allowed only in subtemplates that are followed by as many instances of ellipsis . They are replaced in the output by all of the elements they match in the input, distributed as indicated. It is an error if the output cannot be built up as specified.

Identifiers that appear in the template but are not pattern variables or the identifier ellipsis are inserted into the output as literal identifiers. If a literal identifier is inserted as a free identifier then it refers to the binding of that identifier within whose scope the instance of syntax-rules appears. If a literal identifier is inserted as a bound identifier then it is in effect renamed to prevent inadvertent captures of free identifiers.

A template of the form (ellipsis template) is identical to template, except that ellipses within the template have no special meaning. That is, any ellipses contained within template are treated as ordinary identifiers. In particular, the template (ellipsis ellipsis) produces a single ellipsis. This allows syntactic abstractions to expand into code containing ellipses.

(define-syntax be-like-begin
  (syntax-rules ()
    ((be-like-begin name)
     (define-syntax name
       (syntax-rules ()
         ((name expr (... ...))
          (begin expr (... ...))))))))

(be-like-begin sequence)
(sequence 1 2 3 4) ⇒ 4

7.10.2 Identifier predicates

Procedure: identifier? obj

Return #t if obj is an identifier, i.e., a syntax object representing an identifier, and #f otherwise.

The identifier? procedure is often used within a fender to verify that certain subforms of an input form are identifiers, as in the definition of rec, which creates self–contained recursive objects, below.

(define-syntax rec
  (lambda (x)
    (syntax-case x ()
      ((_ x e)
       (identifier? #'x)
       #'(letrec ((x e)) x)))))

(map (rec fact
       (lambda (n)
         (if (= n 0)                 
             1
             (* n (fact (- n 1))))))
     '(1 2 3 4 5))    ⇒ (1 2 6 24 120)
 
(rec 5 (lambda (x) x))  ⇒ exception

The procedures bound-identifier=? and free-identifier=? each take two identifier arguments and return #t if their arguments are equivalent and #f otherwise. These predicates are used to compare identifiers according to their intended use as free references or bound identifiers in a given context.

Procedure: bound-identifier=? id₁ id₂

id₁ and id₂ must be identifiers.

The procedure bound-identifier=? returns #t if a binding for one would capture a reference to the other in the output of the transformer, assuming that the reference appears within the scope of the binding, and #f otherwise.

In general, two identifiers are bound-identifier=? only if both are present in the original program or both are introduced by the same transformer application (perhaps implicitly, see datum->syntax).

The bound-identifier=? procedure can be used for detecting duplicate identifiers in a binding construct or for other preprocessing of a binding construct that requires detecting instances of the bound identifiers.

Procedure: free-identifier=? id₁ id₂

id₁ and id₂ must be identifiers.

The free-identifier=? procedure returns #t if and only if the two identifiers would resolve to the same binding if both were to appear in the output of a transformer outside of any bindings inserted by the transformer. (If neither of two like–named identifiers resolves to a binding, i.e., both are unbound, they are considered to resolve to the same binding.)

Operationally, two identifiers are considered equivalent by free-identifier=? if and only the topmost matching substitution for each maps to the same binding or the identifiers have the same name and no matching substitution.

The syntax-case and syntax-rules forms internally use free-identifier=? to compare identifiers listed in the literals list against input identifiers.

(let ((fred 17))
  (define-syntax a
    (lambda (x)
      (syntax-case x ()
        ((_ id) #'(b id fred)))))
  (define-syntax b
    (lambda (x)
      (syntax-case x ()
        ((_ id1 id2)
         #`(list
             #,(free-identifier=? #'id1 #'id2)
             #,(bound-identifier=? #'id1 #'id2))))))
  (a fred))
    ⇒ (#t #f)

The following definition of unnamed let uses bound-identifier=? to detect duplicate identifiers.

(define-syntax let
  (lambda (x)
    (define unique-ids?
      (lambda (ls)
        (or (null? ls)
            (and (let notmem? ((x (car ls)) (ls (cdr ls)))
                   (or (null? ls)
                       (and (not (bound-identifier=? x (car ls)))
                            (notmem? x (cdr ls)))))
                 (unique-ids? (cdr ls))))))
    (syntax-case x ()
      ((_ ((i v) ...) e1 e2 ...)
       (unique-ids? #'(i ...))
       #'((lambda (i ...) e1 e2 ...) v ...)))))

The argument #'(i ...) to unique-ids? is guaranteed to be a list by the rules given in the description of syntax above.

With this definition of let:

(let ((a 3) (a 4)) (+ a a))    ⇒ syntax error

However,

(let-syntax
  ((dolet (lambda (x)
            (syntax-case x ()
              ((_ b)
               #'(let ((a 3) (b 4)) (+ a b)))))))
  (dolet a))
⇒ 7

since the identifier a introduced by dolet and the identifier a extracted from the input form are not bound-identifier=?.

Rather than including else in the literals list as before, this version of case explicitly tests for else using free-identifier=?.

(define-syntax case
  (lambda (x)
    (syntax-case x ()
      ((_ e0 ((k ...) e1 e2 ...) ...
          (else-key else-e1 else-e2 ...))
       (and (identifier? #'else-key)
            (free-identifier=? #'else-key #'else))
       #'(let ((t e0))
           (cond
            ((memv t '(k ...)) e1 e2 ...)
            ...
            (else else-e1 else-e2 ...))))
      ((_ e0 ((ka ...) e1a e2a ...)
          ((kb ...) e1b e2b ...) ...)
       #'(let ((t e0))
           (cond
            ((memv t '(ka ...)) e1a e2a ...)
            ((memv t '(kb ...)) e1b e2b ...)
            ...))))))

With either definition of case, else is not recognized as an auxiliary keyword if an enclosing lexical binding for else exists. For example,

(let ((else #f))
  (case 0 (else (write "oops"))))    ⇒ syntax error

since else is bound lexically and is therefore not the same else that appears in the definition of case.

7.10.3 Syntax-object and datum conversions

Procedure: syntax->datum syntax-object

Deprecated procedure: syntax-object->datum syntax-object

Strip all syntactic information from a syntax object and returns the corresponding Scheme datum.

Identifiers stripped in this manner are converted to their symbolic names, which can then be compared with eq?. Thus, a predicate symbolic-identifier=? might be defined as follows.

(define symbolic-identifier=?
  (lambda (x y)
    (eq? (syntax->datum x)
         (syntax->datum y))))

Procedure: datum->syntax template-id datum [srcloc]

Deprecated procedure: datum->syntax-object template-id datum

template-id must be a template identifier and datum should be a datum value.

The datum->syntax procedure returns a syntax-object representation of datum that contains the same contextual information as template-id, with the effect that the syntax object behaves as if it were introduced into the code when template-id was introduced.

If srcloc is specified (and neither #f or #!null), it specifies the file position (including line number) for the result. In that case it should be a syntax object representing a list; otherwise it is currently ignored, though future extensions may support other ways of specifying the position.

The datum->syntax procedure allows a transformer to “bend” lexical scoping rules by creating implicit identifiers that behave as if they were present in the input form, thus permitting the definition of macros that introduce visible bindings for or references to identifiers that do not appear explicitly in the input form. For example, the following defines a loop expression that uses this controlled form of identifier capture to bind the variable break to an escape procedure within the loop body. (The derived with-syntax form is like let but binds pattern variables.)

(define-syntax loop
  (lambda (x)
    (syntax-case x ()
      ((k e ...)
       (with-syntax
           ((break (datum->syntax #'k 'break)))
         #'(call-with-current-continuation
             (lambda (break)
               (let f () e ... (f)))))))))

(let ((n 3) (ls '()))
  (loop
    (if (= n 0) (break ls))
    (set! ls (cons 'a ls))
    (set! n (- n 1))))
⇒ (a a a)

Were loop to be defined as:

(define-syntax loop
  (lambda (x)
    (syntax-case x ()
      ((_ e ...)
       #'(call-with-current-continuation
           (lambda (break)
             (let f () e ... (f))))))))

the variable break would not be visible in e ....

The datum argument datum may also represent an arbitrary Scheme form, as demonstrated by the following definition of include.

(define-syntax include
  (lambda (x)
    (define read-file
      (lambda (fn k)
        (let ((p (open-file-input-port fn)))
          (let f ((x (get-datum p)))
            (if (eof-object? x)
                (begin (close-port p) '())
                (cons (datum->syntax k x)
                      (f (get-datum p))))))))
    (syntax-case x ()
      ((k filename)
       (let ((fn (syntax->datum #'filename)))
         (with-syntax (((exp ...)
                        (read-file fn #'k)))
           #'(begin exp ...)))))))

(include "filename") expands into a begin expression containing the forms found in the file named by "filename". For example, if the file flib.ss contains:

(define f (lambda (x) (g (* x x))))

and the file glib.ss contains:

(define g (lambda (x) (+ x x)))

the expression:

(let ()
  (include "flib.ss")
  (include "glib.ss")
  (f 5))

evaluates to 50.

The definition of include uses datum->syntax to convert the objects read from the file into syntax objects in the proper lexical context, so that identifier references and definitions within those expressions are scoped where the include form appears.

Using datum->syntax, it is even possible to break hygiene entirely and write macros in the style of old Lisp macros. The lisp-transformer procedure defined below creates a transformer that converts its input into a datum, calls the programmer’s procedure on this datum, and converts the result back into a syntax object scoped where the original macro use appeared.

(define lisp-transformer
  (lambda (p)
    (lambda (x)
      (syntax-case x ()
        ((kwd . rest)
         (datum->syntax #'kwd
           (p (syntax->datum x))))))))

7.10.4 Signaling errors in macro transformers

Syntax: syntax-error message args^*

The message and args are treated similary as for the error procedure. However, the error is reported when the syntax-error is expanded. This can be used as a syntax-rules template for a pattern that is an invalid use of the macro, which can provide more descriptive error messages. The message should be a string literal, and the args arbitrary (non-evalualted) expressions providing additional information.

(define-syntax simple-let
  (syntax-rules ()
    ((_ (head ... ((x . y) val) . tail)
       body1 body2 ...)
     (syntax-error "expected an identifier but got" (x . y)))
    ((_ ((name val) ...) body1 body2 ...)
     ((lambda (name ...) body1 body2 ...)
      val ...))))

Procedure: report-syntax-error location message

This is a procedure that can be called at macro-expansion time by a syntax transformer function. (In contrast syntax-error is a syntax form used in the expansion result.) The message is reported as a compile-time error message. The location is used for the source location (file name and line/column numbers): In general it can be a SourceLocator value; most commonly it is a syntax object for a sub-list of the input form that is erroneous. The value returned by report-syntax-error is an instance of ErrorExp, which supresses further compilation.

(define-syntax if
  (lambda (x)
    (syntax-case x ()
                 ((_ test then)
                  (make-if-exp #'test #'then #!null))
                 ((_ test then else)
                  (make-if-exp #'test #'then #'else))
                 ((_ e1 e2 e3 . rest)
                  (report-syntax-error #'rest
                   "too many expressions for 'if'"))
                 ((_ . rest)
                  (report-syntax-error #'rest
                   "too few expressions for 'if'")))))

In the above example, one could use the source form x for the location, but using #'rest is more accurate. Note that the following is incorrect, because e1 might not be a pair, in which case we don’t have location information for it (due to a Kawa limitation):

    (syntax-case x ()
                 ...
                 ((_ e1)
                  (report-syntax-error
                   #'e1 ;; poor location specifier
                   "too few expressions for 'if'")))))

7.10.5 Convenience forms

Syntax: with-syntax ((pattern expression) …) body

The with-syntax form is used to bind pattern variables, just as let is used to bind variables. This allows a transformer to construct its output in separate pieces, then put the pieces together.

Each pattern is identical in form to a syntax-case pattern. The value of each expression is computed and destructured according to the corresponding pattern, and pattern variables within the pattern are bound as with syntax-case to the corresponding portions of the value within body.

The with-syntax form may be defined in terms of syntax-case as follows.

(define-syntax with-syntax
  (lambda (x)
    (syntax-case x ()
      ((_ ((p e0) ...) e1 e2 ...)
       (syntax (syntax-case (list e0 ...) ()
                 ((p ...) (let () e1 e2 ...))))))))

The following definition of cond demonstrates the use of with-syntax to support transformers that employ recursion internally to construct their output. It handles all cond clause variations and takes care to produce one-armed if expressions where appropriate.

(define-syntax cond
  (lambda (x)
    (syntax-case x ()
      ((_ c1 c2 ...)
       (let f ((c1 #'c1) (c2* #'(c2 ...)))
         (syntax-case c2* ()
           (()
            (syntax-case c1 (else =>)
             (((else e1 e2 ...) #'(begin e1 e2 ...))
              ((e0) #'e0)
              ((e0 => e1)
               #'(let ((t e0)) (if t (e1 t))))
              ((e0 e1 e2 ...)
               #'(if e0 (begin e1 e2 ...)))))
           ((c2 c3 ...)
            (with-syntax ((rest (f #'c2 #'(c3 ...))))
              (syntax-case c1 (=>)
                ((e0) #'(let ((t e0)) (if t t rest)))
                ((e0 => e1)
                 #'(let ((t e0)) (if t (e1 t) rest)))
                ((e0 e1 e2 ...)
                 #'(if e0 
                        (begin e1 e2 ...)
                        rest)))))))))))

Syntax: quasisyntax template

Auxiliary Syntax: unsyntax

Auxiliary Syntax: unsyntax-splicing

The quasisyntax form is similar to syntax, but it allows parts of the quoted text to be evaluated, in a manner similar to the operation of quasiquote.

Within a quasisyntax template, subforms of unsyntax and unsyntax-splicing forms are evaluated, and everything else is treated as ordinary template material, as with syntax.

The value of each unsyntax subform is inserted into the output in place of the unsyntax form, while the value of each unsyntax-splicing subform is spliced into the surrounding list or vector structure. Uses of unsyntax and unsyntax-splicing are valid only within quasisyntax expressions.

A quasisyntax expression may be nested, with each quasisyntax introducing a new level of syntax quotation and each unsyntax or unsyntax-splicing taking away a level of quotation. An expression nested within n quasisyntax expressions must be within n unsyntax or unsyntax-splicing expressions to be evaluated.

As noted in abbreviation, #`template is equivalent to (quasisyntax template), #,template is equivalent to (unsyntax template), and #,@template is equivalent to (unsyntax-splicing template). Note that for backwards compatibility, you should only use #,template inside a literal #`template form.

The quasisyntax keyword can be used in place of with-syntax in many cases. For example, the definition of case shown under the description of with-syntax above can be rewritten using quasisyntax as follows.

(define-syntax case
  (lambda (x)
    (syntax-case x ()
      ((_ e c1 c2 ...)
       #`(let ((t e))
           #,(let f ((c1 #'c1) (cmore #'(c2 ...)))
               (if (null? cmore)
                   (syntax-case c1 (else)
                     ((else e1 e2 ...)
                      #'(begin e1 e2 ...))
                     (((k ...) e1 e2 ...)
                      #'(if (memv t '(k ...))
                            (begin e1 e2 ...))])
                   (syntax-case c1 ()
                     (((k ...) e1 e2 ...)
                      #`(if (memv t '(k ...))
                            (begin e1 e2 ...)
                            #,(f (car cmore)
                                  (cdr cmore))))))))))))

Note: Any syntax-rules form can be expressed with syntax-case by making the lambda expression and syntax expressions explicit, and syntax-rules may be defined in terms of syntax-case as follows.
(define-syntax syntax-rules
  (lambda (x)
    (syntax-case x ()
      ((_ (lit ...) ((k . p) t) ...)
       (for-all identifier? #'(lit ... k ...))
       #'(lambda (x)
           (syntax-case x (lit ...)
             ((_ . p) #'t) ...))))))