Variables and Patterns

An identifier can name either a type of syntax or a location where a value can be stored. An identifier that names a type of syntax is called a syntactic keyword (informally called a macro), and is said to be bound to a transformer for that syntax. An identifier that names a location is called a variable and is said to be bound to that location. The set of all visible bindings in effect at some point in a program is known as the environment in effect at that point. The value stored in the location to which a variable is bound is called the variable’s value. By abuse of terminology, the variable is sometimes said to name the value or to be bound to the value. This is not quite accurate, but confusion rarely results from this practice.

Certain expression types are used to create new kinds of syntax and to bind syntactic keywords to those new syntaxes, while other expression types create new locations and bind variables to those locations. These expression types are called binding constructs. Those that bind syntactic keywords are discussed in Macros. The most fundamental of the variable binding constructs is the lambda expression, because all other variable binding constructs can be explained in terms of lambda expressions. Other binding constructs include the define family, and the let family.

Scheme is a language with block structure. To each place where an identifier is bound in a program there corresponds a region of the program text within which the binding is visible. The region is determined by the particular binding construct that establishes the binding; if the binding is established by a lambda expression, for example, then its region is the entire lambda expression. Every mention of an identifier refers to the binding of the identifier that established the innermost of the regions containing the use.

If there is no binding of the identifier whose region contains the use, then the use refers to the binding for the variable in the global environment, if any; if there is no binding for the identifier, it is said to be unbound.

Patterns

The usual way to bind variables is to match an incoming value against a pattern. The pattern contains variables that are bound to some value derived from the value.

(! [x::double y::double] (some-expression))

In the above example, the pattern [x::double y::double] is matched against the incoming value that results from evaluating (some-expression). That value is required to be a two-element sequence. Then the sub-pattern x::double is matched against element 0 of the sequence, which means it is coerced to a double and then the coerced value is matched against the sub-pattern x (which trivially succeeds). Similarly, y::double is matched against element 1.

The syntax of patterns is a work-in-progress. (The focus until now has been in designing and implementing how patterns work in general, rather than the details of the pattern syntax.)

pattern ::= identifier
  | _
  | pattern-literal
  | datum
  | pattern :: type
  | [ lpattern* ]
lpattern ::= pattern
  | @ pattern
  | pattern ...
  | guard
pattern-literal ::=
    boolean | number | character | string
guard ::= #!if expression

This is how the specific patterns work:

identifier

This is the simplest and most common form of pattern. The identifier is bound to a new variable that is initialized to the incoming value.

_

This pattern just discards the incoming value. It is equivalent to a unique otherwise-unused identifier.

pattern-literal

Matches if the value is equal? to the pattern-literal.

datum

Matches if the value is equal? to the quoted datum.

pattern :: type

The incoming value is coerced to a value of the specified type, and then the coerced value is matched against the sub-pattern. Most commonly the sub-pattern is a plain identifier, so the latter match is trivial.

[ lpattern* ]

The incoming value must be a sequence (a list, vector or similar). In the case where each sub-pattern is a plain pattern, then the number of sub-patterns must match the size of the sequence, and each sub-pattern is matched against the corresponding element of the sequence. More generally, each sub-pattern may match zero or more consequtive elements of the incoming sequence.

#!if expression

No incoming value is used. Instead the expression is evaluated. If the result is true, matching succeeds (so far); otherwise the match fails. This form is called a guard.

@ pattern

A splice pattern may match multiple (zero or more) elements of a sequence. The pattern is matched against the resulting sub-sequence.

(! [x @r] [2 3 5 7 11])

This binds x to 2 and r to [3 5 7 11].

pattern ...

Similar to @pattern in that it matches multiple elements of a sequence. However, each individual element is matched against the pattern, rather than the elements as a sequence. This is a repeat pattern.