MIT/GNU Scheme 11.0.90

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

MIT/GNU Scheme

This manual documents MIT/GNU Scheme 11.0.90.

Copyright © 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Massachusetts Institute of Technology

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”


Next: , Previous: , Up: Top   [Contents][Index]

Acknowledgements

While "a cast of thousands" may be an overstatement, it is certainly the case that this document represents the work of many people. First and foremost, thanks go to the authors of the Revised^4 Report on the Algorithmic Language Scheme, from which much of this document is derived. Thanks also to BBN Advanced Computers Inc. for the use of parts of their Butterfly Scheme Reference, and to Margaret O’Connell for translating it from BBN’s text-formatting language to ours.

Special thanks to Richard Stallman, Bob Chassell, and Brian Fox, all of the Free Software Foundation, for creating and maintaining the Texinfo formatting language in which this document is written.

This report describes research done at the Artificial Intelligence Laboratory and the Laboratory for Computer Science, both of the Massachusetts Institute of Technology. Support for this research is provided in part by the Advanced Research Projects Agency of the Department of Defense and by the National Science Foundation.


Next: , Previous: , Up: Top   [Contents][Index]

1 Overview

This manual is a detailed description of the MIT/GNU Scheme runtime system. It is intended to be a reference document for programmers. It does not describe how to run Scheme or how to interact with it — that is the subject of the MIT/GNU Scheme User’s Manual.

This chapter summarizes the semantics of Scheme, briefly describes the MIT/GNU Scheme programming environment, and explains the syntactic and lexical conventions of the language. Subsequent chapters describe special forms, numerous data abstractions, and facilities for input and output.

Throughout this manual, we will make frequent references to standard Scheme, which is the language defined by the document Revised^4 Report on the Algorithmic Language Scheme, by William Clinger, Jonathan Rees, et al., or by IEEE Std. 1178-1990, IEEE Standard for the Scheme Programming Language (in fact, several parts of this document are copied from the Revised Report). MIT/GNU Scheme is an extension of standard Scheme.

These are the significant semantic characteristics of the Scheme language:

Variables are statically scoped

Scheme is a statically scoped programming language, which means that each use of a variable is associated with a lexically apparent binding of that variable. Algol is another statically scoped language.

Types are latent

Scheme has latent types as opposed to manifest types, which means that Scheme associates types with values (or objects) rather than with variables. Other languages with latent types (also referred to as weakly typed or dynamically typed languages) include APL, Snobol, and other dialects of Lisp. Languages with manifest types (sometimes referred to as strongly typed or statically typed languages) include Algol 60, Pascal, and C.

Objects have unlimited extent

All objects created during a Scheme computation, including procedures and continuations, have unlimited extent; no Scheme object is ever destroyed. The system doesn’t run out of memory because the garbage collector reclaims the storage occupied by an object when the object cannot possibly be needed by a future computation. Other languages in which most objects have unlimited extent include APL and other Lisp dialects.

Proper tail recursion

Scheme is properly tail-recursive, which means that iterative computation can occur in constant space, even if the iterative computation is described by a syntactically recursive procedure. With a tail-recursive implementation, you can express iteration using the ordinary procedure-call mechanics; special iteration expressions are provided only for syntactic convenience.

Procedures are objects

Scheme procedures are objects, which means that you can create them dynamically, store them in data structures, return them as the results of other procedures, and so on. Other languages with such procedure objects include Common Lisp and ML.

Continuations are explicit

In most other languages, continuations operate behind the scenes. In Scheme, continuations are objects; you can use continuations for implementing a variety of advanced control constructs, including non-local exits, backtracking, and coroutines.

Arguments are passed by value

Arguments to Scheme procedures are passed by value, which means that Scheme evaluates the argument expressions before the procedure gains control, whether or not the procedure needs the result of the evaluations. ML, C, and APL are three other languages that pass arguments by value. In languages such as SASL and Algol 60, argument expressions are not evaluated unless the values are needed by the procedure.

Scheme uses a parenthesized-list Polish notation to describe programs and (other) data. The syntax of Scheme, like that of most Lisp dialects, provides for great expressive power, largely due to its simplicity. An important consequence of this simplicity is the susceptibility of Scheme programs and data to uniform treatment by other Scheme programs. As with other Lisp dialects, the read primitive parses its input; that is, it performs syntactic as well as lexical decomposition of what it reads.


Next: , Previous: , Up: Overview   [Contents][Index]

1.1 Notational Conventions

This section details the notational conventions used throughout the rest of this document.


Next: , Previous: , Up: Notational Conventions   [Contents][Index]

1.1.1 Errors

When this manual uses the phrase “an error will be signalled,” it means that Scheme will call error, which normally halts execution of the program and prints an error message.

When this manual uses the phrase “it is an error,” it means that the specified action is not valid in Scheme, but the system may or may not signal the error. When this manual says that something “must be,” it means that violating the requirement is an error.


Next: , Previous: , Up: Notational Conventions   [Contents][Index]

1.1.2 Examples

This manual gives many examples showing the evaluation of expressions. The examples have a common format that shows the expression being evaluated on the left hand side, an “arrow” in the middle, and the value of the expression written on the right. For example:

(+ 1 2)          ⇒  3

Sometimes the arrow and value will be moved under the expression, due to lack of space. Occasionally we will not care what the value is, in which case both the arrow and the value are omitted.

If an example shows an evaluation that results in an error, an error message is shown, prefaced by ‘error→’:

(+ 1 'foo)                      error→ Illegal datum

An example that shows printed output marks it with ‘-|’:

(begin (write 'foo) 'bar)
     -| foo
     ⇒ bar

When this manual indicates that the value returned by some expression is unspecified, it means that the expression will evaluate to some object without signalling an error, but that programs should not depend on the value in any way.


Previous: , Up: Notational Conventions   [Contents][Index]

1.1.3 Entry Format

Each description of an MIT/GNU Scheme variable, special form, or procedure begins with one or more header lines in this format:

category: template

where category specifies the kind of item (“variable”, “special form”, or “procedure”). The form of template is interpreted depending on category.

Variable

Template consists of the variable’s name.

Parameter

Template consists of the parameter’s name. See Dynamic Binding and Parameters for more information.

Special Form

Template starts with the syntactic keyword of the special form, followed by a description of the special form’s syntax. The description is written using the following conventions.

Named components are italicized in the printed manual, and uppercase in the Info file. “Noise” keywords, such as the else keyword in the cond special form, are set in a fixed width font in the printed manual; in the Info file they are not distinguished. Parentheses indicate themselves.

A horizontal ellipsis (…) is describes repeated components. Specifically,

thing

indicates zero or more occurrences of thing, while

thing thing

indicates one or more occurrences of thing.

Brackets, [ ], enclose optional components.

Several special forms (e.g. lambda) have an internal component consisting of a series of expressions; usually these expressions are evaluated sequentially under conditions that are specified in the description of the special form. This sequence of expressions is commonly referred to as the body of the special form.

Procedure

Template starts with the name of the variable to which the procedure is bound, followed by a description of the procedure’s arguments. The arguments are described using “lambda list” notation (see Lambda Expressions), except that brackets are used to denote optional arguments, and ellipses are used to denote “rest” arguments.

The names of the procedure’s arguments are italicized in the printed manual, and uppercase in the Info file.

When an argument names a Scheme data type, it indicates that the argument must be that type of data object. For example,

procedure: cdr pair

indicates that the standard Scheme procedure cdr takes one argument, which must be a pair.

Many procedures signal an error when an argument is of the wrong type; usually this error is a condition of type condition-type:wrong-type-argument.

In addition to the standard data-type names (pair, list, boolean, string, etc.), the following names as arguments also imply type restrictions:

Some examples:

procedure: list object …

indicates that the standard Scheme procedure list takes zero or more arguments, each of which may be any Scheme object.

procedure: write-char char [output-port]

indicates that the standard Scheme procedure write-char must be called with a character, char, and may also be called with a character and an output port.


Next: , Previous: , Up: Overview   [Contents][Index]

1.2 Scheme Concepts


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.1 Variable Bindings

Any identifier that is not a syntactic keyword may be used as a variable (see Identifiers). A variable may name a location where a value can be stored. A variable that does so is said to be bound to the location. The value stored in the location to which a variable is bound is called the variable’s value. (The variable is sometimes said to name the value or to be bound to the value.)

A variable may be bound but still not have a value; such a variable is said to be unassigned. Referencing an unassigned variable is an error. When this error is signalled, it is a condition of type condition-type:unassigned-variable; sometimes the compiler does not generate code to signal the error. Unassigned variables are useful only in combination with side effects (see Assignments).


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.2 Environment Concepts

An environment is a set of variable bindings. If an environment has no binding for a variable, that variable is said to be unbound in that environment. Referencing an unbound variable signals a condition of type condition-type:unbound-variable.

A new environment can be created by extending an existing environment with a set of new bindings. Note that “extending an environment” does not modify the environment; rather, it creates a new environment that contains the new bindings and the old ones. The new bindings shadow the old ones; that is, if an environment that contains a binding for x is extended with a new binding for x, then only the new binding is seen when x is looked up in the extended environment. Sometimes we say that the original environment is the parent of the new one, or that the new environment is a child of the old one, or that the new environment inherits the bindings in the old one.

Procedure calls extend an environment, as do let, let*, letrec, and do expressions. Internal definitions (see Internal Definitions) also extend an environment. (Actually, all the constructs that extend environments can be expressed in terms of procedure calls, so there is really just one fundamental mechanism for environment extension.) A top-level definition (see Top-Level Definitions) may add a binding to an existing environment.


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.3 Initial and Current Environments

MIT/GNU Scheme provides an initial environment that contains all of the variable bindings described in this manual. Most environments are ultimately extensions of this initial environment. In Scheme, the environment in which your programs execute is actually a child (extension) of the environment containing the system’s bindings. Thus, system names are visible to your programs, but your names do not interfere with system programs.

The environment in effect at some point in a program is called the current environment at that point. In particular, every REP loop has a current environment. (REP stands for “read-eval-print”; the REP loop is the Scheme program that reads your input, evaluates it, and prints the result.) The environment of the top-level REP loop (the one you are in when Scheme starts up) starts as user-initial-environment, although it can be changed by the ge procedure. When a new REP loop is created, its environment is determined by the program that creates it.


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.4 Static Scoping

Scheme is a statically scoped language with block structure. In this respect, it is like Algol and Pascal, and unlike most other dialects of Lisp except for Common Lisp.

The fact that Scheme is statically scoped (rather than dynamically bound) means that the environment that is extended (and becomes current) when a procedure is called is the environment in which the procedure was created (i.e. in which the procedure’s defining lambda expression was evaluated), not the environment in which the procedure is called. Because all the other Scheme binding expressions can be expressed in terms of procedures, this determines how all bindings behave.

Consider the following definitions, made at the top-level REP loop (in the initial environment):

(define x 1)
(define (f x) (g 2))
(define (g y) (+ x y))
(f 5)                                       ⇒  3 ; not 7

Here f and g are bound to procedures created in the initial environment. Because Scheme is statically scoped, the call to g from f extends the initial environment (the one in which g was created) with a binding of y to 2. In this extended environment, y is 2 and x is 1. (In a dynamically bound Lisp, the call to g would extend the environment in effect during the call to f, in which x is bound to 5 by the call to f, and the answer would be 7.)

Note that with static scoping, you can tell what binding a variable reference refers to just from looking at the text of the program; the referenced binding cannot depend on how the program is used. That is, the nesting of environments (their parent-child relationship) corresponds to the nesting of binding expressions in program text. (Because of this connection to the text of the program, static scoping is also called lexical scoping.) For each place where a variable is bound in a program there is a corresponding region of the program text within which the binding is effective. For example, the region of a binding established by a lambda expression is the entire body of the lambda expression. The documentation of each binding expression explains what the region of the bindings it makes is. A use of a variable (that is, a reference to or assignment of a variable) refers to the innermost binding of that variable whose region contains the variable use. If there is no such region, the use refers to the binding of the variable in the global environment (which is an ancestor of all other environments, and can be thought of as a region in which all your programs are contained).


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.5 True and False

In Scheme, the boolean values true and false are denoted by #t and #f. However, any Scheme value can be treated as a boolean for the purpose of a conditional test. This manual uses the word true to refer to any Scheme value that counts as true, and the word false to refer to any Scheme value that counts as false. In conditional tests, all values count as true except for #f, which counts as false (see Conditionals).


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.6 External Representations

An important concept in Scheme is that of the external representation of an object as a sequence of characters. For example, an external representation of the integer 28 is the sequence of characters ‘28’, and an external representation of a list consisting of the integers 8 and 13 is the sequence of characters ‘(8 13)’.

The external representation of an object is not necessarily unique. The integer 28 also has representations ‘#e28.000’ and ‘#x1c’, and the list in the previous paragraph also has the representations ‘( 08 13 )’ and ‘(8 . (13 . ( )))’.

Many objects have standard external representations, but some, such as procedures and circular data structures, do not have standard representations (although particular implementations may define representations for them).

An external representation may be written in a program to obtain the corresponding object (see Quoting).

External representations can also be used for input and output. The procedure read parses external representations, and the procedure write generates them. Together, they provide an elegant and powerful input/output facility.

Note that the sequence of characters ‘(+ 2 6)’ is not an external representation of the integer 8, even though it is an expression that evaluates to the integer 8; rather, it is an external representation of a three-element list, the elements of which are the symbol + and the integers 2 and 6. Scheme’s syntax has the property that any sequence of characters that is an expression is also the external representation of some object. This can lead to confusion, since it may not be obvious out of context whether a given sequence of characters is intended to denote data or program, but it is also a source of power, since it facilitates writing programs such as interpreters and compilers that treat programs as data or data as programs.


Next: , Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.7 Disjointness of Types

Every object satisfies at most one of the following predicates (but see True and False, for an exception):

bit-string?     environment?    port?           symbol?
boolean?        null?           procedure?      vector?
cell?           number?         promise?        weak-pair?
char?           pair?           string?
condition?

Previous: , Up: Scheme Concepts   [Contents][Index]

1.2.8 Storage Model

This section describes a model that can be used to understand Scheme’s use of storage.

Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. A string, for example, denotes as many locations as there are characters in the string. (These locations need not correspond to a full machine word.) A new value may be stored into one of these locations using the string-set! procedure, but the string continues to denote the same locations as before.

An object fetched from a location, by a variable reference or by a procedure such as car, vector-ref, or string-ref, is equivalent in the sense of eqv? to the object last stored in the location before the fetch.

Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this document speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them.

In many systems it is desirable for constants (i.e. the values of literal expressions) to reside in read-only memory. To express this, it is convenient to imagine that every object that denotes locations is associated with a flag telling whether that object is mutable or immutable. The constants and the strings returned by symbol->string are then the immutable objects, while all objects created by other procedures are mutable. It is an error to attempt to store a new value into a location that is denoted by an immutable object. Note that the MIT/GNU Scheme compiler takes advantage of this property to share constants, but that these constants are not immutable. Instead, two constants that are equal? may be eq? in compiled code.


Next: , Previous: , Up: Overview   [Contents][Index]

1.3 Lexical Conventions

This section describes Scheme’s lexical conventions.


Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.1 Whitespace

Whitespace characters are spaces, newlines, tabs, and page breaks. Whitespace is used to improve the readability of your programs and to separate tokens from each other, when necessary. (A token is an indivisible lexical unit such as an identifier or number.) Whitespace is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.


Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.2 Delimiters

All whitespace characters are delimiters. In addition, the following characters act as delimiters:

(  )  ;  "  '  `  |

Finally, these next characters act as delimiters, despite the fact that Scheme does not define any special meaning for them:

[  ]  {  }

For example, if the value of the variable name is "max":

(list"Hi"name(+ 1 2))                   ⇒  ("Hi" "max" 3)

Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.3 Identifiers

An identifier is a sequence of one or more non-delimiter characters. Identifiers are used in several ways in Scheme programs:

Scheme accepts most of the identifiers that other programming languages allow. MIT/GNU Scheme allows all of the identifiers that standard Scheme does, plus many more.

MIT/GNU Scheme defines a potential identifier to be a sequence of non-delimiter characters that does not begin with either of the characters ‘#’ or ‘,’. Any such sequence of characters that is not a syntactically valid number (see Numbers) is considered to be a valid identifier. Note that, although it is legal for ‘#’ and ‘,’ to appear in an identifier (other than in the first character position), it is poor programming practice.

Here are some examples of identifiers:

lambda             q
list->vector       soup
+                  V17a
<=?                a34kTMNs
the-word-recursion-has-many-meanings

Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.4 Uppercase and Lowercase

Scheme doesn’t distinguish uppercase and lowercase forms of a letter except within character and string constants; in other words, Scheme is case-insensitive. For example, ‘Foo’ is the same identifier as ‘FOO’, and ‘#x1AB’ is the same number as ‘#X1ab’. But ‘#\a’ and ‘#\A’ are different characters.


Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.5 Naming Conventions

A predicate is a procedure that always returns a boolean value (#t or #f). By convention, predicates usually have names that end in ‘?’.

A mutation procedure is a procedure that alters a data structure. By convention, mutation procedures usually have names that end in ‘!’.


Next: , Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.6 Comments

The beginning of a comment is indicated with a semicolon (;). Scheme ignores everything on a line in which a semicolon appears, from the semicolon until the end of the line. The entire comment, including the newline character that terminates it, is treated as whitespace.

An alternative form of comment (sometimes called an extended comment) begins with the characters ‘#|’ and ends with the characters ‘|#’. This alternative form is an MIT/GNU Scheme extension. As with ordinary comments, all of the characters in an extended comment, including the leading ‘#|’ and trailing ‘|#’, are treated as whitespace. Comments of this form may extend over multiple lines, and additionally may be nested (unlike the comments of the programming language C, which have a similar syntax).

;;; This is a comment about the FACT procedure.  Scheme
;;; ignores all of this comment.  The FACT procedure computes
;;; the factorial of a non-negative integer.

#|
This is an extended comment.
Such comments are useful for commenting out code fragments.
|#

(define fact
  (lambda (n)
    (if (= n 0)                      ;This is another comment:
        1                            ;Base case: return 1
        (* n (fact (- n 1))))))

Previous: , Up: Lexical Conventions   [Contents][Index]

1.3.7 Additional Notations

The following list describes additional notations used in Scheme. See Numbers, for a description of the notations used for numbers.

+ - .

The plus sign, minus sign, and period are used in numbers, and may also occur in an identifier. A delimited period (not occurring within a number or identifier) is used in the notation for pairs and to indicate a “rest” parameter in a formal parameter list (see Lambda Expressions).

( )

Parentheses are used for grouping and to notate lists (see Lists).

"

The double quote delimits strings (see Strings).

\

The backslash is used in the syntax for character constants (see Characters) and as an escape character within string constants (see Strings).

;

The semicolon starts a comment.

'

The single quote indicates literal data; it suppresses evaluation (see Quoting).

`

The backquote indicates almost-constant data (see Quoting).

,

The comma is used in conjunction with the backquote (see Quoting).

,@

A comma followed by an at-sign is used in conjunction with the backquote (see Quoting).

#

The sharp (or pound) sign has different uses, depending on the character that immediately follows it:

#t #f

These character sequences denote the boolean constants (see Booleans).

#\

This character sequence introduces a character constant (see Characters).

#(

This character sequence introduces a vector constant (see Vectors). A close parenthesis, ‘)’, terminates a vector constant.

#e #i #b #o #d #l #s #x

These character sequences are used in the notation for numbers (see Numbers).

#|

This character sequence introduces an extended comment. The comment is terminated by the sequence ‘|#’. This notation is an MIT/GNU Scheme extension.

#!

This character sequence is used to denote a small set of named constants. Currently there are only two of these, #!optional and #!rest, both of which are used in the lambda special form to mark certain parameters as being “optional” or “rest” parameters. This notation is an MIT/GNU Scheme extension.

#*

This character sequence introduces a bit string (see Bit Strings). This notation is an MIT/GNU Scheme extension.

#[

This character sequence is used to denote objects that do not have a readable external representation (see Custom Output). A close bracket, ‘]’, terminates the object’s notation. This notation is an MIT/GNU Scheme extension.

#@

This character sequence is a convenient shorthand used to refer to objects by their hash number (see Custom Output). This notation is an MIT/GNU Scheme extension.

#=
##

These character sequences introduce a notation used to show circular structures in printed output, or to denote them in input. The notation works much like that in Common Lisp, and is an MIT/GNU Scheme extension.


Previous: , Up: Overview   [Contents][Index]

1.4 Expressions

A Scheme expression is a construct that returns a value. An expression may be a literal, a variable reference, a special form, or a procedure call.


Next: , Previous: , Up: Expressions   [Contents][Index]

1.4.1 Literal Expressions

Literal constants may be written by using an external representation of the data. In general, the external representation must be quoted (see Quoting); but some external representations can be used without quotation.

"abc"                                   ⇒  "abc"
145932                                  ⇒  145932
#t                                      ⇒  #t
#\a                                     ⇒  #\a

The external representation of numeric constants, string constants, character constants, and boolean constants evaluate to the constants themselves. Symbols, pairs, lists, and vectors require quoting.


Next: , Previous: , Up: Expressions   [Contents][Index]

1.4.2 Variable References

An expression consisting of an identifier (see Identifiers) is a variable reference; the identifier is the name of the variable being referenced. The value of the variable reference is the value stored in the location to which the variable is bound. An error is signalled if the referenced variable is unbound or unassigned.

(define x 28)
x                                       ⇒  28

Next: , Previous: , Up: Expressions   [Contents][Index]

1.4.3 Special Form Syntax

(keyword component …)

A parenthesized expression that starts with a syntactic keyword is a special form. Each special form has its own syntax, which is described later in the manual.

Note that syntactic keywords and variable bindings share the same namespace. A local variable binding may shadow a syntactic keyword, and a local syntactic-keyword definition may shadow a variable binding.

The following list contains all of the syntactic keywords that are defined when MIT/GNU Scheme is initialized:

accessandbegin
casecondcons-stream
declaredefine
define-integrabledefine-structuredefine-syntax
delaydoer-macro-transformer
fluid-letiflambda
letlet*let*-syntax
let-syntaxletrecletrec-syntax
local-declarenamed-lambdanon-hygienic-macro-transformer
orquasiquotequote
rsc-macro-transformersc-macro-transformerset!
syntax-rulesthe-environment

Previous: , Up: Expressions   [Contents][Index]

1.4.4 Procedure Call Syntax

(operator operand …)

A procedure call is written by simply enclosing in parentheses expressions for the procedure to be called (the operator) and the arguments to be passed to it (the operands). The operator and operand expressions are evaluated and the resulting procedure is passed the resulting arguments. See Lambda Expressions, for a more complete description of this.

Another name for the procedure call expression is combination. This word is more specific in that it always refers to the expression; “procedure call” sometimes refers to the process of calling a procedure.

Unlike some other dialects of Lisp, Scheme always evaluates the operator expression and the operand expressions with the same evaluation rules, and the order of evaluation is unspecified.

(+ 3 4)                                 ⇒  7
((if #f = *) 3 4)                       ⇒  12

A number of procedures are available as the values of variables in the initial environment; for example, the addition and multiplication procedures in the above examples are the values of the variables + and *. New procedures are created by evaluating lambda expressions.

If the operator is a syntactic keyword, then the expression is not treated as a procedure call: it is a special form.


Next: , Previous: , Up: Top   [Contents][Index]

2 Special Forms

A special form is an expression that follows special evaluation rules. This chapter describes the basic Scheme special forms.


Next: , Previous: , Up: Special Forms   [Contents][Index]

2.1 Lambda Expressions

extended standard special form: lambda formals expr expr …

A lambda expression evaluates to a procedure. The environment in effect when the lambda expression is evaluated is remembered as part of the procedure; it is called the closing environment. When the procedure is later called with some arguments, the closing environment is extended by binding the variables in the formal parameter list to fresh locations, and the locations are filled with the arguments according to rules about to be given. The new environment created by this process is referred to as the invocation environment.

Once the invocation environment has been constructed, the exprs in the body of the lambda expression are evaluated sequentially in it. This means that the region of the variables bound by the lambda expression is all of the exprs in the body. The result of evaluating the last expr in the body is returned as the result of the procedure call.

Formals, the formal parameter list, is often referred to as a lambda list.

The process of matching up formal parameters with arguments is somewhat involved. There are three types of parameters, and the matching treats each in sequence:

Required

All of the required parameters are matched against the arguments first. If there are fewer arguments than required parameters, an error of type condition-type:wrong-number-of-arguments is signalled; this error is also signalled if there are more arguments than required parameters and there are no further parameters.

Optional

Once the required parameters have all been matched, the optional parameters are matched against the remaining arguments. If there are fewer arguments than optional parameters, the unmatched parameters are bound to special objects called default objects. If there are more arguments than optional parameters, and there are no further parameters, an error of type condition-type:wrong-number-of-arguments is signalled.

The predicate default-object?, which is true only of default objects, can be used to determine which optional parameters were supplied, and which were defaulted.

Rest

Finally, if there is a rest parameter (there can only be one), any remaining arguments are made into a list, and the list is bound to the rest parameter. (If there are no remaining arguments, the rest parameter is bound to the empty list.)

In Scheme, unlike some other Lisp implementations, the list to which a rest parameter is bound is always freshly allocated. It has infinite extent and may be modified without affecting the procedure’s caller.

Specially recognized keywords divide the formals parameters into these three classes. The keywords used here are ‘#!optional’, ‘.’, and ‘#!rest’. Note that only ‘.’ is defined by standard Scheme — the other keywords are MIT/GNU Scheme extensions. ‘#!rest’ has the same meaning as ‘.’ in formals.

The use of these keywords is best explained by means of examples. The following are typical lambda lists, followed by descriptions of which parameters are required, optional, and rest. We will use ‘#!rest’ in these examples, but anywhere it appears ‘.’ could be used instead.

(a b c)

a, b, and c are all required. The procedure must be passed exactly three arguments.

(a b #!optional c)

a and b are required, c is optional. The procedure may be passed either two or three arguments.

(#!optional a b c)

a, b, and c are all optional. The procedure may be passed any number of arguments between zero and three, inclusive.

a
(#!rest a)

These two examples are equivalent. a is a rest parameter. The procedure may be passed any number of arguments. Note: this is the only case in which ‘.’ cannot be used in place of ‘#!rest’.

(a b #!optional c d #!rest e)

a and b are required, c and d are optional, and e is rest. The procedure may be passed two or more arguments.

Some examples of lambda expressions:

(lambda (x) (+ x x))            ⇒  #[compound-procedure 53]

((lambda (x) (+ x x)) 4)                ⇒  8

(define reverse-subtract
  (lambda (x y)
    (- y x)))
(reverse-subtract 7 10)                 ⇒  3

(define foo
  (let ((x 4))
    (lambda (y) (+ x y))))
(foo 6)                                 ⇒  10
special form: named-lambda formals expression expression …

The named-lambda special form is similar to lambda, except that the first “required parameter” in formals is not a parameter but the name of the resulting procedure; thus formals must have at least one required parameter. This name has no semantic meaning, but is included in the external representation of the procedure, making it useful for debugging. In MIT/GNU Scheme, lambda is implemented as named-lambda, with a special name that means “unnamed”.

(named-lambda (f x) (+ x x))    ⇒  #[compound-procedure 53 f]
((named-lambda (f x) (+ x x)) 4)        ⇒  8

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.2 Lexical Binding

The binding constructs let, let*, letrec, letrec*, let-values, and let*-values give Scheme block structure, like Algol 60. The syntax of the first four constructs is identical, but they differ in the regions they establish for their variable bindings. In a let expression, the initial values are computed before any of the variables become bound; in a let* expression, the bindings and evaluations are performed sequentially; while in letrec and letrec* expressions, all the bindings are in effect while their initial values are being computed, thus allowing mutually recursive definitions. The let-values and let*-values constructs are analogous to let and let* respectively, but are designed to handle multiple-valued expressions, binding different identifiers to the returned values.

extended standard special form: let ((variable init) …) expr expr …

The inits are evaluated in the current environment (in some unspecified order), the variables are bound to fresh locations holding the results, the exprs are evaluated sequentially in the extended environment, and the value of the last expr is returned. Each binding of a variable has the exprs as its region.

MIT/GNU Scheme allows any of the inits to be omitted, in which case the corresponding variables are unassigned.

Note that the following are equivalent:

(let ((variable init) …) expr expr …)
((lambda (variable …) expr expr …) init …)

Some examples:

(let ((x 2) (y 3))
  (* x y))                              ⇒  6

(let ((x 2) (y 3))
  (let ((foo (lambda (z) (+ x y z)))
        (x 7))
    (foo 4)))                           ⇒  9

See Iteration, for information on “named let”.

extended standard special form: let* ((variable init) …) expr expr …

let* is similar to let, but the bindings are performed sequentially from left to right, and the region of a binding is that part of the let* expression to the right of the binding. Thus the second binding is done in an environment in which the first binding is visible, and so on.

Note that the following are equivalent:

(let* ((variable1 init1)
       (variable2 init2)
       …
       (variableN initN))
   expr
   expr …)

(let ((variable1 init1))
  (let ((variable2 init2))
    …
      (let ((variableN initN))
        expr
        expr …)
    …))

An example:

(let ((x 2) (y 3))
  (let* ((x 7)
         (z (+ x y)))
    (* z x)))                           ⇒  70
extended standard special form: letrec ((variable init) …) expr expr …

The variables are bound to fresh locations holding unassigned values, the inits are evaluated in the extended environment (in some unspecified order), each variable is assigned to the result of the corresponding init, the exprs are evaluated sequentially in the extended environment, and the value of the last expr is returned. Each binding of a variable has the entire letrec expression as its region, making it possible to define mutually recursive procedures.

MIT/GNU Scheme allows any of the inits to be omitted, in which case the corresponding variables are unassigned.

(letrec ((even?
          (lambda (n)
            (if (zero? n)
                #t
                (odd? (- n 1)))))
         (odd?
          (lambda (n)
            (if (zero? n)
                #f
                (even? (- n 1))))))
  (even? 88))                           ⇒  #t

One restriction on letrec is very important: it shall be possible to evaluated each init without assigning or referring to the value of any variable. If this restriction is violated, then it is an error. The restriction is necessary because Scheme passes arguments by value rather than by name. In the most common uses of letrec, all the inits are lambda or delay expressions and the restriction is satisfied automatically.

extended standard special form: letrec* ((variable init) …) expr expr …

The variables are bound to fresh locations, each variable is assigned in left-to-right order to the result of evaluating the corresponding init (interleaving evaluations and assignments), the exprs are evaluated in the resulting environment, and the values of the last expr are returned. Despite the left-to-right evaluation and assignment order, each binding of a variable has the entire letrec* expression as its region, making it possible to define mutually recursive procedures.

If it is not possible to evaluate each init without assigning or referring to the value of the corresponding variable or the variable of any of the bindings that follow it in bindings, it is an error. Another restriction is that it is an error to invoke the continuation of an init more than once.

;; Returns the arithmetic, geometric, and
;; harmonic means of a nested list of numbers
(define (means ton)
  (letrec*
     ((mean
        (lambda (f g)
          (f (/ (sum g ton) n))))
      (sum
        (lambda (g ton)
          (if (null? ton)
            (+)
            (if (number? ton)
                (g ton)
                (+ (sum g (car ton))
                   (sum g (cdr ton)))))))
      (n (sum (lambda (x) 1) ton)))
    (values (mean values values)
            (mean exp log)
            (mean / /))))

Evaluating (means '(3 (1 4))) returns three values: 8/3, 2.28942848510666 (approximately), and 36/19.

standard special form: let-values ((formals init) …) expr expr …

The inits are evaluated in the current environment (in some unspecified order) as if by invoking call-with-values, and the variables occurring in the formals are bound to fresh locations holding the values returned by the inits, where the formals are matched to the return values in the same way that the formals in a lambda expression are matched to the arguments in a procedure call. Then, the exprs are evaluated in the extended environment, and the values of the last expr are returned. Each binding of a variable has the exprs as its region.

It is an error if the formals do not match the number of values returned by the corresponding init.

(let-values (((root rem) (exact-integer-sqrt 32)))
  (* root rem))         ⇒  35
standard special form: let*-values ((formals init) …) expr expr …

The let*-values construct is similar to let-values, but the inits are evaluated and bindings created sequentially from left to right, with the region of the bindings of each formals including the inits to its right as well as body. Thus the second init is evaluated in an environment in which the first set of bindings is visible and initialized, and so on.

(let ((a 'a) (b 'b) (x 'x) (y 'y))
  (let*-values (((a b) (values x y))
                ((x y) (values a b)))
    (list a b x y)))    ⇒  (x y x y)

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.3 Dynamic Binding

standard special form: parameterize ((parameter value) …) expr expr …

Note that both parameter and value are expressions. It is an error if the value of any parameter expression is not a parameter object.

A parameterize expression is used to change the values of specified parameter objects during the evaluation of the body expressions.

The parameter and value expressions are evaluated in an unspecified order. The body is evaluated in a dynamic environment in which each parameter is bound to the converted value—the result of passing value to the conversion procedure specified when the parameter was created. Then the previous value of parameter is restored without passing it to the conversion procedure. The value of the parameterize expression is the value of the last body expr.

The parameterize special form is standardized by SRFI 39 and by R7RS.

Parameter objects can be used to specify configurable settings for a computation without the need to pass the value to every procedure in the call chain explicitly.

(define radix
  (make-parameter
   10
   (lambda (x)
     (if (and (exact-integer?  x) (<= 2 x 16))
         x
         (error "invalid radix")))))

(define (f n) (number->string n (radix)))

(f 12)                                  ⇒ "12"
(parameterize ((radix 2))
  (f 12))                               ⇒ "1100"
(f 12)                                  ⇒ "12"
(radix 16)                              error→ Wrong number of arguments
(parameterize ((radix 0))
  (f 12))                               error→ invalid radix

A dynamic binding changes the value of a parameter (see Parameters) object temporarily, for a dynamic extent. The set of all dynamic bindings at a given time is called the dynamic environment. The new values are only accessible to the thread that constructed the dynamic environment, and any threads created within that environment.

The extent of a dynamic binding is defined to be the time period during which calling the parameter returns the new value. Normally this time period begins when the body is entered and ends when it is exited, a contiguous time period. However Scheme has first-class continuations by which it is possible to leave the body and reenter it many times. In this situation, the extent is non-contiguous.

When the body is exited by invoking a continuation, the current dynamic environment is unwound until it can be re-wound to the environment captured by the continuation. When the continuation returns, the process is reversed, restoring the original dynamic environment.

The following example shows the interaction between dynamic binding and continuations. Side effects to the binding that occur both inside and outside of the body are preserved, even if continuations are used to jump in and out of the body repeatedly.

(define (complicated-dynamic-parameter)
  (let ((variable (make-settable-parameter 1))
        (inside-continuation))
    (write-line (variable))
    (call-with-current-continuation
     (lambda (outside-continuation)
       (parameterize ((variable 2))
         (write-line (variable))
         (variable 3)
         (call-with-current-continuation
          (lambda (k)
            (set! inside-continuation k)
            (outside-continuation #t)))
         (write-line (variable))
         (set! inside-continuation #f))))
    (write-line (variable))
    (if inside-continuation
        (begin
          (variable 4)
          (inside-continuation #f)))))

Evaluating ‘(complicated-dynamic-binding)’ writes the following on the console:

1
2
1
3
4

Commentary: the first two values written are the initial binding of variable and its new binding inside parameterize’s body. Immediately after they are written, the binding visible in the body is set to ‘3’, and outside-continuation is invoked, exiting the body. At this point, ‘1’ is written, demonstrating that the original binding of variable is still visible outside the body. Then we set variable to ‘4’ and reenter the body by invoking inside-continuation. At this point, ‘3’ is written, indicating that the binding modified in the body is still the binding visible in the body. Finally, we exit the body normally, and write ‘4’, demonstrating that the binding modified outside of the body was also preserved.

2.3.1 Fluid-Let

The fluid-let special form can change the value of any variable for a dynamic extent, but it is difficult to implement in a multi-processing (SMP) world. It and the cell object type (see Cells) are now deprecated. They are still available and functional in a uni-processing (non-SMP) world, but will signal an error when used in an SMP world. The parameterize special form (see parameterize) should be used instead.

special form: fluid-let ((variable init) …) expression expression …

The inits are evaluated in the current environment (in some unspecified order), the current values of the variables are saved, the results are assigned to the variables, the expressions are evaluated sequentially in the current environment, the variables are restored to their original values, and the value of the last expression is returned.

The syntax of this special form is similar to that of let, but fluid-let temporarily rebinds existing variables. Unlike let, fluid-let creates no new bindings; instead it assigns the value of each init to the binding (determined by the rules of lexical scoping) of its corresponding variable.

MIT/GNU Scheme allows any of the inits to be omitted, in which case the corresponding variables are temporarily unassigned.

An error of type condition-type:unbound-variable is signalled if any of the variables are unbound. However, because fluid-let operates by means of side effects, it is valid for any variable to be unassigned when the form is entered.


Next: , Previous: , Up: Special Forms   [Contents][Index]

2.4 Definitions

extended standard special form: define variable [expression]
standard special form: define formals expression expression …

Definitions are valid in some but not all contexts where expressions are allowed. Definitions may only occur at the top level of a program and at the beginning of a lambda body (that is, the body of a lambda, let, let*, letrec, letrec*, let-values, let*-values, parameterize, or “procedure define” expression). A definition that occurs at the top level of a program is called a top-level definition, and a definition that occurs at the beginning of a body is called an internal definition.

In the second form of define (called “procedure define”), the component formals is identical to the component of the same name in a named-lambda expression. In fact, these two expressions are equivalent:

(define (name1 name2 …)
  expression
  expression …)

(define name1
  (named-lambda (name1 name2 …)
    expression
    expression …))

Next: , Previous: , Up: Definitions   [Contents][Index]

2.4.1 Top-Level Definitions

A top-level definition,

(define variable expression)

has essentially the same effect as this assignment expression, if variable is bound:

(set! variable expression)

If variable is not bound, however, define binds variable to a new location in the current environment before performing the assignment (it is an error to perform a set! on an unbound variable). If you omit expression, the variable becomes unassigned; an attempt to reference such a variable is an error.

(define add3
   (lambda (x) (+ x 3)))                ⇒  unspecified
(add3 3)                                ⇒  6

(define first car)                      ⇒  unspecified
(first '(1 2))                          ⇒  1

(define bar)                            ⇒  unspecified
bar                                     error→ Unassigned variable

Previous: , Up: Definitions   [Contents][Index]

2.4.2 Internal Definitions

An internal definition is a definition that occurs at the beginning of a body (that is, the body of a lambda, let, let*, letrec, letrec*, let-values, let*-values, parameterize, or “procedure define” expression), rather than at the top level of a program. The variable defined by an internal definition is local to the body. That is, variable is bound rather than assigned, and the region of the binding is the entire body. For example,

(let ((x 5))
  (define foo (lambda (y) (bar x y)))
  (define bar (lambda (a b) (+ (* a b) a)))
  (foo (+ x 3)))                        ⇒  45

A body containing internal definitions can always be converted into a completely equivalent letrec* expression. For example, the let expression in the above example is equivalent to

(let ((x 5))
  (letrec* ((foo (lambda (y) (bar x y)))
            (bar (lambda (a b) (+ (* a b) a))))
    (foo (+ x 3))))

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.5 Assignments

extended standard special form: set! variable [expression]

If expression is specified, evaluates expression and stores the resulting value in the location to which variable is bound. If expression is omitted, variable is altered to be unassigned; a subsequent reference to such a variable is an error. In either case, the value of the set! expression is unspecified.

Variable must be bound either in some region enclosing the set! expression, or at the top level. However, variable is permitted to be unassigned when the set! form is entered.

(define x 2)                            ⇒  unspecified
(+ x 1)                                 ⇒  3
(set! x 4)                              ⇒  unspecified
(+ x 1)                                 ⇒  5

Variable may be an access expression (see Environments). This allows you to assign variables in an arbitrary environment. For example,

(define x (let ((y 0)) (the-environment)))
(define y 'a)
y                                       ⇒  a
(access y x)                            ⇒  0
(set! (access y x) 1)                   ⇒  unspecified
y                                       ⇒  a
(access y x)                            ⇒  1

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.6 Quoting

This section describes the expressions that are used to modify or prevent the evaluation of objects.

standard special form: quote datum

(quote datum) evaluates to datum. Datum may be any external representation of a Scheme object (see External Representations). Use quote to include literal constants in Scheme code.

(quote a)                               ⇒  a
(quote #(a b c))                        ⇒  #(a b c)
(quote (+ 1 2))                         ⇒  (+ 1 2)

(quote datum) may be abbreviated as 'datum. The two notations are equivalent in all respects.

'a                                      ⇒  a
'#(a b c)                               ⇒  #(a b c)
'(+ 1 2)                                ⇒  (+ 1 2)
'(quote a)                              ⇒  (quote a)
''a                                     ⇒  (quote a)

Numeric constants, string constants, character constants, and boolean constants evaluate to themselves, so they don’t need to be quoted.

'"abc"                                  ⇒  "abc"
"abc"                                   ⇒  "abc"
'145932                                 ⇒  145932
145932                                  ⇒  145932
'#t                                     ⇒  #t
#t                                      ⇒  #t
'#\a                                    ⇒  #\a
#\a                                     ⇒  #\a
standard special form: quasiquote template

“Backquote” or “quasiquote” expressions are useful for constructing a list or vector structure when most but not all of the desired structure is known in advance. If no commas appear within the template, the result of evaluating `template is equivalent (in the sense of equal?) to the result of evaluating 'template. If a comma appears within the template, however, the expression following the comma is evaluated (“unquoted”) and its result is inserted into the structure instead of the comma and the expression. If a comma appears followed immediately by an at-sign (@), then the following expression shall evaluate to a list; the opening and closing parentheses of the list are then “stripped away” and the elements of the list are inserted in place of the comma at-sign expression sequence.

`(list ,(+ 1 2) 4)                       ⇒  (list 3 4)

(let ((name 'a)) `(list ,name ',name))   ⇒  (list a 'a)

`(a ,(+ 1 2) ,@(map abs '(4 -5 6)) b)    ⇒  (a 3 4 5 6 b)

`((foo ,(- 10 3)) ,@(cdr '(c)) . ,(car '(cons)))
                                         ⇒  ((foo 7) . cons)

`#(10 5 ,(sqrt 4) ,@(map sqrt '(16 9)) 8)
                                         ⇒  #(10 5 2 4 3 8)

`,(+ 2 3)                                ⇒  5

Quasiquote forms may be nested. Substitutions are made only for unquoted components appearing at the same nesting level as the outermost backquote. The nesting level increases by one inside each successive quasiquotation, and decreases by one inside each unquotation.

`(a `(b ,(+ 1 2) ,(foo ,(+ 1 3) d) e) f)
     ⇒  (a `(b ,(+ 1 2) ,(foo 4 d) e) f)

(let ((name1 'x)
      (name2 'y))
   `(a `(b ,,name1 ,',name2 d) e))
     ⇒  (a `(b ,x ,'y d) e)

The notations `template and (quasiquote template) are identical in all respects. ,expression is identical to (unquote expression) and ,@expression is identical to (unquote-splicing expression).

(quasiquote (list (unquote (+ 1 2)) 4))
     ⇒  (list 3 4)

'(quasiquote (list (unquote (+ 1 2)) 4))
     ⇒  `(list ,(+ 1 2) 4)
     i.e., (quasiquote (list (unquote (+ 1 2)) 4))

Unpredictable behavior can result if any of the symbols quasiquote, unquote, or unquote-splicing appear in a template in ways otherwise than as described above.


Next: , Previous: , Up: Special Forms   [Contents][Index]

2.7 Conditionals

The behavior of the conditional expressions is determined by whether objects are true or false. The conditional expressions count only #f as false. They count everything else, including #t, pairs, symbols, numbers, strings, vectors, and procedures as true (but see True and False).

In the descriptions that follow, we say that an object has “a true value” or “is true” when the conditional expressions treat it as true, and we say that an object has “a false value” or “is false” when the conditional expressions treat it as false.

standard special form: if predicate consequent [alternative]

Predicate, consequent, and alternative are expressions. An if expression is evaluated as follows: first, predicate is evaluated. If it yields a true value, then consequent is evaluated and its value is returned. Otherwise alternative is evaluated and its value is returned. If predicate yields a false value and no alternative is specified, then the result of the expression is unspecified.

An if expression evaluates either consequent or alternative, never both. Programs should not depend on the value of an if expression that has no alternative.

(if (> 3 2) 'yes 'no)                   ⇒  yes
(if (> 2 3) 'yes 'no)                   ⇒  no
(if (> 3 2)
    (- 3 2)
    (+ 3 2))                            ⇒  1
standard special form: cond clause clause …

Each clause has this form:

(predicate expression …)

where predicate is any expression. The last clause may be an else clause, which has the form:

(else expression expression …)

A cond expression does the following:

  1. Evaluates the predicate expressions of successive clauses in order, until one of the predicates evaluates to a true value.
  2. When a predicate evaluates to a true value, cond evaluates the expressions in the associated clause in left to right order, and returns the result of evaluating the last expression in the clause as the result of the entire cond expression.

    If the selected clause contains only the predicate and no expressions, cond returns the value of the predicate as the result.

  3. If all predicates evaluate to false values, and there is no else clause, the result of the conditional expression is unspecified; if there is an else clause, cond evaluates its expressions (left to right) and returns the value of the last one.
(cond ((> 3 2) 'greater)
      ((< 3 2) 'less))                  ⇒  greater

(cond ((> 3 3) 'greater)
      ((< 3 3) 'less)
      (else 'equal))                    ⇒  equal

Normally, programs should not depend on the value of a cond expression that has no else clause. However, some Scheme programmers prefer to write cond expressions in which at least one of the predicates is always true. In this style, the final clause is equivalent to an else clause.

Scheme supports an alternative clause syntax:

(predicate => recipient)

where recipient is an expression. If predicate evaluates to a true value, then recipient is evaluated. Its value must be a procedure of one argument; this procedure is then invoked on the value of the predicate.

(cond ((assv 'b '((a 1) (b 2))) => cadr)
      (else #f))                        ⇒  2
standard special form: case key clause clause …

Key may be any expression. Each clause has this form:

((object …) expression expression …)

No object is evaluated, and all the objects must be distinct. The last clause may be an else clause, which has the form:

(else expression expression …)

A case expression does the following:

  1. Evaluates key and compares the result with each object.
  2. If the result of evaluating key is equivalent (in the sense of eqv?; see Equivalence Predicates) to an object, case evaluates the expressions in the corresponding clause from left to right and returns the result of evaluating the last expression in the clause as the result of the case expression.
  3. If the result of evaluating key is different from every object, and if there’s an else clause, case evaluates its expressions and returns the result of the last one as the result of the case expression. If there’s no else clause, case returns an unspecified result. Programs should not depend on the value of a case expression that has no else clause.

For example,

(case (* 2 3)
   ((2 3 5 7) 'prime)
   ((1 4 6 8 9) 'composite))            ⇒  composite

(case (car '(c d))
   ((a) 'a)
   ((b) 'b))                            ⇒  unspecified

(case (car '(c d))
   ((a e i o u) 'vowel)
   ((w y) 'semivowel)
   (else 'consonant))                   ⇒  consonant
standard special form: and expression …

The expressions are evaluated from left to right, and the value of the first expression that evaluates to a false value is returned. Any remaining expressions are not evaluated. If all the expressions evaluate to true values, the value of the last expression is returned. If there are no expressions then #t is returned.

(and (= 2 2) (> 2 1))                   ⇒  #t
(and (= 2 2) (< 2 1))                   ⇒  #f
(and 1 2 'c '(f g))                     ⇒  (f g)
(and)                                   ⇒  #t
standard special form: or expression …

The expressions are evaluated from left to right, and the value of the first expression that evaluates to a true value is returned. Any remaining expressions are not evaluated. If all expressions evaluate to false values, the value of the last expression is returned. If there are no expressions then #f is returned.

(or (= 2 2) (> 2 1))                    ⇒  #t
(or (= 2 2) (< 2 1))                    ⇒  #t
(or #f #f #f)                           ⇒  #f
(or (memq 'b '(a b c)) (/ 3 0))         ⇒  (b c)

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.8 Sequencing

The begin special form is used to evaluate expressions in a particular order.

standard special form: begin expression expression …

The expressions are evaluated sequentially from left to right, and the value of the last expression is returned. This expression type is used to sequence side effects such as input and output.

(define x 0)
(begin (set! x 5)
       (+ x 1))                 ⇒  6

(begin (display "4 plus 1 equals ")
       (display (+ 4 1)))
                                -|  4 plus 1 equals 5
                                ⇒  unspecified

Often the use of begin is unnecessary, because many special forms already support sequences of expressions (that is, they have an implicit begin). Some of these special forms are:

case
cond
define          ;“procedure define” only
do
lambda
let
let*
letrec
letrec*
let-values
let*-values
named-lambda
parameterize

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.9 Iteration

The iteration expressions are: “named let” and do. They are also binding expressions, but are more commonly referred to as iteration expressions. Because Scheme is properly tail-recursive, you don’t need to use these special forms to express iteration; you can simply use appropriately written “recursive” procedure calls.

extended standard special form: let name ((variable init) …) expr expr …

MIT/GNU Scheme permits a variant on the syntax of let called “named let” which provides a more general looping construct than do, and may also be used to express recursions.

Named let has the same syntax and semantics as ordinary let except that name is bound within the exprs to a procedure whose formal arguments are the variables and whose body is the exprs. Thus the execution of the exprs may be repeated by invoking the procedure named by name.

MIT/GNU Scheme allows any of the inits to be omitted, in which case the corresponding variables are unassigned.

Note: the following expressions are equivalent:

(let name ((variable init) …)
  expr
  expr …)

((letrec ((name
           (named-lambda (name variable …)
             expr
             expr …)))
   name)
 init …)

Here is an example:

(let loop
     ((numbers '(3 -2 1 6 -5))
      (nonneg '())
      (neg '()))
  (cond ((null? numbers)
         (list nonneg neg))
        ((>= (car numbers) 0)
         (loop (cdr numbers)
               (cons (car numbers) nonneg)
               neg))
        (else
         (loop (cdr numbers)
               nonneg
               (cons (car numbers) neg)))))

     ⇒  ((6 1 3) (-5 -2))
extended standard special form: do ((variable init step) …) (test expression …) command …

do is an iteration construct. It specifies a set of variables to be bound, how they are to be initialized at the start, and how they are to be updated on each iteration. When a termination condition is met, the loop exits with a specified result value.

do expressions are evaluated as follows: The init expressions are evaluated (in some unspecified order), the variables are bound to fresh locations, the results of the init expressions are stored in the bindings of the variables, and then the iteration phase begins.

Each iteration begins by evaluating test; if the result is false, then the command expressions are evaluated in order for effect, the step expressions are evaluated in some unspecified order, the variables are bound to fresh locations, the results of the steps are stored in the bindings of the variables, and the next iteration begins.

If test evaluates to a true value, then the expressions are evaluated from left to right and the value of the last expression is returned as the value of the do expression. If no expressions are present, then the value of the do expression is unspecified in standard Scheme; in MIT/GNU Scheme, the value of test is returned.

The region of the binding of a variable consists of the entire do expression except for the inits. It is an error for a variable to appear more than once in the list of do variables.

A step may be omitted, in which case the effect is the same as if (variable init variable) had been written instead of (variable init).

(do ((vec (make-vector 5))
      (i 0 (+ i 1)))
    ((= i 5) vec)
   (vector-set! vec i i))               ⇒  #(0 1 2 3 4)

(let ((x '(1 3 5 7 9)))
   (do ((x x (cdr x))
        (sum 0 (+ sum (car x))))
       ((null? x) sum)))                ⇒  25

Next: , Previous: , Up: Special Forms   [Contents][Index]

2.10 Structure Definitions

This section provides examples and describes the options and syntax of define-structure, an MIT/GNU Scheme macro that is very similar to defstruct in Common Lisp. The differences between them are summarized at the end of this section. For more information, see Steele’s Common Lisp book.

special form: define-structure (name structure-option …) slot-description …

Each slot-description takes one of the following forms:

slot-name
(slot-name default-init [slot-option value]*)

The fields name and slot-name must both be symbols. The field default-init is an expression for the initial value of the slot. It is evaluated each time a new instance is constructed. If it is not specified, the initial content of the slot is undefined. Default values are only useful with a BOA constructor with argument list or a keyword constructor (see below).

Evaluation of a define-structure expression defines a structure descriptor and a set of procedures to manipulate instances of the structure. These instances are represented as records by default (see Records) but may alternately be lists or vectors. The accessors and modifiers are marked with compiler declarations so that calls to them are automatically transformed into appropriate references. Often, no options are required, so a simple call to define-structure looks like:

(define-structure foo a b c)

This defines a type descriptor rtd:foo, a constructor make-foo, a predicate foo?, accessors foo-a, foo-b, and foo-c, and modifiers set-foo-a!, set-foo-b!, and set-foo-c!.

In general, if no options are specified, define-structure defines the following (using the simple call above as an example):

type descriptor

The name of the type descriptor is "rtd:" followed by the name of the structure, e.g. ‘rtd:foo’. The type descriptor satisfies the predicate record-type?.

constructor

The name of the constructor is "make-" followed by the name of the structure, e.g. ‘make-foo’. The number of arguments accepted by the constructor is the same as the number of slots; the arguments are the initial values for the slots, and the order of the arguments matches the order of the slot definitions.

predicate

The name of the predicate is the name of the structure followed by "?", e.g. ‘foo?’. The predicate is a procedure of one argument, which returns #t if its argument is a record of the type defined by this structure definition, and #f otherwise.

accessors

For each slot, an accessor is defined. The name of the accessor is formed by appending the name of the structure, a hyphen, and the name of the slot, e.g. ‘foo-a’. The accessor is a procedure of one argument, which must be a record of the type defined by this structure definition. The accessor extracts the contents of the corresponding slot in that record and returns it.

modifiers

For each slot, a modifier is defined. The name of the modifier is formed by appending "set-", the name of the accessor, and "!", e.g. ‘set-foo-a!’. The modifier is a procedure of two arguments, the first of which must be a record of the type defined by this structure definition, and the second of which may be any object. The modifier modifies the contents of the corresponding slot in that record to be that object, and returns an unspecified value.

When options are not supplied, (name) may be abbreviated to name. This convention holds equally for structure-options and slot-options. Hence, these are equivalent:

(define-structure foo a b c)
(define-structure (foo) (a) b (c))

as are

(define-structure (foo keyword-constructor) a b c)
(define-structure (foo (keyword-constructor)) a b c)

When specified as option values, false and nil are equivalent to #f, and true and t are equivalent to #t.

Possible slot-options are:

slot option: read-only value

When given a value other than #f, this specifies that no modifier should be created for the slot.

slot option: type type-descriptor

This is accepted but not presently used.

Possible structure-options are:

structure option: predicate [name]

This option controls the definition of a predicate procedure for the structure. If name is not given, the predicate is defined with the default name (see above). If name is #f, the predicate is not defined at all. Otherwise, name must be a symbol, and the predicate is defined with that symbol as its name.

structure option: copier [name]

This option controls the definition of a procedure to copy instances of the structure. This is a procedure of one argument, a structure instance, that makes a newly allocated copy of the structure and returns it. If name is not given, the copier is defined, and the name of the copier is "copy-" followed by the structure name (e.g. ‘copy-foo’). If name is #f, the copier is not defined. Otherwise, name must be a symbol, and the copier is defined with that symbol as its name.

structure option: print-procedure expression

Evaluating expression must yield a procedure of two arguments, which is used to print instances of the structure. The procedure is a print method (see Custom Output).

structure option: constructor [name [argument-list]]

This option controls the definition of constructor procedures. These constructor procedures are called “BOA constructors”, for “By Order of Arguments”, because the arguments to the constructor specify the initial contents of the structure’s slots by the order in which they are given. This is as opposed to “keyword constructors”, which specify the initial contents using keywords, and in which the order of arguments is irrelevant.

If name is not given, a constructor is defined with the default name and arguments (see above). If name is #f, no constructor is defined; argument-list may not be specified in this case. Otherwise, name must be a symbol, and a constructor is defined with that symbol as its name. If name is a symbol, argument-list is optionally allowed; if it is omitted, the constructor accepts one argument for each slot in the structure definition, in the same order in which the slots appear in the definition. Otherwise, argument-list must be a lambda list (see Lambda Expressions), and each of the parameters of the lambda list must be the name of a slot in the structure. The arguments accepted by the constructor are defined by this lambda list. Any slot that is not specified by the lambda list is initialized to the default-init as specified above; likewise for any slot specified as an optional parameter when the corresponding argument is not supplied.

If the constructor option is specified, the default constructor is not defined. Additionally, the constructor option may be specified multiple times to define multiple constructors with different names and argument lists.

(define-structure (foo
                   (constructor make-foo (#!optional a b)))
  (a 6 read-only #t)
  (b 9))
structure option: keyword-constructor [name]

This option controls the definition of keyword constructor procedures. A keyword constructor is a procedure that accepts arguments that are alternating slot names and values. If name is omitted, a keyword constructor is defined, and the name of the constructor is "make-" followed by the name of the structure (e.g. ‘make-foo’). Otherwise, name must be a symbol, and a keyword constructor is defined with this symbol as its name.

If the keyword-constructor option is specified, the default constructor is not defined. Additionally, the keyword-constructor option may be specified multiple times to define multiple keyword constructors; this is usually not done since such constructors would all be equivalent.

(define-structure (foo (keyword-constructor make-bar)) a b)
(foo-a (make-bar 'b 20 'a 19))         ⇒ 19
structure option: type-descriptor name

This option cannot be used with the type or named options.

By default, structures are implemented as records. The name of the structure is defined to hold the type descriptor of the record defined by the structure. The type-descriptor option specifies a different name to hold the type descriptor.

(define-structure foo a b)
foo             ⇒ #[record-type 18]

(define-structure (bar (type-descriptor <bar>)) a b)
bar             error→ Unbound variable: bar
<bar>         ⇒ #[record-type 19]
structure option: conc-name [name]

By default, the prefix for naming accessors and modifiers is the name of the structure followed by a hyphen. The conc-name option can be used to specify an alternative. If name is not given, the prefix is the name of the structure followed by a hyphen (the default). If name is #f, the slot names are used directly, without prefix. Otherwise, name must a symbol, and that symbol is used as the prefix.

(define-structure (foo (conc-name moby/)) a b)

defines accessors moby/a and moby/b, and modifiers set-moby/a! and set-moby/b!.

(define-structure (foo (conc-name #f)) a b)

defines accessors a and b, and modifiers set-a! and set-b!.

structure option: type representation-type

This option cannot be used with the type-descriptor option.

By default, structures are implemented as records. The type option overrides this default, allowing the programmer to specify that the structure be implemented using another data type. The option value representation-type specifies the alternate data type; it is allowed to be one of the symbols vector or list, and the data type used is the one corresponding to the symbol.

If this option is given, and the named option is not specified, the representation will not be tagged, and neither a predicate nor a type descriptor will be defined; also, the print-procedure option may not be given.

(define-structure (foo (type list)) a b) 
(make-foo 1 2)                          ⇒ (1 2)
structure option: named [expression]

This is valid only in conjunction with the type option and specifies that the structure instances be tagged to make them identifiable as instances of this structure type. This option cannot be used with the type-descriptor option.

In the usual case, where expression is not given, the named option causes a type descriptor and predicate to be defined for the structure (recall that the type option without named suppresses their definition), and also defines a default print method for the structure instances (which can be overridden by the print-procedure option). If the default print method is not wanted then the print-procedure option should be specified as #f. This causes the structure to be printed in its native representation, as a list or vector, which includes the type descriptor. The type descriptor is a unique object, not a record type, that describes the structure instances and is additionally stored in the structure instances to identify them: if the representation type is vector, the type descriptor is stored in the zero-th slot of the vector, and if the representation type is list, it is stored as the first element of the list.

(define-structure (foo (type vector) named) a b c)
(vector-ref (make-foo 1 2 3) 0) ⇒ #[structure-type 52]

If expression is specified, it is an expression that is evaluated to yield a tag object. The expression is evaluated once when the structure definition is evaluated (to specify the print method), and again whenever a predicate or constructor is called. Because of this, expression is normally a variable reference or a constant. The value yielded by expression may be any object at all. That object is stored in the structure instances in the same place that the type descriptor is normally stored, as described above. If expression is specified, no type descriptor is defined, only a predicate.

(define-structure (foo (type vector) (named 'foo)) a b c)
(vector-ref (make-foo 1 2 3) 0) ⇒ foo
structure option: safe-accessors [boolean]

This option allows the programmer to have some control over the safety of the slot accessors (and modifiers) generated by define-structure. If safe-accessors is not specified, or if boolean is #f, then the accessors are optimized for speed at the expense of safety; when compiled, the accessors will turn into very fast inline sequences, usually one to three machine instructions in length. However, if safe-accessors is specified and boolean is either omitted or #t, then the accessors are optimized for safety, will check the type and structure of their argument, and will be close-coded.

(define-structure (foo safe-accessors) a b c)
structure option: initial-offset offset

This is valid only in conjunction with the type option. Offset must be an exact non-negative integer and specifies the number of slots to leave open at the beginning of the structure instance before the specified slots are allocated. Specifying an offset of zero is equivalent to omitting the initial-offset option.

If the named option is specified, the structure tag appears in the first slot, followed by the “offset” slots, and then the regular slots. Otherwise, the “offset” slots come first, followed by the regular slots.

(define-structure (foo (type vector) (initial-offset 3))
  a b c)
(make-foo 1 2 3)                ⇒ #(() () () 1 2 3)

The essential differences between MIT/GNU Scheme’s define-structure and Common Lisp’s defstruct are:


Next: , Previous: , Up: Special Forms   [Contents][Index]

2.11 Macros

(This section is largely taken from the Revised^4 Report on the Algorithmic Language Scheme. The section on Syntactic Closures is derived from a document written by Chris Hanson. The section on Explicit Renaming is derived from a document written by William Clinger.)

Scheme programs can define and use new derived expression types, called macros. Program-defined expression types have the syntax

(keyword datum …)

where keyword is an identifier that uniquely determines the expression type. This identifier is called the syntactic keyword, or simply keyword, of the macro. The number of the datums, and their syntax, depends on the expression type.

Each instance of a macro is called a use of the macro. The set of rules that specifies how a use of a macro is transcribed into a more primitive expression is called the transformer of the macro.

MIT/GNU Scheme also supports anonymous syntactic keywords. This means that it’s not necessary to bind a macro transformer to a syntactic keyword before it is used. Instead, any macro-transformer expression can appear as the first element of a form, and the form will be expanded by the transformer.

The macro definition facility consists of these parts:

The syntactic keyword of a macro may shadow variable bindings, and local variable bindings may shadow keyword bindings. All macros defined using the pattern language are “hygienic” and “referentially transparent” and thus preserve Scheme’s lexical scoping:


Next: , Previous: , Up: Macros   [Contents][Index]

2.11.1 Binding Constructs for Syntactic Keywords

let-syntax, letrec-syntax, let*-syntax and define-syntax are analogous to let, letrec, let* and define, but they bind syntactic keywords to macro transformers instead of binding variables to locations that contain values.

Any argument named transformer-spec must be a macro-transformer expression, which is one of the following:

standard special form: let-syntax bindings expression expression …

Bindings should have the form

((keyword transformer-spec) …)

Each keyword is an identifier, each transformer-spec is a a macro-transformer expression, and the body is a sequence of one or more expressions. It is an error for a keyword to appear more than once in the list of keywords being bound.

The expressions are expanded in the syntactic environment obtained by extending the syntactic environment of the let-syntax expression with macros whose keywords are the keywords, bound to the specified transformers. Each binding of a keyword has the expressions as its region.

(let-syntax ((when (syntax-rules ()
                     ((when test stmt1 stmt2 ...)
                      (if test
                          (begin stmt1
                                 stmt2 ...))))))
  (let ((if #t))
    (when if (set! if 'now))
    if))                           ⇒  now

(let ((x 'outer))
  (let-syntax ((m (syntax-rules () ((m) x))))
    (let ((x 'inner))
      (m))))                       ⇒  outer
standard special form: letrec-syntax bindings expression expression …

The syntax of letrec-syntax is the same as for let-syntax.

The expressions are expanded in the syntactic environment obtained by extending the syntactic environment of the letrec-syntax expression with macros whose keywords are the keywords, bound to the specified transformers. Each binding of a keyword has the bindings as well as the expressions within its region, so the transformers can transcribe expressions into uses of the macros introduced by the letrec-syntax expression.

(letrec-syntax
  ((my-or (syntax-rules ()
            ((my-or) #f)
            ((my-or e) e)
            ((my-or e1 e2 ...)
             (let ((temp e1))
               (if temp
                   temp
                   (my-or e2 ...)))))))
  (let ((x #f)
        (y 7)
        (temp 8)
        (let odd?)
        (if even?))
    (my-or x
           (let temp)
           (if y)
           y)))        ⇒  7
standard special form: let*-syntax bindings expression expression …

The syntax of let*-syntax is the same as for let-syntax.

The expressions are expanded in the syntactic environment obtained by extending the syntactic environment of the letrec-syntax expression with macros whose keywords are the keywords, bound to the specified transformers. Each binding of a keyword has the subsequent bindings as well as the expressions within its region. Thus

(let*-syntax
   ((a (syntax-rules …))
    (b (syntax-rules …)))
  …)

is equivalent to

(let-syntax ((a (syntax-rules …)))
  (let-syntax ((b (syntax-rules …)))
    …))
standard special form: define-syntax keyword transformer-spec

Keyword is an identifier, and transformer-spec is a macro transformer expression. The syntactic environment is extended by binding the keyword to the specified transformer.

The region of the binding introduced by define-syntax is the entire block in which it appears. However, the keyword may only be used after it has been defined.

MIT/GNU Scheme permits define-syntax to appear both at top level and within lambda bodies. The Revised^4 Report permits only top-level uses of define-syntax.

When compiling a program, a top-level instance of define-syntax both defines the syntactic keyword and generates code that will redefine the keyword when the program is loaded. This means that the same syntax can be used for defining macros that will be used during compilation and for defining macros to be used at run time.

Although macros may expand into definitions and syntax definitions in any context that permits them, it is an error for a definition or syntax definition to shadow a syntactic keyword whose meaning is needed to determine whether some form in the group of forms that contains the shadowing definition is in fact a definition, or, for internal definitions, is needed to determine the boundary between the group and the expressions that follow the group. For example, the following are errors:

(define define 3)

(begin (define begin list))

(let-syntax
  ((foo (syntax-rules ()
          ((foo (proc args ...) body ...)
           (define proc
             (lambda (args ...)
               body ...))))))
  (let ((x 3))
    (foo (plus x y) (+ x y))
    (define foo x)
    (plus foo x)))

Next: , Previous: , Up: Macros   [Contents][Index]

2.11.2 Pattern Language

MIT/GNU Scheme supports a high-level pattern language for specifying macro transformers. This pattern language is defined by the Revised^4 Report and is portable to other conforming Scheme implementations. To use the pattern language, specify a transformer-spec as a syntax-rules form:

standard special form: syntax-rules [ellipsis] literals syntax-rule …

Ellipsis is an identifier, and if omitted defaults to .... Literals is a list of identifiers and each syntax-rule should be of the form

(pattern template)

The pattern in a syntax-rule is a list pattern that begins with the keyword for the macro.

A pattern is either an identifier, a constant, or one of the following

(pattern …)
(pattern pattern … . pattern)
(patternpattern ellipsis)

and a template is either an identifier, a constant, or one of the following

(element …)
(element element … . template)

where an element is a template optionally followed by an ellipsis and an ellipsis is the identifier ‘...’ (which cannot be used as an identifier in either a template or a pattern).

An instance of syntax-rules produces a new macro transformer by specifying a sequence of hygienic rewrite rules. A use of a macro whose keyword is associated with a transformer specified by syntax-rules is matched against the patterns contained in the syntax-rules, beginning with the leftmost syntax-rule. When a match is found, the macro use is transcribed hygienically according to the template.

An identifier that appears in the pattern of a syntax-rule is a pattern-variable, unless it is the keyword that begins the pattern, is listed in literals, or is the identifier ‘...’. Pattern variables match arbitrary input elements and are used to refer to elements of the input in the template. It is an error for the same pattern variable to appear more than once in a pattern.

The keyword at the beginning of the pattern in a syntax-rule is not involved in the matching and is not considered a pattern variable or literal identifier.

Identifiers that appear in literals are interpreted as literal identifiers to be matched against corresponding subforms of the input. A subform in the input matches a literal identifier if and only if it is an identifier and either both its occurrence in the macro expression and its occurrence in the macro definition have the same lexical binding, or the two identifiers are equal and both have no lexical binding.

A subpattern followed by ‘...’ can match zero or more elements of the input. It is an error for ‘...’ to appear in literals. Within a pattern the identifier ‘...’ must follow the last element of a nonempty sequence of subpatterns.

More formally, an input form F matches a pattern P if and only if:

It is an error to use a macro keyword, within the scope of its binding, in an expression that does not match any of the patterns.

When a macro use is transcribed according to the template of the matching syntax rule, pattern variables that occur in the template are replaced by the subforms they match in the input. Pattern variables that occur in subpatterns followed by one or more instances of the identifier ‘...’ are allowed only in subtemplates that are followed by as many instances of ‘...’. They are replaced in the output by all of the subforms they match in the input, distributed as indicated. It is an error if the output cannot be built up as specified.

Identifiers that appear in the template but are not pattern variables or the identifier ‘...’ are inserted into the output as literal identifiers. If a literal identifier is inserted as a free identifier then it refers to the binding of that identifier within whose scope the instance of syntax-rules appears. If a literal identifier is inserted as a bound identifier then it is in effect renamed to prevent inadvertent captures of free identifiers.

(let ((=> #f))
  (cond (#t => 'ok)))           ⇒ ok

The macro transformer for cond recognizes => as a local variable, and hence an expression, and not as the top-level identifier =>, which the macro transformer treats as a syntactic keyword. Thus the example expands into

(let ((=> #f))
  (if #t (begin => 'ok)))

instead of

(let ((=> #f))
  (let ((temp #t))
    (if temp 
        ('ok temp))))

which would result in an invalid procedure call.


Next: , Previous: , Up: Macros   [Contents][Index]

2.11.3 Syntactic Closures

MIT/GNU Scheme’s syntax-transformation engine is an implementation of syntactic closures, a mechanism invented by Alan Bawden and Jonathan Rees. The main feature of the syntactic-closures mechanism is its simplicity and its close relationship to the environment models commonly used with Scheme. Using the mechanism to write macro transformers is somewhat cumbersome and can be confusing for the newly initiated, but it is easily mastered.


Next: , Previous: , Up: Syntactic Closures   [Contents][Index]

2.11.3.1 Syntax Terminology

This section defines the concepts and data types used by the syntactic closures facility.


Next: , Previous: , Up: Syntactic Closures   [Contents][Index]

2.11.3.2 Transformer Definition

This section describes the special forms for defining syntactic-closures macro transformers, and the associated procedures for manipulating syntactic closures and syntactic environments.

special form: sc-macro-transformer expression

The expression is expanded in the syntactic environment of the sc-macro-transformer expression, and the expanded expression is evaluated in the transformer environment to yield a macro transformer as described below. This macro transformer is bound to a macro keyword by the special form in which the transformer expression appears (for example, let-syntax).

In the syntactic closures facility, a macro transformer is a procedure that takes two arguments, a form and a syntactic environment, and returns a new form. The first argument, the input form, is the form in which the macro keyword occurred. The second argument, the usage environment, is the syntactic environment in which the input form occurred. The result of the transformer, the output form, is automatically closed in the transformer environment, which is the syntactic environment in which the transformer expression occurred.

For example, here is a definition of a push macro using syntax-rules:

(define-syntax push
  (syntax-rules ()
    ((push item list)
     (set! list (cons item list)))))

Here is an equivalent definition using sc-macro-transformer:

(define-syntax push
  (sc-macro-transformer
   (lambda (exp env)
     (let ((item (make-syntactic-closure env '() (cadr exp)))
           (list (make-syntactic-closure env '() (caddr exp))))
       `(set! ,list (cons ,item ,list))))))

In this example, the identifiers set! and cons are closed in the transformer environment, and thus will not be affected by the meanings of those identifiers in the usage environment env.

Some macros may be non-hygienic by design. For example, the following defines a loop macro that implicitly binds exit to an escape procedure. The binding of exit is intended to capture free references to exit in the body of the loop, so exit must be left free when the body is closed:

(define-syntax loop
  (sc-macro-transformer
   (lambda (exp env)
     (let ((body (cdr exp)))
       `(call-with-current-continuation
         (lambda (exit)
           (let f ()
             ,@(map (lambda (exp)
                      (make-syntactic-closure env '(exit)
                        exp))
                    body)
             (f))))))))
special form: rsc-macro-transformer expression

This form is an alternative way to define a syntactic-closures macro transformer. Its syntax and usage are identical to sc-macro-transformer, except that the roles of the usage environment and transformer environment are reversed. (Hence RSC stands for Reversed Syntactic Closures.) In other words, the procedure specified by expression still accepts two arguments, but its second argument will be the transformer environment rather than the usage environment, and the returned expression is closed in the usage environment rather than the transformer environment.

The advantage of this arrangement is that it allows a simpler definition style in some situations. For example, here is the push macro from above, rewritten in this style:

(define-syntax push
  (rsc-macro-transformer
   (lambda (exp env)
     `(,(make-syntactic-closure env '() 'SET!)
       ,(caddr exp)
       (,(make-syntactic-closure env '() 'CONS)
        ,(cadr exp)
        ,(caddr exp))))))

In this style only the introduced keywords are closed, while everything else remains open.

Note that rsc-macro-transformer and sc-macro-transformer are easily interchangeable. Here is how to emulate rsc-macro-transformer using sc-macro-transformer. (This technique can be used to effect the opposite emulation as well.)

(define-syntax push
  (sc-macro-transformer
   (lambda (exp usage-env)
     (capture-syntactic-environment
      (lambda (env)
        (make-syntactic-closure usage-env '()
          `(,(make-syntactic-closure env '() 'SET!)
            ,(caddr exp)
            (,(make-syntactic-closure env '() 'CONS)
             ,(cadr exp)
             ,(caddr exp)))))))))

To assign meanings to the identifiers in a form, use make-syntactic-closure to close the form in a syntactic environment.

procedure: make-syntactic-closure environment free-names form

Environment must be a syntactic environment, free-names must be a list of identifiers, and form must be a form. make-syntactic-closure constructs and returns a syntactic closure of form in environment, which can be used anywhere that form could have been used. All the identifiers used in form, except those explicitly excepted by free-names, obtain their meanings from environment.

Here is an example where free-names is something other than the empty list. It is instructive to compare the use of free-names in this example with its use in the loop example above: the examples are similar except for the source of the identifier being left free.

(define-syntax let1
  (sc-macro-transformer
   (lambda (exp env)
     (let ((id (cadr exp))
           (init (caddr exp))
           (exp (cadddr exp)))
       `((lambda (,id)
           ,(make-syntactic-closure env (list id) exp))
         ,(make-syntactic-closure env '() init))))))

let1 is a simplified version of let that only binds a single identifier, and whose body consists of a single expression. When the body expression is syntactically closed in its original syntactic environment, the identifier that is to be bound by let1 must be left free, so that it can be properly captured by the lambda in the output form.

In most situations, the free-names argument to make-syntactic-closure is the empty list. In those cases, the more succinct close-syntax can be used:

procedure: close-syntax form environment

Environment must be a syntactic environment and form must be a form. Returns a new syntactic closure of form in environment, with no free names. Entirely equivalent to

(make-syntactic-closure environment '() form)

To obtain a syntactic environment other than the usage environment, use capture-syntactic-environment.

procedure: capture-syntactic-environment procedure

capture-syntactic-environment returns a form that will, when transformed, call procedure on the current syntactic environment. Procedure should compute and return a new form to be transformed, in that same syntactic environment, in place of the form.

An example will make this clear. Suppose we wanted to define a simple loop-until keyword equivalent to

(define-syntax loop-until
  (syntax-rules ()
    ((loop-until id init test return step)
     (letrec ((loop
               (lambda (id)
                 (if test return (loop step)))))
       (loop init)))))

The following attempt at defining loop-until has a subtle bug:

(define-syntax loop-until
  (sc-macro-transformer
   (lambda (exp env)
     (let ((id (cadr exp))
           (init (caddr exp))
           (test (cadddr exp))
           (return (cadddr (cdr exp)))
           (step (cadddr (cddr exp)))
           (close
            (lambda (exp free)
              (make-syntactic-closure env free exp))))
       `(letrec ((loop
                  (lambda (,id)
                    (if ,(close test (list id))
                        ,(close return (list id))
                        (loop ,(close step (list id)))))))
          (loop ,(close init '())))))))

This definition appears to take all of the proper precautions to prevent unintended captures. It carefully closes the subexpressions in their original syntactic environment and it leaves the id identifier free in the test, return, and step expressions, so that it will be captured by the binding introduced by the lambda expression. Unfortunately it uses the identifiers if and loop within that lambda expression, so if the user of loop-until just happens to use, say, if for the identifier, it will be inadvertently captured.

The syntactic environment that if and loop want to be exposed to is the one just outside the lambda expression: before the user’s identifier is added to the syntactic environment, but after the identifier loop has been added. capture-syntactic-environment captures exactly that environment as follows:

(define-syntax loop-until
  (sc-macro-transformer
   (lambda (exp env)
     (let ((id (cadr exp))
           (init (caddr exp))
           (test (cadddr exp))
           (return (cadddr (cdr exp)))
           (step (cadddr (cddr exp)))
           (close
            (lambda (exp free)
              (make-syntactic-closure env free exp))))
       `(letrec ((loop
                  ,(capture-syntactic-environment
                    (lambda (env)
                      `(lambda (,id)
                         (,(make-syntactic-closure env '() `if)
                          ,(close test (list id))
                          ,(close return (list id))
                          (,(make-syntactic-closure env '() `loop)
                           ,(close step (list id)))))))))
          (loop ,(close init '())))))))

In this case, having captured the desired syntactic environment, it is convenient to construct syntactic closures of the identifiers if and the loop and use them in the body of the lambda.

A common use of capture-syntactic-environment is to get the transformer environment of a macro transformer:

(sc-macro-transformer
 (lambda (exp env)
   (capture-syntactic-environment
    (lambda (transformer-env)
      …))))

Previous: , Up: Syntactic Closures   [Contents][Index]

2.11.3.3 Identifiers

This section describes the procedures that create and manipulate identifiers. The identifier data type extends the syntactic closures facility to be compatible with the high-level syntax-rules facility.

As discussed earlier, an identifier is either a symbol or an alias. An alias is implemented as a syntactic closure whose form is an identifier:

(make-syntactic-closure env '() 'a) ⇒ an alias

Aliases are implemented as syntactic closures because they behave just like syntactic closures most of the time. The difference is that an alias may be bound to a new value (for example by lambda or let-syntax); other syntactic closures may not be used this way. If an alias is bound, then within the scope of that binding it is looked up in the syntactic environment just like any other identifier.

Aliases are used in the implementation of the high-level facility syntax-rules. A macro transformer created by syntax-rules uses a template to generate its output form, substituting subforms of the input form into the template. In a syntactic closures implementation, all of the symbols in the template are replaced by aliases closed in the transformer environment, while the output form itself is closed in the usage environment. This guarantees that the macro transformation is hygienic, without requiring the transformer to know the syntactic roles of the substituted input subforms.

procedure: identifier? object

Returns #t if object is an identifier, otherwise returns #f. Examples:

(identifier? 'a)        ⇒ #t
(identifier? (make-syntactic-closure env '() 'a))
                        ⇒ #t

(identifier? "a")       ⇒ #f
(identifier? #\a)       ⇒ #f
(identifier? 97)        ⇒ #f
(identifier? #f)        ⇒ #f
(identifier? '(a))      ⇒ #f
(identifier? '#(a))     ⇒ #f

The predicate eq? is used to determine if two identifers are “the same”. Thus eq? can be used to compare identifiers exactly as it would be used to compare symbols. Often, though, it is useful to know whether two identifiers “mean the same thing”. For example, the cond macro uses the symbol else to identify the final clause in the conditional. A macro transformer for cond cannot just look for the symbol else, because the cond form might be the output of another macro transformer that replaced the symbol else with an alias. Instead the transformer must look for an identifier that “means the same thing” in the usage environment as the symbol else means in the transformer environment.

procedure: identifier=? environment1 identifier1 environment2 identifier2

Environment1 and environment2 must be syntactic environments, and identifier1 and identifier2 must be identifiers. identifier=? returns #t if the meaning of identifier1 in environment1 is the same as that of identifier2 in environment2, otherwise it returns #f. Examples:

(let-syntax
    ((foo
      (sc-macro-transformer
       (lambda (form env)
         (capture-syntactic-environment
          (lambda (transformer-env)
            (identifier=? transformer-env 'x env 'x)))))))
  (list (foo)
        (let ((x 3))
          (foo))))
                        ⇒ (#t #f)

(let-syntax ((bar foo))
  (let-syntax
      ((foo
        (sc-macro-transformer
         (lambda (form env)
           (capture-syntactic-environment
            (lambda (transformer-env)
              (identifier=? transformer-env 'foo
                            env (cadr form))))))))
    (list (foo foo)
          (foo bar))))
                        ⇒ (#f #t)

Sometimes it is useful to be able to introduce a new identifier that is guaranteed to be different from any existing identifier, similarly to the way that generate-uninterned-symbol is used.

procedure: make-synthetic-identifier identifier

Creates and returns and new synthetic identifier (alias) that is guaranteed to be different from all existing identifiers. Identifier is any existing identifier, which is used in deriving the name of the new identifier.


Previous: , Up: Macros   [Contents][Index]

2.11.4 Explicit Renaming

Explicit renaming is an alternative facility for defining macro transformers. In the MIT/GNU Scheme implementation, explicit-renaming transformers are implemented as an abstraction layer on top of syntactic closures. An explicit-renaming macro transformer is defined by an instance of the er-macro-transformer keyword:

special form: er-macro-transformer expression

The expression is expanded in the syntactic environment of the er-macro-transformer expression, and the expanded expression is evaluated in the transformer environment to yield a macro transformer as described below. This macro transformer is bound to a macro keyword by the special form in which the transformer expression appears (for example, let-syntax).

In the explicit-renaming facility, a macro transformer is a procedure that takes three arguments, a form, a renaming procedure, and a comparison predicate, and returns a new form. The first argument, the input form, is the form in which the macro keyword occurred.

The second argument to a transformation procedure is a renaming procedure that takes the representation of an identifier as its argument and returns the representation of a fresh identifier that occurs nowhere else in the program. For example, the transformation procedure for a simplified version of the let macro might be written as

(lambda (exp rename compare)
  (let ((vars (map car (cadr exp)))
        (inits (map cadr (cadr exp)))
        (body (cddr exp)))
    `((lambda ,vars ,@body)
      ,@inits)))

This would not be hygienic, however. A hygienic let macro must rename the identifier lambda to protect it from being captured by a local binding. The renaming effectively creates an fresh alias for lambda, one that cannot be captured by any subsequent binding:

(lambda (exp rename compare)
  (let ((vars (map car (cadr exp)))
        (inits (map cadr (cadr exp)))
        (body (cddr exp)))
    `((,(rename 'lambda) ,vars ,@body)
      ,@inits)))

The expression returned by the transformation procedure will be expanded in the syntactic environment obtained from the syntactic environment of the macro application by binding any fresh identifiers generated by the renaming procedure to the denotations of the original identifiers in the syntactic environment in which the macro was defined. This means that a renamed identifier will denote the same thing as the original identifier unless the transformation procedure that renamed the identifier placed an occurrence of it in a binding position.

The renaming procedure acts as a mathematical function in the sense that the identifiers obtained from any two calls with the same argument will be the same in the sense of eqv?. It is an error if the renaming procedure is called after the transformation procedure has returned.

The third argument to a transformation procedure is a comparison predicate that takes the representations of two identifiers as its arguments and returns true if and only if they denote the same thing in the syntactic environment that will be used to expand the transformed macro application. For example, the transformation procedure for a simplified version of the cond macro can be written as

(lambda (exp rename compare)
  (let ((clauses (cdr exp)))
    (if (null? clauses)
        `(,(rename 'quote) unspecified)
        (let* ((first (car clauses))
               (rest (cdr clauses))
               (test (car first)))
          (cond ((and (identifier? test)
                      (compare test (rename 'else)))
                 `(,(rename 'begin) ,@(cdr first)))
                (else `(,(rename 'if)
                        ,test
                         (,(rename 'begin) ,@(cdr first))
                         (cond ,@rest))))))))))

In this example the identifier else is renamed before being passed to the comparison predicate, so the comparison will be true if and only if the test expression is an identifier that denotes the same thing in the syntactic environment of the expression being transformed as else denotes in the syntactic environment in which the cond macro was defined. If else were not renamed before being passed to the comparison predicate, then it would match a local variable that happened to be named else, and the macro would not be hygienic.

Some macros are non-hygienic by design. For example, the following defines a loop macro that implicitly binds exit to an escape procedure. The binding of exit is intended to capture free references to exit in the body of the loop, so exit is not renamed.

(define-syntax loop
  (er-macro-transformer
   (lambda (x r c)
     (let ((body (cdr x)))
       `(,(r 'call-with-current-continuation)
         (,(r 'lambda) (exit)
          (,(r 'let) ,(r 'f) () ,@body (,(r 'f)))))))))

Suppose a while macro is implemented using loop, with the intent that exit may be used to escape from the while loop. The while macro cannot be written as

(define-syntax while
  (syntax-rules ()
    ((while test body ...)
     (loop (if (not test) (exit #f))
           body ...))))

because the reference to exit that is inserted by the while macro is intended to be captured by the binding of exit that will be inserted by the loop macro. In other words, this while macro is not hygienic. Like loop, it must be written using the er-macro-transformer syntax:

(define-syntax while
  (er-macro-transformer
   (lambda (x r c)
     (let ((test (cadr x))
           (body (cddr x)))
       `(,(r 'loop)
         (,(r 'if) (,(r 'not) ,test) (exit #f))
         ,@body)))))

Previous: , Up: Special Forms   [Contents][Index]

2.12 SRFI syntax

Several special forms have been introduced to support some of the Scheme Requests for Implementation (SRFI). Note that MIT/GNU Scheme has for some time supported SRFI 23 (error-reporting mechanism) and SRFI 30 (nested multi-line comments), since these SRFIs reflect existing practice rather than introducing new functionality.


Next: , Previous: , Up: SRFI syntax   [Contents][Index]

2.12.1 cond-expand (SRFI 0)

SRFI 0 is a mechanism for portably determining the availability of SRFI features. The cond-expand special form conditionally expands according to the features available.

standard special form: cond-expand clause clause dots

Each clause has the form

(feature-requirement expression …)

where feature-requirement can have one of the following forms:

feature-identifier
(and feature-requirement …)
(or feature-requirement …)
(not feature-requirement)
else

(Note that at most one else clause may be present, and it must always be the last clause.)

The cond-expand special form tests for the existence of features at macro-expansion time. It either expands into the body of one of its clauses or signals an error during syntactic processing. cond-expand expands into the body of the first clause whose feature-requirement is currently satisfied (an else clause, if present, is selected if none of the previous clauses is selected).

A feature-requirement has an obvious interpretation as a logical formula, where the feature-identifier variables have meaning true if the feature corresponding to the feature-identifier, as specified in the SRFI registry, is in effect at the location of the cond-expand form, and false otherwise. A feature-requirement is satisfied if its formula is true under this interpretation.

(cond-expand
  ((and srfi-1 srfi-10)
   (write 1))
  ((or srfi-1 srfi-10)
   (write 2))
  (else))

(cond-expand
  (command-line
   (define (program-name) (car (argv)))))

The second example assumes that command-line is an alias for some feature which gives access to command line arguments. Note that an error will be signaled at macro-expansion time if this feature is not present.

Note that MIT/GNU Scheme allows cond-expand in any context where a special form is allowed. This is an extension of the semantics defined by SRFI 0, which only allows cond-expand at top level.


Next: , Previous: , Up: SRFI syntax   [Contents][Index]

2.12.2 receive (SRFI 8)

SRFI 8 defines a convenient syntax to bind an identifier to each of the values of a multiple-valued expression and then evaluate an expression in the scope of the bindings. As an instance of this pattern, consider the following excerpt from a ‘quicksort’ procedure:

(call-with-values
  (lambda ()
    (partition (precedes pivot) others))
  (lambda (fore aft)
    (append (qsort fore) (cons pivot (qsort aft)))))

Here ‘partition’ is a multiple-valued procedure that takes two arguments, a predicate and a list, and returns two lists, one comprising the list elements that satisfy the predicate, the other those that do not. The purpose of the expression shown is to partition the list ‘others’, sort each of the sublists, and recombine the results into a sorted list.

For our purposes, the important step is the binding of the identifiers ‘fore’ and ‘aft’ to the values returned by ‘partition’. Expressing the construction and use of these bindings with the call-by-values primitive is cumbersome: One must explicitly embed the expression that provides the values for the bindings in a parameterless procedure, and one must explicitly embed the expression to be evaluated in the scope of those bindings in another procedure, writing as its parameters the identifiers that are to be bound to the values received.

These embeddings are boilerplate, exposing the underlying binding mechanism but not revealing anything relevant to the particular program in which it occurs. So the use of a syntactic abstraction that exposes only the interesting parts – the identifiers to be bound, the multiple-valued expression that supplies the values, and the body of the receiving procedure – makes the code more concise and more readable:

(receive (fore aft) (partition (precedes pivot) others)
  (append (qsort fore) (cons pivot (qsort aft))))

The advantages are similar to those of a ‘let’ expression over a procedure call with a ‘lambda’ expression as its operator. In both cases, cleanly separating a “header” in which the bindings are established from a “body” in which they are used makes it easier to follow the code.

special form: receive formals expression body

Formals and body are defined as for ‘lambda’ (see Lambda Expressions). Specifically, formals can have the following forms (the use of ‘#!optional’ and ‘#!rest’ is also allowed in formals but is omitted for brevity):

(ident1identN)

The environment in which the ‘receive’ expression is evaluated is extended by binding ident1, …, identN to fresh locations. The expression is evaluated, and its values are stored into those locations. (It is an error if expression does not have exactly N values.)

ident

The environment in which the ‘receive’ expression is evaluated is extended by binding ident to a fresh location. The expression is evaluated, its values are converted into a newly allocated list, and the list is stored in the location bound to ident.

(ident1identN . identN+1)

The environment in which the ‘receive’ expression is evaluated is extended by binding ident1, …, identN+1 to fresh locations. The expression is evaluated. Its first N values are stored into the locations bound to ident1identN. Any remaining values are converted into a newly allocated list, which is stored into the location bound to identN+1. (It is an error if expression does not have at least N values.)

In any case, the expressions in body are evaluated sequentially in the extended environment. The results of the last expression in the body are the values of the ‘receive’ expression.


Next: , Previous: , Up: SRFI syntax   [Contents][Index]

2.12.3 and-let* (SRFI 2)

SRFI 2 provides a form that combines ‘and’ and ‘let*’ for a logically short-circuiting sequential binding operator.

special form: and-let* (clause …) body

Runs through each of the clauses left-to-right, short-circuiting like ‘and’ in that the first false clause will result in the whole ‘and-let*’ form returning false. If a body is supplied, and all of the clauses evaluate true, then the body is evaluated sequentially as if in a ‘begin’ form, and the value of the ‘and-let*’ expression is the value of the last body form, evaluated in a tail position with respect to the ‘and-let*’ expression. If no body is supplied, the value of the last clause, also evaluated in a tail position with respect to the ‘and-let*’ expression, is used instead.

Each clause should have one of the following forms:

identifier

in which case identifier’s value is tested.

(expression)

in which case the value of expression is tested.

(identifier expression)

in which case expression is evaluated, and, if its value is not false, identifier is bound to that value for the remainder of the clauses and the optional body.

Example:

(and-let* ((list (compute-list))
           ((pair? list))
           (item (car list))
           ((integer? item)))
  (sqrt item))

Previous: , Up: SRFI syntax   [Contents][Index]

2.12.4 define-record-type (SRFI 9)

The ‘define-record-type’ syntax described in SRFI 9 is a slight simplification of one written for Scheme 48 by Jonathan Rees. Unlike many record-defining special forms, it does not create any new identifiers. Instead, the names of the record type, predicate, constructor, and so on are all listed explicitly in the source. This has the following advantages:

extended standard special form: define-record-type type-name (constructor-name field-tag …) predicate-name field-spec …

Type-name, contructor-name, field-tag, and predicate-name are identifiers. Field-spec has one of these two forms:

(field-tag accessor-name)
(field-tag accessor-name modifier-name)

where field-tag, accessor-name, and modifier-name are each identifiers.

define-record-type is generative: each use creates a new record type that is distinct from all existing types, including other record types and Scheme’s predefined types. Record-type definitions may only occur at top-level (there are two possible semantics for “internal” record-type definitions, generative and nongenerative, and no consensus as to which is better).

An instance of define-record-type is equivalent to the following definitions:

Assigning the value of any of these identifiers has no effect on the behavior of any of their original values.

The following

(define-record-type :pare
  (kons x y)
  pare?
  (x kar set-kar!)
  (y kdr))

defines ‘kons’ to be a constructor, ‘kar’ and ‘kdr’ to be accessors, ‘set-kar!’ to be a modifier, and ‘pare?’ to be a predicate for objects of type ‘:pare’.

(pare? (kons 1 2))        ⇒ #t
(pare? (cons 1 2))        ⇒ #f
(kar (kons 1 2))          ⇒ 1
(kdr (kons 1 2))          ⇒ 2
(let ((k (kons 1 2)))
  (set-kar! k 3)
  (kar k))                ⇒ 3

Next: , Previous: , Up: Top   [Contents][Index]

3 Equivalence Predicates

A predicate is a procedure that always returns a boolean value (#t or #f). An equivalence predicate is the computational analogue of a mathematical equivalence relation (it is symmetric, reflexive, and transitive). Of the equivalence predicates described in this section, eq? is the finest or most discriminating, and equal? is the coarsest. eqv? is slightly less discriminating than eq?.

procedure: eqv? obj1 obj2

The eqv? procedure defines a useful equivalence relation on objects. Briefly, it returns #t if obj1 and obj2 should normally be regarded as the same object.

The eqv? procedure returns #t if:

The eqv? procedure returns #f if:

Some examples:

(eqv? 'a 'a)                    ⇒  #t
(eqv? 'a 'b)                    ⇒  #f
(eqv? 2 2)                      ⇒  #t
(eqv? '() '())                  ⇒  #t
(eqv? 100000000 100000000)      ⇒  #t
(eqv? (cons 1 2) (cons 1 2))    ⇒  #f
(eqv? (lambda () 1)
      (lambda () 2))            ⇒  #f
(eqv? #f 'nil)                  ⇒  #f
(let ((p (lambda (x) x)))
  (eqv? p p))                   ⇒  #t

The following examples illustrate cases in which the above rules do not fully specify the behavior of eqv?. All that can be said about such cases is that the value returned by eqv? must be a boolean.

(eqv? "" "")                    ⇒  unspecified
(eqv? '#() '#())                ⇒  unspecified
(eqv? (lambda (x) x)
      (lambda (x) x))           ⇒  unspecified
(eqv? (lambda (x) x)
      (lambda (y) y))           ⇒  unspecified

The next set of examples shows the use of eqv? with procedures that have local state. gen-counter must return a distinct procedure every time, since each procedure has its own internal counter. gen-loser, however, returns equivalent procedures each time, since the local state does not affect the value or side effects of the procedures.

(define gen-counter
  (lambda ()
    (let ((n 0))
      (lambda () (set! n (+ n 1)) n))))
(let ((g (gen-counter)))
  (eqv? g g))                   ⇒  #t
(eqv? (gen-counter) (gen-counter))
                                ⇒  #f

(define gen-loser
  (lambda ()
    (let ((n 0))
      (lambda () (set! n (+ n 1)) 27))))
(let ((g (gen-loser)))
  (eqv? g g))                   ⇒  #t
(eqv? (gen-loser) (gen-loser))
                                ⇒  unspecified

(letrec ((f (lambda () (if (eqv? f g) 'both 'f)))
         (g (lambda () (if (eqv? f g) 'both 'g)))
  (eqv? f g))
                                ⇒  unspecified

(letrec ((f (lambda () (if (eqv? f g) 'f 'both)))
         (g (lambda () (if (eqv? f g) 'g 'both)))
  (eqv? f g))
                                ⇒  #f

Objects of distinct types must never be regarded as the same object.

Since it is an error to modify constant objects (those returned by literal expressions), the implementation may share structure between constants where appropriate. Thus the value of eqv? on constants is sometimes unspecified.

(let ((x '(a)))
  (eqv? x x))                    ⇒  #t
(eqv? '(a) '(a))                 ⇒  unspecified
(eqv? "a" "a")                   ⇒  unspecified
(eqv? '(b) (cdr '(a b)))         ⇒  unspecified

Rationale: The above definition of eqv? allows implementations latitude in their treatment of procedures and literals: implementations are free either to detect or to fail to detect that two procedures or two literals are equivalent to each other, and can decide whether or not to merge representations of equivalent objects by using the same pointer or bit pattern to represent both.

procedure: eq? obj1 obj2

eq? is similar to eqv? except that in some cases it is capable of discerning distinctions finer than those detectable by eqv?.

eq? and eqv? are guaranteed to have the same behavior on symbols, booleans, the empty list, pairs, records, and non-empty strings and vectors. eq?’s behavior on numbers and characters is implementation-dependent, but it will always return either true or false, and will return true only when eqv? would also return true. eq? may also behave differently from eqv? on empty vectors and empty strings.

(eq? 'a 'a)                     ⇒  #t
(eq? '(a) '(a))                 ⇒  unspecified
(eq? (list 'a) (list 'a))       ⇒  #f
(eq? "a" "a")                   ⇒  unspecified
(eq? "" "")                     ⇒  unspecified
(eq? '() '())                   ⇒  #t
(eq? 2 2)                       ⇒  unspecified
(eq? #\A #\A)                   ⇒  unspecified
(eq? car car)                   ⇒  #t
(let ((n (+ 2 3)))
  (eq? n n))                    ⇒  unspecified
(let ((x '(a)))
  (eq? x x))                    ⇒  #t
(let ((x '#()))
  (eq? x x))                    ⇒  #t
(let ((p (lambda (x) x)))
  (eq? p p))                    ⇒  #t

Rationale: It will usually be possible to implement eq? much more efficiently than eqv?, for example, as a simple pointer comparison instead of as some more complicated operation. One reason is that it may not be possible to compute eqv? of two numbers in constant time, whereas eq? implemented as pointer comparison will always finish in constant time. eq? may be used like eqv? in applications using procedures to implement objects with state since it obeys the same constraints as eqv?.

procedure: equal? obj1 obj2

equal? recursively compares the contents of pairs, vectors, and strings, applying eqv? on other objects such as numbers, symbols, and records. A rule of thumb is that objects are generally equal? if they print the same. equal? may fail to terminate if its arguments are circular data structures.

(equal? 'a 'a)                  ⇒  #t
(equal? '(a) '(a))              ⇒  #t
(equal? '(a (b) c)
        '(a (b) c))             ⇒  #t
(equal? "abc" "abc")            ⇒  #t
(equal? 2 2)                    ⇒  #t
(equal? (make-vector 5 'a)
        (make-vector 5 'a))     ⇒  #t
(equal? (lambda (x) x)
        (lambda (y) y))         ⇒  unspecified

Next: , Previous: , Up: Top   [Contents][Index]

4 Numbers

(This section is largely taken from the Revised^4 Report on the Algorithmic Language Scheme.)

Numerical computation has traditionally been neglected by the Lisp community. Until Common Lisp there was no carefully thought out strategy for organizing numerical computation, and with the exception of the MacLisp system little effort was made to execute numerical code efficiently. This report recognizes the excellent work of the Common Lisp committee and accepts many of their recommendations. In some ways this report simplifies and generalizes their proposals in a manner consistent with the purposes of Scheme.

It is important to distinguish between the mathematical numbers, the Scheme numbers that attempt to model them, the machine representations used to implement the Scheme numbers, and notations used to write numbers. This report uses the types number, complex, real, rational, and integer to refer to both mathematical numbers and Scheme numbers. Machine representations such as fixed point and floating point are referred to by names such as fixnum and flonum.


Next: , Previous: , Up: Numbers   [Contents][Index]

4.1 Numerical types

Mathematically, numbers may be arranged into a tower of subtypes in which each level is a subset of the level above it:

number
complex
real
rational
integer

For example, 3 is an integer. Therefore 3 is also a rational, a real, and a complex. The same is true of the Scheme numbers that model 3. For Scheme numbers, these types are defined by the predicates number?, complex?, real?, rational?, and integer?.

There is no simple relationship between a number’s type and its representation inside a computer. Although most implementations of Scheme will offer at least two different representations of 3, these different representations denote the same integer.

Scheme’s numerical operations treat numbers as abstract data, as independent of their representation as possible. Although an implementation of Scheme may use fixnum, flonum, and perhaps other representations for numbers, this should not be apparent to a casual programmer writing simple programs.

It is necessary, however, to distinguish between numbers that are represented exactly and those that may not be. For example, indexes into data structures must be known exactly, as must some polynomial coefficients in a symbolic algebra system. On the other hand, the results of measurements are inherently inexact, and irrational numbers may be approximated by rational and therefore inexact approximations. In order to catch uses of inexact numbers where exact numbers are required, Scheme explicitly distinguishes exact from inexact numbers. This distinction is orthogonal to the dimension of type.


Next: , Previous: , Up: Numbers   [Contents][Index]

4.2 Exactness

Scheme numbers are either exact or inexact. A number is exact if it was written as an exact constant or was derived from exact numbers using only exact operations. A number is inexact if it was written as an inexact constant, if it was derived using inexact ingredients, or if it was derived using inexact operations. Thus inexactness is a contagious property of a number.

If two implementations produce exact results for a computation that did not involve inexact intermediate results, the two ultimate results will be mathematically equivalent. This is generally not true of computations involving inexact numbers since approximate methods such as floating point arithmetic may be used, but it is the duty of each implementation to make the result as close as practical to the mathematically ideal result.

Rational operations such as + should always produce exact results when given exact arguments. If the operation is unable to produce an exact result, then it may either report the violation of an implementation restriction or it may silently coerce its result to an inexact value. See Implementation restrictions.

With the exception of exact, the operations described in this section must generally return inexact results when given any inexact arguments. An operation may, however, return an exact result if it can prove that the value of the result is unaffected by the inexactness of its arguments. For example, multiplication of any number by an exact zero may produce an exact zero result, even if the other argument is inexact.


Next: , Previous: , Up: Numbers   [Contents][Index]

4.3 Implementation restrictions

Implementations of Scheme are not required to implement the whole tower of subtypes (see Numerical types), but they must implement a coherent subset consistent with both the purposes of the implementation and the spirit of the Scheme language. For example, an implementation in which all numbers are real may still be quite useful.1

Implementations may also support only a limited range of numbers of any type, subject to the requirements of this section. The supported range for exact numbers of any type may be different from the supported range for inexact numbers of that type. For example, an implementation that uses flonums to represent all its inexact real numbers may support a practically unbounded range of exact integers and rationals while limiting the range of inexact reals (and therefore the range of inexact integers and rationals) to the dynamic range of the flonum format. Furthermore the gaps between the representable inexact integers and rationals are likely to be very large in such an implementation as the limits of this range are approached.

An implementation of Scheme must support exact integers throughout the range of numbers that may be used for indexes of lists, vectors, and strings or that may result from computing the length of a list, vector, or string. The length, vector-length, and string-length procedures must return an exact integer, and it is an error to use anything but an exact integer as an index. Furthermore any integer constant within the index range, if expressed by an exact integer syntax, will indeed be read as an exact integer, regardless of any implementation restrictions that may apply outside this range. Finally, the procedures listed below will always return an exact integer result provided all their arguments are exact integers and the mathematically expected result is representable as an exact integer within the implementation:

*                gcd                modulo
+                imag-part          numerator
-                exact              quotient
abs              lcm                rationalize
angle            magnitude          real-part
ceiling          make-polar         remainder
denominator      make-rectangular   round
expt             max                truncate
floor            min

Implementations are encouraged, but not required, to support exact integers and exact rationals of practically unlimited size and precision, and to implement the above procedures and the / procedure in such a way that they always return exact results when given exact arguments. If one of these procedures is unable to deliver an exact result when given exact arguments, then it may either report a violation of an implementation restriction or it may silently coerce its result to an inexact number. Such a coercion may cause an error later.

An implementation may use floating point and other approximate representation strategies for inexact numbers. This report recommends, but does not require, that the IEEE 32-bit and 64-bit floating point standards be followed by implementations that use flonum representations, and that implementations using other representations should match or exceed the precision achievable using these floating point standards.

In particular, implementations that use flonum representations must follow these rules: A flonum result must be represented with at least as much precision as is used to express any of the inexact arguments to that operation. It is desirable (but not required) for potentially inexact operations such as sqrt, when applied to exact arguments, to produce exact answers whenever possible (for example the square root of an exact 4 ought to be an exact 2). If, however, an exact number is operated upon so as to produce an inexact result (as by sqrt), and if the result is represented as a flonum, then the most precise flonum format available must be used; but if the result is represented in some other way then the representation must have at least as much precision as the most precise flonum format available.

Although Scheme allows a variety of written notations for numbers, any particular implementation may support only some of them.2 For example, an implementation in which all numbers are real need not support the rectangular and polar notations for complex numbers. If an implementation encounters an exact numerical constant that it cannot represent as an exact number, then it may either report a violation of an implementation restriction or it may silently represent the constant by an inexact number.


Next: , Previous: , Up: Numbers   [Contents][Index]

4.4 Syntax of numerical constants

A number may be written in binary, octal, decimal, or hexadecimal by the use of a radix prefix. The radix prefixes are #b (binary), #o (octal), #d (decimal), and #x (hexadecimal). With no radix prefix, a number is assumed to be expressed in decimal.

A numerical constant may be specified to be either exact or inexact by a prefix. The prefixes are #e for exact, and #i for inexact. An exactness prefix may appear before or after any radix prefix that is used. If the written representation of a number has no exactness prefix, the constant may be either inexact or exact. It is inexact if it contains a decimal point, an exponent, or a # character in the place of a digit, otherwise it is exact.

In systems with inexact numbers of varying precisions it may be useful to specify the precision of a constant. For this purpose, numerical constants may be written with an exponent marker that indicates the desired precision of the inexact representation. The letters s, f, d, and l specify the use of short, single, double, and long precision, respectively. (When fewer than four internal inexact representations exist, the four size specifications are mapped onto those available. For example, an implementation with two internal representations may map short and single together and long and double together.) In addition, the exponent marker e specifies the default precision for the implementation. The default precision has at least as much precision as double, but implementations may wish to allow this default to be set by the user.

3.14159265358979F0
       Round to single — 3.141593
0.6L0
       Extend to long — .600000000000000

Next: , Previous: , Up: Numbers   [Contents][Index]

4.5 Numerical operations

See Entry Format, for a summary of the naming conventions used to specify restrictions on the types of arguments to numerical routines. The examples used in this section assume that any numerical constant written using an exact notation is indeed represented as an exact number. Some examples also assume that certain numerical constants written using an inexact notation can be represented without loss of accuracy; the inexact constants were chosen so that this is likely to be true in implementations that use flonums to represent inexact numbers.

procedure: number? object
procedure: complex? object
procedure: real? object
procedure: rational? object
procedure: integer? object

These numerical type predicates can be applied to any kind of argument, including non-numbers. They return #t if the object is of the named type, and otherwise they return #f. In general, if a type predicate is true of a number then all higher type predicates are also true of that number. Consequently, if a type predicate is false of a number, then all lower type predicates are also false of that number.3

If z is an inexact complex number, then (real? z) is true if and only if (zero? (imag-part z)) is true. If x is an inexact real number, then (integer? x) is true if and only if (= x (round x)).

(complex? 3+4i)         ⇒  #t
(complex? 3)            ⇒  #t
(real? 3)               ⇒  #t
(real? -2.5+0.0i)       ⇒  #t
(real? #e1e10)          ⇒  #t
(rational? 6/10)        ⇒  #t
(rational? 6/3)         ⇒  #t
(integer? 3+0i)         ⇒  #t
(integer? 3.0)          ⇒  #t
(integer? 8/4)          ⇒  #t

Note: The behavior of these type predicates on inexact numbers is unreliable, since any inaccuracy may affect the result.

procedure: exact? z
procedure: inexact? z

These numerical predicates provide tests for the exactness of a quantity. For any Scheme number, precisely one of these predicates is true.

procedure: exact-integer? object
procedure: exact-nonnegative-integer? object
procedure: exact-rational? object

These procedures test for some very common types of numbers. These tests could be written in terms of simpler predicates, but are more efficient.

procedure: = z1 z2 z3 …
procedure: < x1 x2 x3 …
procedure: > x1 x2 x3 …
procedure: <= x1 x2 x3 …
procedure: >= x1 x2 x3 …

These procedures return #t if their arguments are (respectively): equal, monotonically increasing, monotonically decreasing, monotonically nondecreasing, or monotonically nonincreasing.

These predicates are transitive. Note that the traditional implementations of these predicates in Lisp-like languages are not transitive.

Note: While it is not an error to compare inexact numbers using these predicates, the results may be unreliable because a small inaccuracy may affect the result; this is especially true of = and zero?. When in doubt, consult a numerical analyst.

procedure: zero? z
procedure: positive? x
procedure: negative? x
procedure: odd? x
procedure: even? x

These numerical predicates test a number for a particular property, returning #t or #f. See note above regarding inexact numbers.

procedure: max x1 x2 …
procedure: min x1 x2 …

These procedures return the maximum or minimum of their arguments.

(max 3 4)              ⇒  4    ; exact
(max 3.9 4)            ⇒  4.0  ; inexact

Note: If any argument is inexact, then the result will also be inexact (unless the procedure can prove that the inaccuracy is not large enough to affect the result, which is possible only in unusual implementations). If min or max is used to compare numbers of mixed exactness, and the numerical value of the result cannot be represented as an inexact number without loss of accuracy, then the procedure may report a violation of an implementation restriction.4

procedure: + z1 …
procedure: * z1 …

These procedures return the sum or product of their arguments.

(+ 3 4)                 ⇒  7
(+ 3)                   ⇒  3
(+)                     ⇒  0
(* 4)                   ⇒  4
(*)                     ⇒  1
procedure: - z1 z2 …
procedure: / z1 z2 …

With two or more arguments, these procedures return the difference or quotient of their arguments, associating to the left. With one argument, however, they return the additive or multiplicative inverse of their argument.

(- 3 4)                 ⇒  -1
(- 3 4 5)               ⇒  -6
(- 3)                   ⇒  -3
(/ 3 4 5)               ⇒  3/20
(/ 3)                   ⇒  1/3
procedure: 1+ z
procedure: -1+ z

(1+ z) is equivalent to (+ z 1); (-1+ z) is equivalent to (- z 1).

procedure: abs x

abs returns the magnitude of its argument.

(abs -7)                ⇒  7
procedure: quotient n1 n2
procedure: remainder n1 n2
procedure: modulo n1 n2

These procedures implement number-theoretic (integer) division: for positive integers n1 and n2, if n3 and n4 are integers such that then

(quotient n1 n2)        ⇒  n3
(remainder n1 n2)       ⇒  n4
(modulo n1 n2)          ⇒  n4

For integers n1 and n2 with n2 not equal to 0,

(= n1
   (+ (* n2 (quotient n1 n2))
      (remainder n1 n2)))
                                    ⇒  #t

provided all numbers involved in that computation are exact.

The value returned by quotient always has the sign of the product of its arguments. remainder and modulo differ on negative arguments — the remainder always has the sign of the dividend, the modulo always has the sign of the divisor:

(modulo 13 4)           ⇒  1
(remainder 13 4)        ⇒  1

(modulo -13 4)          ⇒  3
(remainder -13 4)       ⇒  -1

(modulo 13 -4)          ⇒  -3
(remainder 13 -4)       ⇒  1

(modulo -13 -4)         ⇒  -1
(remainder -13 -4)      ⇒  -1

(remainder -13 -4.0)    ⇒  -1.0  ; inexact

Note that quotient is the same as integer-truncate.

procedure: integer-floor n1 n2
procedure: integer-ceiling n1 n2
procedure: integer-truncate n1 n2
procedure: integer-round n1 n2

These procedures combine integer division with rounding. For example, the following are equivalent:

(integer-floor n1 n2)
(floor (/ n1 n2))

However, the former is faster and does not produce an intermediate result.

Note that integer-truncate is the same as quotient.

procedure: integer-divide n1 n2
procedure: integer-divide-quotient qr
procedure: integer-divide-remainder qr

integer-divide is equivalent to performing both quotient and remainder at once. The result of integer-divide is an object with two components; the procedures integer-divide-quotient and integer-divide-remainder select those components. These procedures are useful when both the quotient and remainder are needed; often computing both of these numbers simultaneously is much faster than computing them separately.

For example, the following are equivalent:

(lambda (n d)
  (cons (quotient n d)
        (remainder n d)))

(lambda (n d)
  (let ((qr (integer-divide n d)))
    (cons (integer-divide-quotient qr)
          (integer-divide-remainder qr))))
procedure: gcd n1 …
procedure: lcm n1 …

These procedures return the greatest common divisor or least common multiple of their arguments. The result is always non-negative.

(gcd 32 -36)            ⇒  4
(gcd)                   ⇒  0

(lcm 32 -36)            ⇒  288
(lcm 32.0 -36)          ⇒  288.0  ; inexact
(lcm)                   ⇒  1
procedure: numerator q
procedure: denominator q

These procedures return the numerator or denominator of their argument; the result is computed as if the argument was represented as a fraction in lowest terms. The denominator is always positive. The denominator of 0 is defined to be 1.

(numerator (/ 6 4))  ⇒  3
(denominator (/ 6 4))  ⇒  2
(denominator (inexact (/ 6 4))) ⇒ 2.0
procedure: floor x
procedure: ceiling x
procedure: truncate x
procedure: round x

These procedures return integers. floor returns the largest integer not larger than x. ceiling returns the smallest integer not smaller than x. truncate returns the integer closest to x whose absolute value is not larger than the absolute value of x. round returns the closest integer to x, rounding to even when x is halfway between two integers.

Rationale: round rounds to even for consistency with the rounding modes required by the IEEE floating point standard.

Note: If the argument to one of these procedures is inexact, then the result will also be inexact. If an exact value is needed, the result should be passed to the exact procedure (or use one of the procedures below).

(floor -4.3)          ⇒  -5.0
(ceiling -4.3)        ⇒  -4.0
(truncate -4.3)       ⇒  -4.0
(round -4.3)          ⇒  -4.0

(floor 3.5)           ⇒  3.0
(ceiling 3.5)         ⇒  4.0
(truncate 3.5)        ⇒  3.0
(round 3.5)           ⇒  4.0  ; inexact

(round 7/2)           ⇒  4    ; exact
(round 7)             ⇒  7
procedure: floor->exact x
procedure: ceiling->exact x
procedure: truncate->exact x
procedure: round->exact x

These procedures are similar to the preceding procedures except that they always return an exact result. For example, the following are equivalent

(floor->exact x)
(exact (floor x))

except that the former is faster and has fewer range restrictions.

procedure: rationalize x y
procedure: rationalize->exact x y

rationalize returns the simplest rational number differing from x by no more than y. A rational number r1 is simpler than another rational number r2 if r1=p1/q1 and r2=p2/q2 (both in lowest terms) and |p1|<=|p2| and |q1|<=|q2|. Thus 3/5 is simpler than 4/7. Although not all rationals are comparable in this ordering (consider 2/7 and 3/5) any interval contains a rational number that is simpler than every other rational number in that interval (the simpler 2/5 lies between 2/7 and 3/5). Note that 0=0/1 is the simplest rational of all.

(rationalize (exact .3) 1/10)  ⇒ 1/3    ; exact
(rationalize .3 1/10)          ⇒ #i1/3  ; inexact

rationalize->exact is similar to rationalize except that it always returns an exact result.

procedure: simplest-rational x y
procedure: simplest-exact-rational x y

simplest-rational returns the simplest rational number between x and y inclusive; simplest-exact-rational is similar except that it always returns an exact result.

These procedures implement the same functionality as rationalize and rationalize->exact, except that they specify the input range by its endpoints; rationalize specifies the range by its center point and its (half-) width.

procedure: exp z
procedure: log z
procedure: sin z
procedure: cos z
procedure: tan z
procedure: asin z
procedure: acos z
procedure: atan z
procedure: atan y x

These procedures compute the usual transcendental functions. log computes the natural logarithm of z (not the base ten logarithm). asin, acos, and atan compute arcsine, arccosine, and arctangent, respectively. The two-argument variant of atan computes (angle (make-rectangular x y)) (see below).

In general, the mathematical functions log, arcsine, arccosine, and arctangent are multiply defined. For nonzero real x, the value of log x is defined to be the one whose imaginary part lies in the range minus pi (exclusive) to pi (inclusive). log 0 is undefined. The value of log z when z is complex is defined according to the formula With log defined this way, the values of arcsine, arccosine, and arctangent are according to the following formulae: The above specification follows Common Lisp: the Language, which in turn cites Principal Values and Branch Cuts in Complex APL; refer to these sources for more detailed discussion of branch cuts, boundary conditions, and implementation of these functions. When it is possible these procedures produce a real result from a real argument.

procedure: log1p z
procedure: expm1 z

Equivalent to:

log1p z = log(1 + z).
expm1 z = exp(z) - 1,
fig/log1p fig/expm1

However, for real numbers close to zero, these provide better approximations than (log (+ 1 z)) or (- (exp z) 1):

The forward relative error of this implementation is determined by the system’s math library, usually below 1ulp.

procedure: log1mexp x
procedure: log1pexp x

Equivalent to:

log1mexp x = log (1 - e^x),
log1pexp x = log (1 + e^x).
fig/log1mexp fig/log1pexp

Like log1p and expm1, these avoid numerical pathologies with the intermediate quantities 1 - e^x and 1 + e^x and inputs to log near 1.

This implementation gives forward relative error bounded by ten times the forward relative error bound of the system math library’s log and exp, which is usually below 1ulp.

Beware that although the forward relative error of the MIT/GNU Scheme implementations of these functions is bounded, these functions are ill-conditioned for large negative inputs:

x f'(x)/f(x) = (+/- x exp(x))/((1 +/- e^x) log(1 +/- e^x)),
  --> x,  for x << 0.
fig/cn-log1mexp fig/cn-log1pexp
procedure: logistic x
procedure: logit x

Logistic and logit functions. Equivalent to:

logistic x = exp(x)/[1 + exp(x)] = 1/[1 + exp(-x)],
logit p = log p/(1 - p).

These functions are inverses of one another. The logit function maps a probablity p in [0, 1] into log-odds x in the extended real line, and the logistic function maps back from log-odds to probabilities.

procedure: logistic-1/2 x
procedure: logit1/2+ x

Equivalent to:

logistic-1/2 x = logistic(x) - 1/2,
logit1/2+ p = logit(1/2 + p).
fig/logistichalf fig/logithalf

Like logistic and logit, these functions are inverses of one another; unlike logistic and logit, their domains and codomains are both centered at zero.

procedure: log-logistic x
procedure: logit-exp x

Equivalent to:

log-logistic x = log(logistic(x)) = log [1/(1 + exp(-x))]
logit-exp x = logit(exp(x)) = log [exp(x)/(1 - exp(x))]

Like logistic and logit, these functions are inverses of one another.

This implementation gives forward relative error bounded by ten times the forward relative error bound of the system math library’s log and exp, which is usually below 1ulp.

procedure: logsumexp list

List must be a list of real numbers x1, x2, …, xn. Returns an approximation to:

log(exp(x1) + exp(x2) + … + exp(xn)).

The approximation avoids intermediate overflow and underflow. To minimize error, the caller should arrange for the numbers to be sorted from least to greatest.

Edge cases:

Logsumexp never raises any of the standard IEEE 754-2008 floating-point exceptions other than invalid-operation.

procedure: sqrt z

Returns the principal square root of z. The result will have either positive real part, or zero real part and non-negative imaginary part.

procedure: expt z1 z2

Returns z1 raised to the power z2:

procedure: make-rectangular x1 x2
procedure: make-polar x3 x4
procedure: real-part z
procedure: imag-part z
procedure: magnitude z
procedure: angle z
procedure: conjugate z

Suppose x1, x2, x3, and x4 are real numbers and z is a complex number such that Then make-rectangular and make-polar return z, real-part returns x1, imag-part returns x2, magnitude returns x3, and angle returns x4. In the case of angle, whose value is not uniquely determined by the preceding rule, the value returned will be the one in the range minus pi (exclusive) to pi (inclusive).

conjugate returns the complex conjugate of z.

The procedures exact and inexact implement the natural one-to-one correspondence between exact and inexact integers throughout an implementation-dependent range.

procedure: inexact z
procedure: exact->inexact z

inexact returns an inexact representation of z. The value returned is the inexact number that is numerically closest to the argument. For inexact arguments, the result is the same as the argument. For exact complex numbers, the result is a complex number whose real and imaginary parts are the result of applying inexact to the real and imaginary parts of the argument, respectively. If an exact argument has no reasonably close inexact equivalent (in the sense of =), then a violation of an implementation restriction may be reported.

The procedure exact->inexact has been deprecated by R7RS.

procedure: exact z
procedure: inexact->exact z

exact returns an exact representation of z. The value returned is the exact number that is numerically closest to the argument. For exact arguments, the result is the same as the argument. For inexact non-integral real arguments, the implementation may return a rational approximation, or may report an implementation violation. For inexact complex arguments, the result is a complex number whose real and imaginary parts are the result of applying exact to the real and imaginary parts of the argument, respectively. If an inexact argument has no reasonably close exact equivalent (in the sense of =), then a violation of an implementation restriction may be reported.

The procedure inexact->exact has been deprecated by R7RS.

procedure: copysign x1 x2

Returns a real number with the magnitude of x1 and the sign of x2.

(copysign 123 -1)              ⇒ -123
(copysign 0. -1)               ⇒ -0.
(copysign -0. 0.)              ⇒ 0.
(copysign -nan.123 0.)         ⇒ +nan.123

Next: , Previous: , Up: Numbers   [Contents][Index]

4.6 Numerical input and output

procedure: number->string number [radix]

Radix must be an exact integer, either 2, 8, 10, or 16. If omitted, radix defaults to 10. The procedure number->string takes a number and a radix and returns as a string an external representation of the given number in the given radix such that

(let ((number number)
      (radix radix))
  (eqv? number
        (string->number (number->string number radix)
                        radix)))

is true. It is an error if no possible result makes this expression true.

If number is inexact, the radix is 10, and the above expression can be satisfied by a result that contains a decimal point, then the result contains a decimal point and is expressed using the minimum number of digits (exclusive of exponent and trailing zeroes) needed to make the above expression true; otherwise the format of the result is unspecified.

The result returned by number->string never contains an explicit radix prefix.

Note: The error case can occur only when number is not a complex number or is a complex number with an non-rational real or imaginary part.

Rationale: If number is an inexact number represented using flonums, and the radix is 10, then the above expression is normally satisfied by a result containing a decimal point. The unspecified case allows for infinities, NaNs, and non-flonum representations.

variable: flonum-parser-fast?

This variable controls the behavior of string->number when parsing inexact numbers. Specifically, it allows the user to trade off accuracy against speed.

When set to its default value, #f, the parser provides maximal accuracy, as required by the Scheme standard. If set to #t, the parser uses faster algorithms that will sometimes introduce small errors in the result. The errors affect a few of the least-significant bits of the result, and consequently can be tolerated by many applications.

variable: flonum-unparser-cutoff

This variable is deprecated; use param:flonum-printer-cutoff instead.

parameter: param:flonum-printer-cutoff

This parameter controls the action of number->string when number is a flonum (and consequently controls all printing of flonums). This parameter may be called with an argument to set its value.

The value of this parameter is normally a list of three items:

rounding-type

One of the following symbols: normal, relative, or absolute. The symbol normal means that the number should be printed with full precision. The symbol relative means that the number should be rounded to a specific number of digits. The symbol absolute means that the number should be rounded so that there are a specific number of digits to the right of the decimal point.

precision

An exact integer. If rounding-type is normal, precision is ignored. If rounding-type is relative, precision must be positive, and it specifies the number of digits to which the printed representation will be rounded. If rounding-type is absolute, the printed representation will be rounded precision digits to the right of the decimal point; if precision is negative, the representation is rounded (- precision) digits to the left of the decimal point.

format-type

One of the symbols: normal, scientific, or engineering. This specifies the format in which the number will be printed.
scientific specifies that the number will be printed using scientific notation: x.xxxeyyy. In other words, the number is printed as a significand between zero inclusive and ten exclusive, and an exponent. engineering is like scientific, except that the exponent is always a power of three, and the significand is constrained to be between zero inclusive and 1000 exclusive. If normal is specified, the number will be printed in positional notation if it is “small enough”, otherwise it is printed in scientific notation. A number is “small enough” when the number of digits that would be printed using positional notation does not exceed the number of digits of precision in the underlying floating-point number representation; IEEE 754-2008 binary64 floating-point numbers have 17 digits of precision.

This three-element list may be abbreviated in two ways. First, the symbol normal may be used, which is equivalent to the list (normal 0 normal). Second, the third element of the list, format-type, may be omitted, in which case it defaults to normal.

The default value for param:flonum-printer-cutoff is normal. If it is bound to a value different from those described here, number->string issues a warning and acts as though the value had been normal.

Some examples of param:flonum-printer-cutoff:

(number->string (* 4 (atan 1 1)))
                                    ⇒ "3.141592653589793"
(parameterize ((param:flonum-printer-cutoff '(relative 5)))
  (number->string (* 4 (atan 1 1))))
                                    ⇒ "3.1416"
(parameterize ((param:flonum-printer-cutoff '(relative 5)))
  (number->string (* 4000 (atan 1 1))))
                                    ⇒ "3141.6"
(parameterize ((param:flonum-printer-cutoff '(relative 5 scientific)))
  (number->string (* 4000 (atan 1 1))))
                                    ⇒ "3.1416e3"
(parameterize ((param:flonum-printer-cutoff '(relative 5 scientific)))
  (number->string (* 40000 (atan 1 1))))
                                    ⇒ "3.1416e4"
(parameterize ((param:flonum-printer-cutoff '(relative 5 engineering)))
  (number->string (* 40000 (atan 1 1))))
                                    ⇒ "31.416e3"
(parameterize ((param:flonum-printer-cutoff '(absolute 5)))
  (number->string (* 4 (atan 1 1))))
                                    ⇒ "3.14159"
(parameterize ((param:flonum-printer-cutoff '(absolute 5)))
  (number->string (* 4000 (atan 1 1))))
                                    ⇒ "3141.59265"
(parameterize ((param:flonum-printer-cutoff '(absolute -4)))
  (number->string (* 4e10 (atan 1 1))))
                                    ⇒ "31415930000."
(parameterize ((param:flonum-printer-cutoff '(absolute -4 scientific)))
  (number->string (* 4e10 (atan 1 1))))
                                    ⇒ "3.141593e10"
(parameterize ((param:flonum-printer-cutoff '(absolute -4 engineering)))
  (number->string (* 4e10 (atan 1 1))))
                                    ⇒ "31.41593e9"
(parameterize ((param:flonum-printer-cutoff '(absolute -5)))
  (number->string (* 4e10 (atan 1 1))))
                                    ⇒ "31415900000."
procedure: string->number string [radix]

Returns a number of the maximally precise representation expressed by the given string. Radix must be an exact integer, either 2, 8, 10, or 16. If supplied, radix is a default radix that may be overridden by an explicit radix prefix in string (e.g. "#o177"). If radix is not supplied, then the default radix is 10. If string is not a syntactically valid notation for a number, then string->number returns #f.

(string->number "100")        ⇒  100
(string->number "100" 16)     ⇒  256
(string->number "1e2")        ⇒  100.0
(string->number "15##")       ⇒  1500.0

Note that a numeric representation using a decimal point or an exponent marker is not recognized unless radix is 10.


Next: , Previous: , Up: Numbers   [Contents][Index]

4.7 Fixnum and Flonum Operations

This section describes numerical operations that are restricted forms of the operations described above. These operations are useful because they compile very efficiently. However, care should be exercised: if used improperly, these operations can return incorrect answers, or even malformed objects that confuse the garbage collector.


Next: , Previous: , Up: Fixnum and Flonum Operations   [Contents][Index]

4.7.1 Fixnum Operations

A fixnum is an exact integer that is small enough to fit in a machine word. In MIT/GNU Scheme, fixnums are typically 24 or 26 bits, depending on the machine; it is reasonable to assume that fixnums are at least 24 bits. Fixnums are signed; they are encoded using 2’s complement.

All exact integers that are small enough to be encoded as fixnums are always encoded as fixnums — in other words, any exact integer that is not a fixnum is too big to be encoded as such. For this reason, small constants such as 0 or 1 are guaranteed to be fixnums.

procedure: fix:fixnum? object

Returns #t if object is a fixnum; otherwise returns #f.

Here is an expression that determines the largest fixnum:

(let loop ((n 1))
  (if (fix:fixnum? n)
      (loop (* n 2))
      (- n 1)))

A similar expression determines the smallest fixnum.

procedure: fix:= fixnum fixnum
procedure: fix:< fixnum fixnum
procedure: fix:> fixnum fixnum
procedure: fix:<= fixnum fixnum
procedure: fix:>= fixnum fixnum

These are the standard order and equality predicates on fixnums. When compiled, they do not check the types of their arguments.

procedure: fix:zero? fixnum
procedure: fix:positive? fixnum
procedure: fix:negative? fixnum

These procedures compare their argument to zero. When compiled, they do not check the type of their argument. The code produced by the following expressions is identical:

(fix:zero? fixnum)
(fix:= fixnum 0)

Similarly, fix:positive? and fix:negative? produce code identical to equivalent expressions using fix:> and fix:<.

procedure: fix:+ fixnum fixnum
procedure: fix:- fixnum fixnum
procedure: fix:* fixnum fixnum
procedure: fix:quotient fixnum fixnum
procedure: fix:remainder fixnum fixnum
procedure: fix:gcd fixnum fixnum
procedure: fix:1+ fixnum
procedure: fix:-1+ fixnum

These procedures are the standard arithmetic operations on fixnums. When compiled, they do not check the types of their arguments. Furthermore, they do not check to see if the result can be encoded as a fixnum. If the result is too large to be encoded as a fixnum, a malformed object is returned, with potentially disastrous effect on the garbage collector.

procedure: fix:divide fixnum fixnum

This procedure is like integer-divide, except that its arguments and its results must be fixnums. It should be used in conjunction with integer-divide-quotient and integer-divide-remainder.

The following are bitwise-logical operations on fixnums.

procedure: fix:not fixnum

This returns the bitwise-logical inverse of its argument. When compiled, it does not check the type of its argument.

(fix:not 0)                             ⇒  -1
(fix:not -1)                            ⇒  0
(fix:not 1)                             ⇒  -2
(fix:not -34)                           ⇒  33
procedure: fix:and fixnum fixnum

This returns the bitwise-logical “and” of its arguments. When compiled, it does not check the types of its arguments.

(fix:and #x43 #x0f)                     ⇒  3
(fix:and #x43 #xf0)                     ⇒  #x40
procedure: fix:andc fixnum fixnum

Returns the bitwise-logical “and” of the first argument with the bitwise-logical inverse of the second argument. When compiled, it does not check the types of its arguments.

(fix:andc #x43 #x0f)                    ⇒  #x40
(fix:andc #x43 #xf0)                    ⇒  3
procedure: fix:or fixnum fixnum

This returns the bitwise-logical “inclusive or” of its arguments. When compiled, it does not check the types of its arguments.

(fix:or #x40 3)                         ⇒ #x43
(fix:or #x41 3)                         ⇒ #x43
procedure: fix:xor fixnum fixnum

This returns the bitwise-logical “exclusive or” of its arguments. When compiled, it does not check the types of its arguments.

(fix:xor #x40 3)                        ⇒ #x43
(fix:xor #x41 3)                        ⇒ #x42
procedure: fix:lsh fixnum1 fixnum2

This procedure returns the result of logically shifting fixnum1 by fixnum2 bits. If fixnum2 is positive, fixnum1 is shifted left; if negative, it is shifted right. When compiled, it does not check the types of its arguments, nor the validity of its result.

(fix:lsh 1 10)                          ⇒  #x400
(fix:lsh #x432 -10)                     ⇒  1
(fix:lsh -1 3)                          ⇒  -8
(fix:lsh -128 -4)                       ⇒  #x3FFFF8

Next: , Previous: , Up: Fixnum and Flonum Operations   [Contents][Index]

4.7.2 Flonum Operations

A flonum is an inexact real number that is implemented as a floating-point number. In MIT/GNU Scheme, all inexact real numbers are flonums. For this reason, constants such as 0. and 2.3 are guaranteed to be flonums.

MIT/GNU Scheme follows the IEEE 754-2008 floating-point standard, using binary64 arithmetic for flonums. All floating-point values are classified into:

normal

Numbers of the form

r^e (1 + f/r^p)

where r, the radix, is a positive integer, here always 2; p, the precision, is a positive integer, here always 53; e, the exponent, is an integer within a limited range, here always -1022 to 1023 (inclusive); and f, the fractional part of the significand, is a (p-1)-bit unsigned integer,

subnormal

Fixed-point numbers near zero that allow for gradual underflow. Every subnormal number is an integer multiple of the smallest subnormal number. Subnormals were also historically called “denormal”.

zero

There are two distinguished zero values, one with “negative” sign bit and one with “positive” sign bit.

The two zero values are considered numerically equal, but serve to distinguish paths converging to zero along different branch cuts and so some operations yield different results for differently signed zero values.

infinity

There are two distinguished infinity values, negative infinity or -inf.0 and positive infinity or +inf.0, representing overflow on the real line.

NaN

There are 4 r^{p-2} - 2 distinguished not-a-number values, representing invalid operations or uninitialized data, distinguished by their negative/positive sign bit, a quiet/signalling bit, and a (p-2)-digit unsigned integer payload which must not be zero for signalling NaNs.

Arithmetic on quiet NaNs propagates them without raising any floating-point exceptions. In contrast, arithmetic on signalling NaNs raises the floating-point invalid-operation exception. Quiet NaNs are written +nan.123, -nan.0, etc. Signalling NaNs are written +snan.123, -snan.1, etc. The notation +snan.0 and -snan.0 is not allowed: what would be the encoding for them actually means +inf.0 and -inf.0.

procedure: flo:flonum? object

Returns #t if object is a flonum; otherwise returns #f.

procedure: flo:= flonum1 flonum2
procedure: flo:< flonum1 flonum2
procedure: flo:<= flonum1 flonum2
procedure: flo:> flonum1 flonum2
procedure: flo:>= flonum1 flonum2
procedure: flo:<> flonum1 flonum2

These procedures are the standard order and equality predicates on flonums. When compiled, they do not check the types of their arguments. These predicates raise floating-point invalid-operation exceptions on NaN arguments; in other words, they are “ordered comparisons”. When floating-point exception traps are disabled, they return false when any argument is NaN.

Every pair of floating-point numbers — excluding NaN — exhibits ordered trichotomy: they are related either by flo:=, flo:<, or flo:>.

procedure: flo:safe= flonum1 flonum2
procedure: flo:safe< flonum1 flonum2
procedure: flo:safe<= flonum1 flonum2
procedure: flo:safe> flonum1 flonum2
procedure: flo:safe>= flonum1 flonum2
procedure: flo:safe<> flonum1 flonum2
procedure: flo:unordered? flonum1 flonum2

These procedures are the standard order and equality predicates on flonums. When compiled, they do not check the types of their arguments. These predicates do not raise floating-point exceptions, and simply return false on NaN arguments, except flo:unordered? which returns true iff at least one argument is NaN; in other words, they are “unordered comparisons”.

Every pair of floating-point values — including NaN — exhibits unordered tetrachotomy: they are related either by flo:safe=, flo:safe<, flo:safe>, or flo:unordered?.

procedure: flo:zero? flonum
procedure: flo:positive? flonum
procedure: flo:negative? flonum

Each of these procedures compares its argument to zero. When compiled, they do not check the type of their argument. These predicates raise floating-point invalid-operation exceptions on NaN arguments; in other words, they are “ordered comparisons”.

(flo:zero? -0.)                ⇒ #t
(flo:negative? -0.)            ⇒ #f
(flo:negative? -1.)            ⇒ #t

(flo:zero? 0.)                 ⇒ #t
(flo:positive? 0.)             ⇒ #f
(flo:positive? 1.)             ⇒ #f

(flo:zero? +nan.123)           ⇒ #f  ; (raises invalid-operation)
procedure: flo:normal? flonum
procedure: flo:subnormal? flonum
procedure: flo:safe-zero? flonum
procedure: flo:infinite? flonum
procedure: flo:nan? flonum

Floating-point classification predicates. For any flonum, exactly one of these predicates returns true. These predicates never raise floating-point exceptions.

(flo:normal? 1.23)             ⇒ #t
(flo:subnormal? 4e-124)        ⇒ #t
(flo:safe-zero? -0.)           ⇒ #t
(flo:infinite? +inf.0)         ⇒ #t
(flo:nan? -nan.123)            ⇒ #t
procedure: flo:finite? flonum

Equivalent to:

(or (flo:safe-zero? flonum)
    (flo:subnormal? flonum)
    (flo:normal? flonum))
; or
(and (not (flo:infinite? flonum))
     (not (flo:nan? flonum)))

True for normal, subnormal, and zero floating-point values; false for infinity and NaN.

procedure: flo:classify flonum

Returns a symbol representing the classification of the flonum, one of normal, subnormal, zero, infinity, or nan.

procedure: flo:sign-negative? flonum

Returns true if the sign bit of flonum is negative, and false otherwise. Never raises a floating-point exception.

(flo:sign-negative? +0.)       ⇒ #f
(flo:sign-negative? -0.)       ⇒ #t
(flo:sign-negative? -1.)       ⇒ #t
(flo:sign-negative? +inf.0)    ⇒ #f
(flo:sign-negative? +nan.123)  ⇒ #f

(flo:negative? -0.)            ⇒ #f
(flo:negative? +nan.123)       ⇒ #f  ; (raises invalid-operation)
procedure: flo:+ flonum1 flonum2
procedure: flo:- flonum1 flonum2
procedure: flo:* flonum1 flonum2
procedure: flo:/ flonum1 flonum2

These procedures are the standard arithmetic operations on flonums. When compiled, they do not check the types of their arguments.

procedure: flo:*+ flonum1 flonum2 flonum3
procedure: flo:fma flonum1 flonum2 flonum3
procedure: flo:fast-fma?

Fused multiply-add: (flo:*+ u v a) computes uv+a correctly rounded, with no intermediate overflow or underflow arising from uv. In contrast, (flo:+ (flo:* u v) a) may have two rounding errors, and can overflow or underflow if uv is too large or too small even if uv + a is normal. Flo:fma is an alias for flo:*+ with the more familiar name used in other languages like C.

Flo:fast-fma? returns true if the implementation of fused multiply-add is supported by fast hardware, and false if it is emulated using Dekker’s double-precision algorithm in software.

(flo:+ (flo:* 1.2e100 2e208) -1.4e308)
                               ⇒ +inf.0  ; (raises overflow)
(flo:*+ 1.2e100 2e208  -1.4e308)
                               ⇒ 1e308
procedure: flo:negate flonum

This procedure returns the negation of its argument. When compiled, it does not check the type of its argument.

This is not equivalent to (flo:- 0. flonum):

(flo:negate 1.2)               ⇒ -1.2
(flo:negate -nan.123)          ⇒ +nan.123
(flo:negate +inf.0)            ⇒ -inf.0
(flo:negate 0.)                ⇒ -0.
(flo:negate -0.)               ⇒ 0.

(flo:- 0. 1.2)                 ⇒ -1.2
(flo:- 0. -nan.123)            ⇒ -nan.123
(flo:- 0. +inf.0)              ⇒ -inf.0
(flo:- 0. 0.)                  ⇒ 0.
(flo:- 0. -0.)                 ⇒ 0.
procedure: flo:abs flonum
procedure: flo:exp flonum
procedure: flo:log flonum
procedure: flo:sin flonum
procedure: flo:cos flonum
procedure: flo:tan flonum
procedure: flo:asin flonum
procedure: flo:acos flonum
procedure: flo:atan flonum
procedure: flo:sinh flonum
procedure: flo:cosh flonum
procedure: flo:tanh flonum
procedure: flo:asinh flonum
procedure: flo:acosh flonum
procedure: flo:atanh flonum
procedure: flo:sqrt flonum
procedure: flo:cbrt flonum
procedure: flo:expt flonum1 flonum2
procedure: flo:erf flonum
procedure: flo:erfc flonum
procedure: flo:hypot flonum1 flonum2
procedure: flo:j0 flonum
procedure: flo:j1 flonum
procedure: flo:jn flonum
procedure: flo:y0 flonum
procedure: flo:y1 flonum
procedure: flo:yn flonum
procedure: flo:gamma flonum
procedure: flo:lgamma flonum
procedure: flo:floor flonum
procedure: flo:ceiling flonum
procedure: flo:truncate flonum
procedure: flo:round flonum
procedure: flo:floor->exact flonum
procedure: flo:ceiling->exact flonum
procedure: flo:truncate->exact flonum
procedure: flo:round->exact flonum

These procedures are flonum versions of the corresponding procedures. When compiled, they do not check the types of their arguments.

procedure: flo:expm1 flonum
procedure: flo:log1p flonum

Flonum versions of expm1 and log1p with restricted domains: flo:expm1 is defined only on inputs bounded below log(2) in magnitude, and flo:log1p is defined only on inputs bounded below 1 - sqrt(1/2) in magnitude. Callers must use (- (flo:exp x) 1) or (flo:log (+ 1 x)) outside these ranges.

procedure: flo:atan2 flonum1 flonum2

This is the flonum version of atan with two arguments. When compiled, it does not check the types of its arguments.

procedure: flo:signed-lgamma x

Returns two values,

m = log(|Gamma(x)|)

and

s = sign(Gamma(x)),

respectively a flonum and an exact integer either -1 or 1, so that

Gamma(x) = s * e^m.
procedure: flo:min x1 x2
procedure: flo:max x1 x2

Returns the min or max of two floating-point numbers. If either argument is NaN, raises the floating-point invalid-operation exception and returns the other one if it is not NaN, or the first argument if they are both NaN.

procedure: flo:min-mag x1 x2
procedure: flo:max-mag x1 x2

Returns the argument that has the smallest or largest magnitude, as in minNumMag or maxNumMag of IEEE 754-2008. If either argument is NaN, raises the floating-point invalid-operation exception and returns the other one if it is not NaN, or the first argument if they are both NaN.

procedure: flo:ldexp x1 x2
procedure: flo:scalbn x1 x2

Flo:ldexp scales by a power of two; flo:scalbn scales by a power of the floating-point radix.

ldexp x e := x * 2^e,
scalbn x e := x * r^e.

In MIT/GNU Scheme, these procedures are the same; they are both provided to make it clearer which operation is meant.

procedure: flo:logb x

For nonzero finite x, returns floor(log(x)/log(r)) as an exact integer, where r is the floating-point radix.

For all other inputs, raises invalid-operation and returns #f.

procedure: flo:nextafter x1 x2

Returns the next floating-point number after x1 in the direction of x2.

(flo:nextafter 0. -1.)         ⇒ -4.9406564584124654e-324
procedure: flo:copysign x1 x2

Returns a floating-point number with the magnitude of x1 and the sign of x2.

(flo:copysign 123. 456.)       ⇒ 123.
(flo:copysign +inf.0 -1)       ⇒ -inf.0
(flo:copysign 0. -1)           ⇒ -0.
(flo:copysign -0. 0.)          ⇒ 0.
(flo:copysign -nan.123 0.)     ⇒ +nan.123
constant: flo:radix
constant: flo:radix.
constant: flo:precision

Floating-point system parameters. Flo:radix is the floating-point radix as an integer, and flo:precision is the floating-point precision as an integer; flo:radix. is the flotaing-point radix as a flonum.

constant: flo:error-bound
constant: flo:log-error-bound
constant: flo:ulp-of-one
constant: flo:log-ulp-of-one

Flo:error-bound, sometimes called the machine epsilon, is the maximum relative error of rounding to nearest:

max |x - fl(x)|/|x| = 1/(2 r^(p-1)),

where r is the floating-point radix and p is the floating-point precision.

Flo:ulp-of-one is the distance from 1 to the next larger floating-point number, and is equal to 1/r^{p-1}.

Flo:error-bound is half flo:ulp-of-one.

Flo:log-error-bound is the logarithm of flo:error-bound, and flo:log-ulp-of-one is the logarithm of flo:log-ulp-of-one.

procedure: flo:ulp flonum

Returns the distance from flonum to the next floating-point number larger in magnitude with the same sign. For zero, this returns the smallest subnormal. For infinities, this returns positive infinity. For NaN, this returns the same NaN.

(flo:ulp 1.)                    ⇒ 2.220446049250313e-16
(= (flo:ulp 1.) flo:ulp-of-one) ⇒ #t
constant: flo:normal-exponent-max
constant: flo:normal-exponent-min
constant: flo:subnormal-exponent-min

Largest and smallest positive integer exponents of the radix in normal and subnormal floating-point numbers.

constant: flo:largest-positive-normal
constant: flo:smallest-positive-normal
constant: flo:smallest-positive-subnormal

Smallest and largest normal and subnormal numbers in magnitude.

constant: flo:greatest-normal-exponent-base-e
constant: flo:greatest-normal-exponent-base-2
constant: flo:greatest-normal-exponent-base-10
constant: flo:least-normal-exponent-base-e
constant: flo:least-normal-exponent-base-2
constant: flo:least-normal-exponent-base-10
constant: flo:least-subnormal-exponent-base-e
constant: flo:least-subnormal-exponent-base-2
constant: flo:least-subnormal-exponent-base-10

Least and greatest exponents of normal and subnormal floating-point numbers, as floating-point numbers. For example, flo:greatest-normal-exponent-base-2 is the greatest floating-point number such that (expt 2. flo:greatest-normal-exponent-base-2) does not overflow and is a normal floating-point number.

procedure: flo:total< x1 x2
procedure: flo:total-mag< x1 x2
procedure: flo:total-order x1 x2
procedure: flo:total-order-mag x1 x2

These procedures implement the IEEE 754-2008 total ordering on floating-point values and their magnitudes. Here the “magnitude” of a floating-point value is a floating-point value with positive sign bit and everything else the same; e.g., +nan.123 is the “magnitude” of -nan.123 and 0.0 is the “magnitude” of -0.0.

The total ordering has little to no numerical meaning and should be used only when an arbitrary choice of total ordering is required for some non-numerical reason.

procedure: flo:make-nan negative? quiet? payload
procedure: flo:nan-quiet? nan
procedure: flo:nan-payload nan

Flo:make-nan creates a NaN given the sign bit, quiet bit, and payload. Negative? and quiet? must be booleans, and payload must be an unsigned (p-2)-bit integer, where p is the floating-point precision. If quiet? is false, payload must be nonzero.

(flo:sign-negative? (flo:make-nan negative? quiet? payload))
                               ⇒ negative?
(flo:nan-quiet? (flo:make-nan negative? quiet? payload))
                               ⇒ quiet?
(flo:nan-payload (flo:make-nan negative? quiet? payload))
                               ⇒ payload

(flo:make-nan #t #f 42)        ⇒ -snan.42
(flo:sign-negative? +nan.123)  ⇒ #f
(flo:quiet? +nan.123)          ⇒ #t
(flo:payload +nan.123)         ⇒ 123

Next: , Previous: , Up: Fixnum and Flonum Operations   [Contents][Index]

4.7.3 Floating-Point Environment

The IEEE 754-2008 computation model includes a persistent rounding mode, exception flags, and exception-handling modes. In MIT/GNU Scheme, the floating-point environment is per-thread. However, because saving and restoring the floating-point environment is expensive, it is maintained only for those threads that have touched the floating-point environment explicitly, either:

The default environment is as in IEEE 754-2008: no exceptions are trapped, and rounding is to nearest with ties broken to even. The set of exception flags in the default environment is indeterminate — callers must enter a per-thread environment, e.g. by calling flo:clear-exceptions!, before acting on the exception flags. Like the default environment, a per-thread environment initially has no exceptions trapped and rounds to nearest with ties to even.

A floating-point environment descriptor is a machine-dependent object representing the IEEE 754-2008 floating-point rounding mode, exception flags, and exception-handling mode. Users should not inspect a floating-point environment descriptor other than to use it with the procedures here; its representation may vary from system to system.

procedure: flo:default-environment

Returns a descriptor for the default environment, with no exceptions trapped and round-to-nearest/ties-to-even.

procedure: flo:environment
procedure: flo:set-environment! floenv
procedure: flo:update-environment! floenv

Flo:environment returns a descriptor for the current floating-point environment. Flo:set-environment! replaces the current floating-point environment by floenv. Flo:update-environment! does likewise, but re-raises any exceptions that were already raised in the current floating-point environment, which may cause a trap if floenv also traps them.

Flo:update-environment! is usually used together with flo:defer-exception-traps! to defer potentially trapping on exceptions in a large intermediate computation until the end.

procedure: flo:preserving-environment thunk

Saves the current floating-point environment if any and calls thunk. On exit from thunk, including non-local exit, saves thunk’s floating-point environment and restores the original floating-point environment as if with flo:set-environment!. On re-entry into thunk, restores thunk’s floating-point environment.

Note: Flo:preserving-environment does not enter a per-thread environment. If the current thread is in the default environment, the exception flags are indeterminate, and remain so inside flo:preserving-environment. Callers interested in using the exception flags should start inside flo:preserving-environment by clearing them with flo:clear-exceptions!.


Next: , Previous: , Up: Fixnum and Flonum Operations   [Contents][Index]

4.7.4 Floating-Point Exceptions

In IEEE 754-2008, floating-point operations such as arithmetic may raise exceptions. This sets a flag in the floating-point environment that is maintained until it is cleared. Many machines can also be configured to trap on exceptions, which in Scheme leads to signalling a condition. (Not all CPUs support trapping exceptions — for example, most ARMv8 CPUs do not.) In the default environment, no exceptions are trapped.

Floating-point exceptions and sets of floating-point exceptions are represented by small integers, whose interpretation is machine-dependent — for example, the invalid-operation exception may be represented differently on PowerPC and AMD x86-64 CPUs. The number for a floating-point exception is the same as the number for a set of exceptions containing only that one; the bitwise-AND of two sets is their intersection, the bitwise-IOR is their union, etc. The procedures flo:exceptions->names and flo:names->exceptions convert between machine-dependent integer representations and machine-independent lists of human-readable symbols.

The following exceptions are recognized by MIT/GNU Scheme:

inexact-result

Raised when the result of a floating-point computation is not a floating-point number and therefore must be rounded.

The inexact-result exception is never trappable in MIT/GNU Scheme.

underflow

Raised when the result of a floating-point computation is too small in magnitude to be represented by a normal floating-point number, and is therefore rounded to a subnormal or zero.

overflow

Raised when the result of a floating-point computation is too large in magnitude to be represented by a floating-point number, and is therefore rounded to infinity.

divide-by-zero

Raised on division of a nonzero finite real number by a zero real number, or logarithm of zero, or other operation that has an unbounded limit at a point like division by a divisor approaching zero.

invalid-operation

Raised when the input to a floating-point computation is nonsensical, such as division of zero by zero, or real logarithm of a negative number. The result of an invalid-operation is a NaN. Also raised when the input to a floating-point operation is a signalling NaN, but not for a quiet NaN.

subnormal-operand

Raised when an operand in a floating-point operation is subnormal.

(This is not a standard IEEE 754-2008 exception. It is supported by Intel CPUs.)

procedure: flo:supported-exceptions

Returns the set of exceptions that are supported on the current machine.

procedure: flo:exception:divide-by-zero
procedure: flo:exception:inexact-result
procedure: flo:exception:invalid-operation
procedure: flo:exception:overflow
procedure: flo:exception:subnormal-operand
procedure: flo:exception:underflow

Returns the specified floating-point exception number. On machines that do not support a particular exception, the corresponding procedure simply returns 0.

procedure: flo:exceptions->names excepts
procedure: flo:names->exceptions list

These procedures convert between a machine-dependent small integer representation of a set of exceptions, and a representation of a set of exceptions by a list of human-readable symbols naming them.

(flo:preserving-environment
 (lambda ()
   (flo:clear-exceptions! (flo:supported-exceptions))
   (flo:/ (identity-procedure 1.) 0.)
   (flo:exceptions->names
    (flo:test-exceptions (flo:supported-exceptions)))))
                               ⇒ (divide-by-zero)
procedure: flo:test-exceptions excepts

Returns the set of exceptions in excepts that are currently raised.

In the default environment, the result is indeterminate, and may be affected by floating-point operations in other threads.

procedure: flo:clear-exceptions! excepts
procedure: flo:raise-exceptions! excepts

Clears or raises the exceptions in excepts, entering a per-thread environment. Other exceptions are unaffected.

procedure: flo:save-exception-flags
procedure: flo:restore-exception-flags! exceptflags
procedure: flo:test-exception-flags exceptflags excepts

Flo:save-exception-flags returns a machine-dependent representation of the currently trapped and raised exceptions. Flo:restore-exception-flags! restores it, entering a per-thread environment. Flo:test-exception-flags returns the set of exceptions in excepts that are raised in exceptflags.

Exceptflags is not the same as a set of exceptions. It is opaque and machine-dependent and should not be used except with flo:restore-exception-flags! and flo:test-exception-flags.

Bug: Flo:test-exception-flags is unimplemented.

procedure: flo:have-trap-enable/disable?

Returns true if trapping floating-point exceptions is supported on this machine.

procedure: flo:default-trapped-exceptions

Returns the set of exceptions that are trapped in the default floating-point environment. Equivalent to (flo:names->exceptions '()), or simply 0, since by default, no exceptions are trapped.

procedure: flo:trapped-exceptions

Returns the set of exceptions that are currently trapped.

procedure: flo:trap-exceptions! excepts
procedure: flo:untrap-exceptions! excepts
procedure: flo:set-trapped-exceptions! excepts

Flo:trap-exceptions! requests that any exceptions in the set excepts be trapped, in addition to all of the ones that are currently trapped. Flo:untrap-exceptions! requests that any exceptions in the set excepts not be trapped. Flo:set-trapped-exceptions! replaces the set of trapped exceptions altogether by excepts. All three procedures enter a per-thread environment.

(define (flo:trap-exceptions! excepts)
  (flo:set-trapped-exceptions!
   (fix:or (flo:trapped-exceptions) excepts)))

(define (flo:untrap-exceptions! excepts)
  (flo:set-trapped-exceptions!
   (fix:andc (flo:trapped-exceptions) excepts)))

(define (flo:set-trapped-exceptions! excepts)
  (flo:trap-exceptions! excepts)
  (flo:untrap-exceptions!
   (fix:andc (flo:supported-exceptions) excepts)))
procedure: flo:with-exceptions-trapped excepts thunk
procedure: flo:with-exceptions-untrapped excepts thunk
procedure: flo:with-trapped-exceptions excepts thunk

Dynamic-extent analogues of flo:trap-exceptions!, flo:untrap-exceptions!, and flo:set-trapped-exceptions!. These call thunk with their respective changes to the set of trapped exceptions in a per-thread environment, and restore the environment on return or non-local exit.

procedure: flo:defer-exception-traps!

Saves the current floating-point environment, clears all raised exceptions, disables all exception traps, and returns a descriptor for the saved floating-point environment.

Flo:defer-exception-traps! is typically used together with flo:update-environment!, to trap any exceptions that the caller had wanted trapped only after a long intermediate computation. This pattern is captured in flo:deferring-exception-traps.

procedure: flo:deferring-exception-traps thunk

Calls thunk, but defers trapping on any exceptions it raises until it returns. Equivalent to:

(flo:preserving-environment
 (lambda ()
   (let ((environment (flo:defer-exception-traps!)))
     (begin0 (thunk)
       (flo:update-environment! environment)))))
procedure: flo:ignoring-exception-traps thunk

Calls thunk with all exceptions untrapped and unraised. Equivalent to:

(flo:preserving-environment
 (lambda ()
   (flo:defer-exception-traps!)
   (thunk)))

Previous: , Up: Fixnum and Flonum Operations   [Contents][Index]

4.7.5 Floating-Point Rounding Mode

IEEE 754-2008 supports four rounding modes, which determine the answer given by a floating-point computation when the exact result lies between two floating-point numbers but is not a floating-point number itself:

to-nearest

Round to the nearest floating-point number. If there are two equidistant ones, choose the one whose least significant digit is even. Also known as “round-to-nearest/ties-to-even”.

toward-zero

Round to the floating-point number closest to zero.

downward

Round to the greatest floating-point number below.

upward

Round to the least floating-point number above.

Warning: Not all procedures in MIT/GNU Scheme respect the rounding mode. Only the basic arithmetic operations — +, -, *, /, and sqrt — will reliably respect it. The main purpose of changing the rounding mode is to diagnose numerical instability by injecting small perturbations throughout the computation.

Bug: It would be nice if we had “round-to-odd”, where any inexact result is rounded to the nearest odd floating-point number, for implementing “doubled”-precision algorithms. But we don’t. Sorry.

procedure: flo:default-rounding-mode

Returns a symbol for the default rounding mode, which is always to-nearest.

procedure: flo:rounding-modes

Returns a list of the supported rounding modes as symbols.

procedure: flo:rounding-mode
procedure: flo:set-rounding-mode! mode

Gets or sets the current rounding mode as a symbol, entering a per-thread environment.

procedure: flo:with-rounding-mode mode thunk

Call thunk in a per-thread environment with the rounding mode set to mode. On return, the floating-point environment, including rounding mode, is restored to what it was before.

Non-local exit from and re-entrance to thunk behaves as if the call is surrounded by flo:preserving-environment (see Floating-Point Environment).


Previous: , Up: Numbers   [Contents][Index]

4.8 Random Number Generation

MIT/GNU Scheme provides a facility for random number generation. The current implementation uses the ChaCha stream cipher, reseeding itself at each request so that past outputs cannot be distinguished from uniform random even if the state of memory is compromised in the future.

The interface described here is a mixture of the Common Lisp and SRFI 27 systems.

procedure: random m [state]

The argument m must be either an exact positive integer, or an inexact positive real.

If state is given and not #f, it must be a random-state object; otherwise, it defaults to the default-random-source. This object is used to maintain the state of the pseudorandom number generator and is altered as a side effect of the random procedure.

Use of the default random state requires synchronization between threads, so it is better for multithreaded programs to use explicit states.

(random 1.0)    ⇒ .32744744667719056
(random 1.0)    ⇒ .01668326768172354
(random 10)     ⇒ 3
(random 10)     ⇒ 8
(random 100)    ⇒ 38
(random 100)    ⇒ 63
procedure: flo:random-unit-closed state
procedure: flo:random-unit-open state

State must be a random-state object. Flo:random-unit-closed returns a flonum in the closed interval [0,1] with uniform distribution. In practical terms, the result is in the half-closed interval (0,1] because the probability of returning 0 is 2^{-1075}, far below the standard probability 2^{-128} that means “never” in cryptographic engineering terms.

Flo:random-unit-open is like flo:random-unit-closed, but it explicitly rejects 0.0 and 1.0 as outputs, so that the result is a floating-point number in the open interval (0,1). (flo:random-unit-open) is equivalent (random 1.), except that it is faster.

Callers should generally use flo:random-unit-closed, because for the uniform distribution on the interval [0,1] of real numbers, the probability of a real number that is rounded to the floating-point 1.0 is the small but nonnegligible 2^{-54}, and arithmetic downstream should be prepared to handle results that are rounded to 1.0 much more readily than results that are rounded to 0.0 — in other words, a requirement to use flo:random-unit-open is evidence of bad numerics downstream.

procedure: flo:random-unit state

Deprecated alias for flo:random-unit-open.

procedure: random-bytevector n [state]

Returns a bytevector of n bytes drawn independently uniformly at random from state.

procedure: random-bytevector! bytevector [start end state]

Replaces the bytes in bytevector from start to end by bytes drawn independently uniformly at random from state.

The next three definitions concern random-state objects. In addition to these definitions, it is important to know that random-state objects are specifically designed so that they can be saved to disk using the fasdump procedure, and later restored using the fasload procedure. This allows a particular random-state object to be saved in order to replay a particular pseudorandom sequence.

variable: *random-state*

This variable is deprecated; pass an explicit state instead.

procedure: make-random-state [state]

This procedure returns a new random-state object, suitable for use as as the state argument to random. If state is not given or #f, make-random-state returns a copy of default-random-source. If state is a random-state object, a copy of that object is returned. If state is #t, then a new random-state object is returned that has been “randomly” initialized by some means (such as by a time-of-day clock).

procedure: random-state? object

Returns #t if object is a random-state object, otherwise returns #f.

procedure: export-random-state state
procedure: import-random-state state

Export-random-state returns an external representation of a random state — an object that can be safely read and written with read and write, consisting only of nested lists, vectors, symbols, and small exact integers. Import-random-state creates a random state from its external representation.

In the MIT/GNU Scheme implementation of the SRFI 27 API, a “random source” happens to be the same as a random state, but users should not rely on this.

procedure: make-random-source

[SRFI 27] Returns a random source. Every random source created by make-random-source returns the same sequence of outputs unless modified by random-source-state-set!, random-source-randomize!, or random-source-pseudo-randomize!.

procedure: random-source? object

[SRFI 27] Returns #t if object is a random source, otherwise returns #f.

constant: default-random-source

[SRFI 27] The default random source, used by the various random procedures if no explicit state is specified and *random-state* is false.

procedure: random-source-state-ref source
procedure: random-source-state-set! source exported-state

[SRFI 27] Random-source-state-ref returns an external representation of a random source — an object that can be safely read and written with read and write, consisting only of nested lists, vectors, symbols, and small exact integers. Random-source-state-set! replaces the innards of source by the source represented by exported-state from random-source-state-ref.

procedure: random-source-randomize! source

[SRFI 27] Loads entropy from the environment into source so that its subsequent outputs are nondeterministic.

Warning: Most implementations of SRFI 27 do not make subsequent outputs unpredictable with cryptography, so don’t rely on this.

procedure: random-source-pseudo-randomize! source i j

[SRFI 27] The arguments i and j must be exact nonnegative integers below 2^{128}. This procedure sets source to generate one of 2^{256} distinct possible streams of output, so that if i and j are chosen uniformly at random, it is hard to distinguish the outputs of the source from uniform random.

Warning: Most implementations of SRFI 27 do not make it hard to distinguish the outputs of the source from uniform random even if the indices i and j are uniform random, so don’t rely on this.

procedure: random-integer n

[SRFI 27] Returns an exact nonnegative integer below n chosen uniformly at random.

Equivalent to:

((random-source-make-integers default-random-source) n)
procedure: random-real

[SRFI 27] Returns an inexact real in the open interval (0, 1) with uniform distribution.

Equivalent to:

((random-source-make-reals default-random-source))
procedure: random-source-make-integers source

[SRFI 27] Returns a procedure of one argument, n, that deterministically draws from source a exact nonnegative integer below n with uniform distribution.

procedure: random-source-make-reals source [unit]

[SRFI 27] Returns a procedure of zero arguments that deterministically draws from source an inexact real in the interval (0,1) with uniform distribution. If unit is specified, the results are instead uniform random integral multiples of unit in (0,1) and of the same exactness as unit.


Next: , Previous: , Up: Top   [Contents][Index]

5 Characters

Characters are objects that represent printed characters such as letters and digits. MIT/GNU Scheme supports the full Unicode character repertoire.

Characters are written using the notation #\character or #\character-name or #\xhex-scalar-value.

The following standard character names are supported:

#\alarm                 ; U+0007
#\backspace             ; U+0008
#\delete                ; U+007F
#\escape                ; U+001B
#\newline               ; the linefeed character, U+000A
#\null                  ; the null character, U+0000
#\return                ; the return character, U+000D
#\space                 ; the preferred way to write a space, U+0020
#\tab                   ; the tab character, U+0009

Here are some additional examples:

#\a                     ; lowercase letter
#\A                     ; uppercase letter
#\(                     ; left parenthesis
#\                      ; the space character

Case is significant in #\character, and in #\character-name, but not in #\xhex-scalar-value. If character in #\character is alphabetic, then any character immediately following character cannot be one that can appear in an identifier. This rule resolves the ambiguous case where, for example, the sequence of characters ‘#\space’ could be taken to be either a representation of the space character or a representation of the character ‘#\s’ followed by a representation of the symbol ‘pace’.

Characters written in the #\ notation are self-evaluating. That is, they do not have to be quoted in programs.

Some of the procedures that operate on characters ignore the difference between upper case and lower case. The procedures that ignore case have ‘-ci’ (for “case insensitive”) embedded in their names.

MIT/GNU Scheme allows a character name to include one or more bucky bit prefixes to indicate that the character includes one or more of the keyboard shift keys Control, Meta, Super, or Hyper (note that the Control bucky bit prefix is not the same as the ASCII control key). The bucky bit prefixes and their meanings are as follows (case is not significant):

Key             Bucky bit prefix        Bucky bit
---             ----------------        ---------

Meta            M- or Meta-                 1
Control         C- or Control-              2
Super           S- or Super-                4
Hyper           H- or Hyper-                8

For example,

#\c-a                   ; Control-a
#\meta-b                ; Meta-b
#\c-s-m-h-A             ; Control-Meta-Super-Hyper-A
procedure: char->name char

Returns a string corresponding to the printed representation of char. This is the character, character-name, or xhex-scalar-value component of the external representation, combined with the appropriate bucky bit prefixes.

(char->name #\a)                        ⇒  "a"
(char->name #\space)                    ⇒  "space"
(char->name #\c-a)                      ⇒  "C-a"
(char->name #\control-a)                ⇒  "C-a"
procedure: name->char string

Converts a string that names a character into the character specified. If string does not name any character, name->char signals an error.

(name->char "a")                        ⇒  #\a
(name->char "space")                    ⇒  #\space
(name->char "SPACE")                    ⇒  #\space
(name->char "c-a")                      ⇒  #\C-a
(name->char "control-a")                ⇒  #\C-a
standard procedure: char? object

Returns #t if object is a character, otherwise returns #f.

standard procedure: char=? char1 char2 char3 …
standard procedure: char<? char1 char2 char3 …
standard procedure: char>? char1 char2 char3 …
standard procedure: char<=? char1 char2 char3 …
standard procedure: char>=? char1 char2 char3 …

These procedures return #t if the results of passing their arguments to char->integer are respectively equal, monotonically increasing, monotonically decreasing, monotonically non-decreasing, or monotonically non-increasing.

These predicates are transitive.

char library procedure: char-ci=? char1 char2 char3 …
char library procedure: char-ci<? char1 char2 char3 …
char library procedure: char-ci>? char1 char2 char3 …
char library procedure: char-ci<=? char1 char2 char3 …
char library procedure: char-ci>=? char1 char2 char3 …

These procedures are similar to char=? et cetera, but they treat upper case and lower case letters as the same. For example, (char-ci=? #\A #\a) returns #t.

Specifically, these procedures behave as if char-foldcase were applied to their arguments before they were compared.

char library procedure: char-alphabetic? char
char library procedure: char-numeric? char
char library procedure: char-whitespace? char
char library procedure: char-upper-case? char
char library procedure: char-lower-case? char

These procedures return #t if their arguments are alphabetic, numeric, whitespace, upper case, or lower case characters respectively, otherwise they return #f.

Specifically, they return #t when applied to characters with the Unicode properties Alphabetic, Numeric_Decimal, White_Space, Uppercase, or Lowercase respectively, and #f when applied to any other Unicode characters. Note that many Unicode characters are alphabetic but neither upper nor lower case.

procedure: char-alphanumeric? char

Returns #t if char is either alphabetic or numeric, otherwise it returns #f.

char library procedure: digit-value char

This procedure returns the numeric value (0 to 9) of its argument if it is a numeric digit (that is, if char-numeric? returns #t), or #f on any other character.

(digit-value #\3) ⇒ 3
(digit-value #\x0664) ⇒ 4
(digit-value #\x0AE6) ⇒ 0
(digit-value #\x0EA6) ⇒ #f
standard procedure: char->integer char
standard procedure: integer->char n

Given a Unicode character, char->integer returns an exact integer between 0 and #xD7FF or between #xE000 and #x10FFFF which is equal to the Unicode scalar value of that character. Given a non-Unicode character, it returns an exact integer greater than #x10FFFF.

Given an exact integer that is the value returned by a character when char->integer is applied to it, integer->char returns that character.

Implementation note: MIT/GNU Scheme allows any Unicode code point, not just scalar values.

Implementation note: If the argument to char->integer or integer->char is a constant, the MIT/GNU Scheme compiler will constant-fold the call, replacing it with the corresponding result. This is a very useful way to denote unusual character constants or ASCII codes.

char library procedure: char-upcase char
char library procedure: char-downcase char
char library procedure: char-foldcase char

The char-upcase procedure, given an argument that is the lowercase part of a Unicode casing pair, returns the uppercase member of the pair. Note that language-sensitive casing pairs are not used. If the argument is not the lowercase member of such a pair, it is returned.

The char-downcase procedure, given an argument that is the uppercase part of a Unicode casing pair, returns the lowercase member of the pair. Note that language-sensitive casing pairs are not used. If the argument is not the uppercase member of such a pair, it is returned.

The char-foldcase procedure applies the Unicode simple case-folding algorithm to its argument and returns the result. Note that language-sensitive folding is not used. See UAX #44 (part of the Unicode Standard) for details.

Note that many Unicode lowercase characters do not have uppercase equivalents.

procedure: char->digit char [radix]

If char is a character representing a digit in the given radix, returns the corresponding integer value. If radix is specified (which must be an exact integer between 2 and 36 inclusive), the conversion is done in that base, otherwise it is done in base 10. If char doesn’t represent a digit in base radix, char->digit returns #f.

Note that this procedure is insensitive to the alphabetic case of char.

(char->digit #\8)                       ⇒  8
(char->digit #\e 16)                    ⇒  14
(char->digit #\e)                       ⇒  #f
procedure: digit->char digit [radix]

Returns a character that represents digit in the radix given by radix. The radix argument, if given, must be an exact integer between 2 and 36 (inclusive); it defaults to 10. The digit argument must be an exact non-negative integer strictly less than radix.

(digit->char 8)                         ⇒  #\8
(digit->char 14 16)                     ⇒  #\E

Next: , Previous: , Up: Characters   [Contents][Index]

5.1 Character implementation

An MIT/GNU Scheme character consists of a code part and a bucky bits part. The code part is a Unicode code point, while the bucky bits are an additional set of bits representing shift keys available on some keyboards.

There are 4 bucky bits, named control, meta, super, and hyper. On GNU/Linux systems running a graphical desktop, the control bit corresponds to the CTRL key; the meta bit corresponds to the ALT key; and the super bit corresponds to the “windows” key. On macOS, these are the CONTROL, OPTION, and COMMAND keys respectively.

Characters with bucky bits are not used much outside of graphical user interfaces (e.g. Edwin). They cannot be stored in strings or character sets, and aren’t read or written by textual I/O ports.

procedure: make-char code bucky-bits

Builds a character from code and bucky-bits. The value of code must be a Unicode code point; the value of bucky-bits must be an exact non-negative integer strictly less than 16. If 0 is specified for bucky-bits, make-char produces an ordinary character; otherwise, the appropriate bits are set as follows:

1               meta
2               control
4               super
8               hyper

For example,

(make-char 97 0)                        ⇒  #\a
(make-char 97 1)                        ⇒  #\M-a
(make-char 97 2)                        ⇒  #\C-a
(make-char 97 3)                        ⇒  #\C-M-a
procedure: char-code char

Returns the Unicode code point of char. Note that if char has no bucky bits set, then this is the same value returned by char->integer.

For example,

(char-code #\a)                         ⇒  97
(char-code #\c-a)                       ⇒  97
procedure: char-bits char

Returns the exact integer representation of char’s bucky bits. For example,

(char-bits #\a)                         ⇒  0
(char-bits #\m-a)                       ⇒  1
(char-bits #\c-a)                       ⇒  2
(char-bits #\c-m-a)                     ⇒  3
constant: char-code-limit

This constant is the strict upper limit on a character’s code value. It is #x110000 unless some future version of Unicode increases the range of code points.

constant: char-bits-limit

This constant is the strict upper limit on a character’s bucky-bits value. It is currently #x10 and unlikely to change in the future.

procedure: bitless-char? object

Returns #t if object is a character with no bucky bits set, otherwise it returns #f .

procedure: char->bitless-char char

Returns char with any bucky bits removed. The result is guaranteed to satisfy bitless-char?.

procedure: char-predicate char

Returns a procedure of one argument that returns #t if its argument is a character char=? to char, otherwise it returns #f.

procedure: char-ci-predicate char

Returns a procedure of one argument that returns #t if its argument is a character char-ci=? to char, otherwise it returns #f.


Next: , Previous: , Up: Characters   [Contents][Index]

5.2 Unicode

MIT/GNU Scheme implements the full Unicode character repertoire, defining predicates for Unicode characters and their associated integer values. A Unicode code point is an exact non-negative integer strictly less than #x110000. A Unicode scalar value is a Unicode code point that doesn’t fall between #xD800 inclusive and #xE000 exclusive; in other words, any Unicode code point except for the surrogate code points.

procedure: unicode-code-point? object

Returns #t if object is a Unicode code point, otherwise it returns #f.

procedure: unicode-scalar-value? object

Returns #t if object is a Unicode scalar value, otherwise it returns #f.

procedure: unicode-char? object

Returns #t if object is any character corresponding to a Unicode code point, except for those with general category other:surrogate or other:not-assigned.

procedure: char-general-category char
procedure: code-point-general-category code-point

Returns the Unicode general category of char (or code-point) as a descriptive symbol:

CategorySymbol
Luletter:uppercase
Llletter:lowercase
Ltletter:titlecase
Lmletter:modifier
Loletter:other
Mnmark:nonspacing
Mcmark:spacing-combining
Memark:enclosing
Ndnumber:decimal-digit
Nlnumber:letter
Nonumber:other
Pcpunctuation:connector
Pdpunctuation:dash
Pspunctuation:open
Pepunctuation:close
Pipunctuation:initial-quote
Pfpunctuation:final-quote
Popunctuation:other
Smsymbol:math
Scsymbol:currency
Sksymbol:modifier
Sosymbol:other
Zsseparator:space
Zlseparator:line
Zpseparator:paragraph
Ccother:control
Cfother:format
Csother:surrogate
Coother:private-use
Cnother:not-assigned

Previous: , Up: Characters   [Contents][Index]

5.3 Character Sets

MIT/GNU Scheme’s character-set abstraction is used to represent groups of characters, such as the letters or digits. A character set may contain any character. Alternatively, a character set can be treated as a set of code points.

Implementation note: MIT/GNU Scheme allows any “bitless” character to be stored in a character set; operations that accept characters automatically strip their bucky bits.

procedure: char-set? object

Returns #t if object is a character set, otherwise it returns #f.

procedure: char-in-set? char char-set

Returns #t if char is in char-set, otherwise it returns #f.

procedure: code-point-in-set? code-point char-set

Returns #t if code-point is in char-set, otherwise it returns #f.

procedure: char-set-predicate char-set

Returns a procedure of one argument that returns #t if its argument is a character in char-set, otherwise it returns #f.

procedure: compute-char-set predicate

Calls predicate once on each Unicode code point, and returns a character set containing exactly the code points for which predicate returns a true value.

The next procedures represent a character set as a code-point list, which is a list of code-point range elements. A code-point range is either a Unicode code point, or a pair (start . end) that specifies a contiguous range of code points. Both start and end must be exact nonnegative integers less than or equal to #x110000, and start must be less than or equal to end. The range specifies all of the code points greater than or equal to start and strictly less than end.

procedure: char-set element …
procedure: char-set* elements

Returns a new character set consisting of the characters specified by elements. The procedure char-set takes these elements as multiple arguments, while char-set* takes them as a single list-valued argument; in all other respects these procedures are identical.

An element can take several forms, each of which specifies one or more characters to include in the resulting character set: a character includes itself; a string includes all of the characters it contains; a character set includes its members; or a code-point range includes the corresponding characters.

In addition, an element may be a symbol from the following table, which represents the characters as shown:

NameUnicode character specification
alphabeticAlphabetic = True
alphanumericAlphabetic = True | Numeric_Type = Decimal
casedCased = True
lower-caseLowercase = True
numericNumeric_Type = Decimal
unicodeGeneral_Category != (Cs | Cn)
upper-caseUppercase = True
whitespaceWhite_Space = True
procedure: char-set->code-points char-set

Returns a code-point list specifying the contents of char-set. The returned list consists of numerically sorted, disjoint, and non-abutting code-point ranges.

procedure: char-set=? char-set-1 char-set-2

Returns #t if char-set-1 and char-set-2 contain exactly the same characters, otherwise it returns #f.

procedure: char-set-invert char-set

Returns a character set that’s the inverse of char-set. That is, the returned character set contains exactly those characters that aren’t in char-set.

procedure: char-set-union char-set …
procedure: char-set-intersection char-set …
procedure: char-set-difference char-set-1 char-set …

These procedures compute the respective set union, set intersection, and set difference of their arguments.

procedure: char-set-union* char-sets
procedure: char-set-intersection* char-sets

These procedures correspond to char-set-union and char-set-intersection but take a single argument that’s a list of character sets rather than multiple character-set arguments.

constant: char-set:alphabetic
constant: char-set:numeric
constant: char-set:whitespace
constant: char-set:upper-case
constant: char-set:lower-case
constant: char-set:alphanumeric

These constants are the character sets corresponding to char-alphabetic?, char-numeric?, char-whitespace?, char-upper-case?, char-lower-case?, and char-alphanumeric? respectively.

procedure: 8-bit-char-set? char-set

Returns #t if char-set contains only 8-bit code points (i.e.. ISO 8859-1 characters), otherwise it returns #f.


Next: , Previous: , Up: Top   [Contents][Index]

6 Strings

Strings are sequences of characters. Strings are written as sequences of characters enclosed within quotation marks ("). Within a string literal, various escape sequences represent characters other than themselves. Escape sequences always start with a backslash (\):

\a : alarm, U+0007
\b : backspace, U+0008
\t : character tabulation, U+0009
\n : linefeed, U+000A
\r : return, U+000D
\" : double quote, U+0022
\\ : backslash, U+005C
\| : vertical line, U+007C
\intraline-whitespace* line-ending intraline-whitespace*
     : nothing
\xhex-scalar-value;
     : specified character (note the terminating semi-colon).

The result is unspecified if any other character in a string occurs after a backslash.

Except for a line ending, any character outside of an escape sequence stands for itself in the string literal. A line ending which is preceded by \intraline-whitespace expands to nothing (along with any trailing intraline whitespace), and can be used to indent strings for improved legibility. Any other line ending has the same effect as inserting a \n character into the string.

Examples:

"The word \"recursion\" has many meanings."
"Another example:\ntwo lines of text"
"Here's text \
   containing just one line"
"\x03B1; is named GREEK SMALL LETTER ALPHA."

The length of a string is the number of characters that it contains. This number is an exact, non-negative integer that is fixed when the string is created. The valid indexes of a string are the exact non-negative integers less than the length of the string. The first character of a string has index 0, the second has index 1, and so on.

Some of the procedures that operate on strings ignore the difference between upper and lower case. The names of the versions that ignore case end with ‘-ci’ (for “case insensitive”).

Implementations may forbid certain characters from appearing in strings. However, with the exception of #\null, ASCII characters must not be forbidden. For example, an implementation might support the entire Unicode repertoire, but only allow characters U+0001 to U+00FF (the Latin-1 repertoire without #\null) in strings.

Implementation note: MIT/GNU Scheme allows any “bitless” character to be stored in a string. In effect this means any character with a Unicode code point, including surrogates. String operations that accept characters automatically strip their bucky bits.

It is an error to pass such a forbidden character to make-string, string, string-set!, or string-fill!, as part of the list passed to list->string, or as part of the vector passed to vector->string, or in UTF-8 encoded form within a bytevector passed to utf8->string. It is also an error for a procedure passed to string-map to return a forbidden character, or for read-string to attempt to read one.

MIT/GNU Scheme supports both mutable and immutable strings. Procedures that mutate strings, in particular string-set! and string-fill!, will signal an error if given an immutable string. Nearly all procedures that return strings return immutable strings; notable exceptions are make-string and string-copy, which always return mutable strings, and string-builder which gives the programmer the ability to choose mutable or immutable results.

standard procedure: string? obj

Returns #t if obj is a string, otherwise returns #f.

standard procedure: make-string k [char]

The make-string procedure returns a newly allocated mutable string of length k. If char is given, then all the characters of the string are initialized to char, otherwise the contents of the string are unspecified.

extended standard procedure: string object …
procedure: string* objects

Returns an immutable string whose characters are the concatenation of the characters from the given objects. Each object is converted to characters as if passed to the display procedure.

This is an MIT/GNU Scheme extension to the standard string that accepts only characters as arguments.

The procedure string* is identical to string but takes a single argument that’s a list of objects, rather than multiple object arguments.

standard procedure: string-length string

Returns the number of characters in the given string.

standard procedure: string-ref string k

It is an error if k is not a valid index of string.

The string-ref procedure returns character k of string using zero-origin indexing. There is no requirement for this procedure to execute in constant time.

standard procedure: string-set! string k char

It is an error if string is not a mutable string or if k is not a valid index of string.

The string-set! procedure stores char in element k of string. There is no requirement for this procedure to execute in constant time.

(define (f) (make-string 3 #\*))
(define (g) "***")
(string-set! (f) 0 #\?)  ⇒  unspecified
(string-set! (g) 0 #\?)  ⇒  error
(string-set! (symbol->string 'immutable) 0 #\?)  ⇒  error
standard procedure: string=? string1 string2 string …

Returns #t if all the strings are the same length and contain exactly the same characters in the same positions, otherwise returns #f.

char library procedure: string-ci=? string1 string2 string …

Returns #t if, after case-folding, all the strings are the same length and contain the same characters in the same positions, otherwise returns #f. Specifically, these procedures behave as if string-foldcase were applied to their arguments before comparing them.

standard procedure: string<? string1 string2 string …
char library procedure: string-ci<? string1 string2 string …
standard procedure: string>? string1 string2 string …
char library procedure: string-ci>? string1 string2 string …
standard procedure: string<=? string1 string2 string …
char library procedure: string-ci<=? string1 string2 string …
standard procedure: string>=? string1 string2 string …
char library procedure: string-ci>=? string1 string2 string …

These procedures return #t if their arguments are (respectively): monotonically increasing, monotonically decreasing, monotonically non-decreasing, or monotonically non-increasing.

These predicates are required to be transitive.

These procedures compare strings in an implementation-defined way. One approach is to make them the lexicographic extensions to strings of the corresponding orderings on characters. In that case, string<? would be the lexicographic ordering on strings induced by the ordering char<? on characters, and if the two strings differ in length but are the same up to the length of the shorter string, the shorter string would be considered to be lexicographically less than the longer string. However, it is also permitted to use the natural ordering imposed by the implementation’s internal representation of strings, or a more complex locale-specific ordering.

In all cases, a pair of strings must satisfy exactly one of string<?, string=?, and string>?, and must satisfy string<=? if and only if they do not satisfy string>? and string>=? if and only if they do not satisfy string<?.

The ‘-ci’ procedures behave as if they applied string-foldcase to their arguments before invoking the corresponding procedures without ‘-ci’.

procedure: string-compare string1 string2 if-eq if-lt if-gt
procedure: string-compare-ci string1 string2 if-eq if-lt if-gt

If-eq, if-lt, and if-gt are procedures of no arguments (thunks). The two strings are compared; if they are equal, if-eq is applied, if string1 is less than string2, if-lt is applied, else if string1 is greater than string2, if-gt is applied. The value of the procedure is the value of the thunk that is applied.

string-compare distinguishes uppercase and lowercase letters;
string-compare-ci does not.

(define (cheer) (display "Hooray!"))
(define (boo)   (display "Boo-hiss!"))
(string-compare "a" "b"  cheer  (lambda() 'ignore)  boo)
        -|  Hooray!
        ⇒  unspecified
char library procedure: string-upcase string
char library procedure: string-downcase string
procedure: string-titlecase string
char library procedure: string-foldcase string

These procedures apply the Unicode full string uppercasing, lowercasing, titlecasing, and case-folding algorithms to their arguments and return the result. In certain cases, the result differs in length from the argument. If the result is equal to the argument in the sense of string=?, the argument may be returned. Note that language-sensitive mappings and foldings are not used.

The Unicode Standard prescribes special treatment of the Greek letter \Sigma, whose normal lower-case form is \sigma but which becomes \varsigma at the end of a word. See UAX #44 (part of the Unicode Standard) for details. However, implementations of string-downcase are not required to provide this behavior, and may choose to change \Sigma to \sigma in all cases.

procedure: string-upper-case? string
procedure: string-lower-case? string

These procedures return #t if all the letters in the string are lower case or upper case, otherwise they return #f. The string must contain at least one letter or the procedures return #f.

(map string-upper-case? '(""    "A"    "art"  "Art"  "ART"))
                       ⇒ (#f    #t     #f     #f     #t)
standard procedure: substring string [start [end]]

Returns an immutable copy of the part of the given string between start and end.

procedure: string-slice string [start [end]]

Returns a slice of string, restricted to the range of characters specified by start and end. The returned slice will be mutable if string is mutable, or immutable if string is immutable.

A slice is a kind of string that provides a view into another string. The slice behaves like any other string, but changes to a mutable slice are reflected in the original string and vice versa.

(define foo (string-copy "abcde"))
foo ⇒ "abcde"

(define bar (string-slice foo 1 4))
bar ⇒ "bcd"

(string-set! foo 2 #\z)
foo ⇒ "abzde"
bar ⇒ "bzd"

(string-set! bar 1 #\y)
bar ⇒ "byd"
foo ⇒ "abyde"
standard procedure: string-append string …
procedure: string-append* strings

Returns an immutable string whose characters are the concatenation of the characters in the given strings.

The non-standard procedure string-append* is identical to string-append but takes a single argument that’s a list of strings, rather than multiple string arguments.

standard procedure: string->list string [start [end]]
standard procedure: list->string list

It is an error if any element of list is not a character.

The string->list procedure returns a newly allocated list of the characters of string between start and end. list->string returns an immutable string formed from the elements in the list list. In both procedures, order is preserved. string->list and list->string are inverses so far as equal? is concerned.

standard procedure: string-copy string [start [end]]

Returns a newly allocated mutable copy of the part of the given string between start and end.

standard procedure: string-copy! to at from [start [end]]

It is an error if to is not a mutable string or if at is less than zero or greater than the length of to. It is also an error if (- (string-length to) at) is less than (- end start).

Copies the characters of string from between start and end to string to, starting at at. The order in which characters are copied is unspecified, except that if the source and destination overlap, copying takes place as if the source is first copied into a temporary string and then into the destination. This can be achieved without allocating storage by making sure to copy in the correct direction in such circumstances.

(define a "12345")
(define b (string-copy "abcde"))
(string-copy! b 1 a 0 2) ⇒ 3
b ⇒ "a12de"%

Implementation note: in MIT/GNU Scheme string-copy! returns the value (+ at (- end start)).

standard procedure: string-fill! string fill [start [end]]

It is an error if string is not a mutable string or if fill is not a character.

The string-fill! procedure stores fill in the elements of string between start and end.

The next two procedures treat a given string as a sequence of grapheme clusters, a concept defined by the Unicode standard in UAX #29:

It is important to recognize that what the user thinks of as a “character”—a basic unit of a writing system for a language—may not be just a single Unicode code point. Instead, that basic unit may be made up of multiple Unicode code points. To avoid ambiguity with the computer use of the term character, this is called a user-perceived character. For example, “G” + acute-accent is a user-perceived character: users think of it as a single character, yet is actually represented by two Unicode code points. These user-perceived characters are approximated by what is called a grapheme cluster, which can be determined programmatically.

procedure: grapheme-cluster-length string

This procedure returns the number of grapheme clusters in string.

For ASCII strings, this is identical to string-length.

procedure: grapheme-cluster-slice string start end

This procedure slices string at the grapheme-cluster boundaries specified by the start and end indices. These indices are grapheme-cluster indices, not normal string indices.

For ASCII strings, this is identical to string-slice.

procedure: string-word-breaks string

This procedure returns a list of word break indices for string, ordered from smallest index to largest. Word breaks are defined by the Unicode standard in UAX #29, and generally coincide with what we think of as the boundaries of words in written text.

MIT/GNU Scheme supports the Unicode canonical normalization forms NFC (Normalization Form C) and NFD (Normalization Form D). The reason for these forms is that there can be multiple different Unicode sequences for a given text; these sequences are semantically identical and should be treated equivalently for all purposes. If two such sequences are normalized to the same form, the resulting normalized sequences will be identical.

By default, most procedures that return strings return them in NFC. Notable exceptions are list->string, vector->string, and the utfX->string procedures, which do no normalization, and of course string->nfd.

Generally speaking, NFC is preferred for most purposes, as it is the minimal-length sequence for the variants. Consult the Unicode standard for the details and for information about why one normalization form is preferable for a specific purpose.

procedure: string-in-nfc? string
procedure: string-in-nfd? string

These procedures return #t if string is in Unicode Normalization Form C or D respectively. Otherwise they return #f.

Note that if string consists only of code points strictly less than #xC0, then string-in-nfd? returns #t. If string consists only of code points strictly less than #x300, then string-in-nfc? returns #t. Consequently both of these procedures will return #t for an ASCII string argument.

procedure: string->nfc string
procedure: string->nfd string

The procedures convert string into Unicode Normalization Form C or D respectively. If string is already in the correct form, they return string itself, or an immutable copy if string is mutable.

standard procedure: string-map proc string string …

It is an error if proc does not accept as many arguments as there are strings and return a single character.

The string-map procedure applies proc element-wise to the elements of the strings and returns an immutable string of the results, in order. If more than one string is given and not all strings have the same length, string-map terminates when the shortest string runs out. The dynamic order in which proc is applied to the elements of the strings is unspecified. If multiple returns occur from string-map, the values returned by earlier returns are not mutated.

(string-map char-foldcase "AbdEgH")  ⇒  "abdegh"

(string-map
 (lambda (c)
   (integer->char (+ 1 (char->integer c))))
 "HAL")                 ⇒  "IBM"

(string-map
 (lambda (c k)
   ((if (eqv? k #\u) char-upcase char-downcase) c))
 "studlycaps xxx"
 "ululululul")          ⇒  "StUdLyCaPs"
standard procedure: string-for-each proc string string …

It is an error if proc does not accept as many arguments as there are strings.

The arguments to string-for-each are like the arguments to string-map, but string-for-each calls proc for its side effects rather than for its values. Unlike string-map, string-for-each is guaranteed to call proc on the elements of the lists in order from the first element(s) to the last, and the value returned by string-for-each is unspecified. If more than one string is given and not all strings have the same length, string-for-each terminates when the shortest string runs out. It is an error for proc to mutate any of the strings.

(let ((v '()))
  (string-for-each
   (lambda (c) (set! v (cons (char->integer c) v)))
   "abcde")
  v)                    ⇒  (101 100 99 98 97)
procedure: string-count proc string string …

It is an error if proc does not accept as many arguments as there are strings.

The string-count procedure applies proc element-wise to the elements of the strings and returns a count of the number of true values it returns. If more than one string is given and not all strings have the same length, string-count terminates when the shortest string runs out. The dynamic order in which proc is applied to the elements of the strings is unspecified.

procedure: string-any proc string string …

It is an error if proc does not accept as many arguments as there are strings.

The string-any procedure applies proc element-wise to the elements of the strings and returns #t if it returns a true value. If proc doesn’t return a true value, string-any returns #f.

If more than one string is given and not all strings have the same length, string-any terminates when the shortest string runs out. The dynamic order in which proc is applied to the elements of the strings is unspecified.

procedure: string-every proc string string …

It is an error if proc does not accept as many arguments as there are strings.

The string-every procedure applies proc element-wise to the elements of the strings and returns #f if it returns a false value. If proc doesn’t return a false value, string-every returns #t.

If more than one string is given and not all strings have the same length, string-every terminates when the shortest string runs out. The dynamic order in which proc is applied to the elements of the strings is unspecified.

procedure: string-null? string

Returns #t if string has zero length; otherwise returns #f.

(string-null? "")       ⇒  #t
(string-null? "Hi")     ⇒  #f
procedure: string-hash string [modulus]
procedure: string-hash-ci string [modulus]

These SRFI 69 procedures return an exact non-negative integer that can be used for storing the specified string in a hash table. Equal strings (in the sense of string=? and string-ci=? respectively) return equal (=) hash codes, and non-equal but similar strings are usually mapped to distinct hash codes.

procedure: string-head string end

Equivalent to (substring string 0 end).

procedure: string-tail string start

Equivalent to (substring string start).

If the optional argument modulus is specified, it must be an exact positive integer, and the result of the hash computation is restricted to be less than that value. This is equivalent to calling modulo on the result, but may be faster.

procedure: string-builder [buffer-length]

This procedure returns a string builder that can be used to incrementally collect characters and later convert that collection to a string. This is similar to a string output port, but is less general and significantly faster.

The optional buffer-length argument, if given, must be an exact positive integer. It controls the size of the internal buffers that are used to accumulate characters. Larger values make the builder somewhat faster but use more space. The default value of this argument is 16.

The returned string builder is a procedure that accepts zero or one arguments as follows:

The “result” arguments control the form of the returned string. The arguments immutable and mutable are straightforward, specifying the mutability of the returned string. For these arguments, the returned string contains exactly the same characters, in the same order, as were appended to the builder.

However, calling with the argument nfc, or with no arguments, returns an immutable string in Unicode Normalization Form C, exactly as if string->nfc were called on one of the other two result strings.

procedure: string-joiner infix prefix suffix
procedure: string-joiner* infix prefix suffix

This procedure’s arguments are keyword arguments; that is, each argument is a symbol of the same name followed by its value. The order of the arguments doesn’t matter, but each argument may appear only once.

These procedures return a joiner procedure that takes multiple strings and joins them together into an immutable string. The joiner returned by string-joiner accepts these strings as multiple string arguments, while string-joiner* accepts the strings as a single list-valued argument.

The joiner produces a result by adding prefix before, suffix after, and infix between each input string, then concatenating everything together into a single string. Each of the prefix, suffix, and infix arguments is optional and defaults to an empty string, so normally at least one is specified.

Some examples:

((string-joiner) "a" "b" "c")
  ⇒  "abc"

((string-joiner 'infix " ") "a" "b" "c")
  ⇒  "a b c"

((string-joiner 'infix ", ") "a" "b" "c")
  ⇒  "a, b, c"

((string-joiner* 'infix ", " 'prefix "<" 'suffix ">")
 '("a" "b" "c"))
  ⇒  "<a>, <b>, <c>"
procedure: string-splitter delimiter allow-runs? copier copy?

This procedure’s arguments are keyword arguments; that is, each argument is a symbol of the same name followed by its value. The order of the arguments doesn’t matter, but each argument may appear only once.

This procedure returns a splitter procedure that splits a given string into parts, returning a list of the parts. This is done by identifying delimiter characters and breaking the string at those delimiters. The splitting process is controlled by the arguments:

Some examples:

((string-splitter) "a b c")
  ⇒  ("a" "b" "c")

((string-splitter) "a\tb\tc")
  ⇒  ("a" "b" "c")

((string-splitter 'delimiter #\space) "a\tb\tc")
  ⇒  ("a\tb\tc")

((string-splitter) " a  b  c ")
  ⇒  ("a" "b" "c")

((string-splitter 'allow-runs? #f) " a  b  c ")
  ⇒  ("" "a" "" "b" "" "c" "")
procedure: string-padder where fill-with clip?

This procedure’s arguments are keyword arguments; that is, each argument is a symbol of the same name followed by its value. The order of the arguments doesn’t matter, but each argument may appear only once.

This procedure returns a padder procedure that takes a string and a grapheme-cluster length as its arguments and returns a new string that has been padded to that length. The padder adds grapheme clusters to the string until it has the specified length. If the string’s grapheme-cluster length is greater than the given length, the string may, depending on the arguments, be reduced to the specified length.

The padding process is controlled by the arguments:

Some examples:

((string-padder) "abc def" 10)
  ⇒  "   abc def"

((string-padder 'where 'trailing) "abc def" 10)
  ⇒  "abc def   "

((string-padder 'fill-with "X") "abc def" 10)
  ⇒  "XXXabc def"

((string-padder) "abc def" 5)
  ⇒  "c def"

((string-padder 'where 'trailing) "abc def" 5)
  ⇒  "abc d"

((string-padder 'clip? #f) "abc def" 5)
  ⇒  "abc def"
obsolete procedure: string-pad-left string k [char]
obsolete procedure: string-pad-right string k [char]

These procedures are deprecated and should be replaced by use of string-padder which is more flexible.

These procedures return an immutable string created by padding string out to length k, using char. If char is not given, it defaults to #\space. If k is less than the length of string, the resulting string is a truncated form of string. string-pad-left adds padding characters or truncates from the beginning of the string (lowest indices), while string-pad-right does so at the end of the string (highest indices).

(string-pad-left "hello" 4)             ⇒  "ello"
(string-pad-left "hello" 8)             ⇒  "   hello"
(string-pad-left "hello" 8 #\*)         ⇒  "***hello"
(string-pad-right "hello" 4)            ⇒  "hell"
(string-pad-right "hello" 8)            ⇒  "hello   "
procedure: string-trimmer where to-trim copier copy?

This procedure’s arguments are keyword arguments; that is, each argument is a symbol of the same name followed by its value. The order of the arguments doesn’t matter, but each argument may appear only once.

This procedure returns a trimmer procedure that takes a string as its argument and trims that string, returning the trimmed result. The trimming process is controlled by the arguments:

Some examples:

((string-trimmer 'where 'leading) "    ABC   DEF    ")
  ⇒  "ABC   DEF    "

((string-trimmer 'where 'trailing) "    ABC   DEF    ")
  ⇒  "    ABC   DEF"

((string-trimmer 'where 'both) "    ABC   DEF    ")
  ⇒  "ABC   DEF"

((string-trimmer) "    ABC   DEF    ")
  ⇒  "ABC   DEF"

((string-trimmer 'to-trim char-numeric? 'where 'leading)
 "21 East 21st Street #3")
  ⇒  " East 21st Street #3"

((string-trimmer 'to-trim char-numeric? 'where 'trailing)
 "21 East 21st Street #3")
  ⇒  "21 East 21st Street #"

((string-trimmer 'to-trim char-numeric?)
 "21 East 21st Street #3")
  ⇒  " East 21st Street #"
obsolete procedure: string-trim string [char-set]
obsolete procedure: string-trim-left string [char-set]
obsolete procedure: string-trim-right string [char-set]

These procedures are deprecated and should be replaced by use of string-trimmer which is more flexible.

Returns an immutable string created by removing all characters that are not in char-set from: (string-trim) both ends of string; (string-trim-left) the beginning of string; or (string-trim-right) the end of string. Char-set defaults to char-set:not-whitespace.

(string-trim "  in the end  ")          ⇒  "in the end"
(string-trim "              ")          ⇒  ""
(string-trim "100th" char-set:numeric)  ⇒  "100"
(string-trim-left "-.-+-=-" (char-set #\+))
                                        ⇒  "+-=-"
(string-trim "but (+ x y) is" (char-set #\( #\)))
                                        ⇒  "(+ x y)"
procedure: string-replace string char1 char2

Returns an immutable string containing the same characters as string except that all instances of char1 have been replaced by char2.


Next: , Previous: , Up: Strings   [Contents][Index]

6.1 Searching and Matching Strings

This section describes procedures for searching a string, either for a character or a substring, and matching two strings to one another.

procedure: string-search-forward pattern string [start [end]]

The arguments pattern and string must satisfy string-in-nfc?.

Searches string for the leftmost occurrence of the substring pattern. If successful, the index of the first character of the matched substring is returned; otherwise, #f is returned.

(string-search-forward "rat" "pirate")
    ⇒ 2
(string-search-forward "rat" "pirate rating")
    ⇒ 2
(string-search-forward "rat" "pirate rating" 4 13)
    ⇒ 7
(string-search-forward "rat" "pirate rating" 9 13)
    ⇒ #f
procedure: string-search-backward pattern string [start [end]]

The arguments pattern and string must satisfy string-in-nfc?.

Searches string for the rightmost occurrence of the substring pattern. If successful, the index to the right of the last character of the matched substring is returned; otherwise, #f is returned.

(string-search-backward "rat" "pirate")
    ⇒ 5
(string-search-backward "rat" "pirate rating")
    ⇒ 10
(string-search-backward "rat" "pirate rating" 1 8)
    ⇒ 5
(string-search-backward "rat" "pirate rating" 9 13)
    ⇒ #f
procedure: string-search-all pattern string [start [end]]

The arguments pattern and string must satisfy string-in-nfc?.

Searches string to find all occurrences of the substring pattern. Returns a list of the occurrences; each element of the list is an index pointing to the first character of an occurrence.

(string-search-all "rat" "pirate")
    ⇒ (2)
(string-search-all "rat" "pirate rating")
    ⇒ (2 7)
(string-search-all "rat" "pirate rating" 4 13)
    ⇒ (7)
(string-search-all "rat" "pirate rating" 9 13)
    ⇒ ()
procedure: substring? pattern string

Searches string to see if it contains the substring pattern. Returns #t if pattern is a substring of string, otherwise returns #f.

(substring? "rat" "pirate")             ⇒  #t
(substring? "rat" "outrage")            ⇒  #f
(substring? "" any-string)              ⇒  #t
(if (substring? "moon" text)
    (process-lunar text)
    'no-moon)
procedure: string-find-first-index proc string string …
procedure: string-find-last-index proc string string …

Each string must satisfy string-in-nfc?, and proc must accept as many arguments as there are strings.

These procedures apply proc element-wise to the elements of the strings and return the first or last index for which proc returns a true value. If there is no such index, then #f is returned.

If more than one string is given and not all strings have the same length, then only the indexes of the shortest string are tested.

procedure: string-find-next-char string char [start [end]]
procedure: string-find-next-char-ci string char [start [end]]
procedure: string-find-next-char-in-set string char-set [start [end]]

The argument string must satisfy string-in-nfc?.

These procedures search string for a matching character, starting from start and moving forwards to end. If there is a matching character, the procedures stop the search and return the index of that character. If there is no matching character, the procedures return #f.

The procedures differ only in how they match characters: string-find-next-char matches a character that is char=? to char; string-find-next-char-ci matches a character that is char-ci=? to char; and string-find-next-char-in-set matches a character that’s a member of char-set.

(string-find-next-char "Adam" #\A)           ⇒  0 
(string-find-next-char "Adam" #\A 1 4)       ⇒  #f
(string-find-next-char-ci "Adam" #\A 1 4)    ⇒  2 
(string-find-next-char-in-set my-string char-set:alphabetic)
    ⇒  start position of the first word in my-string
; Can be used as a predicate:
(if (string-find-next-char-in-set my-string
                                  (char-set #\( #\) ))
    'contains-parentheses
    'no-parentheses)
procedure: string-find-previous-char string char [start [end]]
procedure: string-find-previous-char-ci string char [start [end]]
procedure: string-find-previous-char-in-set string char-set [start [end]]

The argument string must satisfy string-in-nfc?.

These procedures search string for a matching character, starting from end and moving backwards to start. If there is a matching character, the procedures stop the search and return the index of that character. If there is no matching character, the procedures return #f.

The procedures differ only in how they match characters: string-find-previous-char matches a character that is char=? to char; string-find-previous-char-ci matches a character that is char-ci=? to char; and string-find-previous-char-in-set matches a character that’s a member of char-set.

procedure: string-match-forward string1 string2

The arguments string1 and string2 must satisfy string-in-nfc?.

Compares the two strings, starting from the beginning, and returns the number of characters that are the same. If the two strings start differently, returns 0.

(string-match-forward "mirror" "micro") ⇒  2  ; matches "mi"
(string-match-forward "a" "b")          ⇒  0  ; no match
procedure: string-match-backward string1 string2

The arguments string1 and string2 must satisfy string-in-nfc?.

Compares the two strings, starting from the end and matching toward the front, returning the number of characters that are the same. If the two strings end differently, returns 0.

(string-match-backward "bulbous" "fractious")
                                        ⇒  3  ; matches "ous"
procedure: string-prefix? string1 string2
procedure: string-prefix-ci? string1 string2

These procedures return #t if the first string forms the prefix of the second; otherwise returns #f. The -ci procedures don’t distinguish uppercase and lowercase letters.

(string-prefix? "abc" "abcdef")         ⇒  #t
(string-prefix? "" any-string)          ⇒  #t
procedure: string-suffix? string1 string2
procedure: string-suffix-ci? string1 string2

These procedures return #t if the first string forms the suffix of the second; otherwise returns #f. The -ci procedures don’t distinguish uppercase and lowercase letters.

(string-suffix? "ous" "bulbous")        ⇒  #t
(string-suffix? "" any-string)          ⇒  #t

Previous: , Up: Strings   [Contents][Index]

6.2 Regular Expressions

MIT/GNU Scheme provides support for matching and searching strings against regular expressions. This is considerably more flexible than ordinary string matching and searching, but potentially much slower. On the other hand it is less powerful than the mechanism described in Parser Language.

Traditional regular expressions are defined with string patterns in which characters like ‘[’ and ‘*’ have special meanings. Unfortunately, the syntax of these patterns is not only baroque but also comes in many different and mutually-incompatible varieties. As a consequence we have chosen to specify regular expressions using an s-expression syntax, which we call a regular s-expression, abbreviated as regsexp.

Previous releases of MIT/GNU Scheme provided a regular-expression implementation nearly identical to that of GNU Emacs version 18. This implementation supported only 8-bit strings, which made it unsuitable for use with Unicode strings. This implementation still exists but is deprecated and will be removed in a future release.


Next: , Previous: , Up: Regular Expressions   [Contents][Index]

6.2.1 Regular S-Expressions

A regular s-expression is either a character or a string, which matches itself, or one of the following forms.

Examples in this section use the following definitions for brevity:

(define (try-match pattern string)
  (regsexp-match-string (compile-regsexp pattern) string))

(define (try-search pattern string)
  (regsexp-search-string-forward (compile-regsexp pattern) string))

These forms match one or more characters literally:

regsexp: char-ci char

Matches char without considering case.

regsexp: string-ci string

Matches string without considering case.

regsexp: any-char

Matches one character other than #\newline.

(try-match '(any-char) "") ⇒ #f
(try-match '(any-char) "a") ⇒ (0 1)
(try-match '(any-char) "\n") ⇒ #f
(try-search '(any-char) "") ⇒ #f
(try-search '(any-char) "ab") ⇒ (0 1)
(try-search '(any-char) "\na") ⇒ (1 2)
regsexp: char-in datum …
regsexp: char-not-in datum …

Matches one character in (not in) the character set specified by (char-set datum …).

(try-match '(seq "a" (char-in "ab") "c") "abc") ⇒ (0 3)
(try-match '(seq "a" (char-not-in "ab") "c") "abc") ⇒ #f
(try-match '(seq "a" (char-not-in "ab") "c") "adc") ⇒ (0 3)
(try-match '(seq "a" (+ (char-in numeric)) "c") "a019c") ⇒ (0 5)

These forms match no characters, but only at specific locations in the input string:

regsexp: line-start
regsexp: line-end

Matches no characters at the start (end) of a line.

(try-match '(seq (line-start)
                 (* (any-char))
                 (line-end))
           "abc") ⇒ (0 3)
(try-match '(seq (line-start)
                 (* (any-char))
                 (line-end))
           "ab\nc") ⇒ (0 2)
(try-search '(seq (line-start)
                  (* (char-in alphabetic))
                  (line-end))
            "1abc") ⇒ #f
(try-search '(seq (line-start)
                  (* (char-in alphabetic))
                  (line-end))
            "1\nabc") ⇒ (2 5)
regsexp: string-start
regsexp: string-end

Matches no characters at the start (end) of the string.

(try-match '(seq (string-start)
                 (* (any-char))
                 (string-end))
           "abc") ⇒ (0 3)
(try-match '(seq (string-start)
                 (* (any-char))
                 (string-end))
           "ab\nc") ⇒ #f
(try-search '(seq (string-start)
                  (* (char-in alphabetic))
                  (string-end))
            "1abc") ⇒ #f
(try-search '(seq (string-start)
                  (* (char-in alphabetic))
                  (string-end))
            "1\nabc") ⇒ #f

These forms match repetitions of a given regsexp. Most of them come in two forms, one of which is greedy and the other shy. The greedy form matches as many repetitions as it can, then uses failure backtracking to reduce the number of repetitions one at a time. The shy form matches the minimum number of repetitions, then uses failure backtracking to increase the number of repetitions one at a time. The shy form is similar to the greedy form except that a ? is added at the end of the form’s keyword.

regsexp: ? regsexp
regsexp: ?? regsexp

Matches regsexp zero or one time.

(try-search '(seq (char-in alphabetic)
                  (? (char-in numeric)))
            "a") ⇒ (0 1)
(try-search '(seq (char-in alphabetic)
                  (?? (char-in numeric)))
            "a") ⇒ (0 1)
(try-search '(seq (char-in alphabetic)
                  (? (char-in numeric)))
            "a1") ⇒ (0 2)
(try-search '(seq (char-in alphabetic)
                  (?? (char-in numeric)))
            "a1") ⇒ (0 1)
(try-search '(seq (char-in alphabetic)
                  (? (char-in numeric)))
            "1a2") ⇒ (1 3)
(try-search '(seq (char-in alphabetic)
                  (?? (char-in numeric)))
            "1a2") ⇒ (1 2)
regsexp: * regsexp
regsexp: *? regsexp

Matches regsexp zero or more times.

(try-match '(seq (char-in alphabetic)
                 (* (char-in numeric))
                 (any-char))
           "aa") ⇒ (0 2)
(try-match '(seq (char-in alphabetic)
                 (*? (char-in numeric))
                 (any-char))
           "aa") ⇒ (0 2)
(try-match '(seq (char-in alphabetic)
                 (* (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 5)
(try-match '(seq (char-in alphabetic)
                 (*? (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 2)
regsexp: + regsexp
regsexp: +? regsexp

Matches regsexp one or more times.

(try-match '(seq (char-in alphabetic)
                 (+ (char-in numeric))
                 (any-char))
           "aa") ⇒ #f
(try-match '(seq (char-in alphabetic)
                 (+? (char-in numeric))
                 (any-char))
           "aa") ⇒ #f
(try-match '(seq (char-in alphabetic)
                 (+ (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 5)
(try-match '(seq (char-in alphabetic)
                 (+? (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 3)
regsexp: ** n m regsexp
regsexp: **? n m regsexp

The n argument must be an exact nonnegative integer. The m argument must be either an exact integer greater than or equal to n, or else #f.

Matches regsexp at least n times and at most m times; if m is #f then there is no upper limit.

(try-match '(seq (char-in alphabetic)
                 (** 0 2 (char-in numeric))
                 (any-char))
           "aa") ⇒ (0 2)
(try-match '(seq (char-in alphabetic)
                 (**? 0 2 (char-in numeric))
                 (any-char))
           "aa") ⇒ (0 2)
(try-match '(seq (char-in alphabetic)
                 (** 0 2 (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 4)
(try-match '(seq (char-in alphabetic)
                 (**? 0 2 (char-in numeric))
                 (any-char))
           "a123a") ⇒ (0 2)
regsexp: ** n regsexp

This is an abbreviation for (** n n regsexp). This matcher is neither greedy nor shy since it matches a fixed number of repetitions.

These forms implement alternatives and sequencing:

regsexp: alt regsexp …

Matches one of the regsexp arguments, trying each in order from left to right.

(try-match '(alt #\a (char-in numeric)) "a") ⇒ (0 1)
(try-match '(alt #\a (char-in numeric)) "b") ⇒ #f
(try-match '(alt #\a (char-in numeric)) "1") ⇒ (0 1)
regsexp: seq regsexp …

Matches the first regsexp, then continues the match with the next regsexp, and so on until all of the arguments are matched.

(try-match '(seq #\a #\b) "a") ⇒ #f
(try-match '(seq #\a #\b) "aa") ⇒ #f
(try-match '(seq #\a #\b) "ab") ⇒ (0 2)

These forms implement named registers, which store matched segments of the input string:

regsexp: group key regsexp

The key argument must be a fixnum, a character, or a symbol.

Matches regsexp. If the match succeeds, the matched segment is stored in the register named key.

(try-match '(seq (group a (any-char))
                 (group b (any-char))
                 (any-char))
           "radar") ⇒ (0 3 (a . "r") (b . "a"))
regsexp: group-ref key

The key argument must be a fixnum, a character, or a symbol.

Matches the characters stored in the register named key. It is an error if that register has not been initialized with a corresponding group expression.

(try-match '(seq (group a (any-char))
                 (group b (any-char))
                 (any-char)
                 (group-ref b)
                 (group-ref a))
           "radar") ⇒ (0 5 (a . "r") (b . "a"))

Previous: , Up: Regular Expressions   [Contents][Index]

6.2.2 Regsexp Procedures

The regular s-expression implementation has two parts, like many other regular-expression implementations: a compiler that translates the pattern into an efficient form, and one or more procedures that use that pattern to match or search inputs.

procedure: compile-regsexp regsexp

Compiles regsexp by translating it into a procedure that implements the specified matcher.

The match and search procedures each return a list when they are successful, and #f when they fail. The returned list is of the form (s e register …), where s is the index at which the match starts, e is the index at which the match ends, and each register is a pair (key . contents) where key is the register’s name and contents is the contents of that register as a string.

In order to get reliable results, the string arguments to these procedures must be in Unicode Normalization Form C. The string implementation keeps most strings in this form by default; in other cases the caller must convert the string using string->nfc.

procedure: regsexp-match-string crse string [start [end]]

The crse argument must be a value returned by compile-regsexp. The string argument must satisfy string-in-nfc?.

Matches string against crse and returns the result.

procedure: regsexp-search-string-forward crse string [start [end]]

The crse argument must be a value returned by compile-regsexp. The string argument must satisfy string-in-nfc?.

Searches string from left to right for a match against crse and returns the result.


Next: , Previous: , Up: Top   [Contents][Index]

7 Lists

A pair (sometimes called a dotted pair) is a data structure with two fields called the car and cdr fields (for historical reasons). Pairs are created by the procedure cons. The car and cdr fields are accessed by the procedures car and cdr. The car and cdr fields are assigned by the procedures set-car! and set-cdr!.

Pairs are used primarily to represent lists. A list can be defined recursively as either the empty list or a pair whose cdr is a list. More precisely, the set of lists is defined as the smallest set X such that

The objects in the car fields of successive pairs of a list are the elements of the list. For example, a two-element list is a pair whose car is the first element and whose cdr is a pair whose car is the second element and whose cdr is the empty list. The length of a list is the number of elements, which is the same as the number of pairs. The empty list is a special object of its own type (it is not a pair); it has no elements and its length is zero.5

The most general notation (external representation) for Scheme pairs is the “dotted” notation (c1 . c2) where c1 is the value of the car field and c2 is the value of the cdr field. For example, (4 . 5) is a pair whose car is 4 and whose cdr is 5. Note that (4 . 5) is the external representation of a pair, not an expression that evaluates to a pair.

A more streamlined notation can be used for lists: the elements of the list are simply enclosed in parentheses and separated by spaces. The empty list is written (). For example, the following are equivalent notations for a list of symbols:

(a b c d e)
(a . (b . (c . (d . (e . ())))))

Whether a given pair is a list depends upon what is stored in the cdr field. When the set-cdr! procedure is used, an object can be a list one moment and not the next:

(define x (list 'a 'b 'c))
(define y x)
y                                       ⇒ (a b c)
(list? y)                               ⇒ #t
(set-cdr! x 4)                          ⇒ unspecified
x                                       ⇒ (a . 4)
(eqv? x y)                              ⇒ #t
y                                       ⇒ (a . 4)
(list? y)                               ⇒ #f
(set-cdr! x x)                          ⇒ unspecified
(list? y)                               ⇒ #f

A chain of pairs that doesn’t end in the empty list is called an improper list. Note that an improper list is not a list. The list and dotted notations can be combined to represent improper lists, as the following equivalent notations show:

(a b c . d)
(a . (b . (c . d)))

Within literal expressions and representations of objects read by the read procedure, the forms 'datum, `datum, ,datum, and ,@datum denote two-element lists whose first elements are the symbols quote, quasiquote, unquote, and unquote-splicing, respectively. The second element in each case is datum. This convention is supported so that arbitrary Scheme programs may be represented as lists. Among other things, this permits the use of the read procedure to parse Scheme programs.


Next: , Previous: , Up: Lists   [Contents][Index]

7.1 Pairs

This section describes the simple operations that are available for constructing and manipulating arbitrary graphs constructed from pairs.

standard procedure: pair? object

Returns #t if object is a pair; otherwise returns #f.

(pair? '(a . b))                        ⇒ #t
(pair? '(a b c))                        ⇒ #t
(pair? '())                             ⇒ #f
(pair? '#(a b))                         ⇒ #f
standard procedure: cons obj1 obj2

Returns a newly allocated pair whose car is obj1 and whose cdr is obj2. The pair is guaranteed to be different (in the sense of eqv?) from every previously existing object.

(cons 'a '())                           ⇒ (a)
(cons '(a) '(b c d))                    ⇒ ((a) b c d)
(cons "a" '(b c))                       ⇒ ("a" b c)
(cons 'a 3)                             ⇒ (a . 3)
(cons '(a b) 'c)                        ⇒ ((a b) . c)
SRFI 1 procedure: xcons obj1 obj2

Returns a newly allocated pair whose car is obj2 and whose cdr is obj1.

(xcons '(b c) 'a)                       ⇒ (a b c)
standard procedure: car pair

Returns the contents of the car field of pair. Note that it is an error to take the car of the empty list.

(car '(a b c))                          ⇒ a
(car '((a) b c d))                      ⇒ (a)
(car '(1 . 2))                          ⇒ 1
(car '())                               error→ Illegal datum
standard procedure: cdr pair

Returns the contents of the cdr field of pair. Note that it is an error to take the cdr of the empty list.

(cdr '((a) b c d))                      ⇒ (b c d)
(cdr '(1 . 2))                          ⇒ 2
(cdr '())                               error→ Illegal datum
SRFI 1 procedure: car+cdr pair

The fundamental pair deconstructor:

(lambda (p) (values (car p) (cdr p)))
(receive (a b) (car+cdr (cons 1 2))
  (write-line a)
  (write-line b))
-| 1
-| 2
standard procedure: set-car! pair object

Stores object in the car field of pair. The value returned by set-car! is unspecified.

(define (f) (list 'not-a-constant-list))
(define (g) '(constant-list))
(set-car! (f) 3)                        ⇒ unspecified
(set-car! (g) 3)                        error→ Illegal datum
standard procedure: set-cdr! pair object

Stores object in the cdr field of pair. The value returned by set-cdr! is unspecified.

standard procedure: caar pair
standard procedure: cadr pair
standard procedure: cdar pair
standard procedure: cddr pair
standard procedure: caaar pair
standard procedure: caadr pair
standard procedure: cadar pair
standard procedure: caddr pair
standard procedure: cdaar pair
standard procedure: cdadr pair
standard procedure: cddar pair
standard procedure: cdddr pair
standard procedure: caaaar pair
standard procedure: caaadr pair
standard procedure: caadar pair
standard procedure: caaddr pair
standard procedure: cadaar pair
standard procedure: cadadr pair
standard procedure: caddar pair
standard procedure: cadddr pair
standard procedure: cdaaar pair
standard procedure: cdaadr pair
standard procedure: cdadar pair
standard procedure: cdaddr pair
standard procedure: cddaar pair
standard procedure: cddadr pair
standard procedure: cdddar pair
standard procedure: cddddr pair

These procedures are compositions of car and cdr; for example, caddr could be defined by

(define caddr (lambda (x) (car (cdr (cdr x)))))
procedure: general-car-cdr object path

This procedure is a generalization of car and cdr. Path encodes a particular sequence of car and cdr operations, which general-car-cdr executes on object. Path is an exact non-negative integer that encodes the operations in a bitwise fashion: a zero bit represents a cdr operation, and a one bit represents a car. The bits are executed LSB to MSB, and the most significant one bit, rather than being interpreted as an operation, signals the end of the sequence.6

For example, the following are equivalent:

(general-car-cdr object #b1011)
(cdr (car (car object)))

Here is a partial table of path/operation equivalents:

#b10    cdr
#b11    car
#b100   cddr
#b101   cdar
#b110   cadr
#b111   caar
#b1000  cdddr
SRFI 1 procedure: tree-copy tree

This copies an arbitrary tree constructed from pairs, copying both the car and cdr elements of every pair. This could have been defined by

(define (tree-copy tree)
  (let loop ((tree tree))
    (if (pair? tree)
        (cons (loop (car tree)) (loop (cdr tree)))
        tree)))

Next: , Previous: , Up: Lists   [Contents][Index]

7.2 Construction of Lists

standard procedure: list object …

Returns a list of its arguments.

(list 'a (+ 3 4) 'c)                    ⇒ (a 7 c)
(list)                                  ⇒ ()

These expressions are equivalent:

(list obj1 obj2objN)
(cons obj1 (cons obj2 … (cons objN '()) …))
SRFI 1 procedure: make-list n [fill]

Returns an n-element list, whose elements are all the value fill. If the fill argument is not given, the elements of the list may be arbitrary values.

(make-list 4 'c)                        ⇒ (c c c c)
SRFI 1 procedure: cons* object object …

cons* is similar to list, except that cons* conses together the last two arguments rather than consing the last argument with the empty list. If the last argument is not a list the result is an improper list. If the last argument is a list, the result is a list consisting of the initial arguments and all of the items in the final argument. If there is only one argument, the result is the argument.

(cons* 'a 'b 'c)                        ⇒ (a b . c)
(cons* 'a 'b '(c d))                    ⇒ (a b c d)
(cons* 'a)                              ⇒ a

These expressions are equivalent:

(cons* obj1 obj2objN-1 objN)
(cons obj1 (cons obj2 … (cons objN-1 objN) …))
SRFI 1 procedure: list-tabulate k init-proc
obsolete procedure: make-initialized-list k init-proc

Returns a k-element list. Element i of the list, where 0 <= i < k, is produced by (init-proc i). No guarantee is made about the dynamic order in which init-proc is applied to these indices.

(list-tabulate 4 values) => (0 1 2 3)
SRFI 1 procedure: list-copy list

Returns a newly allocated copy of list. This copies each of the pairs comprising list. This could have been defined by

(define (list-copy list)
  (if (null? list)
      '()
      (cons (car list)
            (list-copy (cdr list)))))
SRFI 1 procedure: iota count [start [step]]

Returns a list containing the elements

(start start+stepstart+(count-1)*step)

Count must be an exact non-negative integer, while start and step can be any numbers. The start and step parameters default to 0 and 1, respectively.

(iota 5) ⇒ (0 1 2 3 4)
(iota 5 0 -0.1) ⇒ (0 -0.1 -0.2 -0.3 -0.4)
standard procedure: vector->list vector [start [end]]
obsolete procedure: subvector->list vector start end

Returns a newly allocated list of the elements of vector between start inclusive and end exclusive. The inverse of vector->list is list->vector.

(vector->list '#(dah dah didah))        ⇒ (dah dah didah)

Next: , Previous: , Up: Lists   [Contents][Index]

7.3 Selecting List Components

standard procedure: list? object
SRFI 1 procedure: proper-list? object

Returns #t if object is a proper list, otherwise returns #f. By definition, all proper lists have finite length and are terminated by the empty list. If object is a circular list, returns #f.

Any object satisfying this predicate will also satisfy exactly one of pair? or null?.

(list? (list 'a 'b 'c))                 ⇒ #t
(list? (cons* 'a 'b 'c))                error→
(list? (circular-list 'a 'b 'c))        ⇒ #f
SRFI 1 procedure: circular-list? object

Returns #t if object is a circular list, otherwise returns #f.

(circular-list? (list 'a 'b 'c))        ⇒ #f
(circular-list? (cons* 'a 'b 'c))       ⇒ #f
(circular-list? (circular-list 'a 'b 'c)) ⇒ #t
SRFI 1 procedure: dotted-list? object

Returns #t if object is an improper list, otherwise returns #f.

(dotted-list? (list 'a 'b 'c))          ⇒ #f
(dotted-list? (cons* 'a 'b 'c))         ⇒ #t
(dotted-list? (circular-list 'a 'b 'c)) ⇒ #f
standard procedure: length list

Returns the length of list. Signals an error if list isn’t a proper list.

(length (list 'a 'b 'c))                ⇒ 3
(length (cons* 'a 'b 'c))               error→
(length (circular-list 'a 'b 'c))       error→
SRFI 1 procedure: length+ clist

Clist must be a proper, dotted, or circular list. If clist is a circular list, returns #f, otherwise returns the number of pairs comprising the list (which is the same as the length for a proper list).

(length+ (list 'a 'b 'c))               ⇒ 3
(length+ (cons* 'a 'b 'c))              ⇒ 2
(length+ (circular-list 'a 'b 'c))      ⇒ #f
standard procedure: null? object

Returns #t if object is the empty list; otherwise returns #f.

(null? '())                             ⇒ #t
(null? (list 'a 'b 'c))                 ⇒ #f
(null? (cons* 'a 'b 'c))                ⇒ #f
(null? (circular-list 'a 'b 'c))        ⇒ #f
SRFI 1 procedure: null-list? list

List is a proper or circular list. This procedure returns #t if the argument is the empty list (), and #f if the argument is a pair. It is an error to pass this procedure any other value. This procedure is recommended as the termination condition for list-processing procedures that are not defined on dotted lists.

standard procedure: list-ref list k

Returns the kth element of list, using zero-origin indexing. The valid indexes of a list are the exact non-negative integers less than the length of the list. The first element of a list has index 0, the second has index 1, and so on.

(list-ref '(a b c d) 2)                 ⇒ c
(list-ref '(a b c d)
          (exact (round 1.8)))
     ⇒ c

(list-ref list k) is equivalent to (car (drop list k)).

SRFI 1 procedure: first list
SRFI 1 procedure: second list
SRFI 1 procedure: third list
SRFI 1 procedure: fourth list
SRFI 1 procedure: fifth list
SRFI 1 procedure: sixth list
SRFI 1 procedure: seventh list
SRFI 1 procedure: eighth list
SRFI 1 procedure: ninth list
SRFI 1 procedure: tenth list

Returns the specified element of list. It is an error if list is not long enough to contain the specified element (for example, if the argument to seventh is a list that contains only six elements).


Next: , Previous: , Up: Lists   [Contents][Index]

7.4 Cutting and Pasting Lists

SRFI 1 procedure: take x i
SRFI 1 procedure: drop x i

take returns the first i elements of list x. drop returns all but the first i elements of list x.

(take '(a b c d e)  2) => (a b)
(drop '(a b c d e)  2) => (c d e)

x may be any value—a proper, circular, or dotted list:

(take '(1 2 3 . d) 2) => (1 2)
(drop '(1 2 3 . d) 2) => (3 . d)
(take '(1 2 3 . d) 3) => (1 2 3)
(drop '(1 2 3 . d) 3) => d

For a legal i, take and drop partition the list in a manner which can be inverted with append:

(append (take x i) (drop x i)) = x

drop is exactly equivalent to performing i cdr operations on x; the returned value shares a common tail with x. If the argument is a list of non-zero length, take is guaranteed to return a freshly-allocated list, even in the case where the entire list is taken, e.g. (take lis (length lis)).

obsolete procedure: list-head x i
standard procedure: list-tail x i

Equivalent to take and drop, respectively. list-head is deprecated and should not be used. list-tail is defined by R7RS.

procedure: sublist list start end

Start and end must be exact integers satisfying

0 <= start <= end <= (length list)

sublist returns a newly allocated list formed from the elements of list beginning at index start (inclusive) and ending at end (exclusive).

standard procedure: append list …

Returns a list consisting of the elements of the first list followed by the elements of the other lists.

(append '(x) '(y))                      ⇒ (x y)
(append '(a) '(b c d))                  ⇒ (a b c d)
(append '(a (b)) '((c)))                ⇒ (a (b) (c))
(append)                                ⇒ ()

The resulting list is always newly allocated, except that it shares structure with the last list argument. The last argument may actually be any object; an improper list results if the last argument is not a proper list.

(append '(a b) '(c . d))                ⇒ (a b c . d)
(append '() 'a)                         ⇒ a
SRFI 1 procedure: append! list …

Returns a list that is the argument lists concatenated together. The arguments are changed rather than copied. (Compare this with append, which copies arguments rather than destroying them.) For example:

(define x (list 'a 'b 'c))
(define y (list 'd 'e 'f))
(define z (list 'g 'h))
(append! x y z)                         ⇒ (a b c d e f g h)
x                                       ⇒ (a b c d e f g h)
y                                       ⇒ (d e f g h)
z                                       ⇒ (g h)
SRFI 1 procedure: last pair
SRFI 1 procedure: last-pair pair

last returns the last element of the non-empty, finite list pair. last-pair returns the last pair in the non-empty, finite list pair.

(last '(a b c)) => c
(last-pair '(a b c)) => (c)
obsolete procedure: except-last-pair list
obsolete procedure: except-last-pair! list

These procedures are deprecated. Instead use drop-right or drop-right!, respectively, with a second argument of 1.


Next: , Previous: , Up: Lists   [Contents][Index]

7.5 Filtering Lists

SRFI 1 procedure: filter predicate list

Returns a newly allocated copy of list containing only the elements satisfying predicate. Predicate must be a procedure of one argument.

(filter odd? '(1 2 3 4 5)) ⇒ (1 3 5)
SRFI 1 procedure: remove predicate list

Like filter, except that the returned list contains only those elements not satisfying predicate.

(remove odd? '(1 2 3 4 5)) ⇒ (2 4)
SRFI 1 procedure: partition predicate list

Partitions the elements of list with predicate, and returns two values: the list of in-elements and the list of out-elements. The list is not disordered—elements occur in the result lists in the same order as they occur in the argument list. The dynamic order in which the various applications of predicate are made is not specified. One of the returned lists may share a common tail with the argument list.

(partition symbol? '(one 2 3 four five 6)) => 
    (one four five)
    (2 3 6)
SRFI 1 procedure: filter! predicate list
SRFI 1 procedure: remove! predicate list
SRFI 1 procedure: partition! predicate list

Linear-update variants of filter, remove and partition. These procedures are allowed, but not required, to alter the cons cells in the argument list to construct the result lists.

SRFI 1 procedure: delete x list [compare]
SRFI 1 procedure: delete! x list [compare]

delete uses the comparison procedure compare, which defaults to equal?, to find all elements of list that are equal to x, and deletes them from list. The dynamic order in which the various applications of compare are made is not specified.

The list is not disordered—elements that appear in the result list occur in the same order as they occur in the argument list. The result may share a common tail with the argument list.

Note that fully general element deletion can be performed with the remove and remove! procedures, e.g.:

;; Delete all the even elements from LIS:
(remove even? lis)

The comparison procedure is used in this way: (compare x ei). That is, x is always the first argument, and a list element is always the second argument. The comparison procedure will be used to compare each element of list exactly once; the order in which it is applied to the various ei is not specified. Thus, one can reliably remove all the numbers greater than five from a list with (delete 5 list <).

delete! is the linear-update variant of delete. It is allowed, but not required, to alter the cons cells in its argument list to construct the result.

procedure: delq x list
procedure: delq! x list
procedure: delv x list
procedure: delv! x list

Equivalent to (delete x list eq?), (delete! x list eq?), (delete x list eqv?), and (delete! x list eqv?), respectively.

procedure: delete-member-procedure deletor predicate

Returns a deletion procedure similar to delv or delete!. Deletor should be one of the procedures list-deletor or list-deletor!. Predicate must be an equivalence predicate. The returned procedure accepts exactly two arguments: first, an object to be deleted, and second, a list of objects from which it is to be deleted. If deletor is list-deletor, the procedure returns a newly allocated copy of the given list in which all entries equal to the given object have been removed. If deletor is list-deletor!, the procedure returns a list consisting of the top-level elements of the given list with all entries equal to the given object removed; the given list is destructively modified to produce the result. In either case predicate is used to compare the given object to the elements of the given list.

Here are some examples that demonstrate how delete-member-procedure could have been used to implement delv and delete!:

(define delv
  (delete-member-procedure list-deletor eqv?))
(define delete!
  (delete-member-procedure list-deletor! equal?))
procedure: list-deletor predicate
procedure: list-deletor! predicate

These procedures each return a procedure that deletes elements from lists. Predicate must be a procedure of one argument. The returned procedure accepts exactly one argument, which must be a proper list, and applies predicate to each of the elements of the argument, deleting those for which it is true.

The procedure returned by list-deletor deletes elements non-destructively, by returning a newly allocated copy of the argument with the appropriate elements removed. The procedure returned by list-deletor! performs a destructive deletion.


Next: , Previous: , Up: Lists   [Contents][Index]

7.6 Searching Lists

SRFI 1 procedure: find predicate list

Returns the first element in list for which predicate is true; returns #f if it doesn’t find such an element. Predicate must be a procedure of one argument.

(find even? '(3 1 4 1 5 9)) => 4

Note that find has an ambiguity in its lookup semantics—if find returns #f, you cannot tell (in general) if it found a #f element that satisfied predicate, or if it did not find any element at all. In many situations, this ambiguity cannot arise—either the list being searched is known not to contain any #f elements, or the list is guaranteed to have an element satisfying predicate. However, in cases where this ambiguity can arise, you should use find-tail instead of findfind-tail has no such ambiguity:

(cond ((find-tail pred lis)
        => (lambda (pair) …)) ; Handle (CAR PAIR)
      (else …)) ; Search failed.
SRFI 1 procedure: find-tail predicate list

Returns the first pair of list whose car satisfies predicate; returns #f if there’s no such pair. find-tail can be viewed as a general-predicate variant of memv.

standard procedure: memq object list
standard procedure: memv object list
standard procedure: member object list [compare]

These procedures return the first pair of list whose car is object; the returned pair is always one from which list is composed. If object does not occur in list, #f (n.b.: not the empty list) is returned. memq uses eq? to compare object with the elements of list, while memv uses eqv? and member uses compare, or equal? if compare is not supplied.7

(memq 'a '(a b c))                      ⇒ (a b c)
(memq 'b '(a b c))                      ⇒ (b c)
(memq 'a '(b c d))                      ⇒ #f
(memq (list 'a) '(b (a) c))             ⇒ #f
(member (list 'a) '(b (a) c))           ⇒ ((a) c)
(memq 101 '(100 101 102))               ⇒ unspecified
(memv 101 '(100 101 102))               ⇒ (101 102)
procedure: member-procedure predicate

Returns a procedure similar to memq, except that predicate, which must be an equivalence predicate, is used instead of eq?. This could be used to define memv as follows:

(define memv (member-procedure eqv?))

Next: , Previous: , Up: Lists   [Contents][Index]

7.7 Mapping of Lists

standard procedure: map procedure list list …

Procedure must be a procedure taking as many arguments as there are lists. If more than one list is given, then they must all be the same length. map applies procedure element-wise to the elements of the lists and returns a list of the results, in order from left to right. The dynamic order in which procedure is applied to the elements of the lists is unspecified; use for-each to sequence side effects.

(map cadr '((a b) (d e) (g h)))           ⇒ (b e h)
(map (lambda (n) (expt n n)) '(1 2 3 4))  ⇒ (1 4 27 256)
(map + '(1 2 3) '(4 5 6))                 ⇒ (5 7 9)
(let ((count 0))
  (map (lambda (ignored)
         (set! count (+ count 1))
         count)
       '(a b c)))                         ⇒ unspecified
obsolete procedure: map* knil proc list1 list2

Deprecated, use fold-right instead. Equivalent to

(fold-right (lambda (e1 e2 … acc)
              (cons* (proc e1)
                     (proc e2)
                     …
                     acc))
            knil
            list1
            list2
            …)
SRFI 1 procedure: append-map procedure list list …

Similar to map except that the results of applying procedure to the elements of lists are concatenated together by append rather than by cons. The following are equivalent, except that the former is more efficient:

(append-map procedure list1 list2 …)
(apply append (map procedure list1 list2 …))
obsolete procedure: append-map* knil proc list1 list2

Deprecated, use fold-right instead. Equivalent to

(fold-right (lambda (e1 e2 … acc)
              (append (proc e1)
                      (proc e2)
                      …
                      acc))
            knil
            list1
            list2
            …)
SRFI 1 procedure: append-map! proc list list …

Similar to map except that the results of applying proc to the elements of lists are concatenated together by append! rather than by cons. The following are equivalent, except that the former is more efficient:

(append-map! proc list list …)
(apply append! (map proc list list …))
obsolete procedure: append-map*! knil proc list1 list2

Deprecated, use fold-right instead. Equivalent to

(fold-right (lambda (e1 e2 … acc)
              (append! (proc e1)
                       (proc e2)
                       …
                       acc))
            knil
            list1
            list2
            …)
standard procedure: for-each procedure list list …

The arguments to for-each are like the arguments to map, but for-each calls procedure for its side effects rather than for its values. Unlike map, for-each is guaranteed to call procedure on the elements of the lists in order from the first element to the last, and the value returned by for-each is unspecified.

(let ((v (make-vector 5)))
  (for-each (lambda (i)
              (vector-set! v i (* i i)))
            '(0 1 2 3 4))
  v)                            ⇒ #(0 1 4 9 16)
SRFI 1 procedure: any predicate list list …

Applies predicate across the lists, returning true if predicate returns true on any application.

If there are n list arguments list1listn, then predicate must be a procedure taking n arguments and returning a boolean result.

any applies predicate to the first elements of the list parameters. If this application returns a true value, any immediately returns that value. Otherwise, it iterates, applying predicate to the second elements of the list parameters, then the third, and so forth. The iteration stops when a true value is produced or one of the lists runs out of values; in the latter case, any returns #f. The application of predicate to the last element of the lists is a tail call.

Note the difference between find and anyfind returns the element that satisfied the predicate; any returns the true value that the predicate produced.

Like every, any’s name does not end with a question mark—this is to indicate that it does not return a simple boolean (#t or #f), but a general value.

(any integer? '(a 3 b 2.7))   => #t
(any integer? '(a 3.1 b 2.7)) => #f
(any < '(3 1 4 1 5)
       '(2 7 1 8 2)) => #t
SRFI 1 procedure: every predicate list list …

Applies predicate across the lists, returning true if predicate returns true on every application.

If there are n list arguments list1listn, then predicate must be a procedure taking n arguments and returning a boolean result.

every applies predicate to the first elements of the list parameters. If this application returns false, every immediately returns false. Otherwise, it iterates, applying predicate to the second elements of the list parameters, then the third, and so forth. The iteration stops when a false value is produced or one of the lists runs out of values. In the latter case, every returns the true value produced by its final application of predicate. The application of predicate to the last element of the lists is a tail call.

If one of the lists has no elements, every simply returns #t.

Like any, every’s name does not end with a question mark—this is to indicate that it does not return a simple boolean (#t or #f), but a general value.


Next: , Previous: , Up: Lists   [Contents][Index]

7.8 Folding of Lists

SRFI 1 procedure: fold kons knil clist1 clist2

The fundamental list iterator.

First, consider the single list-parameter case. If clist1 = (e1 e2en), then this procedure returns

(kons en … (kons e2 (kons e1 knil)) …)

That is, it obeys the (tail) recursion

(fold kons knil lis) = (fold kons (kons (car lis) knil) (cdr lis))
(fold kons knil '()) = knil

Examples:

(fold + 0 lis)                  ; Add up the elements of LIS.

(fold cons '() lis)             ; Reverse LIS.

(fold cons tail rev-head)       ; See APPEND-REVERSE.

;; How many symbols in LIS?
(fold (lambda (x count) (if (symbol? x) (+ count 1) count))
      0
      lis)

;; Length of the longest string in LIS:
(fold (lambda (s max-len) (max max-len (string-length s)))
      0
      lis)

If n list arguments are provided, then the kons procedure must take n+1 parameters: one element from each list, and the "seed" or fold state, which is initially knil. The fold operation terminates when the shortest list runs out of values:

(fold cons* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1)

At least one of the list arguments must be finite.

SRFI 1 procedure: fold-right kons knil clist1 clist2

The fundamental list recursion operator.

First, consider the single list-parameter case. If clist1 = (e1 e2en), then this procedure returns

(kons e1 (kons e2 … (kons en knil)))

That is, it obeys the recursion

(fold-right kons knil lis) = (kons (car lis) (fold-right kons knil (cdr lis)))
(fold-right kons knil '()) = knil

Examples:

(fold-right cons '() lis)               ; Copy LIS.

;; Filter the even numbers out of LIS.
(fold-right (lambda (x l) (if (even? x) (cons x l) l)) '() lis))

If n list arguments are provided, then the kons function must take n+1 parameters: one element from each list, and the "seed" or fold state, which is initially knil. The fold operation terminates when the shortest list runs out of values:

(fold-right cons* '() '(a b c) '(1 2 3 4 5)) => (a 1 b 2 c 3)

At least one of the list arguments must be finite.

obsolete procedure: fold-left proc knil list

Deprecated, use fold instead. Equivalent to

(fold (lambda (acc elt) (proc elt acc)) knil list)
SRFI 1 procedure: reduce f ridentity list

reduce is a variant of fold.

ridentity should be a "right identity" of the procedure f—that is, for any value x acceptable to f,

(f x ridentity) = x

reduce has the following definition:

If list = (), return ridentity;
Otherwise, return (fold f (car list) (cdr list)).

...in other words, we compute (fold f ridentity list).

Note that ridentity is used only in the empty-list case. You typically use reduce when applying f is expensive and you’d like to avoid the extra application incurred when fold applies f to the head of list and the identity value, redundantly producing the same value passed in to f. For example, if f involves searching a file directory or performing a database query, this can be significant. In general, however, fold is useful in many contexts where reduce is not (consider the examples given in the fold definition—only one of the five folds uses a function with a right identity. The other four may not be performed with reduce).

;; Take the max of a list of non-negative integers.
(reduce max 0 nums) ; i.e., (apply max 0 nums)
SRFI 1 procedure: reduce-right kons knil list

reduce-right is the fold-right variant of reduce. It obeys the following definition:

(reduce-right f ridentity '()) = ridentity
(reduce-right f ridentity '(e1)) = (f e1 ridentity) = e1
(reduce-right f ridentity '(e1 e2 …)) =
    (f e1 (reduce f ridentity '(e2 …)))

...in other words, we compute (fold-right f ridentity list).

;; Append a bunch of lists together.
;; I.e., (apply append list-of-lists)
(reduce-right append '() list-of-lists)
obsolete procedure: reduce-left f ridentity list

Deprecated, use reduce instead. Equivalent to

(reduce (lambda (acc elt) (f elt acc)) ridentity list)

Previous: , Up: Lists   [Contents][Index]

7.9 Miscellaneous List Operations

SRFI 1 procedure: circular-list object …
procedure: make-circular-list k [element]

circular-list returns a circular list containing the given objects. make-circular-list returns a circular list of length k; if element is given, the returned list is filled with it, otherwise the elements are unspecified.

This procedure is like list except that the returned list is circular. circular-list could have been defined like this:

(define (circular-list . objects)
  (append! objects objects))

circular-list is compatible with SRFI 1, but extended so that it can be called with no arguments.

standard procedure: reverse list

Returns a newly allocated list consisting of the top-level elements of list in reverse order.

(reverse '(a b c))                  ⇒ (c b a)
(reverse '(a (b c) d (e (f))))      ⇒ ((e (f)) d (b c) a)
SRFI 1 procedure: reverse! list

Returns a list consisting of the top-level elements of list in reverse order. reverse! is like reverse, except that it destructively modifies list. Because the result may not be eqv? to list, it is desirable to do something like (set! x (reverse! x)).

procedure: sort sequence procedure
procedure: merge-sort sequence procedure
procedure: quick-sort sequence procedure

Sequence must be either a list or a vector. Procedure must be a procedure of two arguments that defines a total ordering on the elements of sequence. In other words, if x and y are two distinct elements of sequence, then it must be the case that

(and (procedure x y)
     (procedure y x))
     ⇒ #f

If sequence is a list (vector), sort returns a newly allocated list (vector) whose elements are those of sequence, except that they are rearranged to be sorted in the order defined by procedure. So, for example, if the elements of sequence are numbers, and procedure is <, then the resulting elements are sorted in monotonically nondecreasing order. Likewise, if procedure is >, the resulting elements are sorted in monotonically nonincreasing order. To be precise, if x and y are any two adjacent elements in the result, where x precedes y, it is the case that

(procedure y x)
     ⇒ #f

Two sorting algorithms are implemented: merge-sort and quick-sort. The procedure sort is an alias for merge-sort.

See also the definition of sort!.


Next: , Previous: , Up: Top   [Contents][Index]

8 Vectors

Vectors are heterogenous structures whose elements are indexed by exact non-negative integers. A vector typically occupies less space than a list of the same length, and the average time required to access a randomly chosen element is typically less for the vector than for the list.

The length of a vector is the number of elements that it contains. This number is an exact non-negative integer that is fixed when the vector is created. The valid indexes of a vector are the exact non-negative integers less than the length of the vector. The first element in a vector is indexed by zero, and the last element is indexed by one less than the length of the vector.

Vectors are written using the notation #(object …). For example, a vector of length 3 containing the number zero in element 0, the list (2 2 2 2) in element 1, and the string "Anna" in element 2 can be written as

#(0 (2 2 2 2) "Anna")

Note that this is the external representation of a vector, not an expression evaluating to a vector. Like list constants, vector constants must be quoted:

'#(0 (2 2 2 2) "Anna")          ⇒  #(0 (2 2 2 2) "Anna")

A number of the vector procedures operate on subvectors. A subvector is a segment of a vector that is specified by two exact non-negative integers, start and end. Start is the index of the first element that is included in the subvector, and end is one greater than the index of the last element that is included in the subvector. Thus if start and end are the same, they refer to a null subvector, and if start is zero and end is the length of the vector, they refer to the entire vector. The valid indexes of a subvector are the exact integers between start inclusive and end exclusive.


Next: , Previous: , Up: Vectors   [Contents][Index]

8.1 Construction of Vectors

procedure: make-vector k [object]

Returns a newly allocated vector of k elements. If object is specified, make-vector initializes each element of the vector to object. Otherwise the initial elements of the result are unspecified.

procedure: vector object …

Returns a newly allocated vector whose elements are the given arguments. vector is analogous to list.

(vector 'a 'b 'c)                       ⇒  #(a b c)
procedure: vector-copy vector

Returns a newly allocated vector that is a copy of vector.

procedure: list->vector list

Returns a newly allocated vector initialized to the elements of list. The inverse of list->vector is vector->list.

(list->vector '(dididit dah))           ⇒  #(dididit dah)
standard procedure: string->vector string [start [end]]
standard procedure: vector->string vector [start [end]]

It is an error if any element of vector is not a character.

The vector->string procedure returns a newly allocated string of the objects contained in the elements of vector between start and end. The string->vector procedure returns a newly created vector initialized to the elements of the string string between start and end.

In both procedures, order is preserved.

(string->vector "ABC")                  ⇒  #(#\A #\B #\C)
(vector->string #(#\1 #\2 #\3)          ⇒  "123"
procedure: make-initialized-vector k initialization

Similar to make-vector, except that the elements of the result are determined by calling the procedure initialization on the indices. For example:

(make-initialized-vector 5 (lambda (x) (* x x)))
     ⇒  #(0 1 4 9 16)
procedure: vector-grow vector k

K must be greater than or equal to the length of vector. Returns a newly allocated vector of length k. The first (vector-length vector) elements of the result are initialized from the corresponding elements of vector. The remaining elements of the result are unspecified.

procedure: vector-map procedure vector

Procedure must be a procedure of one argument. vector-map applies procedure element-wise to the elements of vector and returns a newly allocated vector of the results, in order from left to right. The dynamic order in which procedure is applied to the elements of vector is unspecified.

(vector-map cadr '#((a b) (d e) (g h)))     ⇒  #(b e h)
(vector-map (lambda (n) (expt n n)) '#(1 2 3 4))
                                            ⇒  #(1 4 27 256)
(vector-map + '#(5 7 9))                    ⇒  #(5 7 9)

Next: , Previous: , Up: Vectors   [Contents][Index]

8.2 Selecting Vector Components

procedure: vector? object

Returns #t if object is a vector; otherwise returns #f.

procedure: vector-length vector

Returns the number of elements in vector.

procedure: vector-ref vector k

Returns the contents of element k of vector. K must be a valid index of vector.

(vector-ref '#(1 1 2 3 5 8 13 21) 5)    ⇒  8
procedure: vector-set! vector k object

Stores object in element k of vector and returns an unspecified value. K must be a valid index of vector.

(let ((vec (vector 0 '(2 2 2 2) "Anna")))
  (vector-set! vec 1 '("Sue" "Sue"))
  vec)
     ⇒  #(0 ("Sue" "Sue") "Anna")
procedure: vector-first vector
procedure: vector-second vector
procedure: vector-third vector
procedure: vector-fourth vector
procedure: vector-fifth vector
procedure: vector-sixth vector
procedure: vector-seventh vector
procedure: vector-eighth vector

These procedures access the first several elements of vector in the obvious way. It is an error if the implicit index of one of these procedurs is not a valid index of vector.

procedure: vector-binary-search vector key<? unwrap-key key

Searches vector for an element with a key matching key, returning the element if one is found or #f if none. The search operation takes time proportional to the logarithm of the length of vector. Unwrap-key must be a procedure that maps each element of vector to a key. Key<? must be a procedure that implements a total ordering on the keys of the elements.

(define (translate number)
  (vector-binary-search '#((1 . i)
                           (2 . ii)
                           (3 . iii)
                           (6 . vi))
                        < car number))
(translate 2)  ⇒  (2 . ii)
(translate 4)  ⇒  #F

Next: , Previous: , Up: Vectors   [Contents][Index]

8.3 Cutting Vectors

procedure: subvector vector start end

Returns a newly allocated vector that contains the elements of vector between index start (inclusive) and end (exclusive).

procedure: vector-head vector end

Equivalent to

(subvector vector 0 end)
procedure: vector-tail vector start

Equivalent to

(subvector vector start (vector-length vector))

Previous: , Up: Vectors   [Contents][Index]

8.4 Modifying Vectors

procedure: vector-fill! vector object
procedure: subvector-fill! vector start end object

Stores object in every element of the vector (subvector) and returns an unspecified value.

procedure: subvector-move-left! vector1 start1 end1 vector2 start2
procedure: subvector-move-right! vector1 start1 end1 vector2 start2

Destructively copies the elements of vector1, starting with index start1 (inclusive) and ending with end1 (exclusive), into vector2 starting at index start2 (inclusive). Vector1, start1, and end1 must specify a valid subvector, and start2 must be a valid index for vector2. The length of the source subvector must not exceed the length of vector2 minus the index start2.

The elements are copied as follows (note that this is only important when vector1 and vector2 are eqv?):

subvector-move-left!

The copy starts at the left end and moves toward the right (from smaller indices to larger). Thus if vector1 and vector2 are the same, this procedure moves the elements toward the left inside the vector.

subvector-move-right!

The copy starts at the right end and moves toward the left (from larger indices to smaller). Thus if vector1 and vector2 are the same, this procedure moves the elements toward the right inside the vector.

procedure: sort! vector procedure
procedure: merge-sort! vector procedure
procedure: quick-sort! vector procedure

Procedure must be a procedure of two arguments that defines a total ordering on the elements of vector. The elements of vector are rearranged so that they are sorted in the order defined by procedure. The elements are rearranged in place, that is, vector is destructively modified so that its elements are in the new order.

sort! returns vector as its value.

Two sorting algorithms are implemented: merge-sort! and quick-sort!. The procedure sort! is an alias for merge-sort!.

See also the definition of sort.


Next: , Previous: , Up: Top   [Contents][Index]

9 Bit Strings

A bit string is a sequence of bits. Bit strings can be used to represent sets or to manipulate binary data. The elements of a bit string are numbered from zero up to the number of bits in the string less one, in right to left order, (the rightmost bit is numbered zero). When you convert from a bit string to an integer, the zero-th bit is associated with the zero-th power of two, the first bit is associated with the first power, and so on.

Bit strings are encoded very densely in memory. Each bit occupies exactly one bit of storage, and the overhead for the entire bit string is bounded by a small constant. However, accessing a bit in a bit string is slow compared to accessing an element of a vector or character string. If performance is of overriding concern, it is better to use character strings to store sets of boolean values even though they occupy more space.

The length of a bit string is the number of bits that it contains. This number is an exact non-negative integer that is fixed when the bit string is created. The valid indexes of a bit string are the exact non-negative integers less than the length of the bit string.

Bit strings may contain zero or more bits. They are not limited by the length of a machine word. In the printed representation of a bit string, the contents of the bit string are preceded by ‘#*’. The contents are printed starting with the most significant bit (highest index).

Note that the external representation of bit strings uses a bit ordering that is the reverse of the representation for bit strings in Common Lisp. It is likely that MIT/GNU Scheme’s representation will be changed in the future, to be compatible with Common Lisp. For the time being this representation should be considered a convenience for viewing bit strings rather than a means of entering them as data.

#*11111
#*1010
#*00000000
#*

All of the bit-string procedures are MIT/GNU Scheme extensions.


Next: , Previous: , Up: Bit Strings   [Contents][Index]

9.1 Construction of Bit Strings

procedure: make-bit-string k initialization

Returns a newly allocated bit string of length k. If initialization is #f, the bit string is filled with 0 bits; otherwise, the bit string is filled with 1 bits.

(make-bit-string 7 #f)                  ⇒  #*0000000
procedure: bit-string-allocate k

Returns a newly allocated bit string of length k, but does not initialize it.

procedure: bit-string-copy bit-string

Returns a newly allocated copy of bit-string.


Next: , Previous: , Up: Bit Strings   [Contents][Index]

9.2 Selecting Bit String Components

procedure: bit-string? object

Returns #t if object is a bit string; otherwise returns #f.

procedure: bit-string-length bit-string

Returns the length of bit-string.

procedure: bit-string-ref bit-string k

Returns #t if the kth bit is 1; otherwise returns #f. K must be a valid index of bit-string.

procedure: bit-string-set! bit-string k

Sets the kth bit in bit-string to 1 and returns an unspecified value. K must be a valid index of bit-string.

procedure: bit-string-clear! bit-string k

Sets the kth bit in bit-string to 0 and returns an unspecified value. K must be a valid index of bit-string.

procedure: bit-substring-find-next-set-bit bit-string start end

Returns the index of the first occurrence of a set bit in the substring of bit-string from start (inclusive) to end (exclusive). If none of the bits in the substring are set #f is returned. The index returned is relative to the whole bit string, not substring.

The following procedure uses bit-substring-find-next-set-bit to find all the set bits and display their indexes:

(define (scan-bitstring bs)
  (let ((end (bit-string-length bs)))
    (let loop ((start 0))
      (let ((next
             (bit-substring-find-next-set-bit bs start end)))
        (if next
            (begin
              (write-line next)
              (if (< next end)
                  (loop (+ next 1)))))))))

Next: , Previous: , Up: Bit Strings   [Contents][Index]

9.3 Cutting and Pasting Bit Strings

procedure: bit-string-append bit-string-1 bit-string-2

Appends the two bit string arguments, returning a newly allocated bit string as its result. In the result, the bits copied from bit-string-1 are less significant (smaller indices) than those copied from bit-string-2.

procedure: bit-substring bit-string start end

Returns a newly allocated bit string whose bits are copied from bit-string, starting at index start (inclusive) and ending at end (exclusive).


Next: , Previous: , Up: Bit Strings   [Contents][Index]

9.4 Bitwise Operations on Bit Strings

procedure: bit-string-zero? bit-string

Returns #t if bit-string contains only 0 bits; otherwise returns #f.

procedure: bit-string=? bit-string-1 bit-string-2

Compares the two bit string arguments and returns #t if they are the same length and contain the same bits; otherwise returns #f.

procedure: bit-string-not bit-string

Returns a newly allocated bit string that is the bitwise-logical negation of bit-string.

procedure: bit-string-movec! target-bit-string bit-string

The destructive version of bit-string-not. The arguments target-bit-string and bit-string must be bit strings of the same length. The bitwise-logical negation of bit-string is computed and the result placed in target-bit-string. The value of this procedure is unspecified.

procedure: bit-string-and bit-string-1 bit-string-2

Returns a newly allocated bit string that is the bitwise-logical “and” of the arguments. The arguments must be bit strings of identical length.

procedure: bit-string-andc bit-string-1 bit-string-2

Returns a newly allocated bit string that is the bitwise-logical “and” of bit-string-1 with the bitwise-logical negation of bit-string-2. The arguments must be bit strings of identical length.

procedure: bit-string-or bit-string-1 bit-string-2

Returns a newly allocated bit string that is the bitwise-logical “inclusive or” of the arguments. The arguments must be bit strings of identical length.

procedure: bit-string-xor bit-string-1 bit-string-2

Returns a newly allocated bit string that is the bitwise-logical “exclusive or” of the arguments. The arguments must be bit strings of identical length.

procedure: bit-string-and! target-bit-string bit-string
procedure: bit-string-or! target-bit-string bit-string
procedure: bit-string-xor! target-bit-string bit-string
procedure: bit-string-andc! target-bit-string bit-string

These are destructive versions of the above operations. The arguments target-bit-string and bit-string must be bit strings of the same length. Each of these procedures performs the corresponding bitwise-logical operation on its arguments, places the result into target-bit-string, and returns an unspecified result.


Next: , Previous: , Up: Bit Strings   [Contents][Index]

9.5 Modification of Bit Strings

procedure: bit-string-fill! bit-string initialization

Fills bit-string with zeroes if initialization is #f; otherwise fills bit-string with ones. Returns an unspecified value.

procedure: bit-string-move! target-bit-string bit-string

Moves the contents of bit-string into target-bit-string. Both arguments must be bit strings of the same length. The results of the operation are undefined if the arguments are the same bit string.

procedure: bit-substring-move-right! bit-string-1 start1 end1 bit-string-2 start2

Destructively copies the bits of bit-string-1, starting at index start1 (inclusive) and ending at end1 (exclusive), into bit-string-2 starting at index start2 (inclusive). Start1 and end1 must be valid substring indices for bit-string-1, and start2 must be a valid index for bit-string-2. The length of the source substring must not exceed the length of bit-string-2 minus the index start2.

The bits are copied starting from the MSB and working towards the LSB; the direction of copying only matters when bit-string-1 and bit-string-2 are eqv?.


Previous: , Up: Bit Strings   [Contents][Index]

9.6 Integer Conversions of Bit Strings

procedure: unsigned-integer->bit-string length integer

Both length and integer must be exact non-negative integers. Converts integer into a newly allocated bit string of length bits. Signals an error of type condition-type:bad-range-argument if integer is too large to be represented in length bits.

procedure: signed-integer->bit-string length integer

Length must be an exact non-negative integer, and integer may be any exact integer. Converts integer into a newly allocated bit string of length bits, using two’s complement encoding for negative numbers. Signals an error of type condition-type:bad-range-argument if integer is too large to be represented in length bits.

procedure: bit-string->unsigned-integer bit-string
procedure: bit-string->signed-integer bit-string

Converts bit-string into an exact integer. bit-string->signed-integer regards bit-string as a two’s complement representation of a signed integer, and produces an integer of like sign and absolute value. bit-string->unsigned-integer regards bit-string as an unsigned quantity and converts to an integer accordingly.


Next: , Previous: , Up: Top   [Contents][Index]

10 Miscellaneous Datatypes


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.1 Booleans

The boolean objects are true and false. The boolean constant true is written as ‘#t’, and the boolean constant false is written as ‘#f’.

The primary use for boolean objects is in the conditional expressions if, cond, and, and or; the behavior of these expressions is determined by whether objects are true or false. These expressions count only #f as false. They count everything else, including #t, pairs, symbols, numbers, strings, vectors, and procedures as true (but see True and False).

Programmers accustomed to other dialects of Lisp should note that Scheme distinguishes #f and the empty list from the symbol nil. Similarly, #t is distinguished from the symbol t. In fact, the boolean objects (and the empty list) are not symbols at all.

Boolean constants evaluate to themselves, so you don’t need to quote them.

#t                                      ⇒  #t
#f                                      ⇒  #f
'#f                                     ⇒  #f
t                                       error→ Unbound variable
variable: false
variable: true

These variables are bound to the objects #f and #t respectively. The compiler, given the usual-integrations declaration, replaces references to these variables with their respective values.

Note that the symbol true is not equivalent to #t, and the symbol false is not equivalent to #f.

standard procedure: boolean? object

Returns #t if object is either #t or #f; otherwise returns #f.

(boolean? #f)                           ⇒  #t
(boolean? 0)                            ⇒  #f
standard procedure: not object
procedure: false? object

These procedures return #t if object is false; otherwise they return #f. In other words they invert boolean values. These two procedures have identical semantics; their names are different to give different connotations to the test.

(not #t)                                ⇒  #f
(not 3)                                 ⇒  #f
(not (list 3))                          ⇒  #f
(not #f)                                ⇒  #t
extended standard procedure: procedure boolean=? boolean1 boolean2 boolean3 …

This predicate is true iff the boolean args are either all true or all false.

Implementation note: The standard requires this procedure’s arguments to satisfy boolean?, but MIT/GNU Scheme allows any object to be an argument.

procedure: boolean/and object …

This procedure returns #t if none of its arguments are #f. Otherwise it returns #f.

procedure: boolean/or object …

This procedure returns #f if all of its arguments are #f. Otherwise it returns #t.


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.2 Symbols

MIT/GNU Scheme provides two types of symbols: interned and uninterned. Interned symbols are far more common than uninterned symbols, and there are more ways to create them. Interned symbols have an external representation that is recognized by the procedure read; uninterned symbols do not.8

Interned symbols have an extremely useful property: any two interned symbols whose names are the same, in the sense of string=?, are the same object (i.e. they are eq? to one another). The term interned refers to the process of interning by which this is accomplished. Uninterned symbols do not share this property.

The names of interned symbols are not distinguished by their alphabetic case. Because of this, MIT/GNU Scheme converts all alphabetic characters in the name of an interned symbol to a specific case (lower case) when the symbol is created. When the name of an interned symbol is referenced (using symbol->string) or written (using write) it appears in this case. It is a bad idea to depend on the name being lower case. In fact, it is preferable to take this one step further: don’t depend on the name of a symbol being in a uniform case.

The rules for writing an interned symbol are the same as the rules for writing an identifier (see Identifiers). Any interned symbol that has been returned as part of a literal expression, or read using the read procedure and subsequently written out using the write procedure, will read back in as the identical symbol (in the sense of eq?).

Usually it is also true that reading in an interned symbol that was previously written out produces the same symbol. An exception are symbols created by the procedures string->symbol and intern; they can create symbols for which this write/read invariance may not hold because the symbols’ names contain special characters or letters in the non-standard case.9

The external representation for uninterned symbols is special, to distinguish them from interned symbols and prevent them from being recognized by the read procedure:

(string->uninterned-symbol "foo")
     ⇒  #[uninterned-symbol 30 foo]

In this section, the procedures that return symbols as values will either always return interned symbols, or always return uninterned symbols. The procedures that accept symbols as arguments will always accept either interned or uninterned symbols, and do not distinguish the two.

procedure: symbol? object

Returns #t if object is a symbol, otherwise returns #f.

(symbol? 'foo)                                  ⇒  #t
(symbol? (car '(a b)))                          ⇒  #t
(symbol? "bar")                                 ⇒  #f
procedure: symbol->string symbol

Returns the name of symbol as a string. If symbol was returned by string->symbol, the value of this procedure will be identical (in the sense of string=?) to the string that was passed to string->symbol. It is an error to apply mutation procedures such as string-set! to strings returned by this procedure.

(symbol->string 'flying-fish)           ⇒  "flying-fish"
(symbol->string 'Martin)                ⇒  "martin"
(symbol->string (string->symbol "Malvina"))
                                        ⇒  "Malvina"

Note that two distinct uninterned symbols can have the same name.

procedure: intern string

Returns the interned symbol whose name is string. Converts string to the standard alphabetic case before generating the symbol. This is the preferred way to create interned symbols, as it guarantees the following independent of which case the implementation uses for symbols’ names:

(eq? 'bitBlt (intern "bitBlt")) ⇒     #t

The user should take care that string obeys the rules for identifiers (see Identifiers), otherwise the resulting symbol cannot be read as itself.

procedure: intern-soft string

Returns the interned symbol whose name is string. Converts string to the standard alphabetic case before generating the symbol. If no such interned symbol exists, returns #f.

This is exactly like intern, except that it will not create an interned symbol, but only returns symbols that already exist.

procedure: string->symbol string

Returns the interned symbol whose name is string. Although you can use this procedure to create symbols with names containing special characters or lowercase letters, it’s usually a bad idea to create such symbols because they cannot be read as themselves. See symbol->string.

(eq? 'mISSISSIppi 'mississippi)         ⇒  #t
(string->symbol "mISSISSIppi")
     ⇒  the symbol with the name "mISSISSIppi"
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒  #f
(eq? 'JollyWog
      (string->symbol
        (symbol->string 'JollyWog)))    ⇒  #t
(string=? "K. Harper, M.D."
           (symbol->string
             (string->symbol
               "K. Harper, M.D.")))     ⇒  #t
procedure: string->uninterned-symbol string

Returns a newly allocated uninterned symbol whose name is string. It is unimportant what case or characters are used in string.

Note: this is the fastest way to make a symbol.

procedure: generate-uninterned-symbol [object]

Returns a newly allocated uninterned symbol that is guaranteed to be different from any other object. The symbol’s name consists of a prefix string followed by the (exact non-negative integer) value of an internal counter. The counter is initially zero, and is incremented after each call to this procedure.

The optional argument object is used to control how the symbol is generated. It may take one of the following values:

(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 31 G0]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 32 G1]
(generate-uninterned-symbol 'this)
     ⇒  #[uninterned-symbol 33 this2]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 34 G3]
(generate-uninterned-symbol 100)
     ⇒  #[uninterned-symbol 35 G100]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 36 G101]
procedure: symbol-append symbol …

Returns the interned symbol whose name is formed by concatenating the names of the given symbols. This procedure preserves the case of the names of its arguments, so if one or more of the arguments’ names has non-standard case, the result will also have non-standard case.

(symbol-append 'foo- 'bar)              ⇒  foo-bar
;; the arguments may be uninterned:
(symbol-append 'foo- (string->uninterned-symbol "baz"))
                                        ⇒  foo-baz
;; the result has the same case as the arguments:
(symbol-append 'foo- (string->symbol "BAZ"))    ⇒  foo-BAZ
procedure: symbol-hash symbol

Returns a hash number for symbol, which is computed by calling string-hash on symbol’s name. The hash number is an exact non-negative integer.

procedure: symbol-hash-mod symbol modulus

Modulus must be an exact positive integer. Equivalent to

(modulo (symbol-hash symbol) modulus)

This procedure is provided for convenience in constructing hash tables. However, it is normally preferable to use make-strong-eq-hash-table to build hash tables keyed by symbols, because eq? hash tables are much faster.

procedure: symbol<? symbol1 symbol2

This procedure computes a total order on symbols. It is equivalent to

(string<? (symbol->string symbol1)
          (symbol->string symbol2))

Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.3 Parameters

Parameters are objects that can be bound to new values for the duration of a dynamic extent. See Dynamic Binding.

procedure: make-parameter init [converter]
procedure: make-unsettable-parameter init [converter]

Returns a newly allocated parameter object, which is a procedure that accepts zero arguments and returns the value associated with the parameter object. Initially this value is the value of (converter init), or of init if the conversion procedure converter is not specified. The associated value can be temporarily changed using the parameterize special form (see parameterize).

The make-parameter procedure is standardized by SRFI 39 and by R7RS, while make-unsettable-parameter is an MIT/GNU Scheme extension.

procedure: make-settable-parameter init [converter]

This procedure is like make-parameter, except that the returned parameter object may also be assigned by passing it an argument. Note that an assignment to a settable parameter affects only the extent of its current binding.

make-settable-parameter is an MIT/GNU Scheme extension.

procedure: parameterize* bindings thunk

Bindings should be an alist associating parameter objects with new values. Returns the value of thunk while the parameters are dynamically bound to the values.

Note that the parameterize special form expands into a call to this procedure. parameterize* is an MIT/GNU Scheme extension.

10.3.1 Cells

A cell object is very similar to a parameter but is not implemented in multi-processing worlds and thus is deprecated. Parameters should be used instead.

procedure: cell? object

Returns #t if object is a cell; otherwise returns #f.

procedure: make-cell object

Returns a newly allocated cell whose contents is object.

procedure: cell-contents cell

Returns the current contents of cell.

procedure: set-cell-contents! cell object

Alters the contents of cell to be object. Returns an unspecified value.

procedure: bind-cell-contents! cell object thunk

Alters the contents of cell to be object, calls thunk with no arguments, then restores the original contents of cell and returns the value returned by thunk. This is completely equivalent to dynamic binding of a variable, including the behavior when continuations are used (see Dynamic Binding).


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.4 Records

MIT/GNU Scheme provides a record abstraction, which is a simple and flexible mechanism for building structures with named components. Records can be defined and accessed using the procedures defined in this section. A less flexible but more concise way to manipulate records is to use the define-structure special form (see Structure Definitions).

procedure: make-record-type type-name field-names

Returns a record-type descriptor, a value representing a new data type, disjoint from all others. The type-name argument must be a string, but is only used for debugging purposes (such as the printed representation of a record of the new type). The field-names argument is a list of symbols naming the fields of a record of the new type. It is an error if the list contains any duplicates. It is unspecified how record-type descriptors are represented.

procedure: record-constructor record-type [field-names]

Returns a procedure for constructing new members of the type represented by record-type. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in the list of field-names are unspecified. The field-names argument defaults to the list of field-names in the call to make-record-type that created the type represented by record-type; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.

procedure: record-keyword-constructor record-type

Returns a procedure for constructing new members of the type represented by record-type. The returned procedure accepts arguments in a keyword list, which is an alternating sequence of names and values. In other words, the number of arguments must be a multiple of two, and every other argument, starting with the first argument, must be a symbol that is one of the field names for record-type.

The returned procedure may be called with a keyword list that contains multiple instances of the same keyword. In this case, the leftmost instance is used and the other instances are ignored. This allows keyword lists to be accumulated using cons or cons*, and new bindings added to the front of the list override old bindings at the end.

procedure: record-predicate record-type

Returns a procedure for testing membership in the type represented by record-type. The returned procedure accepts exactly one argument and returns #t if the argument is a member of the indicated record type; it returns #f otherwise.

procedure: record-accessor record-type field-name

Returns a procedure for reading the value of a particular field of a member of the type represented by record-type. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field names in the call to make-record-type that created the type represented by record-type.

procedure: record-modifier record-type field-name

Returns a procedure for writing the value of a particular field of a member of the type represented by record-type. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the modifier procedure is unspecified. The symbol field-name must be a member of the list of field names in the call to make-record-type that created the type represented by record-type.

procedure: record? object

Returns #t if object is a record of any type and #f otherwise. Note that record? may be true of any Scheme value; of course, if it returns #t for some particular value, then record-type-descriptor is applicable to that value and returns an appropriate descriptor.

procedure: record-type-descriptor record

Returns the record-type descriptor representing the type of record. That is, for example, if the returned descriptor were passed to record-predicate, the resulting predicate would return #t when passed record. Note that it is not necessarily the case that the returned descriptor is the one that was passed to record-constructor in the call that created the constructor procedure that created record.

procedure: record-type? object

Returns #t if object is a record-type descriptor; otherwise returns #f.

procedure: record-type-name record-type

Returns the type name associated with the type represented by record-type. The returned value is eqv? to the type-name argument given in the call to make-record-type that created the type represented by record-type.

procedure: record-type-field-names record-type

Returns a list of the symbols naming the fields in members of the type represented by record-type. The returned value is equal? to the field-names argument given in the call to make-record-type that created the type represented by record-type.10


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.5 Promises

special form: delay expression

The delay construct is used together with the procedure force to implement lazy evaluation or call by need. (delay expression) returns an object called a promise which at some point in the future may be asked (by the force procedure) to evaluate expression and deliver the resulting value.

procedure: force promise

Forces the value of promise. If no value has been computed for the promise, then a value is computed and returned. The value of the promise is cached (or “memoized”) so that if it is forced a second time, the previously computed value is returned without any recomputation.

(force (delay (+ 1 2)))                 ⇒  3

(let ((p (delay (+ 1 2))))
  (list (force p) (force p)))           ⇒  (3 3)

(define head car)

(define tail
  (lambda (stream)
    (force (cdr stream))))

(define a-stream
  (letrec ((next
            (lambda (n)
              (cons n (delay (next (+ n 1)))))))
    (next 0)))

(head (tail (tail a-stream)))           ⇒  2
procedure: promise? object

Returns #t if object is a promise; otherwise returns #f.

procedure: promise-forced? promise

Returns #t if promise has been forced and its value cached; otherwise returns #f.

procedure: promise-value promise

If promise has been forced and its value cached, this procedure returns the cached value. Otherwise, an error is signalled.

force and delay are mainly intended for programs written in functional style. The following examples should not be considered to illustrate good programming style, but they illustrate the property that the value of a promise is computed at most once.

(define count 0)

(define p
  (delay
   (begin
     (set! count (+ count 1))
     (* x 3))))

(define x 5)

count                                   ⇒  0
p                                       ⇒  #[promise 54]
(force p)                               ⇒  15
p                                       ⇒  #[promise 54]
count                                   ⇒  1
(force p)                               ⇒  15
count                                   ⇒  1

Here is a possible implementation of delay and force. We define the expression

(delay expression)

to have the same meaning as the procedure call

(make-promise (lambda () expression))

where make-promise is defined as follows:

(define make-promise
  (lambda (proc)
    (let ((already-run? #f)
          (result #f))
      (lambda ()
        (cond ((not already-run?)
               (set! result (proc))
               (set! already-run? #t)))
        result))))

Promises are implemented here as procedures of no arguments, and force simply calls its argument.

(define force
  (lambda (promise)
    (promise)))

Various extensions to this semantics of delay and force are supported in some implementations (none of these are currently supported in MIT/GNU Scheme):


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.6 Streams

In addition to promises, MIT/GNU Scheme supports a higher-level abstraction called streams. Streams are similar to lists, except that the tail of a stream is not computed until it is referred to. This allows streams to be used to represent infinitely long lists.

procedure: stream object …

Returns a newly allocated stream whose elements are the arguments. Note that the expression (stream) returns the empty stream, or end-of-stream marker.

procedure: list->stream list

Returns a newly allocated stream whose elements are the elements of list. Equivalent to (apply stream list).

procedure: stream->list stream

Returns a newly allocated list whose elements are the elements of stream. If stream has infinite length this procedure will not terminate. This could have been defined by

(define (stream->list stream)
  (if (stream-null? stream)
      '()
      (cons (stream-car stream)
            (stream->list (stream-cdr stream)))))
special form: cons-stream object expression

Returns a newly allocated stream pair. Equivalent to (cons object (delay expression)).

procedure: stream-pair? object

Returns #t if object is a pair whose cdr contains a promise. Otherwise returns #f. This could have been defined by

(define (stream-pair? object)
  (and (pair? object)
       (promise? (cdr object))))
procedure: stream-car stream
procedure: stream-first stream

Returns the first element in stream. stream-car is equivalent to car. stream-first is a synonym for stream-car.

procedure: stream-cdr stream
procedure: stream-rest stream

Returns the first tail of stream. Equivalent to (force (cdr stream)). stream-rest is a synonym for stream-cdr.

procedure: stream-null? stream

Returns #t if stream is the end-of-stream marker; otherwise returns #f. This is equivalent to null?, but should be used whenever testing for the end of a stream.

procedure: stream-length stream

Returns the number of elements in stream. If stream has an infinite number of elements this procedure will not terminate. Note that this procedure forces all of the promises that comprise stream.

procedure: stream-ref stream k

Returns the element of stream that is indexed by k; that is, the kth element. K must be an exact non-negative integer strictly less than the length of stream.

procedure: stream-head stream k

Returns the first k elements of stream as a list. K must be an exact non-negative integer strictly less than the length of stream.

procedure: stream-tail stream k

Returns the tail of stream that is indexed by k; that is, the kth tail. This is equivalent to performing stream-cdr k times. K must be an exact non-negative integer strictly less than the length of stream.

procedure: stream-map procedure stream stream …

Returns a newly allocated stream, each element being the result of invoking procedure with the corresponding elements of the streams as its arguments.


Previous: , Up: Miscellaneous Datatypes   [Contents][Index]

10.7 Weak References

Weak references are a mechanism for building data structures that point at objects without protecting them from garbage collection. An example of such a data structure might be an entry in a lookup table that should be removed if the rest of the program does not reference its key. Such an entry must still point at its key to carry out comparisons, but should not in itself prevent its key from being garbage collected.

A weak reference is a reference that points at an object without preventing it from being garbage collected. The term strong reference is used to distinguish normal references from weak ones. If there is no path of strong references to some object, the garbage collector will reclaim that object and mark any weak references to it to indicate that it has been reclaimed.

If there is a path of strong references from an object A to an object B, A is said to hold B strongly. If there is a path of references from an object A to an object B, but every such path traverses at least one weak reference, A is said to hold B weakly.

MIT Scheme provides two mechanisms for using weak references. Weak pairs are like normal pairs, except that their car slot is a weak reference (but the cdr is still strong). The heavier-weight ephemerons additionally arrange that the ephemeron does not count as holding the object in its key field strongly even if the object in its datum field does.

Warning: Working with weak references is subtle and requires careful analysis; most programs should avoid working with them directly. The most common use cases for weak references ought to be served by hash tables (see Hash Tables), which can employ various flavors of weak entry types, 1d tables (see 1D Tables), which hold their keys weakly, and the association table (see The Association Table), which also holds its keys weakly.


Next: , Previous: , Up: Weak References   [Contents][Index]

10.7.1 Weak Pairs

The car of a weak pair holds its pointer weakly, while the cdr holds its pointer strongly. If the object in the car of a weak pair is not held strongly by any other data structure, it will be garbage-collected, and the original value replaced with a unique reclaimed object.

Note: weak pairs can be defeated by cross references among their slots. Consider a weak pair P holding an object A in its car and an object D in its cdr. P points to A weakly and to D strongly. If D holds A strongly, however, then P ends up holding A strongly after all. If avoiding this is worth a heavier-weight structure, See Ephemerons.

Note: weak pairs are not pairs; that is, they do not satisfy the predicate pair?.

procedure: weak-pair? object

Returns #t if object is a weak pair; otherwise returns #f.

procedure: weak-cons car cdr

Allocates and returns a new weak pair, with components car and cdr. The car component is held weakly.

procedure: gc-reclaimed-object? object

Returns #t if object is the reclaimed object, and #f otherwise.

procedure: gc-reclaimed-object

Returns the reclaimed object.

obsolete procedure: weak-pair/car? weak-pair

This predicate returns #f if the car of weak-pair has been garbage-collected; otherwise returns #t. In other words, it is true if weak-pair has a valid car component.

This is equivalent to

(not (gc-reclaimed-object? (weak-car weak-pair)))

This predicate has been deprecated; instead use gc-reclaimed-object?. Please note that the previously recommended way to use weak-pair/car? will no longer work, so any code using it should be rewritten.

procedure: weak-car weak-pair

Returns the car component of weak-pair. If the car component has been garbage-collected, this operation returns the reclaimed object.

procedure: weak-set-car! weak-pair object

Sets the car component of weak-pair to object and returns an unspecified result.

procedure: weak-cdr weak-pair

Returns the cdr component of weak-pair.

procedure: weak-set-cdr! weak-pair object

Sets the cdr component of weak-pair to object and returns an unspecified result.


Next: , Previous: , Up: Weak References   [Contents][Index]

10.7.2 Ephemerons

An ephemeron is an object with two weakly referenced components called its key and datum. The garbage collector drops an ephemeron’s references to both key and datum, rendering the ephemeron broken, if and only if the garbage collector can prove that there are no strong references to the key. In other words, an ephemeron is broken when nobody else cares about its key. In particular, the datum holding a reference to the key will not in itself prevent the ephemeron from becoming broken; in contrast, See Weak Pairs. Once broken, ephemerons never cease to be broken; setting the key or datum of a broken ephemeron with set-ephemeron-key! or set-ephemeron-datum! has no effect. Note that an ephemeron’s reference to its datum may be dropped even if the datum is still reachable; all that matters is whether the key is reachable.

Ephemerons are considerably heavier-weight than weak pairs, because garbage-collecting ephemerons is more complicated than garbage-collecting weak pairs. Each ephemeron needs five words of storage, rather than the two words needed by a weak pair. However, while the garbage collector spends more time on ephemerons than on other objects, the amount of time it spends on ephemerons scales linearly with the number of live ephemerons, which is how its running time scales with the total number of live objects anyway.

procedure: ephemeron? object

Returns #t if object is a ephemeron; otherwise returns #f.

procedure: make-ephemeron key datum

Allocates and returns a new ephemeron, with components key and datum.

procedure: ephemeron-broken? ephemeron

Returns #t if the garbage collector has dropped ephemeron’s references to its key and datum; otherwise returns #f.

procedure: ephemeron-key ephemeron
procedure: ephemeron-datum ephemeron

These return the key or datum component, respectively, of ephemeron. If ephemeron has been broken, these operations return #f, but they can also return #f if that is the value that was stored in the key or value component.

procedure: set-ephemeron-key! ephemeron object
procedure: set-ephemeron-datum! ephemeron object

These set the key or datum component, respectively, of ephemeron to object and return an unspecified result. If ephemeron is broken, neither of these operations has any effect.

Like weak-pair/car?, ephemeron-broken? must be used with care. If (ephemeron-broken? ephemeron) yields false, it guarantees only that prior evaluations of (ephemeron-key ephemeron) or (ephemeron-datum ephemeron) yielded the key or datum that was stored in the ephemeron, but it makes no guarantees about subsequent calls to ephemeron-key or ephemeron-datum: the garbage collector may run and break the ephemeron immediately after ephemeron-broken? returns. Thus, the correct idiom to fetch an ephemeron’s key and datum and use them if the ephemeron is not broken is

(let ((key (ephemeron-key ephemeron))
      (datum (ephemeron-datum ephemeron)))
  (if (ephemeron-broken? ephemeron)
      … broken case …
      … code using key and datum …))

Previous: , Up: Weak References   [Contents][Index]

10.7.3 Reference barriers

The garbage collector may break an ephemeron if it can prove that the key is not strongly reachable. To ensure that it does not do so before a certain point in a program, the program can invoke a reference barrier on the key by calling the reference-barrier procedure, which guarantees that even if the program does not use the key, it will be considered strongly reachable until after reference-barrier returns.

procedure: reference-barrier object

Guarantee that object is strongly reachable until after reference-barrier returns.


Next: , Previous: , Up: Top   [Contents][Index]

11 Associations

MIT/GNU Scheme provides several mechanisms for associating objects with one another. Each of these mechanisms creates a link between one or more objects, called keys, and some other object, called a datum. Beyond this common idea, however, each of the mechanisms has various different properties that make it appropriate in different situations:


Next: , Previous: , Up: Associations   [Contents][Index]

11.1 Association Lists

An association list, or alist, is a data structure used very frequently in Scheme. An alist is a list of pairs, each of which is called an association. The car of an association is called the key.

An advantage of the alist representation is that an alist can be incrementally augmented simply by adding new entries to the front. Moreover, because the searching procedures assv et al. search the alist in order, new entries can “shadow” old entries. If an alist is viewed as a mapping from keys to data, then the mapping can be not only augmented but also altered in a non-destructive manner by adding new entries to the front of the alist.11

procedure: alist? object

Returns #t if object is an association list (including the empty list); otherwise returns #f. Any object satisfying this predicate also satisfies list?.

procedure: assq object alist
procedure: assv object alist
procedure: assoc object alist

These procedures find the first pair in alist whose car field is object, and return that pair; the returned pair is always an element of alist, not one of the pairs from which alist is composed. If no pair in alist has object as its car, #f (n.b.: not the empty list) is returned. assq uses eq? to compare object with the car fields of the pairs in alist, while assv uses eqv? and assoc uses equal?.12

(define e '((a 1) (b 2) (c 3)))
(assq 'a e)                             ⇒  (a 1)
(assq 'b e)                             ⇒  (b 2)
(assq 'd e)                             ⇒  #f
(assq (list 'a) '(((a)) ((b)) ((c))))   ⇒  #f
(assoc (list 'a) '(((a)) ((b)) ((c))))  ⇒  ((a))
(assq 5 '((2 3) (5 7) (11 13)))         ⇒  unspecified
(assv 5 '((2 3) (5 7) (11 13)))         ⇒  (5 7)
procedure: association-procedure predicate selector

Returns an association procedure that is similar to assv, except that selector (a procedure of one argument) is used to select the key from the association, and predicate (an equivalence predicate) is used to compare the key to the given item. This can be used to make association lists whose elements are, say, vectors instead of pairs (also see Searching Lists).

For example, here is how assv could be implemented:

(define assv (association-procedure eqv? car))

Another example is a “reverse association” procedure:

(define rassv (association-procedure eqv? cdr))
procedure: del-assq object alist
procedure: del-assv object alist
procedure: del-assoc object alist

These procedures return a newly allocated copy of alist in which all associations with keys equal to object have been removed. Note that while the returned copy is a newly allocated list, the association pairs that are the elements of the list are shared with alist, not copied. del-assq uses eq? to compare object with the keys, while del-assv uses eqv? and del-assoc uses equal?.

(define a
  '((butcher . "231 e22nd St.")
    (baker . "515 w23rd St.")
    (hardware . "988 Lexington Ave.")))

(del-assq 'baker a)
     ⇒
     ((butcher . "231 e22nd St.")
      (hardware . "988 Lexington Ave."))
procedure: del-assq! object alist
procedure: del-assv! object alist
procedure: del-assoc! object alist

These procedures remove from alist all associations with keys equal to object. They return the resulting list. del-assq! uses eq? to compare object with the keys, while del-assv! uses eqv? and del-assoc! uses equal?. These procedures are like del-assq, del-assv, and del-assoc, respectively, except that they destructively modify alist.

procedure: delete-association-procedure deletor predicate selector

This returns a deletion procedure similar to del-assv or del-assq!. The predicate and selector arguments are the same as those for association-procedure, while the deletor argument should be either the procedure list-deletor (for non-destructive deletions), or the procedure list-deletor! (for destructive deletions).

For example, here is a possible implementation of del-assv:

(define del-assv 
  (delete-association-procedure list-deletor eqv? car))
procedure: alist-copy alist

Returns a newly allocated copy of alist. This is similar to list-copy except that the “association” pairs, i.e. the elements of the list alist, are also copied. alist-copy could have been implemented like this:

(define (alist-copy alist)
  (if (null? alist)
      '()
      (cons (cons (car (car alist)) (cdr (car alist)))
            (alist-copy (cdr alist)))))

Next: , Previous: , Up: Associations   [Contents][Index]

11.2 1D Tables

1D tables (“one-dimensional” tables) are similar to association lists. In a 1D table, unlike an association list, the keys of the table are held weakly: if a key is garbage-collected, its associated value in the table is removed. 1D tables compare their keys for equality using eq?.

1D tables can often be used as a higher-performance alternative to the two-dimensional association table (see The Association Table). If one of the keys being associated is a compound object such as a vector, a 1D table can be stored in one of the vector’s slots. Under these circumstances, accessing items in a 1D table will be comparable in performance to using a property list in a conventional Lisp.

procedure: make-1d-table

Returns a newly allocated empty 1D table.

procedure: 1d-table? object

Returns #t if object is a 1D table, otherwise returns #f. Any object that satisfies this predicate also satisfies list?.

procedure: 1d-table/put! 1d-table key datum

Creates an association between key and datum in 1d-table. Returns an unspecified value.

procedure: 1d-table/remove! 1d-table key

Removes any association for key in 1d-table and returns an unspecified value.

procedure: 1d-table/get 1d-table key default

Returns the datum associated with key in 1d-table. If there is no association for key, default is returned.

procedure: 1d-table/lookup 1d-table key if-found if-not-found

If-found must be a procedure of one argument, and if-not-found must be a procedure of no arguments. If 1d-table contains an association for key, if-found is invoked on the datum of the association. Otherwise, if-not-found is invoked with no arguments. In either case, the result of the invoked procedure is returned as the result of 1d-table/lookup.

procedure: 1d-table/alist 1d-table

Returns a newly allocated association list that contains the same information as 1d-table.


Next: , Previous: , Up: Associations   [Contents][Index]

11.3 The Association Table

MIT/GNU Scheme provides a generalization of the property-list mechanism found in most other implementations of Lisp: a global two-dimensional association table. This table is indexed by two keys, called x-key and y-key in the following procedure descriptions. These keys and the datum associated with them can be arbitrary objects. eq? is used to discriminate keys.

Think of the association table as a matrix: a single datum can be accessed using both keys, a column using x-key only, and a row using y-key only.

procedure: 2d-put! x-key y-key datum

Makes an entry in the association table that associates datum with x-key and y-key. Returns an unspecified result.

procedure: 2d-remove! x-key y-key

If the association table has an entry for x-key and y-key, it is removed. Returns an unspecified result.

procedure: 2d-get x-key y-key

Returns the datum associated with x-key and y-key. Returns #f if no such association exists.

procedure: 2d-get-alist-x x-key

Returns an association list of all entries in the association table that are associated with x-key. The result is a list of (y-key . datum) pairs. Returns the empty list if no entries for x-key exist.

(2d-put! 'foo 'bar 5)
(2d-put! 'foo 'baz 6)
(2d-get-alist-x 'foo)           ⇒  ((baz . 6) (bar . 5))
procedure: 2d-get-alist-y y-key

Returns an association list of all entries in the association table that are associated with y-key. The result is a list of (x-key . datum) pairs. Returns the empty list if no entries for y-key exist.

(2d-put! 'bar 'foo 5)
(2d-put! 'baz 'foo 6)
(2d-get-alist-y 'foo)           ⇒  ((baz . 6) (bar . 5))

Next: , Previous: , Up: Associations   [Contents][Index]

11.4 Hash Tables

Hash tables are a fast, powerful mechanism for storing large numbers of associations. MIT/GNU Scheme’s hash tables feature automatic resizing, customizable growth parameters, customizable hash procedures, and many options for weak references to keys or data.

The average times for the insertion, deletion, and lookup operations on a hash table are bounded by a constant. The space required by the table is proportional to the number of associations in the table; the constant of proportionality is described below (see Resizing of Hash Tables).

The hash table interface described below is a superset of SRFI 69: “Basic hash tables”. The reason for supporting the extra functionality is that SRFI 69 fails to specify certain optimization-enabling exceptions to its semantics, forcing a correct implementation to pay the non-negligible performance cost of completely safe behavior. 13 The MIT/GNU Scheme native hash table interface, in contrast, specifies the minor exceptions it needs, and is therefore implemented more efficiently.

We do not describe the SRFI 69-compliant interface here, as that would be redundant with the SRFI document.


Next: , Previous: , Up: Hash Tables   [Contents][Index]

11.4.1 Construction of Hash Tables

The next few procedures are hash-table constructors. All hash table constructors are procedures that accept one optional argument, initial-size, and return a newly allocated hash table. If initial-size is given, it must be an exact non-negative integer or #f. The meaning of initial-size is discussed below (see Resizing of Hash Tables).

Hash tables are normally characterized by two things: the equivalence predicate that is used to compare keys, and how the table allows its keys and data to be reclaimed by the garbage collector. If a table prevents its keys and data from being reclaimed by the garbage collector, it is said to hold its keys and data strongly; other arrangements are possible, where a table may hold keys or data weakly or ephemerally (see Weak References).

procedure: make-strong-eq-hash-table [initial-size]
obsolete procedure: make-symbol-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eq?. The keys and data are held strongly. These are the fastest of the standard hash tables.

procedure: make-key-weak-eq-hash-table [initial-size]
obsolete procedure: make-weak-eq-hash-table [initial-size]
obsolete procedure: make-eq-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eq?. The keys are held weakly and the data are held strongly. Note that if a datum holds a key strongly, the table will effectively hold that key strongly.

procedure: make-datum-weak-eq-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eq?. The keys are held strongly and the data are held weakly. Note that if a key holds a datum strongly, the table will effectively hold that datum strongly.

procedure: make-key-ephemeral-eq-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eq?. The keys are held weakly, even if some of the data should hold some of the keys strongly.

procedure: make-strong-eqv-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eqv?. The keys and data are held strongly. These hash tables are a little slower than those made by make-strong-eq-hash-table.

procedure: make-key-weak-eqv-hash-table [initial-size]
obsolete procedure: make-weak-eqv-hash-table [initial-size]
obsolete procedure: make-eqv-hash-table [initial-size]
obsolete procedure: make-object-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eqv?. The keys are held weakly, except that booleans, characters, numbers, and interned symbols are held strongly. The data are held strongly. Note that if a datum holds a key strongly, the table will effectively hold that key strongly.

procedure: make-datum-weak-eqv-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eqv?. The keys are held strongly and the data are held weakly. Note that if a key holds a datum strongly, the table will effectively hold that datum strongly.

procedure: make-key-ephemeral-eqv-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with eqv?. The keys are held weakly, except that booleans, characters, numbers, and interned symbols are held strongly. The keys are effectively held weakly even if some of the data should hold some of the keys strongly.

procedure: make-equal-hash-table [initial-size]

Returns a newly allocated hash table that accepts arbitrary objects as keys, and compares those keys with equal?. The keys and data are held strongly. These hash tables are quite a bit slower than those made by make-strong-eq-hash-table.

procedure: make-string-hash-table [initial-size]

Returns a newly allocated hash table that accepts character strings as keys, and compares them with string=?. The keys and data are held strongly.

All of the above are highly optimized table implementations. Next are some general constructors that allow for more flexible table definitions.

procedure: make-hash-table comparator arg …
procedure: make-hash-table [key=? [hash-function arg …]]
procedure: alist->hash-table alist comparator arg …
procedure: alist->hash-table alist [key=? [hash-function arg …]]

These are the standard constructors for making hash tables. The behavior of each differs depending on its arguments: if the first argument is a comparator, then it behaves like a SRFI 125 procedure, otherwise it behaves like a SRFI 69 procedure.

For SRFI 125 behavior the comparator must be a comparator that satisfies comparator-hashable?. The remaining args are optional, and may include the following symbols:

weak-keys

Specifies that the table will have weak keys.

weak-values

Specifies that the table will have weak values.

ephemeral-keys

Specifies that the table will have ephemeral keys.

ephemeral-values

Specifies that the table will have ephemeral values.

The symbols weak-keys and weak-values can be specified together or separately, likewise for ephemeral-keys and ephemeral-values. But weak and ephemeral symbols can’t be mixed. If none of these symbols are present, then the keys and values are strongly held.

Additionally args may contain an exact non-negative integer, which specifies an initial size for the table; otherwise a default size is used.

For SRFI 69 behavior the key=? argument specifies how keys are compared and defaults to equal?. The hash-function argument specifies the hash function to use. If hash-function is not specified, it defaults to a standard value that depends on key=?; an error is signaled if there’s no standard value. The arg arguments are allowed but are implementation dependent; do not provide them.

The procedure alist->hash-table creates a new hash table, as with make-hash-table, and then fills it with the contents of alist.

The remaining constructors use hash-table types to encapsulate the hashing parameters.

obsolete procedure: make-hash-table* type [initial-size]

Constructs a new hash table using the hashing parameters in type.

procedure: hash-table-constructor comparator arg …
obsolete procedure: hash-table-constructor type

Returns a procedure that, when called, constructs a new hash table using the specified parameters. The returned procedure accepts an optional initial-size.

If its first argument is a comparator, it uses the comparator and args as in make-hash-table, except that any initial size specified in args can be overridden by the initial-size argument to the returned procedure.

If its first argument is a type, returns a procedure that, when called, constructs a new hash table using the hashing parameters in type. This is equivalent to

(lambda (#!optional initial-size)
  (make-hash-table* type initial-size))

The next two procedures are used to create hash-table types. The procedures are equivalent in power; they differ only in how the types are described.

obsolete procedure: make-hash-table-type hash-function key=? rehash-after-gc? entry-type

This procedure accepts four arguments and returns a hash-table type, which can be used to make hash tables of that type. The key=? argument is an equivalence predicate for the keys of the hash table. The hash-function argument is a procedure that computes a hash number. Specifically, hash-function accepts two arguments, a key and an exact positive integer (the modulus), and returns an exact non-negative integer that is less than the modulus.

The argument rehash-after-gc?, if true, says that the values returned by hash-function might change after a garbage collection. If so, the hash-table implementation arranges for the table to be rehashed when necessary. (See Address Hashing, for information about hash procedures that have this property.) Otherwise, it is assumed that hash-function always returns the same value for the same arguments.

The argument entry-type determines the strength with which the hash table will hold its keys and values. It must be one of the entry-type variables described below, which all start with hash-table-entry-type:.

obsolete procedure: make-hash-table-type* key=? hash-function rehash-after-gc? entry-type

This procedure’s arguments, except for key=?, are keyword arguments; that is, each argument is a symbol of the same name followed by its value. Aside from how they are passed, the arguments have the same meaning as those for make-hash-table-type. Note that all of the keyword arguments are optional, while key=? is required.

The argument entry-type specifies the name of an entry type. It must be a symbol corresponding to one of the entry-type variables described below. The name of an entry type is the symbol composed of the suffix of the corresponding variable; for example the type hash-table-entry-type:key-weak has the name key-weak.

The default values for the keyword arguments are as follows. The arguments hash-function and rehash-after-gc? default to standard values that depend on key=?; an error is signaled if key=? has no standard values. The argument entry-type defaults to strong.

obsolete variable: hash-table-entry-type:strong

The entry type for hash tables that hold both keys and data strongly.

obsolete variable: hash-table-entry-type:key-weak

An entry type for hash tables that hold keys weakly and data strongly. An entry of this type is a weak pair (see Weak Pairs) whose weak (car) slot holds the key of the entry and whose strong (cdr) slot holds the datum of the entry. If a key of such a hash table is garbage collected, the corresponding entry will be removed. Note that if some datum holds some key strongly, the table will effectively hold that key strongly.

obsolete variable: hash-table-entry-type:datum-weak

An entry type for hash tables that hold keys strongly and data weakly. An entry of this type is a weak pair (see Weak Pairs) whose weak (car) slot holds the datum of the entry and whose strong (cdr) slot holds the key of the entry. If a datum of such a hash table is garbage collected, all corresponding entries will be removed. Note that if some key holds some datum strongly, the table will effectively hold that datum strongly.

obsolete variable: hash-table-entry-type:key&datum-weak
obsolete variable: hash-table-entry-type:key/datum-weak

The entry type for hash tables that hold both keys and data weakly. An entry of this type is a weak list, holding both the key and the datum in the weak (car) slot of weak pairs (see Weak Pairs). If either a key or datum of such a hash table is garbage collected, all corresponding entries will be removed.

obsolete variable: hash-table-entry-type:key-ephemeral

An entry type for hash tables that hold data ephemerally, keyed by the keys. An entry of this type is an ephemeron (see Ephemerons) whose key is the key of the entry and whose datum is the datum of the entry. If a key of such a hash table is garbage collected, the corresponding entry will be removed. Note that the table holds all its keys weakly even if some data should hold some keys strongly.

obsolete variable: hash-table-entry-type:datum-ephemeral

An entry type for hash tables that hold keys ephemerally, keyed by the data. An entry of this type is an ephemeron (see Ephemerons) whose key is the datum of the entry and whose datum is the key of the entry. If a datum of such a hash table is garbage collected, all corresponding entries will be removed. Note that the table holds all its data weakly even if some keys should hold some data strongly.

obsolete variable: hash-table-entry-type:key&datum-ephemeral

The entry type for hash tables that hold both keys and data ephemerally keyed on each other. An entry of this type is a pair of ephemerons (see Ephemerons), one holding the datum keyed by the key and the other holding the key keyed by the datum. If both the key and the datum of any entry of such a hash table are garbage collected, the entry will be removed. The table holds all its keys and data weakly itself, but will prevent any key or datum from being garbage collected if there are strong references to its datum or key, respectively.

Some examples showing how some standard hash-table constructors could have been defined:

(define make-weak-eq-hash-table
  (hash-table-constructor
    (make-hash-table-type eq-hash eq? #t
      hash-table-entry-type:key-weak)))

(define make-equal-hash-table
  (hash-table-constructor
    (make-hash-table-type equal-hash equal? #t
      hash-table-entry-type:strong)))

(define make-string-hash-table
  (hash-table-constructor
    (make-hash-table-type string-hash string=? #f
      hash-table-entry-type:strong)))

The following procedures are provided only for backward compatibility. They should be considered deprecated and should not be used in new programs.

obsolete procedure: hash-table/constructor hash-function key=? rehash-after-gc? entry-type

This procedure is deprecated. Instead use the equivalent

(hash-table-constructor
  (make-hash-table-type hash-function key=? rehash-after-gc?
                        entry-type))
obsolete procedure: strong-hash-table/constructor hash-function key=? [rehash-after-gc?]

Like hash-table/constructor but always uses hash-table-entry-type:strong. If rehash-after-gc? is omitted, it defaults to #f.

obsolete procedure: weak-hash-table/constructor hash-function key=? [rehash-after-gc?]

Like hash-table/constructor but always uses hash-table-entry-type:key-weak. If rehash-after-gc? is omitted, it defaults to #f.


Next: , Previous: , Up: Hash Tables   [Contents][Index]

11.4.2 Basic Hash Table Operations

The procedures described in this section are the basic operations on hash tables. They provide the functionality most often needed by programmers. Subsequent sections describe other operations that provide additional functionality needed by some applications.

procedure: hash-table? object

Returns #t if object is a hash table, otherwise returns #f.

procedure: hash-table-set! hash-table key datum
obsolete procedure: hash-table/put! hash-table key datum

Associates datum with key in hash-table and returns an unspecified result.

The average time required by this operation is bounded by a constant.

procedure: hash-table-ref hash-table key [get-default]

Returns the datum associated with key in hash-table. If there is no association for key, and get-default is provided, it is called with no arguments and the value it yields is returned; if get-default is not provided, an error is signaled.

The average time required by this operation is bounded by a constant.

procedure: hash-table-ref/default hash-table key default
obsolete procedure: hash-table/get hash-table key default

Equivalent to

(hash-table-ref hash-table key (lambda () default))
procedure: hash-table-delete! hash-table key
obsolete procedure: hash-table/remove! hash-table key

If hash-table has an association for key, removes it. Returns an unspecified result.

The average time required by this operation is bounded by a constant.

procedure: hash-table-clear! hash-table
obsolete procedure: hash-table/clear! hash-table

Removes all associations in hash-table and returns an unspecified result.

The average and worst-case times required by this operation are bounded by a constant.

procedure: hash-table-size hash-table
obsolete procedure: hash-table/count hash-table

Returns the number of associations in hash-table as an exact non-negative integer. If hash-table does not hold its keys and data strongly, this is a conservative upper bound that may count some associations whose keys or data have recently been reclaimed by the garbage collector.

The average and worst-case times required by this operation are bounded by a constant.

procedure: hash-table->alist hash-table

Returns the contents of hash-table as a newly allocated alist. Each element of the alist is a pair (key . datum) where key is one of the keys of hash-table, and datum is its associated datum.

The average and worst-case times required by this operation are linear in the number of associations in the table.

procedure: hash-table-keys hash-table
obsolete procedure: hash-table/key-list hash-table

Returns a newly allocated list of the keys in hash-table.

The average and worst-case times required by this operation are linear in the number of associations in the table.

procedure: hash-table-values hash-table
obsolete procedure: hash-table/datum-list hash-table

Returns a newly allocated list of the datums in hash-table. Each element of the list corresponds to one of the associations in hash-table; if the table contains multiple associations with the same datum, so will this list.

The average and worst-case times required by this operation are linear in the number of associations in the table.

procedure: hash-table-walk hash-table procedure
obsolete procedure: hash-table/for-each hash-table procedure

Procedure must be a procedure of two arguments. Invokes procedure once for each association in hash-table, passing the association’s key and datum as arguments, in that order. Returns an unspecified result. Procedure must not modify hash-table, with one exception: it is permitted to call hash-table-delete! to remove the association being processed.

The following procedure is useful when there is no sensible default value for hash-table-ref and the caller must choose between different actions depending on whether there is a datum associated with the key.

obsolete procedure: hash-table/lookup hash-table key if-found if-not-found

If-found must be a procedure of one argument, and if-not-found must be a procedure of no arguments. If hash-table contains an association for key, if-found is invoked on the datum of the association. Otherwise, if-not-found is invoked with no arguments. In either case, the result yielded by the invoked procedure is returned as the result of hash-table/lookup (hash-table/lookup reduces into the invoked procedure, i.e. calls it tail-recursively).

The average time required by this operation is bounded by a constant.

procedure: hash-table-update! hash-table key procedure [get-default]

Procedure must be a procedure of one argument and get-default, if supplied, must be a procedure of zero arguments. Applies procedure to the datum associated with key in hash-table or to the value of calling get-default if there is no association for key, associates the result with key, and returns an unspecified value. If get-default is not supplied and there’s no association for key, an error is signaled.

The average time required by this operation is bounded by a constant.

procedure: hash-table-update!/default hash-table key procedure default
obsolete procedure: hash-table/modify! hash-table key default procedure

Equivalent to

(hash-table-update! hash-table key procedure (lambda () default))
procedure: hash-table-intern! hash-table key get-default
obsolete procedure: hash-table/intern! hash-table key get-default

Get-default must be a procedure of zero arguments. Ensures that hash-table has an association for key and returns the associated datum. If hash-table did not have a datum associated with key, get-default is called and its value is used to create a new association for key.

The average time required by this operation is bounded by a constant.

The following procedure is sometimes useful in conjunction with weak and ephemeral hash tables. Normally it is not needed, because such hash tables clean themselves automatically as they are used.

procedure: hash-table-clean! hash-table
obsolete procedure: hash-table/clean! hash-table

If hash-table is a type of hash table that holds its keys or data weakly or ephemerally, this procedure recovers any space that was being used to record associations for objects that have been reclaimed by the garbage collector. Otherwise, this procedure does nothing. In either case, it returns an unspecified result.


Next: , Previous: , Up: Hash Tables   [Contents][Index]

11.4.3 Resizing of Hash Tables

Normally, hash tables automatically resize themselves according to need. Because of this, the programmer need not be concerned with management of the table’s size. However, some limited control over the table’s size is provided, which will be discussed below. This discussion involves two concepts, usable size and physical size, which we will now define.

The usable size of a hash table is the number of associations that the table can hold at a given time. If the number of associations in the table exceeds the usable size, the table will automatically grow, increasing the usable size to a new value that is sufficient to hold the associations.

The physical size is an abstract measure of a hash table that specifies how much space is allocated to hold the associations of the table. The physical size is always greater than or equal to the usable size. The physical size is not interesting in itself; it is interesting only for its effect on the performance of the hash table. While the average performance of a hash-table lookup is bounded by a constant, the worst-case performance is not. For a table containing a given number of associations, increasing the physical size of the table decreases the probability that worse-than-average performance will occur.

The physical size of a hash table is statistically related to the number of associations. However, it is possible to place bounds on the physical size, and from this to estimate the amount of space used by the table:

(define (hash-table-space-bounds count rehash-size rehash-threshold)
  (let ((tf (/ 1 rehash-threshold)))
    (values (if (exact-integer? rehash-size)
                (- (* count (+ 4 tf))
                   (* tf (+ rehash-size rehash-size)))
                (* count (+ 4 (/ tf (* rehash-size rehash-size)))))
            (* count (+ 4 tf)))))

What this formula shows is that, for a “normal” rehash size (that is, not an exact integer), the amount of space used by the hash table is proportional to the number of associations in the table. The constant of proportionality varies statistically, with the low bound being

(+ 4 (/ (/ 1 rehash-threshold) (* rehash-size rehash-size)))

and the high bound being

(+ 4 (/ 1 rehash-threshold))

which, for the default values of these parameters, are 4.25 and 5, respectively. Reducing the rehash size will tighten these bounds, but increases the amount of time spent resizing, so you can see that the rehash size gives some control over the time-space tradeoff of the table.

The programmer can control the size of a hash table by means of three parameters:

If the programmer knows that the table will initially contain a specific number of items, initial-size can be given when the table is created. If initial-size is an exact non-negative integer, it specifies the initial usable size of the hash table; the table will not change size until the number of items in the table exceeds initial-size, after which automatic resizing is enabled and initial-size no longer has any effect. Otherwise, if initial-size is not given or is #f, the table is initialized to an unspecified size and automatic resizing is immediately enabled.

The rehash size specifies how much to increase the usable size of the hash table when it becomes full. It is either an exact positive integer, or a real number greater than one. If it is an integer, the new size is the sum of the old size and the rehash size. Otherwise, it is a real number, and the new size is the product of the old size and the rehash size. Increasing the rehash size decreases the average cost of an insertion, but increases the average amount of space used by the table. The rehash size of a table may be altered dynamically by the application in order to optimize the resizing of the table; for example, if the table will grow quickly for a known period and afterwards will not change size, performance might be improved by using a large rehash size during the growth phase and a small one during the static phase. The default rehash size of a newly constructed hash table is 2.0.

Warning: The use of an exact positive integer for a rehash size is almost always undesirable; this option is provided solely for compatibility with the Common Lisp hash-table mechanism. The reason for this has to do with the time penalty for resizing the hash table. The time needed to resize a hash table is proportional to the number of associations in the table. This resizing cost is amortized across the insertions required to fill the table to the point where it needs to grow again. If the table grows by an amount proportional to the number of associations, then the cost of resizing and the increase in size are both proportional to the number of associations, so the amortized cost of an insertion operation is still bounded by a constant. However, if the table grows by a constant amount, this is not true: the amortized cost of an insertion is not bounded by a constant. Thus, using a constant rehash size means that the average cost of an insertion increases proportionally to the number of associations in the hash table.

The rehash threshold is a real number, between zero exclusive and one inclusive, that specifies the ratio between a hash table’s usable size and its physical size. Decreasing the rehash threshold decreases the probability of worse-than-average insertion, deletion, and lookup times, but increases the physical size of the table for a given usable size. The default rehash threshold of a newly constructed hash table is 1.

procedure: hash-table-grow-size hash-table
obsolete procedure: hash-table/size hash-table

Returns the usable size of hash-table as an exact positive integer. This is the maximum number of associations that hash-table can hold before it will grow.

procedure: hash-table-shrink-size hash-table

Returns the minimum number of associations that hash-table can hold before it will shrink.

procedure: hash-table-rehash-size hash-table
obsolete procedure: hash-table/rehash-size hash-table

Returns the rehash size of hash-table.

procedure: set-hash-table-rehash-size! hash-table x
obsolete procedure: set-hash-table/rehash-size! hash-table x

X must be either an exact positive integer, or a real number that is greater than one. Sets the rehash size of hash-table to x and returns an unspecified result. This operation adjusts the “shrink threshold” of the table; the table might shrink if the number of associations is less than the new threshold.

procedure: hash-table-rehash-threshold hash-table
obsolete procedure: hash-table/rehash-threshold hash-table

Returns the rehash threshold of hash-table.

procedure: set-hash-table-rehash-threshold! hash-table x
obsolete procedure: set-hash-table/rehash-threshold! hash-table x

X must be a real number between zero exclusive and one inclusive. Sets the rehash threshold of hash-table to x and returns an unspecified result. This operation does not change the usable size of the table, but it usually changes the physical size of the table, which causes the table to be rehashed.


Previous: , Up: Hash Tables   [Contents][Index]

11.4.4 Address Hashing

The procedures described in this section may be used to make very efficient key-hashing procedures for arbitrary objects. All of these procedures are based on address hashing, which uses the address of an object as its hash number. The great advantage of address hashing is that converting an arbitrary object to a hash number is extremely fast and takes the same amount of time for any object.

The disadvantage of address hashing is that the garbage collector changes the addresses of most objects. The hash-table implementation compensates for this disadvantage by automatically rehashing tables that use address hashing when garbage collections occur. Thus, in order to use these procedures for key hashing, it is necessary to tell the hash-table implementation (by means of the rehash-after-gc? argument to the hash-table type constructors) that the hash numbers computed by your key-hashing procedure must be recomputed after a garbage collection.

procedure: eq-hash object
procedure: eqv-hash object
procedure: equal-hash object

These procedures return a hash number for object. The result is always a non-negative integer, and in the case of eq-hash, a non-negative fixnum. Two objects that are equivalent according to eq?, eqv?, or equal?, respectively, will produce the same hash number when passed as arguments to these procedures, provided that the garbage collector does not run during or between the two calls.

procedure: hash-by-identity key [modulus]

This SRFI 69 procedure returns the same value as eq-hash, optionally limited by modulus.

procedure: hash key [modulus]

This SRFI 69 procedure returns the same value as equal-hash, optionally limited by modulus.

obsolete procedure: hash-by-eqv key [modulus]

This procedure returns the same value as eqv-hash, optionally limited by modulus.

obsolete procedure: eq-hash-mod object modulus

This procedure is the key-hashing procedure used by make-strong-eq-hash-table.

obsolete procedure: eqv-hash-mod object modulus

This procedure is the key-hashing procedure used by make-strong-eqv-hash-table.

obsolete procedure: equal-hash-mod object modulus

This procedure is the key-hashing procedure used by make-equal-hash-table.


Next: , Previous: , Up: Associations   [Contents][Index]

11.5 Object Hashing

The MIT/GNU Scheme object-hashing facility provides a mechanism for generating a unique hash number for an arbitrary object. This hash number, unlike an object’s address, is unchanged by garbage collection. The object-hashing facility is used in the generation of the written representation for many objects (see Custom Output), but it can be used for anything that needs a stable identifier for an arbitrary object.

All of these procedures accept an optional argument called hasher which contains the object-integer associations. If given, this argument must be an object hasher as constructed by make-object-hasher (see below). If not given, a default hasher is used.

procedure: hash-object object [hasher]
obsolete procedure: hash object [hasher]
obsolete procedure: object-hash object [hasher]

hash-object associates an exact non-negative integer with object and returns that integer. If hash-object was previously called with object as its argument, the integer returned is the same as was returned by the previous call. hash-object guarantees that distinct objects (in the sense of eqv?) are associated with distinct integers.

procedure: unhash-object k [hasher]
obsolete procedure: unhash k [hasher]
obsolete procedure: object-unhash k [hasher]

unhash-object takes an exact non-negative integer k and returns the object associated with that integer. If there is no object associated with k, or if the object previously associated with k has been reclaimed by the garbage collector, an error of type condition-type:bad-range-argument is signalled. In other words, if hash-object previously returned k for some object, and that object has not been reclaimed, it is the value of the call to unhash-object.

An object that is passed to hash-object as an argument is not protected from being reclaimed by the garbage collector. If all other references to that object are eliminated, the object will be reclaimed. Subsequently calling unhash-object with the hash number of the (now reclaimed) object will signal an error.

(define x (cons 0 0))           ⇒  unspecified
(hash-object x)                 ⇒  77
(eqv? (hash-object x)
      (hash-object x))          ⇒  #t
(define x 0)                    ⇒  unspecified
(gc-flip)                       ;force a garbage collection
(unhash-object 77)              error→
procedure: object-hashed? object [hasher]

This predicate is true iff object has an associated hash number.

procedure: valid-object-hash? k [hasher]
obsolete procedure: valid-hash-number? k [hasher]

This predicate is true iff k is the hash number associated with some object.

Finally, this procedure makes new object hashers:

procedure: make-object-hasher
obsolete procedure: hash-table/make

This procedure creates and returns a new, empty object hasher that is suitable for use as the optional hasher argument to the above procedures. The returned hasher contains no associations.


Next: , Previous: , Up: Associations   [Contents][Index]

11.6 Red-Black Trees

Balanced binary trees are a useful data structure for maintaining large sets of associations whose keys are ordered. While most applications involving large association sets should use hash tables, some applications can benefit from the use of binary trees. Binary trees have two advantages over hash tables:

MIT/GNU Scheme provides an implementation of red-black trees. The red-black tree-balancing algorithm provides generally good performance because it doesn’t try to keep the tree very closely balanced. At any given node in the tree, one side of the node can be twice as high as the other in the worst case. With typical data the tree will remain fairly well balanced anyway.

A red-black tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is eight words per association.

Red-black trees hold their keys strongly. In other words, if a red-black tree contains an association for a given key, that key cannot be reclaimed by the garbage collector.

procedure: make-rb-tree key=? key<?

This procedure creates and returns a newly allocated red-black tree. The tree contains no associations. Key=? and key<? are predicates that compare two keys and determine whether they are equal to or less than one another, respectively. For any two keys, at most one of these predicates is true.

procedure: rb-tree? object

Returns #t if object is a red-black tree, otherwise returns #f.

procedure: rb-tree/insert! rb-tree key datum

Associates datum with key in rb-tree and returns an unspecified value. If rb-tree already has an association for key, that association is replaced. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations in rb-tree.

procedure: rb-tree/lookup rb-tree key default

Returns the datum associated with key in rb-tree. If rb-tree doesn’t contain an association for key, default is returned. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations in rb-tree.

procedure: rb-tree/delete! rb-tree key

If rb-tree contains an association for key, removes it. Returns an unspecified value. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations in rb-tree.

procedure: rb-tree->alist rb-tree

Returns the contents of rb-tree as a newly allocated alist. Each element of the alist is a pair (key . datum) where key is one of the keys of rb-tree, and datum is its associated datum. The alist is sorted by key according to the key<? argument used to construct rb-tree. The time required by this operation is proportional to the number of associations in the tree.

procedure: rb-tree/key-list rb-tree

Returns a newly allocated list of the keys in rb-tree. The list is sorted by key according to the key<? argument used to construct rb-tree. The time required by this operation is proportional to the number of associations in the tree.

procedure: rb-tree/datum-list rb-tree

Returns a newly allocated list of the datums in rb-tree. Each element of the list corresponds to one of the associations in rb-tree, so if the tree contains multiple associations with the same datum, so will this list. The list is sorted by the keys of the associations, even though they do not appear in the result. The time required by this operation is proportional to the number of associations in the tree.

This procedure is equivalent to:

(lambda (rb-tree) (map cdr (rb-tree->alist rb-tree)))
procedure: rb-tree/equal? rb-tree-1 rb-tree-2 datum=?

Compares rb-tree-1 and rb-tree-2 for equality, returning #t iff they are equal and #f otherwise. The trees must have been constructed with the same equality and order predicates (same in the sense of eq?). The keys of the trees are compared using the key=? predicate used to build the trees, while the datums of the trees are compared using the equivalence predicate datum=?. The worst-case time required by this operation is proportional to the number of associations in the tree.

procedure: rb-tree/empty? rb-tree

Returns #t iff rb-tree contains no associations. Otherwise returns #f.

procedure: rb-tree/size rb-tree

Returns the number of associations in rb-tree, an exact non-negative integer. The average and worst-case times required by this operation are proportional to the number of associations in the tree.

procedure: rb-tree/height rb-tree

Returns the height of rb-tree, an exact non-negative integer. This is the length of the longest path from a leaf of the tree to the root. The average and worst-case times required by this operation are proportional to the number of associations in the tree.

The returned value satisfies the following:

(lambda (rb-tree)
  (let ((size (rb-tree/size rb-tree))
        (lg (lambda (x) (/ (log x) (log 2)))))
    (<= (lg size)
        (rb-tree/height rb-tree)
        (* 2 (lg (+ size 1))))))
procedure: rb-tree/copy rb-tree

Returns a newly allocated copy of rb-tree. The copy is identical to rb-tree in all respects, except that changes to rb-tree do not affect the copy, and vice versa. The time required by this operation is proportional to the number of associations in the tree.

procedure: alist->rb-tree alist key=? key<?

Returns a newly allocated red-black tree that contains the same associations as alist. This procedure is equivalent to:

(lambda (alist key=? key<?)
  (let ((tree (make-rb-tree key=? key<?)))
    (for-each (lambda (association)
                (rb-tree/insert! tree
                                 (car association)
                                 (cdr association)))
              alist)
    tree))

The following operations provide access to the smallest and largest members in a red/black tree. They are useful for implementing priority queues.

procedure: rb-tree/min rb-tree default

Returns the smallest key in rb-tree, or default if the tree is empty.

procedure: rb-tree/min-datum rb-tree default

Returns the datum associated with the smallest key in rb-tree, or default if the tree is empty.

procedure: rb-tree/min-pair rb-tree

Finds the smallest key in rb-tree and returns a pair containing that key and its associated datum. If the tree is empty, returns #f.

procedure: rb-tree/max rb-tree default

Returns the largest key in rb-tree, or default if the tree is empty.

procedure: rb-tree/max-datum rb-tree default

Returns the datum associated with the largest key in rb-tree, or default if the tree is empty.

procedure: rb-tree/max-pair rb-tree

Finds the largest key in rb-tree and returns a pair containing that key and its associated datum. If the tree is empty, returns #f.

procedure: rb-tree/delete-min! rb-tree default
procedure: rb-tree/delete-min-datum! rb-tree default
procedure: rb-tree/delete-min-pair! rb-tree
procedure: rb-tree/delete-max! rb-tree default
procedure: rb-tree/delete-max-datum! rb-tree default
procedure: rb-tree/delete-max-pair! rb-tree

These operations are exactly like the accessors above, in that they return information associated with the smallest or largest key, except that they simultaneously delete that key.


Next: , Previous: , Up: Associations   [Contents][Index]

11.7 Weight-Balanced Trees

Balanced binary trees are a useful data structure for maintaining large sets of ordered objects or sets of associations whose keys are ordered. MIT/GNU Scheme has a comprehensive implementation of weight-balanced binary trees which has several advantages over the other data structures for large aggregates:

These features make weight-balanced trees suitable for a wide range of applications, especially those that require large numbers of sets or discrete maps. Applications that have a few global databases and/or concentrate on element-level operations like insertion and lookup are probably better off using hash tables or red-black trees.

The size of a tree is the number of associations that it contains. Weight-balanced binary trees are balanced to keep the sizes of the subtrees of each node within a constant factor of each other. This ensures logarithmic times for single-path operations (like lookup and insertion). A weight-balanced tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is six words per association.

Weight-balanced trees can be used as an implementation for either discrete sets or discrete maps (associations). Sets are implemented by ignoring the datum that is associated with the key. Under this scheme if an association exists in the tree this indicates that the key of the association is a member of the set. Typically a value such as (), #t or #f is associated with the key.

Many operations can be viewed as computing a result that, depending on whether the tree arguments are thought of as sets or maps, is known by two different names. An example is wt-tree/member?, which, when regarding the tree argument as a set, computes the set membership operation, but, when regarding the tree as a discrete map, wt-tree/member? is the predicate testing if the map is defined at an element in its domain. Most names in this package have been chosen based on interpreting the trees as sets, hence the name wt-tree/member? rather than wt-tree/defined-at?.


Next: , Previous: , Up: Weight-Balanced Trees   [Contents][Index]

11.7.1 Construction of Weight-Balanced Trees

Binary trees require there to be a total order on the keys used to arrange the elements in the tree. Weight-balanced trees are organized by types, where the type is an object encapsulating the ordering relation. Creating a tree is a two-stage process. First a tree type must be created from the predicate that gives the ordering. The tree type is then used for making trees, either empty or singleton trees or trees from other aggregate structures like association lists. Once created, a tree ‘knows’ its type and the type is used to test compatibility between trees in operations taking two trees. Usually a small number of tree types are created at the beginning of a program and used many times throughout the program’s execution.

procedure: make-wt-tree-type key<?

This procedure creates and returns a new tree type based on the ordering predicate key<?. Key<? must be a total ordering, having the property that for all key values a, b and c:

(key<? a a)                         ⇒ #f
(and (key<? a b) (key<? b a))       ⇒ #f
(if (and (key<? a b) (key<? b c))
    (key<? a c)
    #t)                             ⇒ #t

Two key values are assumed to be equal if neither is less than the other by key<?.

Each call to make-wt-tree-type returns a distinct value, and trees are only compatible if their tree types are eq?. A consequence is that trees that are intended to be used in binary-tree operations must all be created with a tree type originating from the same call to make-wt-tree-type.

variable: number-wt-type

A standard tree type for trees with numeric keys. Number-wt-type could have been defined by

(define number-wt-type (make-wt-tree-type  <))
variable: string-wt-type

A standard tree type for trees with string keys. String-wt-type could have been defined by

(define string-wt-type (make-wt-tree-type  string<?))
procedure: make-wt-tree wt-tree-type

This procedure creates and returns a newly allocated weight-balanced tree. The tree is empty, i.e. it contains no associations. Wt-tree-type is a weight-balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure: singleton-wt-tree wt-tree-type key datum

This procedure creates and returns a newly allocated weight-balanced tree. The tree contains a single association, that of datum with key. Wt-tree-type is a weight-balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure: alist->wt-tree tree-type alist

Returns a newly allocated weight-balanced tree that contains the same associations as alist. This procedure is equivalent to:

(lambda (type alist)
  (let ((tree (make-wt-tree type)))
    (for-each (lambda (association)
                (wt-tree/add! tree
                              (car association)
                              (cdr association)))
              alist)
    tree))

Next: , Previous: , Up: Weight-Balanced Trees   [Contents][Index]

11.7.2 Basic Operations on Weight-Balanced Trees

This section describes the basic tree operations on weight-balanced trees. These operations are the usual tree operations for insertion, deletion and lookup, some predicates and a procedure for determining the number of associations in a tree.

procedure: wt-tree? object

Returns #t if object is a weight-balanced tree, otherwise returns #f.

procedure: wt-tree/empty? wt-tree

Returns #t if wt-tree contains no associations, otherwise returns #f.

procedure: wt-tree/size wt-tree

Returns the number of associations in wt-tree, an exact non-negative integer. This operation takes constant time.

procedure: wt-tree/add wt-tree key datum

Returns a new tree containing all the associations in wt-tree and the association of datum with key. If wt-tree already had an association for key, the new association overrides the old. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure: wt-tree/add! wt-tree key datum

Associates datum with key in wt-tree and returns an unspecified value. If wt-tree already has an association for key, that association is replaced. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure: wt-tree/member? key wt-tree

Returns #t if wt-tree contains an association for key, otherwise returns #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure: wt-tree/lookup wt-tree key default

Returns the datum associated with key in wt-tree. If wt-tree doesn’t contain an association for key, default is returned. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure: wt-tree/delete wt-tree key

Returns a new tree containing all the associations in wt-tree, except that if wt-tree contains an association for key, it is removed from the result. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure: wt-tree/delete! wt-tree key

If wt-tree contains an association for key the association is removed. Returns an unspecified value. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.


Next: , Previous: , Up: Weight-Balanced Trees   [Contents][Index]

11.7.3 Advanced Operations on Weight-Balanced Trees

In the following the size of a tree is the number of associations that the tree contains, and a smaller tree contains fewer associations.

procedure: wt-tree/split< wt-tree bound

Returns a new tree containing all and only the associations in wt-tree that have a key that is less than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of the size of wt-tree.

procedure: wt-tree/split> wt-tree bound

Returns a new tree containing all and only the associations in wt-tree that have a key that is greater than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of the size of wt-tree.

procedure: wt-tree/union wt-tree-1 wt-tree-2

Returns a new tree containing all the associations from both trees. This operation is asymmetric: when both trees have an association for the same key, the returned tree associates the datum from wt-tree-2 with the key. Thus if the trees are viewed as discrete maps then wt-tree/union computes the map override of wt-tree-1 by wt-tree-2. If the trees are viewed as sets the result is the set union of the arguments. The worst-case time required by this operation is proportional to the sum of the sizes of both trees. If the minimum key of one tree is greater than the maximum key of the other tree then the worst-case time required is proportional to the logarithm of the size of the larger tree.

procedure: wt-tree/intersection wt-tree-1 wt-tree-2

Returns a new tree containing all and only those associations from wt-tree-1 that have keys appearing as the key of an association in wt-tree-2. Thus the associated data in the result are those from wt-tree-1. If the trees are being used as sets the result is the set intersection of the arguments. As a discrete map operation, wt-tree/intersection computes the domain restriction of wt-tree-1 to (the domain of) wt-tree-2. The worst-case time required by this operation is proportional to the sum of the sizes of the trees.

procedure: wt-tree/difference wt-tree-1 wt-tree-2

Returns a new tree containing all and only those associations from wt-tree-1 that have keys that do not appear as the key of an association in wt-tree-2. If the trees are viewed as sets the result is the asymmetric set difference of the arguments. As a discrete map operation, it computes the domain restriction of wt-tree-1 to the complement of (the domain of) wt-tree-2. The worst-case time required by this operation is proportional to the sum of the sizes of the trees.

procedure: wt-tree/subset? wt-tree-1 wt-tree-2

Returns #t iff the key of each association in wt-tree-1 is the key of some association in wt-tree-2, otherwise returns #f. Viewed as a set operation, wt-tree/subset? is the improper subset predicate. A proper subset predicate can be constructed:

(define (proper-subset? s1 s2)
  (and (wt-tree/subset? s1 s2)
       (< (wt-tree/size s1) (wt-tree/size s2))))

As a discrete map operation, wt-tree/subset? is the subset test on the domain(s) of the map(s). In the worst-case the time required by this operation is proportional to the size of wt-tree-1.

procedure: wt-tree/set-equal? wt-tree-1 wt-tree-2

Returns #t iff for every association in wt-tree-1 there is an association in wt-tree-2 that has the same key, and vice versa.

Viewing the arguments as sets, wt-tree/set-equal? is the set equality predicate. As a map operation it determines if two maps are defined on the same domain.

This procedure is equivalent to

(lambda (wt-tree-1 wt-tree-2)
  (and (wt-tree/subset? wt-tree-1 wt-tree-2
       (wt-tree/subset? wt-tree-2 wt-tree-1)))

In the worst case the time required by this operation is proportional to the size of the smaller tree.

procedure: wt-tree/fold combiner initial wt-tree

This procedure reduces wt-tree by combining all the associations, using an reverse in-order traversal, so the associations are visited in reverse order. Combiner is a procedure of three arguments: a key, a datum and the accumulated result so far. Provided combiner takes time bounded by a constant, wt-tree/fold takes time proportional to the size of wt-tree.

A sorted association list can be derived simply:

(wt-tree/fold (lambda (key datum list)
                (cons (cons key datum) list))
              '()
              wt-tree))

The data in the associations can be summed like this:

(wt-tree/fold (lambda (key datum sum) (+ sum datum))
              0
              wt-tree)
procedure: wt-tree/for-each action wt-tree

This procedure traverses wt-tree in order, applying action to each association. The associations are processed in increasing order of their keys. Action is a procedure of two arguments that takes the key and datum respectively of the association. Provided action takes time bounded by a constant, wt-tree/for-each takes time proportional to the size of wt-tree. The example prints the tree:

(wt-tree/for-each (lambda (key value)
                    (display (list key value)))
                  wt-tree))
procedure: wt-tree/union-merge wt-tree-1 wt-tree-2 merge

Returns a new tree containing all the associations from both trees. If both trees have an association for the same key, the datum associated with that key in the result tree is computed by applying the procedure merge to the key, the value from wt-tree-1 and the value from wt-tree-2. Merge is of the form

(lambda (key datum-1 datum-2) …)

If some key occurs only in one tree, that association will appear in the result tree without being processed by merge, so for this operation to make sense, either merge must have both a right and left identity that correspond to the association being absent in one of the trees, or some guarantee must be made, for example, all the keys in one tree are known to occur in the other.

These are all reasonable procedures for merge

(lambda (key val1 val2) (+ val1 val2))
(lambda (key val1 val2) (append val1 val2))
(lambda (key val1 val2) (wt-tree/union val1 val2))

However, a procedure like

(lambda (key val1 val2) (- val1 val2))

would result in a subtraction of the data for all associations with keys occuring in both trees but associations with keys occuring in only the second tree would be copied, not negated, as is presumably be intent. The programmer might ensure that this never happens.

This procedure has the same time behavior as wt-tree/union but with a slightly worse constant factor. Indeed, wt-tree/union might have been defined like this:

(define (wt-tree/union tree1 tree2)
  (wt-tree/union-merge tree1 tree2
                       (lambda (key val1 val2) val2)))

The merge procedure takes the key as a parameter in case the data are not independent of the key.


Previous: , Up: Weight-Balanced Trees   [Contents][Index]

11.7.4 Indexing Operations on Weight-Balanced Trees

Weight-balanced trees support operations that view the tree as sorted sequence of associations. Elements of the sequence can be accessed by position, and the position of an element in the sequence can be determined, both in logarthmic time.

procedure: wt-tree/index wt-tree index
procedure: wt-tree/index-datum wt-tree index
procedure: wt-tree/index-pair wt-tree index

Returns the 0-based indexth association of wt-tree in the sorted sequence under the tree’s ordering relation on the keys. wt-tree/index returns the indexth key, wt-tree/index-datum returns the datum associated with the indexth key and wt-tree/index-pair returns a new pair (key . datum) which is the cons of the indexth key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal a condition of type condition-type:bad-range-argument if index<0 or if index is greater than or equal to the number of associations in the tree. If the tree is empty, they signal an anonymous error.

Indexing can be used to find the median and maximum keys in the tree as follows:

median:   (wt-tree/index wt-tree
                         (quotient (wt-tree/size wt-tree)
                                   2))
maximum:  (wt-tree/index wt-tree
                         (- (wt-tree/size wt-tree)
                            1))
procedure: wt-tree/rank wt-tree key

Determines the 0-based position of key in the sorted sequence of the keys under the tree’s ordering relation, or #f if the tree has no association with for key. This procedure returns either an exact non-negative integer or #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

procedure: wt-tree/min wt-tree
procedure: wt-tree/min-datum wt-tree
procedure: wt-tree/min-pair wt-tree

Returns the association of wt-tree that has the least key under the tree’s ordering relation. wt-tree/min returns the least key, wt-tree/min-datum returns the datum associated with the least key and wt-tree/min-pair returns a new pair (key . datum) which is the cons of the minimum key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal an error if the tree is empty. They could have been written

(define (wt-tree/min tree)
  (wt-tree/index tree 0))
(define (wt-tree/min-datum tree)
  (wt-tree/index-datum tree 0))
(define (wt-tree/min-pair tree)
  (wt-tree/index-pair tree 0))
procedure: wt-tree/delete-min wt-tree

Returns a new tree containing all of the associations in wt-tree except the association with the least key under the wt-tree’s ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

(wt-tree/delete wt-tree (wt-tree/min wt-tree))
procedure: wt-tree/delete-min! wt-tree

Removes the association with the least key under the wt-tree’s ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

(wt-tree/delete! wt-tree (wt-tree/min wt-tree))

Previous: , Up: Associations   [Contents][Index]

11.8 Associative Maps

Starting with version 11.1, MIT/GNU Scheme provides an abstract associative-map interface that can be backed by any kind of association mechanism. The interface is similar to that of SRFI 125. Associative maps can be mutable or immutable, depending on the backing implementation. As of this writing we support the following implementations: alists, weak alists, red/black trees, hash tables, and tries, all of which are mutable.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.1 Amap constructors

All associative-map constructors take a comparator argument followed by a list of additional specification arguments. In all cases, the comparator is used to compare keys in the map, while the additional arguments specify the implementation and/or features of the map being created.

procedure: make-amap comparator arg …

Creates a new mutable associative map as specified by comparator and args.

procedure: alist->amap alist comparator arg …

Creates a new associative map as specified by comparator and args, and which contains the associations in alist.

procedure: amap-unfold stop? mapper successor seed comparator arg …

Creates a new associative map as specified by comparator and args. The associations in the resulting map are generated using the additional arguments.

The stop? argument is a unary predicate that takes a state value and returns #t if generation is complete, otherwise #f. The mapper argument is a unary procedure that takes a state value and returns two values: a key and a value. The successor argument is a unary procedure that takes a state value and returns a new state value. And seed is the initial state value.

The process by which the generator adds associations to the map is this:

(let loop ((state seed))
  (if (stop? state)
      result
      (let-values (((key value) (mapper state)))
        (amap-set! result key value)
        (loop (successor state)))))

The arguments passed to a constructor can be divided into categories:

This is a complex set of possible arguments. In order to help explore what arguments can be used, and in what combinations, we provide some utility procedures:

procedure: amap-implementation-names

Returns a list of the supported implementation names.

procedure: amap-implementation-supported-args name

Returns a list of the arguments supported by the implementation specified by name. This list can include the procedure exact-nonnegative-integer? if the implementation supports an initial size.

procedure: amap-implementation-supports-args? name args

Returns #t if the implementation specified by name supports args, otherwise returns #f.

An implementation may support a limited set of comparators. For example, a hash table requires a comparator that satisfies comparator-hashable?, while a binary tree requires one satisfying comparator-ordered?.

procedure: amap-implementation-supports-comparator? name comparator

Returns #t if the implementation specified by name supports comparator, otherwise returns #f.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.2 Amap predicates

procedure: amap? object

True iff object is an associative map.

procedure: amap-contains? amap key

True iff amap contains an association for key.

procedure: amap-empty? amap

True iff amap has no associations.

procedure: amap=? value-comparator amap1 amap2

True iff the associations in amap1 are the same as the associations in amap2. This means that amap1 and amap2 have the same keys (in the sense of their shared equality predicate), and that for each key they have the same value (in the sense of value-comparator).

procedure: amap-mutable? amap

True iff amap is mutable.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.3 Amap accessors

procedure: amap-ref amap key [fail [succeed]]

Looks up the value associated to key in amap, invokes the procedure succeed on it, and returns its result; if succeed is not provided, then the value itself is returned. If key is not contained in amap and fail is supplied, then fail is invoked on no arguments and its result is returned. Otherwise an error is signaled.

procedure: amap-ref/default amap key default

Semantically equivalent to, but may be more efficient than, the following code:

(amap-ref amap key (lambda () default))
procedure: amap-comparator amap

Returns the comparator that was used to create amap.

procedure: amap-args amap

Returns the extra arguments that were used to create amap.

procedure: amap-implementation-name amap

Returns the name of amap’s backing implementation.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.4 Amap mutators

procedure: amap-set! amap [key value] …

Repeatedly mutates amap, creating new associations in it by processing the arguments from left to right. The args alternate between keys and values. Whenever there is a previous association for a key, it is deleted. It is an error if a key does not satisfy the type check procedure of the comparator of amap. Likewise, it is an error if a key is not a valid argument to the equality predicate of amap. Returns an unspecified value.

procedure: amap-delete! amap key …

Deletes any association to each key in amap and returns the number of keys that had associations.

procedure: amap-intern! amap key fail

Effectively invokes amap-ref with the given arguments and returns what it returns. If key was not found in amap, its value is set to the result of calling fail.

procedure: amap-update! amap key updater [fail [succeed]]

Semantically equivalent to, but may be more efficient than, the following code:

(amap-set! amap key (updater (amap-ref amap key fail succeed)))
procedure: amap-update!/default amap key updater default

Semantically equivalent to, but may be more efficient than, the following code:

(amap-set! amap key (updater (amap-ref/default amap key default)))
procedure: amap-pop! amap

Chooses an arbitrary association from amap and removes it, returning the key and value as two values. Signals an error if amap is empty.

procedure: amap-clear! amap

Delete all the associations from amap.

procedure: amap-clean! amap

If amap has weak or ephemeral associations, cleans up any storage for associations whose key and/or value has been reclaimed by the garbage collector. Otherwise does nothing.

This procedure does not have any visible effect, since the associations it cleans up are ignored by all other parts of the interface. Additionally, most implementations clean up their storage incrementally as they are used. But this procedure provides for edge cases where the reclaimed storage might matter.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.5 Amap mapping and folding

procedure: amap-map procedure amap [comparator arg …]

Returns a newly allocated associative map as if by (make-amap comparator arg …). Calls procedure for every association in amap with the value of the association. The key of the association and the result of invoking procedure are entered into the new map.

If comparator recognizes multiple keys in amap as equivalent, any one of such associations is taken.

procedure: amap-for-each procedure amap

Calls procedure for every association in amap with two arguments: the key of the association and the value of the association. The value returned by procedure is discarded. Returns an unspecified value.

procedure: amap-map! procedure amap

Calls procedure for every association in amap with two arguments: the key of the association and the value of the association. The value returned by procedure is used to update the value of the association. Returns an unspecified value.

procedure: amap-map->list procedure amap

Calls procedure for every association in amap with two arguments: the key of the association and the value of the association. The values returned by the invocations of procedure are accumulated into a list, which is returned.

procedure: amap-fold kons knil amap

Calls kons for every association in amap with three arguments: the key of the association, the value of the association, and an accumulated value val. The argument knil is seed for the first invocation of kons, and for subsequent invocations of kons, the returned value of the previous invocation. The value returned by amap-fold is the return value of the last invocation of kons.

procedure: amap-prune! predicate amap

Calls predicate for every association in amap with two arguments, the key and the value of the association, and removes all associations from amap for which predicate returns true. Returns an unspecified value.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.6 Amap contents

procedure: amap-size amap

Returns the number of associations in amap as an exact integer.

procedure: amap-keys amap

Returns a newly allocated list of all the keys in amap.

procedure: amap-values amap

Returns a newly allocated list of all the values in amap.

procedure: amap-entries amap

Returns two values, a newly allocated list of all the keys in amap and a newly allocated list of all the values in amap in the corresponding order.

procedure: amap-find procedure amap fail

For each association of amap, invoke procedure on its key and value. If procedure returns true, then amap-find returns what procedure returns. If all the calls to procedure return #f, returns the result of invoking the thunk fail.

procedure: amap-count predicate amap

For each association of amap, invoke predicate on its key and value. Returns the number of calls to predicate that returned true.


Next: , Previous: , Up: Associative Maps   [Contents][Index]

11.8.7 Amap copying and conversion

procedure: amap-copy amap [mutable?]

Returns a newly allocated associative map with the same properties and associations as amap. If mutable? is given and is true, the new associative map is mutable. Otherwise it is immutable provided that the implementation supports immutable maps.

procedure: amap-empty-copy amap

Returns a newly allocated mutable associative map with the same properties as amap, but with no associations.

procedure: amap->alist amap

Returns an alist with the same associations as amap in an unspecified order.


Previous: , Up: Associative Maps   [Contents][Index]

11.8.8 Amaps as sets

procedure: amap-union! amap1 amap2

Adds the associations of amap2 to amap1 and returns amap1. If a key appears in both maps, its value is set to the value appearing in amap1.

procedure: amap-intersection! amap1 amap2

Deletes the associations from amap1 whose keys don’t also appear in amap2 and returns amap1.

procedure: amap-difference! amap1 amap2

Deletes the associations of amap1 whose keys are also present in amap2 and returns amap1.

procedure: amap-xor! amap1 amap2

Deletes the associations of amap1 whose keys are also present in amap2, and then adds the associations of amap2 whose keys are not present in amap1 to amap1. Returns amap1.


Next: , Previous: , Up: Top   [Contents][Index]

12 Procedures

Procedures are created by evaluating lambda expressions (see Lambda Expressions); the lambda may either be explicit or may be implicit as in a “procedure define” (see Definitions). Also there are special built-in procedures, called primitive procedures, such as car; these procedures are not written in Scheme but in the language used to implement the Scheme system. MIT/GNU Scheme also provides application hooks, which support the construction of data structures that act like procedures.

In MIT/GNU Scheme, the written representation of a procedure tells you the type of the procedure (compiled, interpreted, or primitive):

pp
     ⇒  #[compiled-procedure 56 ("pp" #x2) #x10 #x307578]
(lambda (x) x)
     ⇒  #[compound-procedure 57]
(define (foo x) x)
foo
     ⇒  #[compound-procedure 58 foo]
car
     ⇒  #[primitive-procedure car]
(call-with-current-continuation (lambda (x) x))
     ⇒  #[continuation 59]

Note that interpreted procedures are called “compound” procedures (strictly speaking, compiled procedures are also compound procedures). The written representation makes this distinction for historical reasons, and may eventually change.


Next: , Previous: , Up: Procedures   [Contents][Index]

12.1 Procedure Operations

procedure: apply procedure object object …

Calls procedure with the elements of the following list as arguments:

(cons* object object …)

The initial objects may be any objects, but the last object (there must be at least one object) must be a list.

(apply + (list 3 4 5 6))                ⇒  18
(apply + 3 4 '(5 6))                    ⇒  18

(define compose
  (lambda (f g)
    (lambda args
      (f (apply g args)))))
((compose sqrt *) 12 75)                ⇒  30
procedure: procedure? object

Returns #t if object is a procedure; otherwise returns #f. If #t is returned, exactly one of the following predicates is satisfied by object: compiled-procedure?, compound-procedure?, or primitive-procedure?.

procedure: compiled-procedure? object

Returns #t if object is a compiled procedure; otherwise returns #f.

procedure: compound-procedure? object

Returns #t if object is a compound (i.e. interpreted) procedure; otherwise returns #f.

procedure: primitive-procedure? object

Returns #t if object is a primitive procedure; otherwise returns #f.

procedure: procedure-environment procedure

Returns the closing environment of procedure. Signals an error if procedure is a primitive procedure, or if procedure is a compiled procedure for which the debugging information is unavailable.


Next: , Previous: , Up: Procedures   [Contents][Index]

12.2 Arity

Each procedure has an arity, which is the minimum and (optionally) maximum number of arguments that it will accept. MIT/GNU Scheme provides an abstraction that represents arity, and tests for the apparent arity of a procedure.

Arity objects come in two forms: the simple form, an exact non-negative integer, represents a fixed number of arguments. The general form is a pair whose car represents the minimum number of arguments and whose cdr is the maximum number of arguments.

procedure: make-procedure-arity min [max [simple-ok?]]

Returns an arity object made from min and max. Min must be an exact non-negative integer. Max must be an exact non-negative integer at least as large as min. Alternatively, max may be omitted or given as ‘#f’, which represents an arity with no upper bound.

If simple-ok? is true, the returned arity is in the simple form (an exact non-negative integer) when possible, and otherwise is always in the general form. Simple-ok? defaults to ‘#f’.

procedure: procedure-arity? object

Returns ‘#t’ if object is an arity object, and ‘#f’ otherwise.

procedure: procedure-arity-min arity
procedure: procedure-arity-max arity

Return the lower and upper bounds of arity, respectively.

The following procedures test for the apparent arity of a procedure. The results of the test may be less restrictive than the effect of calling the procedure. In other words, these procedures may indicate that the procedure will accept a given number of arguments, but if you call the procedure it may signal a condition-type:wrong-number-of-arguments error. For example, here is a procedure that appears to accept any number of arguments, but when called will signal an error if the number of arguments is not one:

(lambda arguments (apply car arguments))
procedure: procedure-arity procedure

Returns the arity that procedure accepts. The result may be in either simple or general form.

(procedure-arity (lambda () 3))         ⇒  (0 . 0)
(procedure-arity (lambda (x) x))        ⇒  (1 . 1)
(procedure-arity car)                   ⇒  (1 . 1)
(procedure-arity (lambda x x))          ⇒  (0 . #f)
(procedure-arity (lambda (x . y) x))    ⇒  (1 . #f)
(procedure-arity (lambda (x #!optional y) x))
                                        ⇒  (1 . 2)
procedure: procedure-arity-valid? procedure arity

Returns ‘#t’ if procedure accepts arity, and ‘#f’ otherwise.

procedure: procedure-of-arity? object arity

Returns ‘#t’ if object is a procedure that accepts arity, and ‘#f’ otherwise. Equivalent to:

(and (procedure? object)
     (procedure-arity-valid? object arity))
procedure: guarantee-procedure-of-arity object arity caller

Signals an error if object is not a procedure accepting arity. Caller is a symbol that is printed as part of the error message and is intended to be the name of the procedure where the error occurs.

procedure: thunk? object

Returns ‘#t’ if object is a procedure that accepts zero arguments, and ‘#f’ otherwise. Equivalent to:

(procedure-of-arity? object 0)

Next: , Previous: , Up: Procedures   [Contents][Index]

12.3 Primitive Procedures

procedure: make-primitive-procedure name [arity]

Name must be a symbol. Arity must be an exact non-negative integer, -1, #f, or #t; if not supplied it defaults to #f. Returns the primitive procedure called name. May perform further actions depending on arity:

#f

If the primitive procedure is not implemented, signals an error.

#t

If the primitive procedure is not implemented, returns #f.

integer

If the primitive procedure is implemented, signals an error if its arity is not equal to arity. If the primitive procedure is not implemented, returns an unimplemented primitive procedure object that accepts arity arguments. An arity of -1 means it accepts any number of arguments.

procedure: primitive-procedure-name primitive-procedure

Returns the name of primitive-procedure, a symbol.

(primitive-procedure-name car)          ⇒  car
procedure: implemented-primitive-procedure? primitive-procedure

Returns #t if primitive-procedure is implemented; otherwise returns #f. Useful because the code that implements a particular primitive procedure is not necessarily linked into the executable Scheme program.


Next: , Previous: , Up: Procedures   [Contents][Index]

12.4 Continuations

procedure: call-with-current-continuation procedure

Procedure must be a procedure of one argument. Packages up the current continuation (see below) as an escape procedure and passes it as an argument to procedure. The escape procedure is a Scheme procedure of one argument that, if it is later passed a value, will ignore whatever continuation is in effect at that later time and will give the value instead to the continuation that was in effect when the escape procedure was created. The escape procedure created by call-with-current-continuation has unlimited extent just like any other procedure in Scheme. It may be stored in variables or data structures and may be called as many times as desired.

The following examples show only the most common uses of this procedure. If all real programs were as simple as these examples, there would be no need for a procedure with the power of call-with-current-continuation.

(call-with-current-continuation
  (lambda (exit)
    (for-each (lambda (x)
                (if (negative? x)
                    (exit x)))
              '(54 0 37 -3 245 19))
    #t))                                ⇒  -3

(define list-length
  (lambda (obj)
    (call-with-current-continuation
      (lambda (return)
        (letrec ((r
                  (lambda (obj)
                    (cond ((null? obj) 0)
                          ((pair? obj) (+ (r (cdr obj)) 1))
                          (else (return #f))))))
          (r obj))))))
(list-length '(1 2 3 4))                ⇒  4
(list-length '(a b . c))                ⇒  #f

A common use of call-with-current-continuation is for structured, non-local exits from loops or procedure bodies, but in fact call-with-current-continuation is quite useful for implementing a wide variety of advanced control structures.

Whenever a Scheme expression is evaluated a continuation exists that wants the result of the expression. The continuation represents an entire (default) future for the computation. If the expression is evaluated at top level, for example, the continuation will take the result, print it on the screen, prompt for the next input, evaluate it, and so on forever. Most of the time the continuation includes actions specified by user code, as in a continuation that will take the result, multiply it by the value stored in a local variable, add seven, and give the answer to the top-level continuation to be printed. Normally these ubiquitous continuations are hidden behind the scenes and programmers don’t think much about them. On the rare occasions that you may need to deal explicitly with continuations, call-with-current-continuation lets you do so by creating a procedure that acts just like the current continuation.

procedure: continuation? object

Returns #t if object is a continuation; otherwise returns #f.

procedure: within-continuation continuation thunk

Thunk must be a procedure of no arguments. Conceptually,
within-continuation invokes continuation on the result of invoking thunk, but thunk is executed in the dynamic state of continuation. In other words, the “current” continuation is abandoned before thunk is invoked.

procedure: dynamic-wind before thunk after

Calls thunk without arguments, returning the result(s) of this call. Before and after are called, also without arguments, as required by the following rules. Note that in the absence of calls to continuations captured using call-with-current-continuation the three arguments are called once each, in order. Before is called whenever execution enters the dynamic extent of the call to thunk and after is called whenever it exits that dynamic extent. The dynamic extent of a procedure call is the period between when the call is initiated and when it returns. In Scheme, because of call-with-current-continuation, the dynamic extent of a call may not be a single, connected time period. It is defined as follows:

If a second call to dynamic-wind occurs within the dynamic extent of the call to thunk and then a continuation is invoked in such a way that the afters from these two invocations of dynamic-wind are both to be called, then the after associated with the second (inner) call to dynamic-wind is called first.

If a second call to dynamic-wind occurs within the dynamic extent of the call to thunk and then a continuation is invoked in such a way that the befores from these two invocations of dynamic-wind are both to be called, then the before associated with the first (outer) call to dynamic-wind is called first.

If invoking a continuation requires calling the before from one call to dynamic-wind and the after from another, then the after is called first.

The effect of using a captured continuation to enter or exit the dynamic extent of a call to before or after is undefined.

(let ((path '())
      (c #f))
  (let ((add (lambda (s)
               (set! path (cons s path)))))
    (dynamic-wind
      (lambda () (add 'connect))
      (lambda ()
        (add (call-with-current-continuation
               (lambda (c0)
                 (set! c c0)
                 'talk1))))
      (lambda () (add 'disconnect)))
    (if (< (length path) 4)
        (c 'talk2)
        (reverse path))))

⇒ (connect talk1 disconnect connect talk2 disconnect)

The following two procedures support multiple values.

procedure: call-with-values thunk procedure

Thunk must be a procedure of no arguments, and procedure must be a procedure. Thunk is invoked with a continuation that expects to receive multiple values; specifically, the continuation expects to receive the same number of values that procedure accepts as arguments. Thunk must return multiple values using the values procedure. Then procedure is called with the multiple values as its arguments. The result yielded by procedure is returned as the result of call-with-values.

procedure: values object …

Returns multiple values. The continuation in effect when this procedure is called must be a multiple-value continuation that was created by call-with-values. Furthermore it must accept as many values as there are objects.


Previous: , Up: Procedures   [Contents][Index]

12.5 Application Hooks

Application hooks are objects that can be applied like procedures. Each application hook has two parts: a procedure that specifies what to do when the application hook is applied, and an arbitrary object, called extra. Often the procedure uses the extra object to determine what to do.

There are two kinds of application hooks, which differ in what arguments are passed to the procedure. When an apply hook is applied, the procedure is passed exactly the same arguments that were passed to the apply hook. When an entity is applied, the entity itself is passed as the first argument, followed by the other arguments that were passed to the entity.

Both apply hooks and entities satisfy the predicate procedure?. Each satisfies either compiled-procedure?, compound-procedure?, or primitive-procedure?, depending on its procedure component. An apply hook is considered to accept the same number of arguments as its procedure, while an entity is considered to accept one less argument than its procedure.

procedure: make-apply-hook procedure object

Returns a newly allocated apply hook with a procedure component of procedure and an extra component of object.

procedure: apply-hook? object

Returns #t if object is an apply hook; otherwise returns #f.

procedure: apply-hook-procedure apply-hook

Returns the procedure component of apply-hook.

procedure: set-apply-hook-procedure! apply-hook procedure

Changes the procedure component of apply-hook to be procedure. Returns an unspecified value.

procedure: apply-hook-extra apply-hook

Returns the extra component of apply-hook.

procedure: set-apply-hook-extra! apply-hook object

Changes the extra component of apply-hook to be object. Returns an unspecified value.

procedure: make-entity procedure object

Returns a newly allocated entity with a procedure component of procedure and an extra component of object.

procedure: entity? object

Returns #t if object is an entity; otherwise returns #f.

procedure: entity-procedure entity

Returns the procedure component of entity.

procedure: set-entity-procedure! entity procedure

Changes the procedure component of entity to be procedure. Returns an unspecified value.

procedure: entity-extra entity

Returns the extra component of entity.

procedure: set-entity-extra! entity object

Changes the extra component of entity to be object. Returns an unspecified value.


Next: , Previous: , Up: Top   [Contents][Index]

13 Environments


Next: , Previous: , Up: Environments   [Contents][Index]

13.1 Environment Operations

Environments are first-class objects in MIT/GNU Scheme. An environment consists of some bindings and possibly a parent environment, from which other bindings are inherited. The operations in this section reveal the frame-like structure of environments by permitting you to examine the bindings of a particular environment separately from those of its parent.

There are several types of bindings that can occur in an environment. The most common is the simple variable binding, which associates a value (any Scheme object) with an identifier (a symbol). A variable binding can also be unassigned, which means that it has no value. An unassigned variable is bound, in that is will shadow other bindings of the same name in ancestor environments, but a reference to that variable will signal an error of type condition-type:unassigned-variable. An unassigned variable can be assigned (using set! or environment-assign!) to give it a value.

In addition to variable bindings, an environment can also have keyword bindings. A keyword binding associates a syntactic keyword (usually a macro transformer) with an identifier. Keyword bindings are special in that they are considered “bound”, but ordinary variable references don’t work on them. So an attempt to reference or assign a keyword binding results in an error of type condition-type:macro-binding. However, keyword bindings can be redefined using define or environment-define.

procedure: environment? object

Returns #t if object is an environment; otherwise returns #f.

procedure: environment-has-parent? environment

Returns #t if environment has a parent environment; otherwise returns #f.

procedure: environment-parent environment

Returns the parent environment of environment. It is an error if environment has no parent.

procedure: environment-bound-names environment

Returns a newly allocated list of the names (symbols) that are bound by environment. This does not include the names that are bound by the parent environment of environment. It does include names that are unassigned or keywords in environment.

procedure: environment-macro-names environment

Returns a newly allocated list of the names (symbols) that are bound to syntactic keywords in environment.

procedure: environment-bindings environment

Returns a newly allocated list of the bindings of environment; does not include the bindings of the parent environment. Each element of this list takes one of two forms: (symbol) indicates that symbol is bound but unassigned, while (symbol object) indicates that symbol is bound, and its value is object.

procedure: environment-reference-type environment symbol

Returns a symbol describing the reference type of symbol in environment or one of its ancestor environments. The result is one of the following:

normal

means symbol is a variable binding with a normal value.

unassigned

means symbol is a variable binding with no value.

macro

means symbol is a keyword binding.

unbound

means symbol has no associated binding.

procedure: environment-bound? environment symbol

Returns #t if symbol is bound in environment or one of its ancestor environments; otherwise returns #f. This is equivalent to

(not (eq? 'unbound
          (environment-reference-type environment symbol)))
procedure: environment-assigned? environment symbol

Returns #t if symbol is bound in environment or one of its ancestor environments, and has a normal value. Returns #f if it is bound but unassigned. Signals an error if it is unbound or is bound to a keyword.

procedure: environment-lookup environment symbol

Symbol must be bound to a normal value in environment or one of its ancestor environments. Returns the value to which it is bound. Signals an error if unbound, unassigned, or a keyword.

procedure: environment-lookup-macro environment symbol

If symbol is a keyword binding in environment or one of its ancestor environments, returns the value of the binding. Otherwise, returns #f. Does not signal any errors other than argument-type errors.

procedure: environment-assignable? environment symbol

Symbol must be bound in environment or one of its ancestor environments. Returns #t if the binding may be modified by side effect.

procedure: environment-assign! environment symbol object

Symbol must be bound in environment or one of its ancestor environments, and must be assignable. Modifies the binding to have object as its value, and returns an unspecified result.

procedure: environment-definable? environment symbol

Returns #t if symbol is definable in environment, and #f otherwise. At present, this is false for environments generated by application of compiled procedures, and true for all other environments.

procedure: environment-define environment symbol object

Defines symbol to be bound to object in environment, and returns an unspecified value. Signals an error if symbol isn’t definable in environment.

procedure: environment-define-macro environment symbol transformer

Defines symbol to be a keyword bound to transformer in environment, and returns an unspecified value. Signals an error if symbol isn’t definable in environment. The type of transformer is defined by the syntax engine and is not checked by this procedure. If the type is incorrect this will subsequently signal an error during syntax expansion.

procedure: eval expression environment

Evaluates expression, a list-structure representation (sometimes called s-expression representation) of a Scheme expression, in environment. You rarely need eval in ordinary programs; it is useful mostly for evaluating expressions that have been created “on the fly” by a program. eval is relatively expensive because it must convert expression to an internal form before it is executed.

(define foo (list '+ 1 2))
(eval foo (the-environment))            ⇒  3

Next: , Previous: , Up: Environments   [Contents][Index]

13.2 Environment Variables

The user-initial-environment is where the top-level read-eval-print (REP) loop evaluates expressions and binds definitions. It is a child of system-global-environment, which is where all of the Scheme system definitions are bound. All of the bindings in system-global-environment are available when the current environment is user-initial-environment. However, any new bindings that you create in the REP loop (with define forms or by loading files containing define forms) occur in user-initial-environment.

variable: system-global-environment

The variable system-global-environment is bound to the distinguished environment that’s the ancestor of most other environments (except for those created by make-root-top-level-environment). It is the parent environment of user-initial-environment. Primitives, system procedures, and most syntactic keywords are bound (and sometimes closed) in this environment.

variable: user-initial-environment

The variable user-initial-environment is bound to the default environment in which typed expressions are evaluated by the top-level REP loop.

Although all bindings in system-global-environment are visible to the REP loop, definitions that are typed at, or loaded by, the REP loop occur in the user-initial-environment. This is partly a safety measure: if you enter a definition that happens to have the same name as a critical system procedure, your definition will be visible only to the procedures you define in the user-initial-environment; the MIT/GNU Scheme system procedures, which are defined in system-global-environment, will continue to see the original definition.


Next: , Previous: , Up: Environments   [Contents][Index]

13.3 REPL Environment

procedure: nearest-repl/environment

Returns the current REP loop environment (i.e. the current environment of the closest enclosing REP loop). When Scheme first starts up, this is the same as user-initial-environment.

procedure: ge environment

Changes the current REP loop environment to environment. Environment can be either an environment or a procedure object. If it’s a procedure, the environment in which that procedure was closed is the new environment.


Previous: , Up: Environments   [Contents][Index]

13.4 Top-level Environments

The operations in this section manipulate top-level environments, as opposed to environments created by the application of procedures. For historical reasons, top-level environments are referred to as interpreter environments.

special form: the-environment

Returns the current environment. This form may only be evaluated in a top-level environment. An error is signalled if it appears elsewhere.

procedure: top-level-environment? object
procedure: interpreter-environment? object

Returns #t if object is an top-level environment; otherwise returns #f.

interpreter-environment? is an alias for top-level-environment?.

procedure: extend-top-level-environment environment [names [values]]
procedure: make-top-level-environment [names [values]]
procedure: make-root-top-level-environment [names [values]]

Returns a newly allocated top-level environment. extend-top-level-environment creates an environment that has parent environment, make-top-level-environment creates an environment that has parent system-global-environment, and make-root-top-level-environment creates an environment that has no parent.

The optional arguments names and values are used to specify initial bindings in the new environment. If specified, names must be a list of symbols, and values must be a list of objects. If only names is specified, each name in names will be bound in the environment, but unassigned. If names and values are both specified, they must be the same length, and each name in names will be bound to the corresponding value in values. If neither names nor values is specified, the environment will have no initial bindings.

Defines symbol1 in environment1 to have the same binding as symbol2 in environment2, and returns an unspecified value. Prior to the call, symbol2 must be bound in environment2, but the type of binding is irrelevant; it may be a normal binding, an unassigned binding, or a keyword binding. Signals an error if symbol1 isn’t definable in environment1, or if symbol2 is unbound in environment2.

By “the same binding”, we mean that the value cell is shared between the two environments. If a value is assigned to symbol1 in environment1, a subsequent reference to symbol2 in environment2 will see that value, and vice versa.

procedure: unbind-variable environment symbol

If symbol is bound in environment or one of its ancestor environments, removes the binding, so that subsequent accesses to that symbol behave as if the binding never existed. Returns #t if there was a binding prior to the call, and #f if there wasn’t.


Next: , Previous: , Up: Top   [Contents][Index]

14 Input/Output

This chapter describes the procedures that are used for input and output (I/O). The chapter first describes ports and how they are manipulated, then describes the I/O operations. Finally, some low-level procedures are described that permit the implementation of custom ports and high-performance I/O.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.1 Ports

Ports represent input and output devices. To Scheme, an input port is a Scheme object that can deliver data upon command, while an output port is a Scheme object that can accept data. Whether the input and output port types are disjoint is implementation-dependent. (In MIT/GNU Scheme, there are input ports, output ports, and input/output ports.)

Different port types operate on different data. Scheme implementations are required to support textual ports and binary ports, but may also provide other port types.

A textual port supports reading or writing of individual characters from or to a backing store containing characters using read-char and write-char below, and it supports operations defined in terms of characters, such as read and write.

A binary port supports reading or writing of individual bytes from or to a backing store containing bytes using read-u8 and write-u8 below, as well as operations defined in terms of bytes. Whether the textual and binary port types are disjoint is implementation-dependent. (In MIT/GNU Scheme, textual ports and binary ports are distinct.)

Ports can be used to access files, devices, and similar things on the host system on which the Scheme program is running.

standard procedure: call-with-port port procedure

It is an error if procedure does not accept one argument.

The call-with-port procedure calls procedure with port as an argument. If procedure returns, then the port is closed automatically and the values yielded by procedure are returned. If procedure does not return, then the port must not be closed automatically unless it is possible to prove that the port will never again be used for a read or write operation.

Rationale: Because Scheme’s escape procedures have unlimited extent, it is possible to escape from the current continuation but later to resume it. If implementations were permitted to close the port on any escape from the current continuation, then it would be impossible to write portable code using both call-with-current-continuation and call-with-port.

procedure: call-with-truncated-output-port limit output-port procedure

The limit argument must be a nonnegative integer. It is an error if procedure does not accept one argument.

This procedure uses a continuation to escape from procedure if it tries to write more than limit characters.

It calls procedure with a special output port as an argument. Up to limit characters may be written to that output port, and those characters are transparently written through to output-port.

If the number of characters written to that port exceeds limit, then the escape continuation is invoked and #t is returned. Otherwise, procedure returns normally and #f is returned.

Note that if procedure writes exactly limit characters, then the escape continuation is not invoked, and #f is returned.

In no case does call-with-truncated-output-port close output-port.

standard procedure: input-port? object
standard procedure: output-port? object
procedure: i/o-port? object
standard procedure: textual-port? object
standard procedure: binary-port? object
standard procedure: port? object

These procedures return #t if object is an input port, output port, input/output port, textual port, binary port, or any kind of port, respectively. Otherwise they return #f.

standard procedure: input-port-open? port
standard procedure: output-port-open? port

Returns #t if port is still open and capable of performing input or output, respectively, and #f otherwise.

standard parameter: current-input-port [input-port]
standard parameter: current-output-port [output-port]
standard parameter: current-error-port [output-port]

Returns the current default input port, output port, or error port (an output port), respectively. These procedures are parameter objects, which can be overridden with parameterize. The initial bindings for these are implementation-defined textual ports.

parameter: notification-output-port [output-port]

Returns an output port suitable for generating “notifications”, that is, messages to the user that supply interesting information about the execution of a program. For example, the load procedure writes messages to this port informing the user that a file is being loaded.

This procedure is a parameter object, which can be overridden with parameterize.

parameter: trace-output-port [output-port]

Returns an output port suitable for generating “tracing” information about a program’s execution. The output generated by the trace procedure is sent to this port.

This procedure is a parameter object, which can be overridden with parameterize.

parameter: interaction-i/o-port [i/o-port]

Returns an I/O port suitable for querying or prompting the user. The standard prompting procedures use this port by default (see Prompting).

This procedure is a parameter object, which can be overridden with parameterize.

standard procedure: close-port port
standard procedure: close-input-port port
standard procedure: close-output-port port

Closes the resource associated with port, rendering the port incapable of delivering or accepting data. It is an error to apply the last two procedures to a port which is not an input or output port, respectively. Scheme implementations may provide ports which are simultaneously input and output ports, such as sockets; the close-input-port and close-output-port procedures can then be used to close the input and output sides of the port independently.

These routines have no effect if the port has already been closed.

obsolete procedure: set-current-input-port! input-port
obsolete procedure: set-current-output-port! output-port
obsolete procedure: set-notification-output-port! output-port
obsolete procedure: set-trace-output-port! output-port
obsolete procedure: set-interaction-i/o-port! i/o-port

These procedures are deprecated; instead call the corresponding parameters with an argument.

obsolete procedure: with-input-from-port input-port thunk
obsolete procedure: with-output-to-port output-port thunk
obsolete procedure: with-notification-output-port output-port thunk
obsolete procedure: with-trace-output-port output-port thunk
obsolete procedure: with-interaction-i/o-port i/o-port thunk

These procedures are deprecated; instead use parameterize on the corresponding parameters.

variable: console-i/o-port

console-i/o-port is an I/O port that communicates with the “console”. Under unix, the console is the controlling terminal of the Scheme process. Under Windows, the console is the window that is created when Scheme starts up.

This variable is rarely used; instead programs should use one of the standard ports defined above. This variable should not be modified.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.2 File Ports

Before Scheme can access a file for reading or writing, it is necessary to open a port to the file. This section describes procedures used to open ports to files. Such ports are closed (like any other port) by close-port. File ports are automatically closed if and when they are reclaimed by the garbage collector.

Before opening a file for input or output, by whatever method, the filename argument is converted to canonical form by calling the procedure merge-pathnames with filename as its sole argument. Thus, filename can be either a string or a pathname, and it is merged with the current pathname defaults to produce the pathname that is then opened.

standard procedure: call-with-input-file filename procedure
standard procedure: call-with-output-file filename procedure

It is an error if procedure does not accept one argument.

These procedures obtain a textual port obtained by opening the named file for input or output as if by open-input-file or open-output-file. The port and procedure are then passed to a procedure equivalent to call-with-port.

procedure: call-with-binary-input-file filename procedure
procedure: call-with-binary-output-file filename procedure

It is an error if procedure does not accept one argument.

These procedures obtain a binary port obtained by opening the named file for input or output as if by open-binary-input-file or open-binary-output-file. The port and procedure are then passed to a procedure equivalent to call-with-port.

standard procedure: with-input-from-file filename thunk
standard procedure: with-output-to-file filename thunk

The file named by filename is opened for input or output as if by open-input-file or open-output-file, and the new port is made to be the value returned by current-input-port or current-output-port (as used by (read), (write obj), and so forth). The thunk is then called with no arguments. When the thunk returns, the port is closed and the previous default is restored. It is an error if thunk does not accept zero arguments. Both procedures return the values yielded by thunk. If an escape procedure is used to escape from the continuation of these procedures, they behave exactly as if the current input or output port had been bound dynamically with parameterize.

obsolete procedure: with-input-from-binary-file filename thunk
obsolete procedure: with-output-to-binary-file filename thunk

These procedures are deprecated; instead use parameterize along with call-with-binary-input-file or call-with-binary-output-file.

procedure: open-input-file filename
procedure: open-binary-input-file filename

Takes a filename for an existing file and returns a textual input port or binary input port that is capable of delivering data from the file. If the file does not exist or cannot be opened, an error an error that satisfies file-error? is signaled.

standard procedure: open-output-file filename [append?]
standard procedure: open-binary-output-file filename [append?]

Takes a filename naming an output file to be created and returns a textual output port or binary output port that is capable of writing data to a new file by that name. If a file with the given name already exists, the effect is unspecified. (In that case, MIT/GNU Scheme overwrites an existing file.) If the file cannot be opened, an error that satisfies file-error? is signalled.

The optional argument append? is an MIT/GNU Scheme extension. If append? is given and not #f, the file is opened in append mode. In this mode, the contents of the file are not overwritten; instead any characters written to the file are appended to the end of the existing contents. If the file does not exist, append mode creates the file and writes to it in the normal way.

procedure: open-i/o-file filename
procedure: open-binary-i/o-file filename

Takes a filename referring to an existing file and returns an I/O port that is capable of both reading from and writing to the file. If the file cannot be opened, an error that satisfies file-error? is signalled.

This procedure is often used to open special files. For example, under unix this procedure can be used to open terminal device files, PTY device files, and named pipes.

procedure: close-all-open-files

This procedure closes all file ports that are open at the time that it is called, and returns an unspecified value.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.3 String Ports

This section describes textual input ports that read their input from given strings, and textual output ports that accumulate their output and return it as a string.

standard procedure: open-input-string string [start [end]]

Takes a string and returns a textual input port that delivers characters from the string. If the string is modified, the effect is unspecified.

The optional arguments start and end may be used to specify that the string port delivers characters from a substring of string; if not given, start defaults to 0 and end defaults to (string-length string).

standard procedure: open-output-string

Returns a textual output port that will accumulate characters for retrieval by get-output-string.

standard procedure: get-output-string port

It is an error if port was not created with open-output-string.

Returns a string consisting of the characters that have been output to the port so far in the order they were output. If the result string is modified, the effect is unspecified.

(parameterize ((current-output-port (open-output-string)))
  (display "piece")
  (display " by piece ")
  (display "by piece.")
  (newline)
  (get-output-string (current-output-port)))

    ⇒ "piece by piece by piece.\n"
procedure: call-with-output-string procedure

The procedure is called with one argument, a textual output port. The values yielded by procedure are ignored. When procedure returns, call-with-output-string returns the port’s accumulated output as a string. If the result string is modified, the effect is unspecified.

This procedure could have been defined as follows:

(define (call-with-output-string procedure)
  (let ((port (open-output-string)))
    (procedure port)
    (get-output-string port)))
procedure: call-with-truncated-output-string limit procedure

Similar to call-with-output-string, except that the output is limited to at most limit characters. The returned value is a pair; the car of the pair is #t if procedure attempted to write more than limit characters, and #f otherwise. The cdr of the pair is a newly allocated string containing the accumulated output.

This procedure could have been defined as follows:

(define (call-with-truncated-output-string limit procedure)
  (let ((port (open-output-string)))
    (let ((truncated?
           (call-with-truncated-output-port limit port
                                            procedure)))
      (cons truncated? (get-output-string port)))))

This procedure is helpful for displaying circular lists, as shown in this example:

(define inf (list 'inf))
(call-with-truncated-output-string 40
  (lambda (port)
    (write inf port)))                  ⇒  (#f . "(inf)")
(set-cdr! inf inf)
(call-with-truncated-output-string 40
  (lambda (port)
    (write inf port)))
        ⇒  (#t . "(inf inf inf inf inf inf inf inf inf inf")
procedure: write-to-string object [limit]

Writes object to a string output port, and returns the resulting string.

If limit is supplied and not #f, then this procedure is equivalent to the following and returns a pair instead of just a string:

(call-with-truncated-output-string limit
  (lambda (port)
    (write object port)))
obsolete procedure: with-input-from-string string thunk
obsolete procedure: with-output-to-string thunk
obsolete procedure: with-output-to-truncated-string limit thunk

These procedures are deprecated; instead use open-input-string, call-with-output-string, or call-with-truncated-output-string along with parameterize.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.4 Bytevector Ports

This section describes binary input ports that read their input from given bytevectors, and binary output ports that accumulate their output and return it as a bytevector.

standard procedure: open-input-bytevector bytevector [start [end]]

Takes a bytevector and returns a binary input port that delivers bytes from the bytevector. If the bytevector is modified, the effect is unspecified.

The optional arguments start and end may be used to specify that the bytevector port delivers bytes from a portion of bytevector; if not given, start defaults to 0 and end defaults to (bytevector-length bytevector).

standard procedure: open-output-bytevector

Returns a binary output port that will accumulate bytes for retrieval by get-output-bytevector.

standard procedure: get-output-bytevector port

It is an error if port was not created with open-output-bytevector.

Returns a bytevector consisting of the bytes that have been output to the port so far in the order they were output. If the result bytevector is modified, the effect is unspecified.

procedure: call-with-output-bytevector procedure

The procedure is called with one argument, a binary output port. The values yielded by procedure are ignored. When procedure returns, call-with-output-bytevector returns the port’s accumulated output as a newly allocated bytevector.

This procedure could have been defined as follows:

(define (call-with-output-bytevector procedure)
  (let ((port (open-output-bytevector)))
    (procedure port)
    (get-output-bytevector port)))

Next: , Previous: , Up: Input/Output   [Contents][Index]

14.5 Input Procedures

This section describes the procedures that read input. Input procedures can read either from the current input port or from a given port. Remember that to read from a file, you must first open a port to the file.

Input ports can be divided into two types, called interactive and non-interactive. Interactive input ports are ports that read input from a source that is time-dependent; for example, a port that reads input from a terminal or from another program. Non-interactive input ports read input from a time-independent source, such as an ordinary file or a character string.

In this section, all optional arguments called port default to the current input port.

standard procedure: read [port]

The read procedure converts external representations of Scheme objects into the objects themselves. It returns the next object parsable from the given textual input port, updating port to point to the first character past the end of the external representation of the object.

Implementations may support extended syntax to represent record types or other types that do not have datum representations.

If an end of file is encountered in the input before any characters are found that can begin an object, then an end-of-file object is returned. The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s external representation, but the external representation is incomplete and therefore not parsable, an error that satisfies read-error? is signaled.

The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s written representation, but the written representation is incomplete and therefore not parsable, an error is signalled.

standard procedure: read-char [port]

Returns the next character available from the textual input port, updating port to point to the following character. If no more characters are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port and no characters are immediately available, read-char will hang waiting for input, even if the port is in non-blocking mode.

procedure: read-char-no-hang [port]

This procedure behaves exactly like read-char except when port is an interactive port in non-blocking mode, and there are no characters immediately available. In that case this procedure returns #f without blocking.

procedure: unread-char char [port]

The given char must be the most-recently read character from the textual input port. This procedure “unreads” the character, updating port as if the character had never been read.

Note that this only works with characters returned by read-char or read-char-no-hang.

standard procedure: peek-char [port]

Returns the next character available from the textual input port, without updating port to point to the following character. If no more characters are available, an end-of-file object is returned.

Note: The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char on the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char. In particular, a call to peek-char on an interactive port will hang waiting for input whenever a call to read-char would have hung.

standard procedure: read-line [port]

Returns the next line of text available from the textual input port, updating the port to point to the following character. If an end of line is read, a string containing all of the text up to (but not including) the end of line is returned, and the port is updated to point just past the end of line. If an end of file is encountered before any end of line is read, but some characters have been read, a string containing those characters is returned. If an end of file is encountered before any characters are read, an end-of-file object is returned. For the purpose of this procedure, an end of line consists of either a linefeed character, a carriage return character, or a sequence of a carriage return character followed by a linefeed character. Implementations may also recognize other end of line characters or sequences.

In MIT/GNU Scheme, if port is an interactive input port and no characters are immediately available, read-line will hang waiting for input, even if the port is in non-blocking mode.

standard procedure: eof-object? object

Returns #t if object is an end-of-file object, otherwise returns #f. The precise set of end-of-file objects will vary among implementations, but in any case no end-of-file object will ever be an object that can be read in using read.

standard procedure: eof-object

Returns an end-of-file object, not necessarily unique.

standard procedure: char-ready? [port]

Returns #t if a character is ready on the textual input port and returns #f otherwise. If char-ready? returns #t then the next read-char operation on the given port is guaranteed not to hang. If the port is at end of file then char-ready? returns #t.

Rationale: The char-ready? procedure exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors associated with such ports must ensure that characters whose existence has been asserted by char-ready? cannot be removed from the input. If char-ready? were to return #f at end of file, a port at end of file would be indistinguishable from an interactive port that has no ready characters.

standard procedure: read-string k [port]

Reads the next k characters, or as many as are available before the end of file, from the textual input port into a newly allocated string in left-to-right order and returns the string. If no characters are available before the end of file, an end-of-file object is returned.

Note: MIT/GNU Scheme previously defined this procedure differently, and this alternate usage is deprecated; please use read-delimited-string instead. For now, read-string will redirect to read-delimited-string as needed, but this redirection will be eliminated in a future release.

procedure: read-string! string [port [start [end]]]

Reads the next end-start characters, or as many as are available before the end of file, from the textual input port into string in left-to-right order beginning at the start position. If end is not supplied, reads until the end of string has been reached. If start is not supplied, reads beginning at position 0. Returns the number of characters read. If no characters are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive port in non-blocking mode and no characters are immediately available, #f is returned without any modification of string.

However, if one or more characters are immediately available, the region is filled using the available characters. The procedure then returns the number of characters filled in, without waiting for further characters, even if the number of filled characters is less than the size of the region.

obsolete procedure: read-substring! string start end [port]

This procedure is deprecated; use read-string! instead.

standard procedure: read-u8 [port]

Returns the next byte available from the binary input port, updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-u8 will return #f.

standard procedure: peek-u8 [port]

Returns the next byte available from the binary input port, but without updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, peek-u8 will return #f.

standard procedure: u8-ready? [port]

Returns #t if a byte is ready on the binary input port and returns #f otherwise. If u8-ready? returns #t then the next read-u8 operation on the given port is guaranteed not to hang. If the port is at end of file then u8-ready? returns #t.

standard procedure: read-bytevector k [port]

Reads the next k bytes, or as many as are available before the end of file, from the binary input port into a newly allocated bytevector in left-to-right order and returns the bytevector. If no bytes are available before the end of file, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-bytevector will return #f.

However, if one or more bytes are immediately available, they are read and returned as a bytevector, without waiting for further bytes, even if the number of bytes is less than k.

standard procedure: read-bytevector! bytevector [port [start [end]]]

Reads the next end-start bytes, or as many as are available before the end of file, from the binary input port into bytevector in left-to-right order beginning at the start position. If end is not supplied, reads until the end of bytevector has been reached. If start is not supplied, reads beginning at position 0. Returns the number of bytes read. If no bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-bytevector! will return #f.

However, if one or more bytes are immediately available, the region is filled using the available bytes. The procedure then returns the number of bytes filled in, without waiting for further bytes, even if the number of filled bytes is less than the size of the region.

procedure: read-delimited-string char-set [port]

Reads characters from port until it finds a terminating character that is a member of char-set (see Character Sets) or encounters end of file. The port is updated to point to the terminating character, or to end of file if no terminating character was found. read-delimited-string returns the characters, up to but excluding the terminating character, as a newly allocated string.

This procedure ignores the blocking mode of the port, blocking unconditionally until it sees either a delimiter or end of file. If end of file is encountered before any characters are read, an end-of-file object is returned.

On many input ports, this operation is significantly faster than the following equivalent code using peek-char and read-char:

(define (read-delimited-string char-set port)
  (let ((char (peek-char port)))
    (if (eof-object? char)
        char
        (list->string
         (let loop ((char char))
           (if (or (eof-object? char)
                   (char-in-set? char char-set))
               '()
               (begin
                 (read-char port)
                 (cons char
                       (loop (peek-char port))))))))))

14.5.1 Reader Controls

The following parameters control the behavior of the read procedure.

parameter: param:reader-radix

This parameter defines the radix used by the reader when it parses numbers. This is similar to passing a radix argument to string->number. The value of the parameter must be one of 2, 8, 10, or 16; an error is signaled if the parameter is bound to any other value.

Note that much of the number syntax is invalid for radixes other than 10. The reader detects cases where such invalid syntax is used and signals an error. However, problems can still occur when param:reader-radix is bound to 16, because syntax that normally denotes symbols can now denote numbers (e.g. abc). Because of this, it is usually undesirable to bind this parameter to anything other than the default.

The default value of this parameter is 10.

parameter: param:reader-fold-case?

This parameter controls whether the parser folds the case of symbols, character names, and certain other syntax. If it is bound to its default value of #t, symbols read by the parser are case-folded prior to being interned. Otherwise, symbols are interned without folding.

At present, it is a bad idea to use this feature, as it doesn’t really make Scheme case-sensitive, and therefore can break features of the Scheme runtime that depend on case-folded symbols. Instead, use the #!fold-case or #!no-fold-case markers in your code.

obsolete variable: *parser-radix*
obsolete variable: *parser-canonicalize-symbols?*

These variables are deprecated; instead use the corresponding parameter objects.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.6 Output Procedures

Output ports may or may not support buffering of output, in which output characters are collected together in a buffer and then sent to the output device all at once. (Most of the output ports implemented by the runtime system support buffering.) Sending all of the characters in the buffer to the output device is called flushing the buffer. In general, output procedures do not flush the buffer of an output port unless the buffer is full.

However, the standard output procedures described in this section perform what is called discretionary flushing of the buffer. Discretionary output flushing works as follows. After a procedure performs its output (writing characters to the output buffer), it checks to see if the port implements an operation called discretionary-flush-output. If so, then that operation is invoked to flush the buffer. At present, only the console port defines discretionary-flush-output; this is used to guarantee that output to the console appears immediately after it is written, without requiring calls to flush-output-port.

In this section, all optional arguments called port default to the current output port.

Note: MIT/GNU Scheme doesn’t support datum labels, so any behavior in write, write-shared, or write-simple that depends on datum labels is not implemented. At present all three of these procedures are equivalent. This will be remedied in a future release.

standard procedure: write object [port]

Writes a representation of object to the given textual output port. Strings that appear in the written representation are enclosed in quotation marks, and within those strings backslash and quotation mark characters are escaped by backslashes. Symbols that contain non-ASCII characters are escaped with vertical lines. Character objects are written using the #\ notation.

If object contains cycles which would cause an infinite loop using the normal written representation, then at least the objects that form part of the cycle must be represented using datum labels. Datum labels must not be used if there are no cycles.

Implementations may support extended syntax to represent record types or other types that do not have datum representations.

The write procedure returns an unspecified value.

On MIT/GNU Scheme write performs discretionary output flushing.

standard procedure: write-shared object [port]

The write-shared procedure is the same as write, except that shared structure must be represented using datum labels for all pairs and vectors that appear more than once in the output.

standard procedure: write-simple object [port]

The write-simple procedure is the same as write, except that shared structure is never represented using datum labels. This can cause write-simple not to terminate if object contains circular structure.

standard procedure: display object [port]

Writes a representation of object to the given textual output port. Strings that appear in the written representation are output as if by write-string instead of by write. Symbols are not escaped. Character objects appear in the representation as if written by write-char instead of by write. The display representation of other objects is unspecified. However, display must not loop forever on self-referencing pairs, vectors, or records. Thus if the normal write representation is used, datum labels are needed to represent cycles as in write.

Implementations may support extended syntax to represent record types or other types that do not have datum representations.

The display procedure returns an unspecified value.

Rationale: The write procedure is intended for producing machine-readable output and display for producing human-readable output.

standard procedure: newline [port]

Writes an end of line to textual output port. Exactly how this is done differs from one operating system to another. Returns an unspecified value.

standard procedure: write-char char [port]

Writes the character char (not an external representation of the character) to the given textual output port and returns an unspecified value.

standard procedure: write-string string [port [start [end]]]

Writes the characters of string from start to end in left-to-right order to the textual output port.

obsolete procedure: write-substring string start end [port]

This procedure is deprecated; use write-string instead.

standard procedure: write-u8 byte [port]

Writes the byte to the given binary output port and returns an unspecified value.

In MIT/GNU Scheme, if port is an interactive output port in non-blocking mode and writing a byte would block, write-u8 immediately returns #f without writing anything. Otherwise byte is written and 1 is returned.

standard procedure: write-bytevector bytevector [port [start [end]]]

Writes the bytes of bytevector from start to end in left-to-right order to the binary output port.

In MIT/GNU Scheme, if port is an interactive output port in non-blocking mode write-bytevector will write as many bytes as it can without blocking, then returns the number of bytes written; if no bytes can be written without blocking, returns #f without writing anything. Otherwise write-bytevector returns the number of bytes actually written, which may be less than the number requested if unable to write all the bytes. (For example, if writing to a file and the file system is full.)

standard procedure: flush-output-port [port]

Flushes any buffered output from the buffer of port to the underlying file or device and returns an unspecified value.

obsolete procedure: flush-output [port]

This procedure is deprecated; use flush-output-port instead.

procedure: fresh-line [port]

Most output ports are able to tell whether or not they are at the beginning of a line of output. If port is such a port, this procedure writes an end-of-line to the port only if the port is not already at the beginning of a line. If port is not such a port, this procedure is identical to newline. In either case, fresh-line performs discretionary output flushing and returns an unspecified value.

procedure: write-line object [port]

Like write, except that it writes an end-of-line to port after writing object’s representation. This procedure performs discretionary output flushing and returns an unspecified value.

procedure: beep [port]

Performs a “beep” operation on port, performs discretionary output flushing, and returns an unspecified value. On the console port, this usually causes the console bell to beep, but more sophisticated interactive ports may take other actions, such as flashing the screen. On most output ports, e.g. file and string output ports, this does nothing.

procedure: clear [port]

“Clears the screen” of port, performs discretionary output flushing, and returns an unspecified value. On a terminal or window, this has a well-defined effect. On other output ports, e.g. file and string output ports, this does nothing.

procedure: pp object [port [as-code?]]

pp prints object in a visually appealing and structurally revealing manner on port. If object is a procedure, pp attempts to print the source text. If the optional argument as-code? is true, pp prints lists as Scheme code, providing appropriate indentation; by default this argument is false. pp performs discretionary output flushing and returns an unspecified value.

The following parameters may be used with parameterize to change the behavior of the write and display procedures.

parameter: param:printer-radix

This parameter specifies the default radix used to print numbers. Its value must be one of the exact integers 2, 8, 10, or 16; the default is 10. For values other than 10, numbers are prefixed to indicate their radix.

parameter: param:printer-list-breadth-limit

This parameter specifies a limit on the length of the printed representation of a list or vector; for example, if the limit is 4, only the first four elements of any list are printed, followed by ellipses to indicate any additional elements. The value of this parameter must be an exact non-negative integer, or #f meaning no limit; the default is #f.

(parameterize ((param:printer-list-breadth-limit 4))
  (write-to-string '(a b c d)))
                                ⇒ "(a b c d)"
(parameterize ((param:printer-list-breadth-limit 4))
  (write-to-string '(a b c d e)))
                                ⇒ "(a b c d ...)"
parameter: param:printer-list-depth-limit

This parameter specifies a limit on the nesting of lists and vectors in the printed representation. If lists (or vectors) are more deeply nested than the limit, the part of the representation that exceeds the limit is replaced by ellipses. The value of this parameter must be an exact non-negative integer, or #f meaning no limit; the default is #f.

(parameterize ((param:printer-list-depth-limit 4))
  (write-to-string '((((a))) b c d)))
                                ⇒ "((((a))) b c d)"
(parameterize ((param:printer-list-depth-limit 4))
  (write-to-string '(((((a)))) b c d)))
                                ⇒ "((((...))) b c d)"
parameter: param:printer-string-length-limit

This parameter specifies a limit on the length of the printed representation of strings. If a string’s length exceeds this limit, the part of the printed representation for the characters exceeding the limit is replaced by ellipses. The value of this parameter must be an exact non-negative integer, or #f meaning no limit; the default is #f.

(parameterize ((param:printer-string-length-limit 4))
  (write-to-string "abcd"))
                                ⇒ "\"abcd\""
(parameterize ((param:printer-string-length-limit 4))
  (write-to-string "abcde"))
                                ⇒ "\"abcd...\""
parameter: param:print-with-maximum-readability?

This parameter, which takes a boolean value, tells the printer to use a special printed representation for objects that normally print in a form that cannot be recognized by read. These objects are printed using the representation #@n, where n is the result of calling hash on the object to be printed. The reader recognizes this syntax, calling unhash on n to get back the original object. Note that this printed representation can only be recognized by the Scheme program in which it was generated, because these hash numbers are different for each invocation of Scheme.

obsolete variable: *unparser-radix*
obsolete variable: *unparser-list-breadth-limit*
obsolete variable: *unparser-list-depth-limit*
obsolete variable: *unparser-string-length-limit*
obsolete variable: *unparse-with-maximum-readability?*

These variables are deprecated; instead use the corresponding parameter objects.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.7 Blocking Mode

An interactive port is always in one of two modes: blocking or non-blocking. This mode is independent of the terminal mode: each can be changed independently of the other. Furthermore, if it is an interactive I/O port, there are separate blocking modes for input and for output.

If an input port is in blocking mode, attempting to read from it when no input is available will cause Scheme to “block”, i.e. suspend itself, until input is available. If an input port is in non-blocking mode, attempting to read from it when no input is available will cause the reading procedure to return immediately, indicating the lack of input in some way (exactly how this situation is indicated is separately specified for each procedure or operation).

An output port in blocking mode will block if the output device is not ready to accept output. In non-blocking mode it will return immediately after performing as much output as the device will allow (again, each procedure or operation reports this situation in its own way).

Interactive ports are initially in blocking mode; this can be changed at any time with the procedures defined in this section.

These procedures represent blocking mode by the symbol blocking, and non-blocking mode by the symbol nonblocking. An argument called mode must be one of these symbols. A port argument to any of these procedures may be any port, even if that port does not support blocking mode; in that case, the port is not modified in any way.

procedure: input-port-blocking-mode input-port
procedure: output-port-blocking-mode output-port

Returns the blocking mode of input-port or output-port. Returns #f if the given port doesn’t support blocking mode.

procedure: set-input-port-blocking-mode! input-port mode
procedure: set-output-port-blocking-mode output-port mode

Changes the blocking mode of input-port or output-port to be mode and returns an unspecified value.

procedure: with-input-port-blocking-mode input-port mode thunk
procedure: with-output-port-blocking-mode output-port mode thunk

Thunk must be a procedure of no arguments.

Binds the blocking mode of input-port or output-port to be mode, and calls thunk. When thunk returns, the original blocking mode is restored and the values yielded by thunk are returned.

obsolete procedure: port/input-blocking-mode input-port
obsolete procedure: port/set-input-blocking-mode input-port mode
obsolete procedure: port/with-input-blocking-mode input-port mode thunk
obsolete procedure: port/output-blocking-mode output-port
obsolete procedure: port/set-output-blocking-mode output-port mode
obsolete procedure: port/with-output-blocking-mode output-port mode thunk

These procedures are deprecated; instead use the corresponding procedures above.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.8 Terminal Mode

A port that reads from or writes to a terminal has a terminal mode; this is either cooked or raw. This mode is independent of the blocking mode: each can be changed independently of the other. Furthermore, a terminal I/O port has independent terminal modes both for input and for output.

A terminal port in cooked mode provides some standard processing to make the terminal easy to communicate with. For example, under unix, cooked mode on input reads from the terminal a line at a time and provides editing within the line, while cooked mode on output might translate linefeeds to carriage-return/linefeed pairs. In general, the precise meaning of cooked mode is operating-system dependent, and furthermore might be customizable by means of operating-system utilities. The basic idea is that cooked mode does whatever is necessary to make the terminal handle all of the usual user-interface conventions for the operating system, while keeping the program’s interaction with the port as normal as possible.

A terminal port in raw mode disables all of that processing. In raw mode, characters are directly read from and written to the device without any translation or interpretation by the operating system. On input, characters are available as soon as they are typed, and are not echoed on the terminal by the operating system. In general, programs that put ports in raw mode have to know the details of interacting with the terminal. In particular, raw mode is used for writing programs such as text editors.

Terminal ports are initially in cooked mode; this can be changed at any time with the procedures defined in this section.

These procedures represent cooked mode by the symbol cooked, and raw mode by the symbol raw. An argument called mode must be one of these symbols. A port argument to any of these procedures may be any port, even if that port does not support terminal mode; in that case, the port is not modified in any way.

procedure: input-port-terminal-mode input-port
procedure: output-port-terminal-mode output-port

Returns the terminal mode of input-port or output-port. Returns #f if the given port is not a terminal port.

procedure: set-input-port-terminal-mode! input-port mode
procedure: set-output-port-terminal-mode! output-port mode

Changes the terminal mode of input-port or output-port to be mode and returns an unspecified value.

procedure: with-input-port-terminal-mode input-port mode thunk
procedure: with-output-port-terminal-mode output-port mode thunk

Thunk must be a procedure of no arguments.

Binds the terminal mode of input-port or output-port to be mode, and calls thunk. When thunk returns, the original terminal mode is restored and the values yielded by thunk are returned.

obsolete procedure: port/input-terminal-mode input-port
obsolete procedure: port/set-input-terminal-mode input-port mode
obsolete procedure: port/with-input-terminal-mode input-port mode thunk
obsolete procedure: port/output-terminal-mode output-port
obsolete procedure: port/set-output-terminal-mode output-port mode
obsolete procedure: port/with-output-terminal-mode output-port mode thunk

These procedures are deprecated; instead use the corresponding procedures above.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.9 Format

The procedure format is very useful for producing nicely formatted text, producing good-looking messages, and so on. MIT/GNU Scheme’s implementation of format is similar to that of Common Lisp, except that Common Lisp defines many more directives.14

format is a run-time-loadable option. To use it, execute

(load-option 'format)

once before calling it.

procedure: format destination control-string argument …

Writes the characters of control-string to destination, except that a tilde (~) introduces a format directive. The character after the tilde, possibly preceded by prefix parameters and modifiers, specifies what kind of formatting is desired. Most directives use one or more arguments to create their output; the typical directive puts the next argument into the output, formatted in some special way. It is an error if no argument remains for a directive requiring an argument, but it is not an error if one or more arguments remain unprocessed by a directive.

The output is sent to destination. If destination is #f, a string is created that contains the output; this string is returned as the value of the call to format. In all other cases format returns an unspecified value. If destination is #t, the output is sent to the current output port. Otherwise, destination must be an output port, and the output is sent there.

This procedure performs discretionary output flushing (see Output Procedures).

A format directive consists of a tilde (~), optional prefix parameters separated by commas, optional colon (:) and at-sign (@) modifiers, and a single character indicating what kind of directive this is. The alphabetic case of the directive character is ignored. The prefix parameters are generally integers, notated as optionally signed decimal numbers. If both the colon and at-sign modifiers are given, they may appear in either order.

In place of a prefix parameter to a directive, you can put the letter ‘V’ (or ‘v’), which takes an argument for use as a parameter to the directive. Normally this should be an exact integer. This feature allows variable-width fields and the like. You can also use the character ‘#’ in place of a parameter; it represents the number of arguments remaining to be processed.

It is an error to give a format directive more parameters than it is described here as accepting. It is also an error to give colon or at-sign modifiers to a directive in a combination not specifically described here as being meaningful.

~A

The next argument, which may be any object, is printed as if by display. ~mincolA inserts spaces on the right, if necessary, to make the width at least mincol columns. The @ modifier causes the spaces to be inserted on the left rather than the right.

~S

The next argument, which may be any object, is printed as if by write. ~mincolS inserts spaces on the right, if necessary, to make the width at least mincol columns. The @ modifier causes the spaces to be inserted on the left rather than the right.

~%

This outputs a #\newline character. ~n% outputs n newlines. No argument is used. Simply putting a newline in control-string would work, but ~% is often used because it makes the control string look nicer in the middle of a program.

~~

This outputs a tilde. ~n~ outputs n tildes.

~newline

Tilde immediately followed by a newline ignores the newline and any following non-newline whitespace characters. With an @, the newline is left in place, but any following whitespace is ignored. This directive is typically used when control-string is too long to fit nicely into one line of the program:

(define (type-clash-error procedure arg spec actual)
  (format
   #t
   "~%Procedure ~S~%requires its %A argument ~
    to be of type ~S,~%but it was called with ~
    an argument of type ~S.~%"
   procedure arg spec actual))
(type-clash-error 'vector-ref
                  "first"
                  'integer
                  'vector)

prints

Procedure vector-ref
requires its first argument to be of type integer,
but it was called with an argument of type vector.

Note that in this example newlines appear in the output only as specified by the ~% directives; the actual newline characters in the control string are suppressed because each is preceded by a tilde.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.10 Custom Output

MIT/GNU Scheme provides hooks for specifying that specified objects have special written representations. There are no restrictions on the written representations.

procedure: define-print-method predicate print-method

Defines the print method for objects satisfying predicate to be print-method. The predicate argument must be a unary procedure that returns true for the objects to print specially, and print-method must be a binary procedure that accepts one of those objects and a textual output port.

Although print-method can print the object in any way, we strongly recomment using one of the following special printers.

procedure: standard-print-method name [get-parts]

The name argument may be a unary procedure, a string, or a symbol; if it is a procedure it is called with the object to be printed as its argument and should return a string or a symbol. The get-parts argument, if provided, must be a unary procedure that is called with the object to be printed and must return a list of objects. If get-parts is not provided, it defaults to a procedure that returns an empty list.

The output generated by this method is in a standard format:

#[<name> <hash> <part>…]

where <name> is the string or symbol from name as printed by display, <hash> is a unique nonnegative integer generated by calling hash-object on the object, and the <part>s are the result of calling get-parts as printed by write and separated by spaces.

One significant advantage of print methods generated by standard-print-method is that the parts returned by get-parts are examined when searching for circular structure (as by write) or shared structure (as by write-shared). In effect the printer sees one of these objects as a compound object containing those parts.

procedure: bracketed-print-method name printer

The name argument may be a unary procedure, a string, or a symbol; if it is a procedure it is called with the object to be printed as its argument and should return a string or a symbol. The printer argument must be a binary procedure, which is called with the object to print and a textual output port as its arguments.

Similar to standard-print-method, this procedure prints an object

#[<name> <hash><output>]

where <name> is the string or symbol from name as printed by display, <hash> is a unique nonnegative integer generated by calling hash-object on the object, and <output> is the text written by printer.

This procedure has the benefit of printing objects using the standard bracketed form, but because its output is unstructured can not be examined for sharing or circularity. Generally speaking it’s preferable to use standard-print-method instead.

The following are deprecated procedures that have been replaced by the above.

obsolete procedure: set-record-type-unparser-method! record-type unparser-method

This procedure is deprecated; instead use

(define-print-method (record-constructor record-type)
  unparser-method)

provided that unparser-method is really a print method.

obsolete procedure: unparser/set-tagged-vector-method! tag unparser-method
obsolete procedure: unparser/set-tagged-pair-method! tag unparser-method

These procedures arg deprecated. There is no direct replacement for them.

These were primarily used by define-structure, which now generates new-style print methods. If you have other uses of these, it should be possible to translate them to use define-print-method with hand-written predicates.

obsolete procedure: standard-unparser-method name procedure

This procedure is deprecated; it is currently an alias for bracketed-print-method.

obsolete procedure: with-current-unparser-state unparser-state procedure

This procedure is deprecated, with no direct replacement. In general just use procedure without wrapping it.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.11 Prompting

This section describes procedures that prompt the user for input. Why should the programmer use these procedures when it is possible to do prompting using ordinary input and output procedures? One reason is that the prompting procedures are more succinct. However, a second and better reason is that the prompting procedures can be separately customized for each user interface, providing more natural interaction. The interfaces for Edwin and for GNU Emacs have already been customized in this fashion; because Edwin and Emacs are very similar editors, their customizations provide very similar behavior.

Each of these procedure accepts an optional argument called port, which if given must be an I/O port. If not given, this port defaults to the value of (interaction-i/o-port); this is initially the console I/O port.

procedure: prompt-for-command-expression prompt [port]

Prompts the user for an expression that is to be executed as a command. This is the procedure called by the REP loop to read the user’s expressions.

If prompt is a string, it is used verbatim as the prompt string. Otherwise, it must be a pair whose car is the symbol ‘standard’ and whose cdr is a string; in this case the prompt string is formed by prepending to the string the current REP loop “level number” and a space. Also, a space is appended to the string, unless it already ends in a space or is an empty string.

The default behavior of this procedure is to print a fresh line, a newline, and the prompt string; flush the output buffer; then read an object and return it. v Under Edwin and Emacs, before the object is read, the interaction buffer is put into a mode that allows expressions to be edited and submitted for input using specific editor commands. The first expression that is submitted is returned as the value of this procedure.

procedure: prompt-for-command-char prompt [port]

Prompts the user for a single character that is to be executed as a command; the returned character is guaranteed to satisfy char-graphic?. If at all possible, the character is read from the user interface using a mode that reads the character as a single keystroke; in other words, it should not be necessary for the user to follow the character with a carriage return or something similar.

This is the procedure called by debug and where to read the user’s commands.

If prompt is a string, it is used verbatim as the prompt string. Otherwise, it must be a pair whose car is standard and whose cdr is a string; in this case the prompt string is formed by prepending to the string the current REP loop “level number” and a space. Also, a space is appended to the string, unless it already ends in a space or is an empty string.

The default behavior of this procedure is to print a fresh line, a newline, and the prompt string; flush the output buffer; read a character in raw mode, echo that character, and return it.

Under Edwin and Emacs, instead of reading a character, the interaction buffer is put into a mode in which graphic characters submit themselves as input. After this mode change, the first such character submitted is returned as the value of this procedure.

procedure: prompt-for-expression prompt [port]

Prompts the user for an expression.

The prompt string is formed by appending a colon and a space to prompt, unless prompt already ends in a space or is the null string.

The default behavior of this procedure is to print a fresh line, a newline, and the prompt string; flush the output buffer; then read an object and return it.

Under Edwin and Emacs, the expression is read in the minibuffer.

procedure: prompt-for-evaluated-expression prompt [environment [port]]

Prompts the user for an evaluated expression. Calls prompt-for-expression to read an expression, then evaluates the expression using environment; if environment is not given, the REP loop environment is used.

procedure: prompt-for-confirmation prompt [port]

Prompts the user for confirmation. The result yielded by this procedure is a boolean.

The prompt string is formed by appending the string " (y or n)? " to prompt, unless prompt already ends in a space or is the null string.

The default behavior of this procedure is to print a fresh line, a newline, and the prompt string; flush the output buffer; then read a character in raw mode. If the character is #\y, #\Y, or #\space, the procedure returns #t; If the character is #\n, #\N, or #\rubout, the procedure returns #f. Otherwise the prompt is repeated.

Under Edwin or Emacs, the confirmation is read in the minibuffer.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.12 Textual Port Primitives

This section describes the low-level operations that can be used to build and manipulate textual I/O ports. The purpose of these operations is to allow programmers to construct new kinds of textual I/O ports.

The mechanisms described in this section are exclusively for textual ports; binary ports can’t be customized. In this section, any reference to a “port” that isn’t modified by “textual” or “binary” is assumed to be a textual port.

The abstract model of a textual I/O port, as implemented here, is a combination of a set of named operations and a state. The state is an arbitrary object, the meaning of which is determined by the operations. The operations are defined by a mapping from names to procedures.

The set of named operations is represented by an object called a textual port type. A port type is constructed from a set of named operations, and is subsequently used to construct a port. The port type completely specifies the behavior of the port. Port types also support a simple form of inheritance, allowing you to create new ports that are similar to existing ports.

The port operations are divided into two classes:

Standard operations

There is a specific set of standard operations for input ports, and a different set for output ports. Applications can assume that the standard input operations are implemented for all input ports, and likewise the standard output operations are implemented for all output ports.

Custom operations

Some ports support additional operations. For example, ports that implement output to terminals (or windows) may define an operation named y-size that returns the height of the terminal in characters. Because only some ports will implement these operations, programs that use custom operations must test each port for their existence, and be prepared to deal with ports that do not implement them.


Next: , Previous: , Up: Textual Port Primitives   [Contents][Index]

14.12.1 Textual Port Types

The procedures in this section provide means for constructing port types with standard and custom operations, and accessing their operations.

procedure: make-textual-port-type operations port-type

Creates and returns a new port type. Operations must be a list; each element is a list of two elements, the name of the operation (a symbol) and the procedure that implements it. Port-type is either #f or a port type; if it is a port type, any operations implemented by port-type but not specified in operations will be implemented by the resulting port type.

Operations need not contain definitions for all of the standard operations; the procedure will provide defaults for any standard operations that are not defined. At a minimum, the following operations must be defined: for input ports, read-char and peek-char; for output ports, either write-char or write-substring. I/O ports must supply the minimum operations for both input and output.

If an operation in operations is defined to be #f, then the corresponding operation in port-type is not inherited.

If read-char is defined in operations, then any standard input operations defined in port-type are ignored. Likewise, if write-char or write-substring is defined in operations, then any standard output operations defined in port-type are ignored. This feature allows overriding the standard operations without having to enumerate them.

procedure: textual-port-type? object
procedure: textual-input-port-type? object
procedure: textual-output-port-type? object
procedure: textual-i/o-port-type? object

These predicates return #t if object is a port type, input-port type, output-port type, or I/O-port type, respectively. Otherwise, they return #f.

obsolete procedure: make-port-type operations port-type
obsolete procedure: port-type? object
obsolete procedure: input-port-type? object
obsolete procedure: output-port-type? object
obsolete procedure: i/o-port-type? object

These procedures are deprecated; use the procedures defined above.

obsolete procedure: port-type/operations port-type
obsolete procedure: port-type/operation-names port-type
obsolete procedure: port-type/operation port-type symbol

These procedures are deprecated and will be removed in the near future. There are no replacements planned.


Next: , Previous: , Up: Textual Port Primitives   [Contents][Index]

14.12.2 Constructors and Accessors for Textual Ports

The procedures in this section provide means for constructing ports, accessing the type of a port, and manipulating the state of a port.

procedure: make-textual-port port-type state

Returns a new port with type port-type and the given state. The port will be an input, output, or I/O port according to port-type.

procedure: textual-port-type textual-port

Returns the port type of textual-port.

procedure: textual-port-state textual-port

Returns the state component of textual-port.

procedure: set-textual-port-state! textual-port object

Changes the state component of textual-port to be object. Returns an unspecified value.

procedure: textual-port-operation textual-port symbol

Returns the operation named symbol for textual-port. If textual-port has no such operation, returns #f.

procedure: textual-port-operation-names textual-port

Returns a newly allocated list whose elements are the names of the operations implemented by textual-port.

obsolete procedure: make-port port-type state
obsolete procedure: port/type textual-port
obsolete procedure: port/state textual-port
obsolete procedure: set-port/state! textual-port object
obsolete procedure: port/operation textual-port symbol
obsolete procedure: port/operation-names port

These procedures are deprecated; use the procedures defined above.


Next: , Previous: , Up: Textual Port Primitives   [Contents][Index]

14.12.3 Textual Input Port Operations

This section describes the standard operations on textual input ports. Following that, some useful custom operations are described.

operation on textual input port: read-char port

Removes the next character available from port and returns it. If port has no more characters and will never have any (e.g. at the end of an input file), this operation returns an end-of-file object. If port has no more characters but will eventually have some more (e.g. a terminal where nothing has been typed recently), and it is in non-blocking mode, #f is returned; otherwise the operation hangs until input is available.

operation on textual input port: peek-char port

Reads the next character available from port and returns it. The character is not removed from port, and a subsequent attempt to read from the port will get that character again. In other respects this operation behaves like read-char.

operation on textual input port: char-ready? port k

char-ready? returns #t if at least one character is available to be read from port. If no characters are available, the operation waits up to k milliseconds before returning #f, returning immediately if any characters become available while it is waiting.

operation on textual input port: read-string port char-set
operation on textual input port: discard-chars port char-set

These operations are like read-char, except that they read or discard multiple characters at once. All characters up to, but excluding, the first character in char-set (or end of file) are read from port. read-string returns these characters as a newly allocated string, while discard-chars discards them and returns an unspecified value. These operations hang until sufficient input is available, even if port is in non-blocking mode. If end of file is encountered before any input characters, read-string returns an end-of-file object.

operation on textual input port: read-substring port string start end

Reads characters from port into the substring defined by string, start, and end until either the substring has been filled or there are no more characters available. Returns the number of characters written to the substring.

If port is an interactive port, and at least one character is immediately available, the available characters are written to the substring and this operation returns immediately. If no characters are available, and port is in blocking mode, the operation blocks until at least one character is available. Otherwise, the operation returns #f immediately.

This is an extremely fast way to read characters from a port.

procedure: input-port/read-char textual-input-port
procedure: input-port/peek-char textual-input-port
procedure: input-port/char-ready? textual-input-port k
procedure: input-port/read-string textual-input-port char-set
procedure: input-port/discard-chars textual-input-port char-set
procedure: input-port/read-substring textual-input-port string start end

Each of these procedures invokes the respective operation on textual-input-port. For example, the following are equivalent:

(input-port/read-char textual-input-port)
((textual-port-operation textual-input-port 'read-char)
 textual-input-port)

The following custom operations are implemented for input ports to files, and will also work with some other kinds of input ports:

operation on textual input port: eof? port

Returns #t if port is known to be at end of file, otherwise it returns #f.

operation on textual input port: chars-remaining port

Returns an estimate of the number of characters remaining to be read from port. This is useful only when port is a file port in binary mode; in other cases, it returns #f.

operation on textual input port: buffered-input-chars port

Returns the number of unread characters that are stored in port’s buffer. This will always be less than or equal to the buffer’s size.

operation on textual input port: input-buffer-size port

Returns the maximum number of characters that port’s buffer can hold.

operation on textual input port: set-input-buffer-size port size

Resizes port’s buffer so that it can hold at most size characters. Characters in the buffer are discarded. Size must be an exact non-negative integer.


Previous: , Up: Textual Port Primitives   [Contents][Index]

14.12.4 Textual Output Port Operations

This section describes the standard operations on output ports. Following that, some useful custom operations are described.

operation on textual output port: write-char port char

Writes char to port and returns an unspecified value.

operation on textual output port: write-substring port string start end

Writes the substring specified by string, start, and end to port and returns an unspecified value. Equivalent to writing the characters of the substring, one by one, to port, but is implemented very efficiently.

operation on textual output port: fresh-line port

Most output ports are able to tell whether or not they are at the beginning of a line of output. If port is such a port, end-of-line is written to the port only if the port is not already at the beginning of a line. If port is not such a port, an end-of-line is unconditionally written to the port. Returns an unspecified value.

operation on textual output port: flush-output port

If port is buffered, this causes its buffer to be written out. Otherwise it has no effect. Returns an unspecified value.

operation on textual output port: discretionary-flush-output port

Normally, this operation does nothing. However, ports that support discretionary output flushing implement this operation identically to flush-output.

procedure: output-port/write-char textual-output-port char
procedure: output-port/write-substring textual-output-port string start end
procedure: output-port/fresh-line textual-output-port
procedure: output-port/flush-output textual-output-port
procedure: output-port/discretionary-flush-output textual-output-port

Each of these procedures invokes the respective operation on textual-output-port. For example, the following are equivalent:

(output-port/write-char textual-output-port char)
((textual-port-operation textual-output-port 'write-char)
 textual-output-port char)
procedure: output-port/write-string textual-output-port string

Writes string to textual-output-port. Equivalent to

(output-port/write-substring textual-output-port
                             string
                             0
                             (string-length string))

The following custom operations are generally useful.

operation on textual output port: buffered-output-chars port

Returns the number of unwritten characters that are stored in port’s buffer. This will always be less than or equal to the buffer’s size.

operation on textual output port: output-buffer-size port

Returns the maximum number of characters that port’s buffer can hold.

operation on textual output port: set-output-buffer-size port size

Resizes port’s buffer so that it can hold at most size characters. Characters in the buffer are discarded. Size must be an exact non-negative integer.

operation on textual output port: x-size port

Returns an exact positive integer that is the width of port in characters. If port has no natural width, e.g. if it is a file port, #f is returned.

operation on textual output port: y-size port

Returns an exact positive integer that is the height of port in characters. If port has no natural height, e.g. if it is a file port, #f is returned.

procedure: output-port/x-size textual-output-port

This procedure invokes the custom operation whose name is the symbol x-size, if it exists. If the x-size operation is both defined and returns a value other than #f, that value is returned as the result of this procedure. Otherwise, output-port/x-size returns a default value (currently 80).

output-port/x-size is useful for programs that tailor their output to the width of the display (a fairly common practice). If the output device is not a display, such programs normally want some reasonable default width to work with, and this procedure provides exactly that.

procedure: output-port/y-size textual-output-port

This procedure invokes the custom operation whose name is the symbol y-size, if it exists. If the y-size operation is defined, the value it returns is returned as the result of this procedure; otherwise, #f is returned.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.13 Parser Buffers

The parser buffer mechanism facilitates construction of parsers for complex grammars. It does this by providing an input stream with unbounded buffering and backtracking. The amount of buffering is under program control. The stream can backtrack to any position in the buffer.

The mechanism defines two data types: the parser buffer and the parser-buffer pointer. A parser buffer is like an input port with buffering and backtracking. A parser-buffer pointer is a pointer into the stream of characters provided by a parser buffer.

Note that all of the procedures defined here consider a parser buffer to contain a stream of Unicode characters.

There are several constructors for parser buffers:

procedure: textual-input-port->parser-buffer textual-input-port
obsolete procedure: input-port->parser-buffer textual-input-port

Returns a parser buffer that buffers characters read from textual-input-port.

procedure: substring->parser-buffer string start end

Returns a parser buffer that buffers the characters in the argument substring. This is equivalent to creating a string input port and calling textual-input-port->parser-buffer, but it runs faster and uses less memory.

procedure: string->parser-buffer string

Like substring->parser-buffer but buffers the entire string.

procedure: source->parser-buffer source

Returns a parser buffer that buffers the characters returned by calling source. Source is a procedure of three arguments: a string, a start index, and an end index (in other words, a substring specifier). Each time source is called, it writes some characters in the substring, and returns the number of characters written. When there are no more characters available, it returns zero. It must not return zero in any other circumstance.

Parser buffers and parser-buffer pointers may be distinguished from other objects:

procedure: parser-buffer? object

Returns #t if object is a parser buffer, otherwise returns #f.

procedure: parser-buffer-pointer? object

Returns #t if object is a parser-buffer pointer, otherwise returns #f.

Characters can be read from a parser buffer much as they can be read from an input port. The parser buffer maintains an internal pointer indicating its current position in the input stream. Additionally, the buffer remembers all characters that were previously read, and can look at characters arbitrarily far ahead in the stream. It is this buffering capability that facilitates complex matching and backtracking.

procedure: read-parser-buffer-char buffer

Returns the next character in buffer, advancing the internal pointer past that character. If there are no more characters available, returns #f and leaves the internal pointer unchanged.

procedure: peek-parser-buffer-char buffer

Returns the next character in buffer, or #f if no characters are available. Leaves the internal pointer unchanged.

procedure: parser-buffer-ref buffer index

Returns a character in buffer. Index is a non-negative integer specifying the character to be returned. If index is zero, returns the next available character; if it is one, returns the character after that, and so on. If index specifies a position after the last character in buffer, returns #f. Leaves the internal pointer unchanged.

The internal pointer of a parser buffer can be read or written:

procedure: get-parser-buffer-pointer buffer

Returns a parser-buffer pointer object corresponding to the internal pointer of buffer.

procedure: set-parser-buffer-pointer! buffer pointer

Sets the internal pointer of buffer to the position specified by pointer. Pointer must have been returned from a previous call of get-parser-buffer-pointer on buffer. Additionally, if some of buffer’s characters have been discarded by discard-parser-buffer-head!, pointer must be outside the range that was discarded.

procedure: get-parser-buffer-tail buffer pointer

Returns a newly-allocated string consisting of all of the characters in buffer that fall between pointer and buffer’s internal pointer. Pointer must have been returned from a previous call of get-parser-buffer-pointer on buffer. Additionally, if some of buffer’s characters have been discarded by discard-parser-buffer-head!, pointer must be outside the range that was discarded.

procedure: discard-parser-buffer-head! buffer

Discards all characters in buffer that have already been read; in other words, all characters prior to the internal pointer. After this operation has completed, it is no longer possible to move the internal pointer backwards past the current position by calling set-parser-buffer-pointer!.

The next rather large set of procedures does conditional matching against the contents of a parser buffer. All matching is performed relative to the buffer’s internal pointer, so the first character to be matched against is the next character that would be returned by peek-parser-buffer-char. The returned value is always #t for a successful match, and #f otherwise. For procedures whose names do not end in ‘-no-advance’, a successful match also moves the internal pointer of the buffer forward to the end of the matched text; otherwise the internal pointer is unchanged.

procedure: match-parser-buffer-char buffer char
procedure: match-parser-buffer-char-ci buffer char
procedure: match-parser-buffer-not-char buffer char
procedure: match-parser-buffer-not-char-ci buffer char
procedure: match-parser-buffer-char-no-advance buffer char
procedure: match-parser-buffer-char-ci-no-advance buffer char
procedure: match-parser-buffer-not-char-no-advance buffer char
procedure: match-parser-buffer-not-char-ci-no-advance buffer char

Each of these procedures compares a single character in buffer to char. The basic comparison match-parser-buffer-char compares the character to char using char=?. The procedures whose names contain the ‘-ci’ modifier do case-insensitive comparison (i.e. they use char-ci=?). The procedures whose names contain the ‘not-’ modifier are successful if the character doesn’t match char.

procedure: match-parser-buffer-char-in-set buffer char-set
procedure: match-parser-buffer-char-in-set-no-advance buffer char-set

These procedures compare the next character in buffer against char-set using char-in-set?.

procedure: match-parser-buffer-string buffer string
procedure: match-parser-buffer-string-ci buffer string
procedure: match-parser-buffer-string-no-advance buffer string
procedure: match-parser-buffer-string-ci-no-advance buffer string

These procedures match string against buffer’s contents. The ‘-ci’ procedures do case-insensitive matching.

procedure: match-parser-buffer-substring buffer string start end
procedure: match-parser-buffer-substring-ci buffer string start end
procedure: match-parser-buffer-substring-no-advance buffer string start end
procedure: match-parser-buffer-substring-ci-no-advance buffer string start end

These procedures match the specified substring against buffer’s contents. The ‘-ci’ procedures do case-insensitive matching.

The remaining procedures provide information that can be used to identify locations in a parser buffer’s stream.

procedure: parser-buffer-position-string pointer

Returns a string describing the location of pointer in terms of its character and line indexes. This resulting string is meant to be presented to an end user in order to direct their attention to a feature in the input stream. In this string, the indexes are presented as one-based numbers.

Pointer may alternatively be a parser buffer, in which case it is equivalent to having specified the buffer’s internal pointer.

procedure: parser-buffer-pointer-index pointer
procedure: parser-buffer-pointer-line pointer

Returns the character or line index, respectively, of pointer. Both indexes are zero-based.


Next: , Previous: , Up: Input/Output   [Contents][Index]

14.14 Parser Language

Although it is possible to write parsers using the parser-buffer abstraction (see Parser Buffers), it is tedious. The problem is that the abstraction isn’t closely matched to the way that people think about syntactic structures. In this section, we introduce a higher-level mechanism that greatly simplifies the implementation of a parser.

The parser language described here allows the programmer to write BNF-like specifications that are translated into efficient Scheme code at compile time. The language is declarative, but it can be freely mixed with Scheme code; this allows the parsing of grammars that aren’t conveniently described in the language.

The language also provides backtracking. For example, this expression matches any sequence of alphanumeric characters followed by a single alphabetic character:

(*matcher
 (seq (* (char-set char-set:alphanumeric))
      (char-set char-set:alphabetic)))

The way that this works is that the matcher matches alphanumeric characters in the input stream until it finds a non-alphanumeric character. It then tries to match an alphabetic character, which of course fails. At this point, if it matched at least one alphanumeric character, it backtracks: the last matched alphanumeric is “unmatched”, and it again attempts to match an alphabetic character. The backtracking can be arbitrarily deep; the matcher will continue to back up until it finds a way to match the remainder of the expression.

So far, this sounds a lot like regular-expression matching (see Regular Expressions). However, there are some important differences.

Here is an example that shows off several of the features of the parser language. The example is a parser for XML start tags:

(*parser
 (with-pointer p
   (seq "<"
        parse-name
        parse-attribute-list
        (alt (match ">")
             (match "/>")
             (sexp
              (lambda (b)
                (error
                 (string-append
                  "Unterminated start tag at "
                  (parser-buffer-position-string p)))))))))

This shows that the basic description of a start tag is very similar to its BNF. Non-terminal symbols parse-name and parse-attribute-list do most of the work, and the noise strings "<" and ">" are the syntactic markers delimiting the form. There are two alternate endings for start tags, an