This manual documents Guile version 2.0.5.
Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2009, 2010, 2011, 2012 Free Software Foundation.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”
Appendices
Indices
cond clause
sxml-match: Pattern Matching of SXML
This manual describes how to use Guile, GNU's Ubiquitous Intelligent Language for Extensions. It relates particularly to Guile version 2.0.5.
Like Guile itself, the Guile reference manual is a living entity, cared for by many people over a long period of time. As such, it is hard to identify individuals of whom to say “yes, this person, she wrote the manual.”
Still, among the many contributions, some caretakers stand out. First among them is Neil Jerram, who has been working on this document for ten years now. Neil's attention both to detail and to the big picture have made a real difference in the understanding of a generation of Guile hackers.
Next we should note Marius Vollmer's effect on this document. Marius maintained Guile during a period in which Guile's API was clarified—put to the fire, so to speak—and he had the good sense to effect the same change on the manual.
Martin Grabmueller made substantial contributions throughout the manual in preparation for the Guile 1.6 release, including filling out a lot of the documentation of Scheme data types, control mechanisms and procedures. In addition, he wrote the documentation for Guile's SRFI modules and modules associated with the Guile REPL.
Ludovic Courtès and Andy Wingo, the Guile maintainers at the time of this writing (late 2010), have also made their dent in the manual, writing documentation for new modules and subsystems in Guile 2.0. They are also responsible for ensuring that the existing text retains its relevance as Guile evolves. See Reporting Bugs, for more information on reporting problems in this manual.
The content for the first versions of this manual incorporated and was inspired by documents from Aubrey Jaffer, author of the SCM system on which Guile was based, and from Tom Lord, Guile's first maintainer. Although most of this text has been rewritten, all of it was important, and some of the structure remains.
The manual for the first versions of Guile were largely written, edited, and compiled by Mark Galassi and Jim Blandy. In particular, Jim wrote the original tutorial on Guile's data representation and the C API for accessing Guile objects.
Significant portions were also contributed by Thien-Thi Nguyen, Kevin Ryde, Mikael Djurfeldt, Christian Lynbech, Julian Graham, Gary Houston, Tim Pierce, and a few dozen more. You, reader, are most welcome to join their esteemed ranks. Visit Guile's web site at http://www.gnu.org/software/guile/ to find out how to get involved.
Guile is Free Software. Guile is copyrighted, not public domain, and there are restrictions on its distribution or redistribution, but these restrictions are designed to permit everything a cooperating person would want to do.
C code linking to the Guile library is subject to terms of that library. Basically such code may be published on any terms, provided users can re-link against a new or modified version of Guile.
C code linking to the Guile readline module is subject to the terms of that module. Basically such code must be published on Free terms.
Scheme level code written to be run by Guile (but not derived from Guile itself) is not restricted in any way, and may be published on any terms. We encourage authors to publish on Free terms.
You must be aware there is no warranty whatsoever for Guile. This is described in full in the licenses.
Guile is an implementation of the Scheme programming language. Scheme (http://schemers.org/) is an elegant and conceptually simple dialect of Lisp, originated by Guy Steele and Gerald Sussman, and since evolved by the series of reports known as RnRS (the Revised^n Reports on Scheme).
Unlike, for example, Python or Perl, Scheme has no benevolent dictator. There are many Scheme implementations, with different characteristics and with communities and academic activities around them, and the language develops as a result of the interplay between these. Guile's particular characteristics are that
The next few sections explain what we mean by these points. The sections after that cover how you can obtain and install Guile, and the typographical conventions that we use in this manual.
Guile implements Scheme as described in the Revised^5 Report on the Algorithmic Language Scheme (usually known as R5RS), providing clean and general data and control structures. Guile goes beyond the rather austere language presented in R5RS, extending it with a module system, full access to POSIX system calls, networking support, multiple threads, dynamic linking, a foreign function call interface, powerful string processing, and many other features needed for programming in the real world.
The Scheme community has recently agreed and published R6RS, the latest installment in the RnRS series. R6RS significantly expands the core Scheme language, and standardises many non-core functions that implementations—including Guile—have previously done in different ways. Guile has been updated to incorporate some of the features of R6RS, and to adjust some existing features to conform to the R6RS specification, but it is by no means a complete R6RS implementation. See R6RS Support.
Between R5RS and R6RS, the SRFI process (http://srfi.schemers.org/) standardised interfaces for many practical needs, such as multithreaded programming and multidimensional arrays. Guile supports many SRFIs, as documented in detail in SRFI Support.
In summary, so far as relationship to the Scheme standards is concerned, Guile is an R5RS implementation with many extensions, some of which conform to SRFIs or to the relevant parts of R6RS.
Like a shell, Guile can run interactively—reading expressions from the user, evaluating them, and displaying the results—or as a script interpreter, reading and executing Scheme code from a file. Guile also provides an object library, libguile, that allows other applications to easily incorporate a complete Scheme interpreter. An application can then use Guile as an extension language, a clean and powerful configuration language, or as multi-purpose “glue”, connecting primitives provided by the application. It is easy to call Scheme code from C code and vice versa, giving the application designer full control of how and when to invoke the interpreter. Applications can add new functions, data types, control structures, and even syntax to Guile, creating a domain-specific language tailored to the task at hand, but based on a robust language design.
This kind of combination is helped by four aspects of Guile's design and history. First is that Guile has always been targeted as an extension language. Hence its C API has always been of great importance, and has been developed accordingly. Second and third are rather technical points—that Guile uses conservative garbage collection, and that it implements the Scheme concept of continuations by copying and reinstating the C stack—but whose practical consequence is that most existing C code can be glued into Guile as is, without needing modifications to cope with strange Scheme execution flows. Last is the module system, which helps extensions to coexist without stepping on each others' toes.
Guile's module system allows one to break up a large program into manageable sections with well-defined interfaces between them. Modules may contain a mixture of interpreted and compiled code; Guile can use either static or dynamic linking to incorporate compiled code. Modules also encourage developers to package up useful collections of routines for general distribution; as of this writing, one can find Emacs interfaces, database access routines, compilers, GUI toolkit interfaces, and HTTP client functions, among others.
Guile was conceived by the GNU Project following the fantastic success of Emacs Lisp as an extension language within Emacs. Just as Emacs Lisp allowed complete and unanticipated applications to be written within the Emacs environment, the idea was that Guile should do the same for other GNU Project applications. This remains true today.
The idea of extensibility is closely related to the GNU project's primary goal, that of promoting software freedom. Software freedom means that people receiving a software package can modify or enhance it to their own desires, including in ways that may not have occurred at all to the software's original developers. For programs written in a compiled language like C, this freedom covers modifying and rebuilding the C code; but if the program also provides an extension language, that is usually a much friendlier and lower-barrier-of-entry way for the user to start making their own changes.
Guile is now used by GNU project applications such as AutoGen, Lilypond, Denemo, Mailutils, TeXmacs and Gnucash, and we hope that there will be many more in future.
Non-free software has no interest in its users being able to see how it works. They are supposed to just accept it, or to report problems and hope that the source code owners will choose to work on them.
Free software aims to work reliably just as much as non-free software does, but it should also empower its users by making its workings available. This is useful for many reasons, including education, auditing and enhancements, as well as for debugging problems.
The ideal free software system achieves this by making it easy for interested
users to see the source code for a feature that they are using, and to follow
through that source code step-by-step, as it runs. In Emacs, good examples of
this are the source code hyperlinks in the help system, and edebug.
Then, for bonus points and maximising the ability for the user to experiment
quickly with code changes, the system should allow parts of the source code to
be modified and reloaded into the running program, to take immediate effect.
Guile is designed for this kind of interactive programming, and this distinguishes it from many Scheme implementations that instead prioritise running a fixed Scheme program as fast as possible—because there are tradeoffs between performance and the ability to modify parts of an already running program. There are faster Schemes than Guile, but Guile is a GNU project and so prioritises the GNU vision of programming freedom and experimentation.
Since the 2.0 release, Guile's architecture supports compiling any language to its core virtual machine bytecode, and Scheme is just one of the supported languages. Other supported languages are Emacs Lisp, ECMAScript (commonly known as Javascript) and Brainfuck, and work is under discussion for Lua, Ruby and Python.
This means that users can program applications which use Guile in the language of their choice, rather than having the tastes of the application's author imposed on them.
Guile can be obtained from the main GNU archive site ftp://ftp.gnu.org or any of its mirrors. The file will be named guile-version.tar.gz. The current version is 2.0.5, so the file you should grab is:
ftp://ftp.gnu.org/gnu/guile/guile-2.0.5.tar.gz
To unbundle Guile use the instruction
zcat guile-2.0.5.tar.gz | tar xvf -
which will create a directory called guile-2.0.5 with all the sources. You can look at the file INSTALL for detailed instructions on how to build and install Guile, but you should be able to just do
cd guile-2.0.5
./configure
make
make install
This will install the Guile executable guile, the Guile library libguile and various associated header files and support libraries. It will also install the Guile reference manual.
Since this manual frequently refers to the Scheme “standard”, also known as R5RS, or the “Revised^5 Report on the Algorithmic Language Scheme”, we have included the report in the Guile distribution; see Introduction. This will also be installed in your info directory.
The rest of this manual is organised into the following chapters.
guile program from the command-line and how to write scripts
in Scheme. It also introduces the extensions that Guile offers beyond standard
Scheme.
We use some conventions in this manual.
#t” or “non-#f”. This typically means that
val is returned if condition holds, and that ‘#f’ is
returned otherwise. To clarify: val will only be
returned when condition is true.
The symbol ‘⇒’ is used to tell which value is returned by an evaluation:
(+ 1 2)
⇒ 3
Some procedures produce some output besides returning a value. This is denoted by the symbol ‘-|’.
(begin (display 1) (newline) 'hooray)
-| 1
⇒ hooray
As you can see, this code prints ‘1’ (denoted by
‘-|’), and returns hooray (denoted by
‘⇒’).
This chapter presents a quick tour of all the ways that Guile can be used. There are additional examples in the examples/ directory in the Guile source distribution. It also explains how best to report any problems that you find.
The following examples assume that Guile has been installed in
/usr/local/.
In its simplest form, Guile acts as an interactive interpreter for the
Scheme programming language, reading and evaluating Scheme expressions
the user enters from the terminal. Here is a sample interaction between
Guile and a user; the user's input appears after the $ and
scheme@(guile-user)> prompts:
$ guile
scheme@(guile-user)> (+ 1 2 3) ; add some numbers
$1 = 6
scheme@(guile-user)> (define (factorial n) ; define a function
(if (zero? n) 1 (* n (factorial (- n 1)))))
scheme@(guile-user)> (factorial 20)
$2 = 2432902008176640000
scheme@(guile-user)> (getpwnam "root") ; look in /etc/passwd
$3 = #("root" "x" 0 0 "root" "/root" "/bin/bash")
scheme@(guile-user)> C-d
$
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
Here is a trivial Guile script. See Guile Scripting, for more details.
#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)
The Guile interpreter is available as an object library, to be linked into applications using Scheme as a configuration or extension language.
Here is simple-guile.c, source code for a program that will
produce a complete Guile interpreter. In addition to all usual
functions provided by Guile, it will also offer the function
my-hostname.
#include <stdlib.h>
#include <libguile.h>
static SCM
my_hostname (void)
{
char *s = getenv ("HOSTNAME");
if (s == NULL)
return SCM_BOOL_F;
else
return scm_from_locale_string (s);
}
static void
inner_main (void *data, int argc, char **argv)
{
scm_c_define_gsubr ("my-hostname", 0, 0, 0, my_hostname);
scm_shell (argc, argv);
}
int
main (int argc, char **argv)
{
scm_boot_guile (argc, argv, inner_main, 0);
return 0; /* never reached */
}
When Guile is correctly installed on your system, the above program can be compiled and linked like this:
$ gcc -o simple-guile simple-guile.c \
`pkg-config --cflags --libs guile-2.0`
When it is run, it behaves just like the guile program except
that you can also call the new my-hostname function.
$ ./simple-guile
scheme@(guile-user)> (+ 1 2 3)
$1 = 6
scheme@(guile-user)> (my-hostname)
"burns"
You can link Guile into your program and make Scheme available to the users of your program. You can also link your library into Guile and make its functionality available to all users of Guile.
A library that is linked into Guile is called an extension, but it really just is an ordinary object library.
The following example shows how to write a simple extension for Guile
that makes the j0 function available to Scheme code.
#include <math.h>
#include <libguile.h>
SCM
j0_wrapper (SCM x)
{
return scm_make_real (j0 (scm_num2dbl (x, "j0")));
}
void
init_bessel ()
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc `pkg-config --cflags guile-2.0` \
-shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction).
A shared library can be loaded into a running Guile process with the
function load-extension. The j0 is then immediately
available:
$ guile
scheme@(guile-user)> (load-extension "./libguile-bessel" "init_bessel")
scheme@(guile-user)> (j0 2)
$1 = 0.223890779141236
For more on how to install your extension, see Installing Site Packages.
Guile has support for dividing a program into modules. By using modules, you can group related code together and manage the composition of complete programs from largely independent parts.
For more details on the module system beyond this introductory material, See Modules.
Guile comes with a lot of useful modules, for example for string processing or command line parsing. Additionally, there exist many Guile modules written by other Guile hackers, but which have to be installed manually.
Here is a sample interactive session that shows how to use the
(ice-9 popen) module which provides the means for communicating
with other processes over pipes together with the (ice-9
rdelim) module that provides the function read-line.
$ guile
scheme@(guile-user)> (use-modules (ice-9 popen))
scheme@(guile-user)> (use-modules (ice-9 rdelim))
scheme@(guile-user)> (define p (open-input-pipe "ls -l"))
scheme@(guile-user)> (read-line p)
$1 = "total 30"
scheme@(guile-user)> (read-line p)
$2 = "drwxr-sr-x 2 mgrabmue mgrabmue 1024 Mar 29 19:57 CVS"
You can create new modules using the syntactic form
define-module. All definitions following this form until the
next define-module are placed into the new module.
One module is usually placed into one file, and that file is installed in a location where Guile can automatically find it. The following session shows a simple example.
$ cat /usr/local/share/guile/site/foo/bar.scm
(define-module (foo bar)
#:export (frob))
(define (frob x) (* 2 x))
$ guile
scheme@(guile-user)> (use-modules (foo bar))
scheme@(guile-user)> (frob 12)
$1 = 24
For more on how to install your module, see Installing Site Packages.
In addition to Scheme code you can also put things that are defined in C into a module.
You do this by writing a small Scheme file that defines the module and
call load-extension directly in the body of the module.
$ cat /usr/local/share/guile/site/math/bessel.scm
(define-module (math bessel)
#:export (j0))
(load-extension "libguile-bessel" "init_bessel")
$ file /usr/local/lib/guile/2.0/extensions/libguile-bessel.so
... ELF 32-bit LSB shared object ...
$ guile
scheme@(guile-user)> (use-modules (math bessel))
scheme@(guile-user)> (j0 2)
$1 = 0.223890779141236
See Modules and Extensions, for more information.
Any problems with the installation should be reported to bug-guile@gnu.org.
If you find a bug in Guile, please report it to the Guile developers, so they can fix it. They may also be able to suggest workarounds when it is not possible for you to apply the bug-fix or install a new version of Guile yourself.
Before sending in bug reports, please check with the following list that you really have found a bug.
Before reporting the bug, check whether any programs you have loaded
into Guile, including your .guile file, set any variables that
may affect the functioning of Guile. Also, see whether the problem
happens in a freshly started Guile without loading your .guile
file (start Guile with the -q switch to prevent loading the init
file). If the problem does not occur then, you must report the
precise contents of any programs that you must load into Guile in order
to cause the problem to occur.
When you write a bug report, please make sure to include as much of the information described below in the report. If you can't figure out some of the items, it is not a problem, but the more information we get, the more likely we can diagnose and fix the bug.
(version) from
within Guile.
config.guess shell
script. If you have a Guile checkout, this file is located in
build-aux; otherwise you can fetch the latest version from
http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD.
$ build-aux/config.guess
x86_64-unknown-linux-gnu
rpm -qa | grep guile. On systems
that use DPKG, dpkg -l | grep guile.
$ ./config.status --config
'--enable-error-on-warning' '--disable-deprecated'...
If you have a Scheme program that produces the bug, please include it in the bug report. If your program is too big to include. please try to reduce your code to a minimal test case.
If you can reproduce your problem at the REPL, that is best. Give a transcript of the expressions you typed at the REPL.
If the manifestation of the bug is a Guile error message, it is
important to report the precise text of the error message, and a
backtrace showing how the Scheme program arrived at the error. This can
be done using the ,backtrace command in Guile's debugger.
If your bug causes Guile to crash, additional information from a
low-level debugger such as GDB might be helpful. If you have built Guile
yourself, you can run Guile under GDB via the
meta/gdb-uninstalled-guile script. Instead of invoking Guile as
usual, invoke the wrapper script, type run to start the process,
then backtrace when the crash comes. Include that backtrace in
your report.
In this chapter, we introduce the basic concepts that underpin the elegance and power of the Scheme language.
Readers who already possess a background knowledge of Scheme may happily skip this chapter. For the reader who is new to the language, however, the following discussions on data, procedures, expressions and closure are designed to provide a minimum level of Scheme understanding that is more or less assumed by the chapters that follow.
The style of this introductory material aims about halfway between the terse precision of R5RS and the discursiveness of existing Scheme tutorials. For pointers to useful Scheme resources on the web, please see Further Reading.
This section discusses the representation of data types and values, what it means for Scheme to be a latently typed language, and the role of variables. We conclude by introducing the Scheme syntaxes for defining a new variable, and for changing the value of an existing variable.
The term latent typing is used to describe a computer language, such as Scheme, for which you cannot, in general, simply look at a program's source code and determine what type of data will be associated with a particular variable, or with the result of a particular expression.
Sometimes, of course, you can tell from the code what the type of
an expression will be. If you have a line in your program that sets the
variable x to the numeric value 1, you can be certain that,
immediately after that line has executed (and in the absence of multiple
threads), x has the numeric value 1. Or if you write a procedure
that is designed to concatenate two strings, it is likely that the rest
of your application will always invoke this procedure with two string
parameters, and quite probable that the procedure would go wrong in some
way if it was ever invoked with parameters that were not both strings.
Nevertheless, the point is that there is nothing in Scheme which
requires the procedure parameters always to be strings, or x
always to hold a numeric value, and there is no way of declaring in your
program that such constraints should always be obeyed. In the same
vein, there is no way to declare the expected type of a procedure's
return value.
Instead, the types of variables and expressions are only known – in general – at run time. If you need to check at some point that a value has the expected type, Scheme provides run time procedures that you can invoke to do so. But equally, it can be perfectly valid for two separate invocations of the same procedure to specify arguments with different types, and to return values with different types.
The next subsection explains what this means in practice, for the ways that Scheme programs use data types, values and variables.
Scheme provides many data types that you can use to represent your data. Primitive types include characters, strings, numbers and procedures. Compound types, which allow a group of primitive and compound values to be stored together, include lists, pairs, vectors and multi-dimensional arrays. In addition, Guile allows applications to define their own data types, with the same status as the built-in standard Scheme types.
As a Scheme program runs, values of all types pop in and out of existence. Sometimes values are stored in variables, but more commonly they pass seamlessly from being the result of one computation to being one of the parameters for the next.
Consider an example. A string value is created because the interpreter reads in a literal string from your program's source code. Then a numeric value is created as the result of calculating the length of the string. A second numeric value is created by doubling the calculated length. Finally the program creates a list with two elements – the doubled length and the original string itself – and stores this list in a program variable.
All of the values involved here – in fact, all values in Scheme – carry their type with them. In other words, every value “knows,” at runtime, what kind of value it is. A number, a string, a list, whatever.
A variable, on the other hand, has no fixed type. A variable –
x, say – is simply the name of a location – a box – in which
you can store any kind of Scheme value. So the same variable in a
program may hold a number at one moment, a list of procedures the next,
and later a pair of strings. The “type” of a variable – insofar as
the idea is meaningful at all – is simply the type of whatever value
the variable happens to be storing at a particular moment.
To define a new variable, you use Scheme's define syntax like
this:
(define variable-name value)
This makes a new variable called variable-name and stores value in it as the variable's initial value. For example:
;; Make a variable `x' with initial numeric value 1.
(define x 1)
;; Make a variable `organization' with an initial string value.
(define organization "Free Software Foundation")
(In Scheme, a semicolon marks the beginning of a comment that continues
until the end of the line. So the lines beginning ;; are
comments.)
Changing the value of an already existing variable is very similar,
except that define is replaced by the Scheme syntax set!,
like this:
(set! variable-name new-value)
Remember that variables do not have fixed types, so new-value may have a completely different type from whatever was previously stored in the location named by variable-name. Both of the following examples are therefore correct.
;; Change the value of `x' to 5.
(set! x 5)
;; Change the value of `organization' to the FSF's street number.
(set! organization 545)
In these examples, value and new-value are literal numeric
or string values. In general, however, value and new-value
can be any Scheme expression. Even though we have not yet covered the
forms that Scheme expressions can take (see About Expressions), you
can probably guess what the following set! example does...
(set! x (+ x 1))
(Note: this is not a complete description of define and
set!, because we need to introduce some other aspects of Scheme
before the missing pieces can be filled in. If, however, you are
already familiar with the structure of Scheme, you may like to read
about those missing pieces immediately by jumping ahead to the following
references.
define syntax that can be used when defining new procedures.
set! syntax that helps with changing a single value in the depths
of a compound data structure.)
define other
than at top level in a Scheme program, including a discussion of when it
works to use define rather than set! to change the value
of an existing variable.
This section introduces the basics of using and creating Scheme
procedures. It discusses the representation of procedures as just
another kind of Scheme value, and shows how procedure invocation
expressions are constructed. We then explain how lambda is used
to create new procedures, and conclude by presenting the various
shorthand forms of define that can be used instead of writing an
explicit lambda expression.
One of the great simplifications of Scheme is that a procedure is just
another type of value, and that procedure values can be passed around
and stored in variables in exactly the same way as, for example, strings
and lists. When we talk about a built-in standard Scheme procedure such
as open-input-file, what we actually mean is that there is a
pre-defined top level variable called open-input-file, whose
value is a procedure that implements what R5RS says that
open-input-file should do.
Note that this is quite different from many dialects of Lisp — including Emacs Lisp — in which a program can use the same name with two quite separate meanings: one meaning identifies a Lisp function, while the other meaning identifies a Lisp variable, whose value need have nothing to do with the function that is associated with the first meaning. In these dialects, functions and variables are said to live in different namespaces.
In Scheme, on the other hand, all names belong to a single unified namespace, and the variables that these names identify can hold any kind of Scheme value, including procedure values.
One consequence of the “procedures as values” idea is that, if you don't happen to like the standard name for a Scheme procedure, you can change it.
For example, call-with-current-continuation is a very important
standard Scheme procedure, but it also has a very long name! So, many
programmers use the following definition to assign the same procedure
value to the more convenient name call/cc.
(define call/cc call-with-current-continuation)
Let's understand exactly how this works. The definition creates a new
variable call/cc, and then sets its value to the value of the
variable call-with-current-continuation; the latter value is a
procedure that implements the behaviour that R5RS specifies under the
name “call-with-current-continuation”. So call/cc ends up
holding this value as well.
Now that call/cc holds the required procedure value, you could
choose to use call-with-current-continuation for a completely
different purpose, or just change its value so that you will get an
error if you accidentally use call-with-current-continuation as a
procedure in your program rather than call/cc. For example:
(set! call-with-current-continuation "Not a procedure any more!")
Or you could just leave call-with-current-continuation as it was.
It's perfectly fine for more than one variable to hold the same
procedure value.
A procedure invocation in Scheme is written like this:
(procedure [arg1 [arg2 ...]])
In this expression, procedure can be any Scheme expression whose value is a procedure. Most commonly, however, procedure is simply the name of a variable whose value is a procedure.
For example, string-append is a standard Scheme procedure whose
behaviour is to concatenate together all the arguments, which are
expected to be strings, that it is given. So the expression
(string-append "/home" "/" "andrew")
is a procedure invocation whose result is the string value
"/home/andrew".
Similarly, string-length is a standard Scheme procedure that
returns the length of a single string argument, so
(string-length "abc")
is a procedure invocation whose result is the numeric value 3.
Each of the parameters in a procedure invocation can itself be any Scheme expression. Since a procedure invocation is itself a type of expression, we can put these two examples together to get
(string-length (string-append "/home" "/" "andrew"))
— a procedure invocation whose result is the numeric value 12.
(You may be wondering what happens if the two examples are combined the other way round. If we do this, we can make a procedure invocation expression that is syntactically correct:
(string-append "/home" (string-length "abc"))
but when this expression is executed, it will cause an error, because
the result of (string-length "abc") is a numeric value, and
string-append is not designed to accept a numeric value as one of
its arguments.)
Scheme has lots of standard procedures, and Guile provides all of these via predefined top level variables. All of these standard procedures are documented in the later chapters of this reference manual.
Before very long, though, you will want to create new procedures that
encapsulate aspects of your own applications' functionality. To do
this, you can use the famous lambda syntax.
For example, the value of the following Scheme expression
(lambda (name address) expression ...)
is a newly created procedure that takes two arguments:
name and address. The behaviour of the
new procedure is determined by the sequence of expressions in the
body of the procedure definition. (Typically, these
expressions would use the arguments in some way, or else there
wouldn't be any point in giving them to the procedure.) When invoked,
the new procedure returns a value that is the value of the last
expression in the procedure body.
To make things more concrete, let's suppose that the two arguments are both strings, and that the purpose of this procedure is to form a combined string that includes these arguments. Then the full lambda expression might look like this:
(lambda (name address)
(string-append "Name=" name ":Address=" address))
We noted in the previous subsection that the procedure part of a procedure invocation expression can be any Scheme expression whose value is a procedure. But that's exactly what a lambda expression is! So we can use a lambda expression directly in a procedure invocation, like this:
((lambda (name address)
(string-append "Name=" name ":Address=" address))
"FSF"
"Cambridge")
This is a valid procedure invocation expression, and its result is the string:
"Name=FSF:Address=Cambridge"
It is more common, though, to store the procedure value in a variable —
(define make-combined-string
(lambda (name address)
(string-append "Name=" name ":Address=" address)))
— and then to use the variable name in the procedure invocation:
(make-combined-string "FSF" "Cambridge")
Which has exactly the same result.
It's important to note that procedures created using lambda have
exactly the same status as the standard built in Scheme procedures, and
can be invoked, passed around, and stored in variables in exactly the
same ways.
Since it is so common in Scheme programs to want to create a procedure
and then store it in a variable, there is an alternative form of the
define syntax that allows you to do just that.
A define expression of the form
(define (name [arg1 [arg2 ...]])
expression ...)
is exactly equivalent to the longer form
(define name
(lambda ([arg1 [arg2 ...]])
expression ...))
So, for example, the definition of make-combined-string in the
previous subsection could equally be written:
(define (make-combined-string name address)
(string-append "Name=" name ":Address=" address))
This kind of procedure definition creates a procedure that requires
exactly the expected number of arguments. There are two further forms
of the lambda expression, which create a procedure that can
accept a variable number of arguments:
(lambda (arg1 ... . args) expression ...)
(lambda args expression ...)
The corresponding forms of the alternative define syntax are:
(define (name arg1 ... . args) expression ...)
(define (name . args) expression ...)
For details on how these forms work, see See Lambda.
(It could be argued that the alternative define forms are rather
confusing, especially for newcomers to the Scheme language, as they hide
both the role of lambda and the fact that procedures are values
that are stored in variables in the some way as any other kind of value.
On the other hand, they are very convenient, and they are also a good
example of another of Scheme's powerful features: the ability to specify
arbitrary syntactic transformations at run time, which can be applied to
subsequently read input.)
So far, we have met expressions that do things, such as the
define expressions that create and initialize new variables, and
we have also talked about expressions that have values, for
example the value of the procedure invocation expression:
(string-append "/home" "/" "andrew")
but we haven't yet been precise about what causes an expression like this procedure invocation to be reduced to its “value”, or how the processing of such expressions relates to the execution of a Scheme program as a whole.
This section clarifies what we mean by an expression's value, by introducing the idea of evaluation. It discusses the side effects that evaluation can have, explains how each of the various types of Scheme expression is evaluated, and describes the behaviour and use of the Guile REPL as a mechanism for exploring evaluation. The section concludes with a very brief summary of Scheme's common syntactic expressions.
In Scheme, the process of executing an expression is known as evaluation. Evaluation has two kinds of result:
Of the expressions that we have met so far, define and
set! expressions have side effects — the creation or
modification of a variable — but no value; lambda expressions
have values — the newly constructed procedures — but no side
effects; and procedure invocation expressions, in general, have either
values, or side effects, or both.
It is tempting to try to define more intuitively what we mean by “value” and “side effects”, and what the difference between them is. In general, though, this is extremely difficult. It is also unnecessary; instead, we can quite happily define the behaviour of a Scheme program by specifying how Scheme executes a program as a whole, and then by describing the value and side effects of evaluation for each type of expression individually.
So, some1 definitions...
2.3 or a string
"Hello world!"
The following subsections describe how each of these types of expression is evaluated.
When a literal data expression is evaluated, the value of the expression is simply the value that the expression describes. The evaluation of a literal data expression has no side effects.
So, for example,
"abc" is the string value
"abc"
3+4i is the complex number 3 + 4i
#(1 2 3) is a three-element vector
containing the numeric values 1, 2 and 3.
For any data type which can be expressed literally like this, the syntax of the literal data expression for that data type — in other words, what you need to write in your code to indicate a literal value of that type — is known as the data type's read syntax. This manual specifies the read syntax for each such data type in the section that describes that data type.
Some data types do not have a read syntax. Procedures, for example,
cannot be expressed as literal data; they must be created using a
lambda expression (see Creating a Procedure) or implicitly
using the shorthand form of define (see Lambda Alternatives).
When an expression that consists simply of a variable name is evaluated, the value of the expression is the value of the named variable. The evaluation of a variable reference expression has no side effects.
So, after
(define key "Paul Evans")
the value of the expression key is the string value "Paul
Evans". If key is then modified by
(set! key 3.74)
the value of the expression key is the numeric value 3.74.
If there is no variable with the specified name, evaluation of the variable reference expression signals an error.
This is where evaluation starts getting interesting! As already noted, a procedure invocation expression has the form
(procedure [arg1 [arg2 ...]])
where procedure must be an expression whose value, when evaluated, is a procedure.
The evaluation of a procedure invocation expression like this proceeds by
For a procedure defined in Scheme, “calling the procedure with the list of values as its parameters” means binding the values to the procedure's formal parameters and then evaluating the sequence of expressions that make up the body of the procedure definition. The value of the procedure invocation expression is the value of the last evaluated expression in the procedure body. The side effects of calling the procedure are the combination of the side effects of the sequence of evaluations of expressions in the procedure body.
For a built-in procedure, the value and side-effects of calling the procedure are best described by that procedure's documentation.
Note that the complete side effects of evaluating a procedure invocation expression consist not only of the side effects of the procedure call, but also of any side effects of the preceding evaluation of the expressions procedure, arg1, arg2, and so on.
To illustrate this, let's look again at the procedure invocation expression:
(string-length (string-append "/home" "/" "andrew"))
In the outermost expression, procedure is string-length and
arg1 is (string-append "/home" "/" "andrew").
string-length, which is a variable, gives a
procedure value that implements the expected behaviour for
“string-length”.
(string-append "/home" "/" "andrew"), which is
another procedure invocation expression, means evaluating each of
string-append, which gives a procedure value that implements the
expected behaviour for “string-append”
"/home", which gives the string value "/home"
"/", which gives the string value "/"
"andrew", which gives the string value "andrew"
and then invoking the procedure value with this list of string values as
its arguments. The resulting value is a single string value that is the
concatenation of all the arguments, namely "/home/andrew".
In the evaluation of the outermost expression, the interpreter can now invoke the procedure value obtained from procedure with the value obtained from arg1 as its arguments. The resulting value is a numeric value that is the length of the argument string, which is 12.
When a procedure invocation expression is evaluated, the procedure and all the argument expressions must be evaluated before the procedure can be invoked. Special syntactic expressions are special because they are able to manipulate their arguments in an unevaluated form, and can choose whether to evaluate any or all of the argument expressions.
Why is this needed? Consider a program fragment that asks the user whether or not to delete a file, and then deletes the file if the user answers yes.
(if (string=? (read-answer "Should I delete this file?")
"yes")
(delete-file file))
If the outermost (if ...) expression here was a procedure
invocation expression, the expression (delete-file file), whose
side effect is to actually delete a file, would already have been
evaluated before the if procedure even got invoked! Clearly this
is no use — the whole point of an if expression is that the
consequent expression is only evaluated if the condition of the
if expression is “true”.
Therefore if must be special syntax, not a procedure. Other
special syntaxes that we have already met are define, set!
and lambda. define and set! are syntax because
they need to know the variable name that is given as the first
argument in a define or set! expression, not that
variable's value. lambda is syntax because it does not
immediately evaluate the expressions that define the procedure body;
instead it creates a procedure object that incorporates these
expressions so that they can be evaluated in the future, when that
procedure is invoked.
The rules for evaluating each special syntactic expression are specified individually for each special syntax. For a summary of standard special syntax, see See Syntax Summary.
Scheme is “properly tail recursive”, meaning that tail calls or recursions from certain contexts do not consume stack space or other resources and can therefore be used on arbitrarily large data or for an arbitrarily long calculation. Consider for example,
(define (foo n)
(display n)
(newline)
(foo (1+ n)))
(foo 1)
-|
1
2
3
...
foo prints numbers infinitely, starting from the given n.
It's implemented by printing n then recursing to itself to print
n+1 and so on. This recursion is a tail call, it's the
last thing done, and in Scheme such tail calls can be made without
limit.
Or consider a case where a value is returned, a version of the SRFI-1
last function (see SRFI-1 Selectors) returning the last
element of a list,
(define (my-last lst)
(if (null? (cdr lst))
(car lst)
(my-last (cdr lst))))
(my-last '(1 2 3)) ⇒ 3
If the list has more than one element, my-last applies itself
to the cdr. This recursion is a tail call, there's no code
after it, and the return value is the return value from that call. In
Scheme this can be used on an arbitrarily long list argument.
A proper tail call is only available from certain contexts, namely the following special form positions,
and — last expression
begin — last expression
case — last expression in each clause
cond — last expression in each clause, and the call to a
=> procedure is a tail call
do — last result expression
if — “true” and “false” leg expressions
lambda — last expression in body
let, let*, letrec, let-syntax,
letrec-syntax — last expression in body
or — last expression
The following core functions make tail calls,
apply — tail call to given procedure
call-with-current-continuation — tail call to the procedure
receiving the new continuation
call-with-values — tail call to the values-receiving
procedure
eval — tail call to evaluate the form
string-any, string-every — tail call to predicate on
the last character (if that point is reached)
The above are just core functions and special forms. Tail calls in other modules are described with the relevant documentation, for example SRFI-1
any and every (see SRFI-1 Searching).
It will be noted there are a lot of places which could potentially be
tail calls, for instance the last call in a for-each, but only
those explicitly described are guaranteed.
If you start Guile without specifying a particular program for it to execute, Guile enters its standard Read Evaluate Print Loop — or REPL for short. In this mode, Guile repeatedly reads in the next Scheme expression that the user types, evaluates it, and prints the resulting value.
The REPL is a useful mechanism for exploring the evaluation behaviour
described in the previous subsection. If you type string-append,
for example, the REPL replies #<primitive-procedure
string-append>, illustrating the relationship between the variable
string-append and the procedure value stored in that variable.
In this manual, the notation ⇒ is used to mean “evaluates to”. Wherever you see an example of the form
expression
⇒
result
feel free to try it out yourself by typing expression into the REPL and checking that it gives the expected result.
This subsection lists the most commonly used Scheme syntactic expressions, simply so that you will recognize common special syntax when you see it. For a full description of each of these syntaxes, follow the appropriate reference.
lambda (see Lambda) is used to construct procedure objects.
define (see Top Level) is used to create a new variable and
set its initial value.
set! (see Top Level) is used to modify an existing variable's
value.
let, let* and letrec (see Local Bindings)
create an inner lexical environment for the evaluation of a sequence of
expressions, in which a specified set of local variables is bound to the
values of a corresponding set of expressions. For an introduction to
environments, see See About Closure.
begin (see begin) executes a sequence of expressions in order
and returns the value of the last expression. Note that this is not the
same as a procedure which returns its last argument, because the
evaluation of a procedure invocation expression does not guarantee to
evaluate the arguments in order.
if and cond (see Conditionals) provide conditional
evaluation of argument expressions depending on whether one or more
conditions evaluate to “true” or “false”.
case (see Conditionals) provides conditional evaluation of
argument expressions depending on whether a variable has one of a
specified group of values.
and (see and or) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“false”.
or (see and or) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“true”.
The concept of closure is the idea that a lambda expression “captures” the variable bindings that are in lexical scope at the point where the lambda expression occurs. The procedure created by the lambda expression can refer to and mutate the captured bindings, and the values of those bindings persist between procedure calls.
This section explains and explores the various parts of this idea in more detail.
We said earlier that a variable name in a Scheme program is associated with a location in which any kind of Scheme value may be stored. (Incidentally, the term “vcell” is often used in Lisp and Scheme circles as an alternative to “location”.) Thus part of what we mean when we talk about “creating a variable” is in fact establishing an association between a name, or identifier, that is used by the Scheme program code, and the variable location to which that name refers. Although the value that is stored in that location may change, the location to which a given name refers is always the same.
We can illustrate this by breaking down the operation of the
define syntax into three parts: define
define expression
define expression.
A collection of associations between names and locations is called an
environment. When you create a top level variable in a program
using define, the name-location association for that variable is
added to the “top level” environment. The “top level” environment
also includes name-location associations for all the procedures that are
supplied by standard Scheme.
It is also possible to create environments other than the top level one, and to create variable bindings, or name-location associations, in those environments. This ability is a key ingredient in the concept of closure; the next subsection shows how it is done.
We have seen how to create top level variables using the define
syntax (see Definition). It is often useful to create variables
that are more limited in their scope, typically as part of a procedure
body. In Scheme, this is done using the let syntax, or one of
its modified forms let* and letrec. These syntaxes are
described in full later in the manual (see Local Bindings). Here
our purpose is to illustrate their use just enough that we can see how
local variables work.
For example, the following code uses a local variable s to
simplify the computation of the area of a triangle given the lengths of
its three sides.
(define a 5.3)
(define b 4.7)
(define c 2.8)
(define area
(let ((s (/ (+ a b c) 2)))
(sqrt (* s (- s a) (- s b) (- s c)))))
The effect of the let expression is to create a new environment
and, within this environment, an association between the name s
and a new location whose initial value is obtained by evaluating
(/ (+ a b c) 2). The expressions in the body of the let,
namely (sqrt (* s (- s a) (- s b) (- s c))), are then evaluated
in the context of the new environment, and the value of the last
expression evaluated becomes the value of the whole let
expression, and therefore the value of the variable area.
In the example of the previous subsection, we glossed over an important
point. The body of the let expression in that example refers not
only to the local variable s, but also to the top level variables
a, b, c and sqrt. (sqrt is the
standard Scheme procedure for calculating a square root.) If the body
of the let expression is evaluated in the context of the
local let environment, how does the evaluation get at the
values of these top level variables?
The answer is that the local environment created by a let
expression automatically has a reference to its containing environment
— in this case the top level environment — and that the Scheme
interpreter automatically looks for a variable binding in the containing
environment if it doesn't find one in the local environment. More
generally, every environment except for the top level one has a
reference to its containing environment, and the interpreter keeps
searching back up the chain of environments — from most local to top
level — until it either finds a variable binding for the required
identifier or exhausts the chain.
This description also determines what happens when there is more than
one variable binding with the same name. Suppose, continuing the
example of the previous subsection, that there was also a pre-existing
top level variable s created by the expression:
(define s "Some beans, my lord!")
Then both the top level environment and the local let environment
would contain bindings for the name s. When evaluating code
within the let body, the interpreter looks first in the local
let environment, and so finds the binding for s created by
the let syntax. Even though this environment has a reference to
the top level environment, which also has a binding for s, the
interpreter doesn't get as far as looking there. When evaluating code
outside the let body, the interpreter looks up variable names in
the top level environment, so the name s refers to the top level
variable.
Within the let body, the binding for s in the local
environment is said to shadow the binding for s in the top
level environment.
The rules that we have just been describing are the details of how Scheme implements “lexical scoping”. This subsection takes a brief diversion to explain what lexical scope means in general and to present an example of non-lexical scoping.
“Lexical scope” in general is the idea that
In practice, lexical scoping is the norm for most programming languages, and probably corresponds to what you would intuitively consider to be “normal”. You may even be wondering how the situation could possibly — and usefully — be otherwise. To demonstrate that another kind of scoping is possible, therefore, and to compare it against lexical scoping, the following subsection presents an example of non-lexical scoping and examines in detail how its behavior differs from the corresponding lexically scoped code.
To demonstrate that non-lexical scoping does exist and can be useful, we present the following example from Emacs Lisp, which is a “dynamically scoped” language.
(defvar currency-abbreviation "USD")
(defun currency-string (units hundredths)
(concat currency-abbreviation
(number-to-string units)
"."
(number-to-string hundredths)))
(defun french-currency-string (units hundredths)
(let ((currency-abbreviation "FRF"))
(currency-string units hundredths)))
The question to focus on here is: what does the identifier
currency-abbreviation refer to in the currency-string
function? The answer, in Emacs Lisp, is that all variable bindings go
onto a single stack, and that currency-abbreviation refers to the
topmost binding from that stack which has the name
“currency-abbreviation”. The binding that is created by the
defvar form, to the value "USD", is only relevant if none
of the code that calls currency-string rebinds the name
“currency-abbreviation” in the meanwhile.
The second function french-currency-string works precisely by
taking advantage of this behaviour. It creates a new binding for the
name “currency-abbreviation” which overrides the one established by
the defvar form.
;; Note! This is Emacs Lisp evaluation, not Scheme!
(french-currency-string 33 44)
⇒
"FRF33.44"
Now let's look at the corresponding, lexically scoped Scheme code:
(define currency-abbreviation "USD")
(define (currency-string units hundredths)
(string-append currency-abbreviation
(number->string units)
"."
(number->string hundredths)))
(define (french-currency-string units hundredths)
(let ((currency-abbreviation "FRF"))
(currency-string units hundredths)))
According to the rules of lexical scoping, the
currency-abbreviation in currency-string refers to the
variable location in the innermost environment at that point in the code
which has a binding for currency-abbreviation, which is the
variable location in the top level environment created by the preceding
(define currency-abbreviation ...) expression.
In Scheme, therefore, the french-currency-string procedure does
not work as intended. The variable binding that it creates for
“currency-abbreviation” is purely local to the code that forms the
body of the let expression. Since this code doesn't directly use
the name “currency-abbreviation” at all, the binding is pointless.
(french-currency-string 33 44)
⇒
"USD33.44"
This begs the question of how the Emacs Lisp behaviour can be
implemented in Scheme. In general, this is a design question whose
answer depends upon the problem that is being addressed. In this case,
the best answer may be that currency-string should be
redesigned so that it can take an optional third argument. This third
argument, if supplied, is interpreted as a currency abbreviation that
overrides the default.
It is possible to change french-currency-string so that it mostly
works without changing currency-string, but the fix is inelegant,
and susceptible to interrupts that could leave the
currency-abbreviation variable in the wrong state:
(define (french-currency-string units hundredths)
(set! currency-abbreviation "FRF")
(let ((result (currency-string units hundredths)))
(set! currency-abbreviation "USD")
result))
The key point here is that the code does not create any local binding
for the identifier currency-abbreviation, so all occurrences of
this identifier refer to the top level variable.
Consider a let expression that doesn't contain any
lambdas:
(let ((s (/ (+ a b c) 2)))
(sqrt (* s (- s a) (- s b) (- s c))))
When the Scheme interpreter evaluates this, it
let
s in the new environment, with
value given by (/ (+ a b c) 2)
let in the context of
the new local environment, and remembers the value V
let, using
the value V as the value of the let expression, in the
context of the containing environment.
After the let expression has been evaluated, the local
environment that was created is simply forgotten, and there is no longer
any way to access the binding that was created in this environment. If
the same code is evaluated again, it will follow the same steps again,
creating a second new local environment that has no connection with the
first, and then forgetting this one as well.
If the let body contains a lambda expression, however, the
local environment is not forgotten. Instead, it becomes
associated with the procedure that is created by the lambda
expression, and is reinstated every time that that procedure is called.
In detail, this works as follows.
lambda expression, to
create a procedure object, it stores the current environment as part of
the procedure definition.
The result is that the procedure body is always evaluated in the context of the environment that was current when the procedure was created.
This is what is meant by closure. The next few subsections present examples that explore the usefulness of this concept.
This example uses closure to create a procedure with a variable binding that is private to the procedure, like a local variable, but whose value persists between procedure calls.
(define (make-serial-number-generator)
(let ((current-serial-number 0))
(lambda ()
(set! current-serial-number (+ current-serial-number 1))
current-serial-number)))
(define entry-sn-generator (make-serial-number-generator))
(entry-sn-generator)
⇒
1
(entry-sn-generator)
⇒
2
When make-serial-number-generator is called, it creates a local
environment with a binding for current-serial-number whose
initial value is 0, then, within this environment, creates a procedure.
The local environment is stored within the created procedure object and
so persists for the lifetime of the created procedure.
Every time the created procedure is invoked, it increments the value of
the current-serial-number binding in the captured environment and
then returns the current value.
Note that make-serial-number-generator can be called again to
create a second serial number generator that is independent of the
first. Every new invocation of make-serial-number-generator
creates a new local let environment and returns a new procedure
object with an association to this environment.
This example uses closure to create two procedures, get-balance
and deposit, that both refer to the same captured local
environment so that they can both access the balance variable
binding inside that environment. The value of this variable binding
persists between calls to either procedure.
Note that the captured balance variable binding is private to
these two procedures: it is not directly accessible to any other code.
It can only be accessed indirectly via get-balance or
deposit, as illustrated by the withdraw procedure.
(define get-balance #f)
(define deposit #f)
(let ((balance 0))
(set! get-balance
(lambda ()
balance))
(set! deposit
(lambda (amount)
(set! balance (+ balance amount))
balance)))
(define (withdraw amount)
(deposit (- amount)))
(get-balance)
⇒
0
(deposit 50)
⇒
50
(withdraw 75)
⇒
-25
An important detail here is that the get-balance and
deposit variables must be set up by defineing them at top
level and then set!ing their values inside the let body.
Using define within the let body would not work: this
would create variable bindings within the local let environment
that would not be accessible at top level.
A frequently used programming model for library code is to allow an application to register a callback function for the library to call when some particular event occurs. It is often useful for the application to make several such registrations using the same callback function, for example if several similar library events can be handled using the same application code, but the need then arises to distinguish the callback function calls that are associated with one callback registration from those that are associated with different callback registrations.
In languages without the ability to create functions dynamically, this
problem is usually solved by passing a user_data parameter on the
registration call, and including the value of this parameter as one of
the parameters on the callback function. Here is an example of
declarations using this solution in C:
typedef void (event_handler_t) (int event_type,
void *user_data);
void register_callback (int event_type,
event_handler_t *handler,
void *user_data);
In Scheme, closure can be used to achieve the same functionality without
requiring the library code to store a user-data for each callback
registration.
;; In the library:
(define (register-callback event-type handler-proc)
...)
;; In the application:
(define (make-handler event-type user-data)
(lambda ()
...
<code referencing event-type and user-data>
...))
(register-callback event-type
(make-handler event-type ...))
As far as the library is concerned, handler-proc is a procedure
with no arguments, and all the library has to do is call it when the
appropriate event occurs. From the application's point of view, though,
the handler procedure has used closure to capture an environment that
includes all the context that the handler code needs —
event-type and user-data — to handle the event
correctly.
Closure is the capture of an environment, containing persistent variable bindings, within the definition of a procedure or a set of related procedures. This is rather similar to the idea in some object oriented languages of encapsulating a set of related data variables inside an “object”, together with a set of “methods” that operate on the encapsulated data. The following example shows how closure can be used to emulate the ideas of objects, methods and encapsulation in Scheme.
(define (make-account)
(let ((balance 0))
(define (get-balance)
balance)
(define (deposit amount)
(set! balance (+ balance amount))
balance)
(define (withdraw amount)
(deposit (- amount)))
(lambda args
(apply
(case (car args)
((get-balance) get-balance)
((deposit) deposit)
((withdraw) withdraw)
(else (error "Invalid method!")))
(cdr args)))))
Each call to make-account creates and returns a new procedure,
created by the expression in the example code that begins “(lambda
args”.
(define my-account (make-account))
my-account
⇒
#<procedure args>
This procedure acts as an account object with methods
get-balance, deposit and withdraw. To apply one of
the methods to the account, you call the procedure with a symbol
indicating the required method as the first parameter, followed by any
other parameters that are required by that method.
(my-account 'get-balance)
⇒
0
(my-account 'withdraw 5)
⇒
-5
(my-account 'deposit 396)
⇒
391
(my-account 'get-balance)
⇒
391
Note how, in this example, both the current balance and the helper
procedures get-balance, deposit and withdraw, used
to implement the guts of the account object's methods, are all stored in
variable bindings within the private local environment captured by the
lambda expression that creates the account object procedure.
Guile's core language is Scheme, and a lot can be achieved simply by using Guile to write and run Scheme programs — as opposed to having to dive into C code. In this part of the manual, we explain how to use Guile in this mode, and describe the tools that Guile provides to help you with script writing, debugging, and packaging your programs for distribution.
For detailed reference information on the variables, functions, and so on that make up Guile's application programming interface (API), see API Reference.
Guile's core language is Scheme, which is specified and described in the series of reports known as RnRS. RnRS is shorthand for the Revised^n Report on the Algorithmic Language Scheme. Guile complies fully with R5RS (see Introduction), and implements some aspects of R6RS.
Guile also has many extensions that go beyond these reports. Some of the areas where Guile extends R5RS are:
Many features of Guile depend on and can be changed by information that the user provides either before or when Guile is started. Below is a description of what information to provide and how to provide it.
Here we describe Guile's command-line processing in detail. Guile processes its arguments from left to right, recognizing the switches described below. For examples, see Scripting Examples.
-s script arg...command-line function returns
a list of strings of the form (script arg...).
It is possible to name a file using a leading hyphen, for example, -myfile.scm. In this case, the file name must be preceded by -s to tell Guile that a (script) file is being named.
Scripts are read and evaluated as Scheme source code just as the
load function would. After loading script, Guile exits.
-c expr arg...command-line function returns a list of strings of
the form (guile arg...), where guile is the
path of the Guile executable.
-- arg...command-line function returns a list of strings of the form
(guile arg...), where guile is the path of the
Guile executable.
-L directory-x extension%load-extensions). The specified extensions
are tried in the order given on the command line, and before the default
load extensions. Extensions added here are not in effect during
execution of the user's .guile file.
-l file-e functioncommand-line function.
A -e switch can appear anywhere in the argument list, but Guile always invokes the function as the last action it performs. This is weird, but because of the way script invocation works under POSIX, the -s option must always come last in the list.
The function is most often a simple symbol that names a function
that is defined in the script. It can also be of the form (@
module-name symbol), and in that case, the symbol is
looked up in the module named module-name.
For compatibility with some versions of Guile 1.4, you can also use the
form (symbol ...) (that is, a list of only symbols that doesn't
start with @), which is equivalent to (@ (symbol ...)
main), or (symbol ...) symbol (that is, a list of only symbols
followed by a symbol), which is equivalent to (@ (symbol ...)
symbol). We recommend to use the equivalent forms directly since they
correspond to the (@ ...) read syntax that can be used in
normal code. See Using Guile Modules and Scripting Examples.
-dsThis switch is necessary because, although the POSIX script invocation
mechanism effectively requires the -s option to appear last, the
programmer may well want to run the script before other actions
requested on the command line. For examples, see Scripting Examples.
\--use-srfi=listcond-expand when this option is used.
Here is an example that loads the modules SRFI-8 ('receive') and SRFI-13 ('string library') before the GUILE interpreter is started:
guile --use-srfi=8,13
--debugBy default, the debugging VM engine is only used when entering an
interactive session. When executing a script with -s or
-c, the normal, faster VM is used by default.
--no-debugNote that, despite the name, Guile running with --no-debug
does support the usual debugging facilities, such as printing a
detailed backtrace upon error. The only difference with
--debug is lack of support for VM hooks and the facilities that
build upon it (see above).
-q--listen[=p]If p is not given, the default is local port 37146. If you look at it upside down, it almost spells “Guile”. If you have netcat installed, you should be able to nc localhost 37146 and get a Guile prompt. Alternately you can fire up Emacs and connect to the process; see Using Guile in Emacs for more details.
Note that opening a port allows anyone who can connect to that port—in the TCP case, any local user—to do anything Guile can do, as the user that the Guile process is running as. Do not use --listen on multi-user machines. Of course, if you do not pass --listen to Guile, no port will be opened.
That said, --listen is great for interactive debugging and
development.
--auto-compile--fresh-auto-compile--no-auto-compile-h, --help-v, --versionThe environment is a feature of the operating system; it consists of a collection of variables with names and values. Each variable is called an environment variable (or, sometimes, a “shell variable”); environment variable names are case-sensitive, and it is conventional to use upper-case letters only. The values are all text strings, even those that are written as numerals. (Note that here we are referring to names and values that are defined in the operating system shell from which Guile is invoked. This is not the same as a Scheme environment that is defined within a running instance of Guile. For a description of Scheme environments, see About Environments.)
How to set environment variables before starting Guile depends on the operating system and, especially, the shell that you are using. For example, here is how to tell Guile to provide detailed warning messages about deprecated features by setting GUILE_WARN_DEPRECATED using Bash:
$ export GUILE_WARN_DEPRECATED="detailed"
$ guile
Or, detailed warnings can be turned on for a single invocation using:
$ env GUILE_WARN_DEPRECATED="detailed" guile
If you wish to retrieve or change the value of the shell environment variables that affect the run-time behavior of Guile from within a running instance of Guile, see Runtime Environment.
Here are the environment variables that affect the run-time behavior of Guile:
If a compiled (.go) file corresponding to a .scm file is not found or is not newer than the .scm file, the .scm file will be compiled on the fly, and the resulting .go file stored away. An advisory note will be printed on the console.
Compiled files will be stored in the directory $XDG_CACHE_HOME/guile/ccache, where XDG_CACHE_HOME defaults to the directory $HOME/.cache. This directory will be created if it does not already exist.
Note that this mechanism depends on the timestamp of the .go file being newer than that of the .scm file; if the .scm or .go files are moved after installation, care should be taken to preserve their original timestamps.
Set GUILE_AUTO_COMPILE to zero (0), to prevent Scheme files from being compiled automatically. Set this variable to “fresh” to tell Guile to compile Scheme files whether they are newer than the compiled files or not.
See Compilation.
%load-compiled-path.
Here is an example using the Bash shell that adds the current directory,
., and the relative directory ../my-library to
%load-compiled-path:
$ export GUILE_LOAD_COMPILED_PATH=".:../my-library"
$ guile -c '(display %load-compiled-path) (newline)'
(. ../my-library /usr/local/lib/guile/2.0/ccache)
%load-path.
Here is an example using the Bash shell that adds the current directory
and the parent of the current directory to %load-path:
$ env GUILE_LOAD_PATH=".:.." \
guile -c '(display %load-path) (newline)'
(. .. /usr/local/share/guile/2.0 \
/usr/local/share/guile/site/2.0 \
/usr/local/share/guile/site /usr/local/share/guile)
(Note: The line breaks, above, are for documentation purposes only, and
not required in the actual example.)
Users may now install Guile in non-standard directories and run `/path/to/bin/guile', without having also to set LTDL_LIBRARY_PATH to include `/path/to/lib'.
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
The first line of a Guile script must tell the operating system to use Guile to evaluate the script, and then tell Guile how to go about doing that. Here is the simplest case:
The operating system interprets this to mean that the rest of the line is the name of an executable that can interpret the script. Guile, however, interprets these characters as the beginning of a multi-line comment, terminated by the characters ‘!#’ on a line by themselves. (This is an extension to the syntax described in R5RS, added to support shell scripts.)
coding: utf-8 should appear in a comment
somewhere in the first five lines of the file: see Character Encoding of Source Files.
Guile reads the program, evaluating expressions in the order that they appear. Upon reaching the end of the file, Guile exits.
Guile's command-line switches allow the programmer to describe reasonably complicated actions in scripts. Unfortunately, the POSIX script invocation mechanism only allows one argument to appear on the ‘#!’ line after the path to the Guile executable, and imposes arbitrary limits on that argument's length. Suppose you wrote a script starting like this:
#!/usr/local/bin/guile -e main -s
!#
(define (main args)
(map (lambda (arg) (display arg) (display " "))
(cdr args))
(newline))
The intended meaning is clear: load the file, and then call main
on the command-line arguments. However, the system will treat
everything after the Guile path as a single argument — the string
"-e main -s" — which is not what we want.
As a workaround, the meta switch \ allows the Guile programmer to
specify an arbitrary number of options without patching the kernel. If
the first argument to Guile is \, Guile will open the script file
whose name follows the \, parse arguments starting from the
file's second line (according to rules described below), and substitute
them for the \ switch.
Working in concert with the meta switch, Guile treats the characters ‘#!’ as the beginning of a comment which extends through the next line containing only the characters ‘!#’. This sort of comment may appear anywhere in a Guile program, but it is most useful at the top of a file, meshing magically with the POSIX script invocation mechanism.
Thus, consider a script named /u/jimb/ekko which starts like this:
#!/usr/local/bin/guile \
-e main -s
!#
(define (main args)
(map (lambda (arg) (display arg) (display " "))
(cdr args))
(newline))
Suppose a user invokes this script as follows:
$ /u/jimb/ekko a b c
Here's what happens:
/usr/local/bin/guile \ /u/jimb/ekko a b c
This is the usual behavior, prescribed by POSIX.
\ /u/jimb/ekko, it opens
/u/jimb/ekko, parses the three arguments -e, main,
and -s from it, and substitutes them for the \ switch.
Thus, Guile's command line now reads:
/usr/local/bin/guile -e main -s /u/jimb/ekko a b c
(main "/u/jimb/ekko" "a" "b" "c").
When Guile sees the meta switch \, it parses command-line
argument from the script file according to the following rules:
"".
\n and
\t are also supported. These produce argument constituents; the
two-character combination \n doesn't act like a terminating
newline. The escape sequence \NNN for exactly three octal
digits reads as the character whose ASCII code is NNN. As above,
characters produced this way are argument constituents. Backslash
followed by other characters is not allowed.
The ability to accept and handle command line arguments is very important when writing Guile scripts to solve particular problems, such as extracting information from text files or interfacing with existing command line applications. This chapter describes how Guile makes command line arguments available to a Guile script, and the utilities that Guile provides to help with the processing of command line arguments.
When a Guile script is invoked, Guile makes the command line arguments
accessible via the procedure command-line, which returns the
arguments as a list of strings.
For example, if the script
#! /usr/local/bin/guile -s
!#
(write (command-line))
(newline)
is saved in a file cmdline-test.scm and invoked using the command
line ./cmdline-test.scm bar.txt -o foo -frumple grob, the output
is
("./cmdline-test.scm" "bar.txt" "-o" "foo" "-frumple" "grob")
If the script invocation includes a -e option, specifying a
procedure to call after loading the script, Guile will call that
procedure with (command-line) as its argument. So a script that
uses -e doesn't need to refer explicitly to command-line
in its code. For example, the script above would have identical
behaviour if it was written instead like this:
#! /usr/local/bin/guile \
-e main -s
!#
(define (main args)
(write args)
(newline))
(Note the use of the meta switch \ so that the script invocation
can include more than one Guile option: See The Meta Switch.)
These scripts use the #! POSIX convention so that they can be
executed using their own file names directly, as in the example command
line ./cmdline-test.scm bar.txt -o foo -frumple grob. But they
can also be executed by typing out the implied Guile command line in
full, as in:
$ guile -s ./cmdline-test.scm bar.txt -o foo -frumple grob
or
$ guile -e main -s ./cmdline-test2.scm bar.txt -o foo -frumple grob
Even when a script is invoked using this longer form, the arguments that
the script receives are the same as if it had been invoked using the
short form. Guile ensures that the (command-line) or -e
arguments are independent of how the script is invoked, by stripping off
the arguments that Guile itself processes.
A script is free to parse and handle its command line arguments in any
way that it chooses. Where the set of possible options and arguments is
complex, however, it can get tricky to extract all the options, check
the validity of given arguments, and so on. This task can be greatly
simplified by taking advantage of the module (ice-9 getopt-long),
which is distributed with Guile, See getopt-long.
To start with, here are some examples of invoking Guile directly:
guile -- a b c(command-line) will return ("/usr/local/bin/guile" "a" "b" "c").
guile -s /u/jimb/ex2 a b c(command-line) will return ("/u/jimb/ex2" "a" "b" "c").
guile -c '(write %load-path) (newline)'%load-path, print a newline,
and exit.
guile -e main -s /u/jimb/ex4 foomain, passing it the list ("/u/jimb/ex4" "foo").
guile -l first -ds -l last -s script-ds switch says when to process the -s
switch. For a more motivated example, see the scripts below.
Here is a very simple Guile script:
#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)
The first line marks the file as a Guile script. When the user invokes
it, the system runs /usr/local/bin/guile to interpret the script,
passing -s, the script's filename, and any arguments given to the
script as command-line arguments. When Guile sees -s
script, it loads script. Thus, running this program
produces the output:
Hello, world!
Here is a script which prints the factorial of its argument:
#!/usr/local/bin/guile -s
!#
(define (fact n)
(if (zero? n) 1
(* n (fact (- n 1)))))
(display (fact (string->number (cadr (command-line)))))
(newline)
In action:
$ ./fact 5
120
$
However, suppose we want to use the definition of fact in this
file from another script. We can't simply load the script file,
and then use fact's definition, because the script will try to
compute and display a factorial when we load it. To avoid this problem,
we might write the script this way:
#!/usr/local/bin/guile \
-e main -s
!#
(define (fact n)
(if (zero? n) 1
(* n (fact (- n 1)))))
(define (main args)
(display (fact (string->number (cadr args))))
(newline))
This version packages the actions the script should perform in a
function, main. This allows us to load the file purely for its
definitions, without any extraneous computation taking place. Then we
used the meta switch \ and the entry point switch -e to
tell Guile to call main after loading the script.
$ ./fact 50
30414093201713378043612608166064768844377641568960512000000000000
Suppose that we now want to write a script which computes the
choose function: given a set of m distinct objects,
(choose n m) is the number of distinct subsets
containing n objects each. It's easy to write choose given
fact, so we might write the script this way:
#!/usr/local/bin/guile \
-l fact -e main -s
!#
(define (choose n m)
(/ (fact m) (* (fact (- m n)) (fact n))))
(define (main args)
(let ((n (string->number (cadr args)))
(m (string->number (caddr args))))
(display (choose n m))
(newline)))
The command-line arguments here tell Guile to first load the file
fact, and then run the script, with main as the entry
point. In other words, the choose script can use definitions
made in the fact script. Here are some sample runs:
$ ./choose 0 4
1
$ ./choose 1 4
4
$ ./choose 2 4
6
$ ./choose 3 4
4
$ ./choose 4 4
1
$ ./choose 50 100
100891344545564193334812497256
When you start up Guile by typing just guile, without a
-c argument or the name of a script to execute, you get an
interactive interpreter where you can enter Scheme expressions, and
Guile will evaluate them and print the results for you. Here are some
simple examples.
scheme@(guile-user)> (+ 3 4 5)
$1 = 12
scheme@(guile-user)> (display "Hello world!\n")
Hello world!
scheme@(guile-user)> (values 'a 'b)
$2 = a
$3 = b
This mode of use is called a REPL, which is short for “Read-Eval-Print Loop”, because the Guile interpreter first reads the expression that you have typed, then evaluates it, and then prints the result.
The prompt shows you what language and module you are in. In this case, the
current language is scheme, and the current module is
(guile-user). See Other Languages, for more information on Guile's
support for languages other than Scheme.
When run interactively, Guile will load a local initialization file from ~/.guile. This file should contain Scheme expressions for evaluation.
This facility lets the user customize their interactive Guile environment, pulling in extra modules or parameterizing the REPL implementation.
To run Guile without loading the init file, use the -q
command-line option.
To make it easier for you to repeat and vary previously entered expressions, or to edit the expression that you're typing in, Guile can use the GNU Readline library. This is not enabled by default because of licensing reasons, but all you need to activate Readline is the following pair of lines.
scheme@(guile-user)> (use-modules (ice-9 readline))
scheme@(guile-user)> (activate-readline)
It's a good idea to put these two lines (without the
scheme@(guile-user)> prompts) in your .guile file.
See Init File, for more on .guile.
Just as Readline helps you to reuse a previous input line, value
history allows you to use the result of a previous evaluation in
a new expression. When value history is enabled, each evaluation result
is automatically assigned to the next in the sequence of variables
$1, $2, .... You can then use these variables in
subsequent expressions.
scheme@(guile-user)> (iota 10)
$1 = (0 1 2 3 4 5 6 7 8 9)
scheme@(guile-user)> (apply * (cdr $1))
$2 = 362880
scheme@(guile-user)> (sqrt $2)
$3 = 602.3952191045344
scheme@(guile-user)> (cons $2 $1)
$4 = (362880 0 1 2 3 4 5 6 7 8 9)
Value history is enabled by default, because Guile's REPL imports the
(ice-9 history) module. Value history may be turned off or on within the
repl, using the options interface:
scheme@(guile-user)> ,option value-history #f
scheme@(guile-user)> 'foo
foo
scheme@(guile-user)> ,option value-history #t
scheme@(guile-user)> 'bar
$5 = bar
Note that previously recorded values are still accessible, even if value history
is off. In rare cases, these references to past computations can cause Guile to
use too much memory. One may clear these values, possibly enabling garbage
collection, via the clear-value-history! procedure, described below.
The programmatic interface to value history is in a module:
(use-modules (ice-9 history))
Clear the value history. If the stored values are not captured by some other data structure or closure, they may then be reclaimed by the garbage collector.
The REPL exists to read expressions, evaluate them, and then print their results. But sometimes one wants to tell the REPL to evaluate an expression in a different way, or to do something else altogether. A user can affect the way the REPL works with a REPL command.
The previous section had an example of a command, in the form of
,option.
scheme@(guile-user)> ,option value-history #t
Commands are distinguished from expressions by their initial comma (‘,’). Since a comma cannot begin an expression in most languages, it is an effective indicator to the REPL that the following text forms a command, not an expression.
REPL commands are convenient because they are always there. Even if the
current module doesn't have a binding for pretty-print, one can
always ,pretty-print.
The following sections document the various commands, grouped together
by functionality. Many of the commands have abbreviations; see the
online help (,help) for more information.
When Guile starts interactively, it notifies the user that help can be
had by typing ‘,help’. Indeed, help is a command, and a
particularly useful one, as it allows the user to discover the rest of
the commands.
all | group | [-c] command]Show help.
With one argument, tries to look up the argument as a group name, giving help on that group if successful. Otherwise tries to look up the argument as a command, giving help on the command.
If there is a command whose name is also a group name, use the ‘-c command’ form to give help on the command instead of the group.
Without any argument, a list of help commands and command groups are displayed.
Gives information about Guile.
With one argument, tries to show a particular piece of information; currently supported topics are `warranty' (or `w'), `copying' (or `c'), and `version' (or `v').
Without any argument, a list of topics is displayed.
Evaluate an expression, or alternatively, execute another meta-command in the context of a module. For example, ‘,in (foo bar) ,binding’ will show the bindings in the module
(foo bar).
These debugging commands are only available within a recursive REPL; they do not work at the top level.
Print a backtrace.
Print a backtrace of all stack frames, or innermost COUNT frames. If count is negative, the last count frames will be shown.
Select a calling stack frame.
Select and print stack frames that called this one. An argument says how many frames up to go.
Select a called stack frame.
Select and print stack frames called by this one. An argument says how many frames down to go.
Show a frame.
Show the selected frame. With an argument, select a frame by index, then show it.
Show error message.
Display the message associated with the error that started the current debugging REPL.
Show the VM registers associated with the current frame.
See Stack Layout, for more information on VM stack frames.
Sets the number of display columns in the output of
,backtraceand,localsto cols. If cols is not given, the width of the terminal is used.
The next 3 commands work at any REPL.
Set a tracepoint on the given procedure. This will cause all calls to the procedure to print out a tracing message. See Tracing Traps, for more information.
The rest of the commands in this subsection all apply only when the stack is continuable — in other words when it makes sense for the program that the stack comes from to continue running. Usually this means that the program stopped because of a trap or a breakpoint.
Tell the debugged program to step to the next source location in the same frame. (See Traps for the details of how this works.)
Tell the program being debugged to continue running until the completion of the current stack frame, and at that time to print the result and reenter the REPL.
Current REPL options include:
compile-optionsinterpprompt#f by default, indicating the default
prompt.
value-historyon-errordebug, meaning to
enter the debugger. Other values include backtrace, to show a
backtrace without entering the debugger, or report, to simply
show a short error printout.
Default values for REPL options may be set using
repl-default-option-set! from (system repl common):
Set the default value of a REPL option. This function is particularly useful in a user's init file. See Init File.
When code being evaluated from the REPL hits an error, Guile enters a new prompt, allowing you to inspect the context of the error.
scheme@(guile-user)> (map string-append '("a" "b") '("c" #\d))
ERROR: In procedure string-append:
ERROR: Wrong type (expecting string): #\d
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>
The new prompt runs inside the old one, in the dynamic context of the error. It is a recursive REPL, augmented with a reified representation of the stack, ready for debugging.
,backtrace (abbreviated ,bt) displays the Scheme call
stack at the point where the error occurred:
scheme@(guile-user) [1]> ,bt
1 (map #<procedure string-append _> ("a" "b") ("c" #\d))
0 (string-append "b" #\d)
In the above example, the backtrace doesn't have much source
information, as map and string-append are both
primitives. But in the general case, the space on the left of the
backtrace indicates the line and column in which a given procedure calls
another.
You can exit a recursive REPL in the same way that you exit any REPL: via ‘(quit)’, ‘,quit’ (abbreviated ‘,q’), or C-d, among other options.
A recursive debugging REPL exposes a number of other meta-commands that inspect the state of the computation at the time of the error. These commands allow you to
See Debug Commands, for documentation of the individual commands. This section aims to give more of a walkthrough of a typical debugging session.
First, we're going to need a good error. Let's try to macroexpand the
expression (unquote foo), outside of a quasiquote form,
and see how the macroexpander reports this error.
scheme@(guile-user)> (macroexpand '(unquote foo))
ERROR: In procedure macroexpand:
ERROR: unquote: expression not valid outside of quasiquote in (unquote foo)
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>
The backtrace command, which can also be invoked as bt,
displays the call stack (aka backtrace) at the point where the debugger
was entered:
scheme@(guile-user) [1]> ,bt
In ice-9/psyntax.scm:
1130:21 3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #))
1071:30 2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f)
1368:28 1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...)
In unknown file:
0 (scm-error syntax-error macroexpand "~a: ~a in ~a" # #f)
A call stack consists of a sequence of stack frames, with each frame describing one procedure which is waiting to do something with the values returned by another. Here we see that there are four frames on the stack.
Note that macroexpand is not on the stack – it must have made a
tail call to chi-top, as indeed we would find if we searched
ice-9/psyntax.scm for its definition.
When you enter the debugger, the innermost frame is selected, which
means that the commands for getting information about the “current”
frame, or for evaluating expressions in the context of the current
frame, will do so by default with respect to the innermost frame. To
select a different frame, so that these operations will apply to it
instead, use the up, down and frame commands like
this:
scheme@(guile-user) [1]> ,up
In ice-9/psyntax.scm:
1368:28 1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...)
scheme@(guile-user) [1]> ,frame 3
In ice-9/psyntax.scm:
1130:21 3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #))
scheme@(guile-user) [1]> ,down
In ice-9/psyntax.scm:
1071:30 2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f)
Perhaps we're interested in what's going on in frame 2, so we take a look at its local variables:
scheme@(guile-user) [1]> ,locals
Local variables:
$1 = e = (unquote foo)
$2 = r = ()
$3 = w = ((top))
$4 = s = #f
$5 = rib = #f
$6 = mod = (hygiene guile-user)
$7 = for-car? = #f
$8 = first = unquote
$9 = ftype = macro
$10 = fval = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)>
$11 = fe = unquote
$12 = fw = ((top))
$13 = fs = #f
$14 = fmod = (hygiene guile-user)
All of the values are accessible by their value-history names
($n):
scheme@(guile-user) [1]> $10
$15 = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)>
We can even invoke the procedure at the REPL directly:
scheme@(guile-user) [1]> ($10 'not-going-to-work)
ERROR: In procedure macroexpand:
ERROR: source expression failed to match any pattern in not-going-to-work
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
Well at this point we've caused an error within an error. Let's just quit back to the top level:
scheme@(guile-user) [2]> ,q
scheme@(guile-user) [1]> ,q
scheme@(guile-user)>
Finally, as a word to the wise: hackers close their REPL prompts with C-d.
Any text editor can edit Scheme, but some are better than others. Emacs is the best, of course, and not just because it is a fine text editor. Emacs has good support for Scheme out of the box, with sensible indentation rules, parenthesis-matching, syntax highlighting, and even a set of keybindings for structural editing, allowing navigation, cut-and-paste, and transposition operations that work on balanced S-expressions.
As good as it is, though, two things will vastly improve your experience with Emacs and Guile.
The first is Taylor Campbell's Paredit. You should not code in any dialect of Lisp without Paredit. (They say that unopinionated writing is boring—hence this tone—but it's the truth, regardless.) Paredit is the bee's knees.
The second is
José
Antonio Ortega Ruiz's
Geiser. Geiser complements Emacs'
scheme-mode with tight integration to running Guile processes via
a comint-mode REPL buffer.
Of course there are keybindings to switch to the REPL, and a good REPL environment, but Geiser goes beyond that, providing:
See Geiser's web page at http://www.nongnu.org/geiser/, for more information.
Guile also comes with a growing number of command-line utilities: a
compiler, a disassembler, some module inspectors, and in the future, a
system to install Guile packages from the internet. These tools may be
invoked using the guild program.
$ guild compile -o foo.go foo.scm
wrote `foo.go'
This program used to be called guile-tools up to
Guile version 2.0.1, and for backward
compatibility it still may be called as such. However we changed the
name to guild, not only because it is pleasantly shorter and
easier to read, but also because this tool will serve to bind Guile
wizards together, by allowing hackers to share code with each other
using a CPAN-like system.
See Compilation, for more on guild compile.
A complete list of guild scripts can be had by invoking guild
list, or simply guild.
At some point, you will probably want to share your code with other people. To do so effectively, it is important to follow a set of common conventions, to make it easy for the user to install and use your package.
The first thing to do is to install your Scheme files where Guile can find them. When Guile goes to find a Scheme file, it will search a load path to find the file: first in Guile's own path, then in paths for site packages. A site package is any Scheme code that is installed and not part of Guile itself. See Load Paths, for more on load paths.
There are several site paths, for historical reasons, but the one that
should generally be used can be obtained by invoking the
%site-dir procedure. See Build Config. If Guile
2.0 is installed on your system in /usr/,
then (%site-dir) will be
/usr/share/guile/site/2.0. Scheme files
should be installed there.
If you do not install compiled .go files, Guile will compile your
modules and programs when they are first used, and cache them in the
user's home directory. See Compilation, for more on
auto-compilation. However, it is better to compile the files before
they are installed, and to just copy the files to a place that Guile can
find them.
As with Scheme files, Guile searches a path to find compiled .go
files, the %load-compiled-path. By default, this path has two
entries: a path for Guile's files, and a path for site packages. You
should install your .go files into the latter. Currently there
is no procedure to get at this path, which is probably a bug. As in the
previous example, if Guile 2.0 is installed on
your system in /usr/, then the place to put compiled files for
site packages will be
/usr/lib/guile/2.0/site-ccache.
Note that a .go file will only be loaded in preference to a
.scm file if it is newer. For that reason, you should install
your Scheme files first, and your compiled files second. Load
Paths, for more on the loading process.
Finally, although this section is only about Scheme, sometimes you need
to install C extensions too. Shared libraries should be installed in
the extensions dir. This value can be had from the build config
(see Build Config). Again, if Guile 2.0 is
installed on your system in /usr/, then the extensions dir will
be /usr/lib/guile/2.0/extensions.
This part of the manual explains the general concepts that you need to understand when interfacing to Guile from C. You will learn about how the latent typing of Scheme is embedded into the static typing of C, how the garbage collection of Guile is made available to C code, and how continuations influence the control flow in a C program.
This knowledge should make it straightforward to add new functions to Guile that can be called from Scheme. Adding new data types is also possible and is done by defining smobs.
The Programming Overview section of this part contains general musings and guidelines about programming with Guile. It explores different ways to design a program around Guile, or how to embed Guile into existing programs.
For a pedagogical yet detailed explanation of how the data representation of Guile is implemented, See Data Representation. You don't need to know the details given there to use Guile from C, but they are useful when you want to modify Guile itself or when you are just curious about how it is all done.
For detailed reference information on the variables, functions etc. that make up Guile's application programming interface (API), See API Reference.
Guile provides strong API and ABI stability guarantees during stable series, so that if a user writes a program against Guile version 2.0.3, it will be compatible with some future version 2.0.7. We say in this case that 2.0 is the effective version, composed of the major and minor versions, in this case 2 and 0.
Users may install multiple effective versions of Guile, with each version's headers, libraries, and Scheme files under their own directories. This provides the necessary stability guarantee for users, while also allowing Guile developers to evolve the language and its implementation.
However, parallel installability does have a down-side, in that users
need to know which version of Guile to ask for, when they build against
Guile. Guile solves this problem by installing a file to be read by the
pkg-config utility, a tool to query installed packages by name.
Guile encodes the version into its pkg-config name, so that users can
ask for guile-2.0 or guile-2.2, as appropriate.
For effective version 2.0, for example, you would
invoke pkg-config --cflags --libs guile-2.0
to get the compilation and linking flags necessary to link to version
2.0 of Guile. You would typically run
pkg-config during the configuration phase of your program and use
the obtained information in the Makefile.
Guile's pkg-config file,
guile-2.0.pc, defines additional useful
variables:
sitedirextensiondirSee the pkg-config man page, for more information, or its web
site, http://pkg-config.freedesktop.org/.
See Autoconf Support, for more on checking for Guile from within a
configure.ac file.
This section covers the mechanics of linking your program with Guile on a typical POSIX system.
The header file <libguile.h> provides declarations for all of
Guile's functions and constants. You should #include it at the
head of any C source file that uses identifiers described in this
manual. Once you've compiled your source files, you need to link them
against the Guile object code library, libguile.
As noted in the previous section, <libguile.h> is not in the
default search path for headers. The following command lines give
respectively the C compilation and link flags needed to build programs
using Guile 2.0:
pkg-config guile-2.0 --cflags
pkg-config guile-2.0 --libs
To initialize Guile, you can use one of several functions. The first,
scm_with_guile, is the most portable way to initialize Guile. It
will initialize Guile when necessary and then call a function that you
can specify. Multiple threads can call scm_with_guile
concurrently and it can also be called more than once in a given thread.
The global state of Guile will survive from one call of
scm_with_guile to the next. Your function is called from within
scm_with_guile since the garbage collector of Guile needs to know
where the stack of each thread is.
A second function, scm_init_guile, initializes Guile for the
current thread. When it returns, you can use the Guile API in the
current thread. This function employs some non-portable magic to learn
about stack bounds and might thus not be available on all platforms.
One common way to use Guile is to write a set of C functions which
perform some useful task, make them callable from Scheme, and then link
the program with Guile. This yields a Scheme interpreter just like
guile, but augmented with extra functions for some specific
application — a special-purpose scripting language.
In this situation, the application should probably process its
command-line arguments in the same manner as the stock Guile
interpreter. To make that straightforward, Guile provides the
scm_boot_guile and scm_shell function.
For more about these functions, see Initialization.
Here is simple-guile.c, source code for a main and an
inner_main function that will produce a complete Guile
interpreter.
/* simple-guile.c --- how to start up the Guile
interpreter from C code. */
/* Get declarations for all the scm_ functions. */
#include <libguile.h>
static void
inner_main (void *closure, int argc, char **argv)
{
/* module initializations would go here */
scm_shell (argc, argv);
}
int
main (int argc, char **argv)
{
scm_boot_guile (argc, argv, inner_main, 0);
return 0; /* never reached */
}
The main function calls scm_boot_guile to initialize
Guile, passing it inner_main. Once scm_boot_guile is
ready, it invokes inner_main, which calls scm_shell to
process the command-line arguments in the usual way.
Here is a Makefile which you can use to compile the above program. It
uses pkg-config to learn about the necessary compiler and
linker flags.
# Use GCC, if you have it installed.
CC=gcc
# Tell the C compiler where to find <libguile.h>
CFLAGS=`pkg-config --cflags guile-2.0`
# Tell the linker what libraries to use and where to find them.
LIBS=`pkg-config --libs guile-2.0`
simple-guile: simple-guile.o
${CC} simple-guile.o ${LIBS} -o simple-guile
simple-guile.o: simple-guile.c
${CC} -c ${CFLAGS} simple-guile.c
If you are using the GNU Autoconf package to make your application more
portable, Autoconf will settle many of the details in the Makefile above
automatically, making it much simpler and more portable; we recommend
using Autoconf with Guile. Here is a configure.ac file for
simple-guile that uses the standard PKG_CHECK_MODULES
macro to check for Guile. Autoconf will process this file into a
configure script. We recommend invoking Autoconf via the
autoreconf utility.
AC_INIT(simple-guile.c)
# Find a C compiler.
AC_PROG_CC
# Check for Guile
PKG_CHECK_MODULES([GUILE], [guile-2.0])
# Generate a Makefile, based on the results.
AC_OUTPUT(Makefile)
Run autoreconf -vif to generate configure.
Here is a Makefile.in template, from which the configure
script produces a Makefile customized for the host system:
# The configure script fills in these values.
CC=@CC@
CFLAGS=@GUILE_CFLAGS@
LIBS=@GUILE_LIBS@
simple-guile: simple-guile.o
${CC} simple-guile.o ${LIBS} -o simple-guile
simple-guile.o: simple-guile.c
${CC} -c ${CFLAGS} simple-guile.c
The developer should use Autoconf to generate the configure script from the configure.ac template, and distribute configure with the application. Here's how a user might go about building the application:
$ ls
Makefile.in configure* configure.ac simple-guile.c
$ ./configure
checking for gcc... ccache gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether ccache gcc accepts -g... yes
checking for ccache gcc option to accept ISO C89... none needed
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for GUILE... yes
configure: creating ./config.status
config.status: creating Makefile
$ make
[...]
$ ./simple-guile
guile> (+ 1 2 3)
6
guile> (getpwnam "jimb")
#("jimb" "83Z7d75W2tyJQ" 4008 10 "Jim Blandy" "/u/jimb"
"/usr/local/bin/bash")
guile> (exit)
$
The previous section has briefly explained how to write programs that
make use of an embedded Guile interpreter. But sometimes, all you
want to do is make new primitive procedures and data types available
to the Scheme programmer. Writing a new version of guile is
inconvenient in this case and it would in fact make the life of the
users of your new features needlessly hard.
For example, suppose that there is a program guile-db that is a
version of Guile with additional features for accessing a database.
People who want to write Scheme programs that use these features would
have to use guile-db instead of the usual guile program.
Now suppose that there is also a program guile-gtk that extends
Guile with access to the popular Gtk+ toolkit for graphical user
interfaces. People who want to write GUIs in Scheme would have to use
guile-gtk. Now, what happens when you want to write a Scheme
application that uses a GUI to let the user access a database? You
would have to write a third program that incorporates both the
database stuff and the GUI stuff. This might not be easy (because
guile-gtk might be a quite obscure program, say) and taking this
example further makes it easy to see that this approach can not work in
practice.
It would have been much better if both the database features and the GUI
feature had been provided as libraries that can just be linked with
guile. Guile makes it easy to do just this, and we encourage you
to make your extensions to Guile available as libraries whenever
possible.
You write the new primitive procedures and data types in the normal fashion, and link them into a shared library instead of into a stand-alone program. The shared library can then be loaded dynamically by Guile.
This section explains how to make the Bessel functions of the C library
available to Scheme. First we need to write the appropriate glue code
to convert the arguments and return values of the functions from Scheme
to C and back. Additionally, we need a function that will add them to
the set of Guile primitives. Because this is just an example, we will
only implement this for the j0 function.
Consider the following file bessel.c.
#include <math.h>
#include <libguile.h>
SCM
j0_wrapper (SCM x)
{
return scm_from_double (j0 (scm_to_double (x)));
}
void
init_bessel ()
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc `pkg-config --cflags guile-2.0` \
-shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction).
A shared library can be loaded into a running Guile process with the
function load-extension. In addition to the name of the
library to load, this function also expects the name of a function from
that library that will be called to initialize it. For our example,
we are going to call the function init_bessel which will make
j0_wrapper available to Scheme programs with the name
j0. Note that we do not specify a filename extension such as
.so when invoking load-extension. The right extension for
the host platform will be provided automatically.
(load-extension "libguile-bessel" "init_bessel")
(j0 2)
⇒ 0.223890779141236
For this to work, load-extension must be able to find
libguile-bessel, of course. It will look in the places that
are usual for your operating system, and it will additionally look
into the directories listed in the LTDL_LIBRARY_PATH
environment variable.
To see how these Guile extensions via shared libraries relate to the module system, See Putting Extensions into Modules.
When you want to embed the Guile Scheme interpreter into your program or library, you need to link it against the libguile library (see Linking Programs With Guile). Once you have done this, your C code has access to a number of data types and functions that can be used to invoke the interpreter, or make new functions that you have written in C available to be called from Scheme code, among other things.
Scheme is different from C in a number of significant ways, and Guile tries to make the advantages of Scheme available to C as well. Thus, in addition to a Scheme interpreter, libguile also offers dynamic types, garbage collection, continuations, arithmetic on arbitrary sized numbers, and other things.
The two fundamental concepts are dynamic types and garbage collection. You need to understand how libguile offers them to C programs in order to use the rest of libguile. Also, the more general control flow of Scheme caused by continuations needs to be dealt with.
Running asynchronous signal handlers and multi-threading is known to C code already, but there are of course a few additional rules when using them together with libguile.
Scheme is a dynamically-typed language; this means that the system cannot, in general, determine the type of a given expression at compile time. Types only become apparent at run time. Variables do not have fixed types; a variable may hold a pair at one point, an integer at the next, and a thousand-element vector later. Instead, values, not variables, have fixed types.
In order to implement standard Scheme functions like pair? and
string? and provide garbage collection, the representation of
every value must contain enough information to accurately determine its
type at run time. Often, Scheme systems also use this information to
determine whether a program has attempted to apply an operation to an
inappropriately typed value (such as taking the car of a string).
Because variables, pairs, and vectors may hold values of any type, Scheme implementations use a uniform representation for values — a single type large enough to hold either a complete value or a pointer to a complete value, along with the necessary typing information.
In Guile, this uniform representation of all Scheme values is the C type
SCM. This is an opaque type and its size is typically equivalent
to that of a pointer to void. Thus, SCM values can be
passed around efficiently and they take up reasonably little storage on
their own.
The most important rule is: You never access a SCM value
directly; you only pass it to functions or macros defined in libguile.
As an obvious example, although a SCM variable can contain
integers, you can of course not compute the sum of two SCM values
by adding them with the C + operator. You must use the libguile
function scm_sum.
Less obvious and therefore more important to keep in mind is that you
also cannot directly test SCM values for trueness. In Scheme,
the value #f is considered false and of course a SCM
variable can represent that value. But there is no guarantee that the
SCM representation of #f looks false to C code as well.
You need to use scm_is_true or scm_is_false to test a
SCM value for trueness or falseness, respectively.
You also can not directly compare two SCM values to find out
whether they are identical (that is, whether they are eq? in
Scheme terms). You need to use scm_is_eq for this.
The one exception is that you can directly assign a SCM value to
a SCM variable by using the C = operator.
The following (contrived) example shows how to do it right. It implements a function of two arguments (a and flag) that returns a+1 if flag is true, else it returns a unchanged.
SCM
my_incrementing_function (SCM a, SCM flag)
{
SCM result;
if (scm_is_true (flag))
result = scm_sum (a, scm_from_int (1));
else
result = a;
return result;
}
Often, you need to convert between SCM values and appropriate C
values. For example, we needed to convert the integer 1 to its
SCM representation in order to add it to a. Libguile
provides many function to do these conversions, both from C to
SCM and from SCM to C.
The conversion functions follow a common naming pattern: those that make
a SCM value from a C value have names of the form
scm_from_type (...) and those that convert a SCM
value to a C value use the form scm_to_type (...).
However, it is best to avoid converting values when you can. When you
must combine C values and SCM values in a computation, it is
often better to convert the C values to SCM values and do the
computation by using libguile functions than to the other way around
(converting SCM to C and doing the computation some other way).
As a simple example, consider this version of
my_incrementing_function from above:
SCM
my_other_incrementing_function (SCM a, SCM flag)
{
int result;
if (scm_is_true (flag))
result = scm_to_int (a) + 1;
else
result = scm_to_int (a);
return scm_from_int (result);
}
This version is much less general than the original one: it will only
work for values A that can fit into a int. The original
function will work for all values that Guile can represent and that
scm_sum can understand, including integers bigger than long
long, floating point numbers, complex numbers, and new numerical types
that have been added to Guile by third-party libraries.
Also, computing with SCM is not necessarily inefficient. Small
integers will be encoded directly in the SCM value, for example,
and do not need any additional memory on the heap. See Data Representation to find out the details.
Some special SCM values are available to C code without needing
to convert them from C values:
| Scheme value | C representation
|
#f | SCM_BOOL_F
|
#t | SCM_BOOL_T
|
() | SCM_EOL
|
In addition to SCM, Guile also defines the related type
scm_t_bits. This is an unsigned integral type of sufficient
size to hold all information that is directly contained in a
SCM value. The scm_t_bits type is used internally by
Guile to do all the bit twiddling explained in Data Representation, but
you will encounter it occasionally in low-level user code as well.
As explained above, the SCM type can represent all Scheme values.
Some values fit entirely into a SCM value (such as small
integers), but other values require additional storage in the heap (such
as strings and vectors). This additional storage is managed
automatically by Guile. You don't need to explicitly deallocate it
when a SCM value is no longer used.
Two things must be guaranteed so that Guile is able to manage the storage automatically: it must know about all blocks of memory that have ever been allocated for Scheme values, and it must know about all Scheme values that are still being used. Given this knowledge, Guile can periodically free all blocks that have been allocated but are not used by any active Scheme values. This activity is called garbage collection.
It is easy for Guile to remember all blocks of memory that it has allocated for use by Scheme values, but you need to help it with finding all Scheme values that are in use by C code.
You do this when writing a SMOB mark function, for example
(see Garbage Collecting Smobs). By calling this function, the
garbage collector learns about all references that your SMOB has to
other SCM values.
Other references to SCM objects, such as global variables of type
SCM or other random data structures in the heap that contain
fields of type SCM, can be made visible to the garbage collector
by calling the functions scm_gc_protect or
scm_permanent_object. You normally use these functions for long
lived objects such as a hash table that is stored in a global variable.
For temporary references in local variables or function arguments, using
these functions would be too expensive.
These references are handled differently: Local variables (and function
arguments) of type SCM are automatically visible to the garbage
collector. This works because the collector scans the stack for
potential references to SCM objects and considers all referenced
objects to be alive. The scanning considers each and every word of the
stack, regardless of what it is actually used for, and then decides
whether it could possibly be a reference to a SCM object. Thus,
the scanning is guaranteed to find all actual references, but it might
also find words that only accidentally look like references. These
`false positives' might keep SCM objects alive that would
otherwise be considered dead. While this might waste memory, keeping an
object around longer than it strictly needs to is harmless. This is why
this technique is called “conservative garbage collection”. In
practice, the wasted memory seems to be no problem.
The stack of every thread is scanned in this way and the registers of the CPU and all other memory locations where local variables or function parameters might show up are included in this scan as well.
The consequence of the conservative scanning is that you can just
declare local variables and function parameters of type SCM and
be sure that the garbage collector will not free the corresponding
objects.
However, a local variable or function parameter is only protected as
long as it is really on the stack (or in some register). As an
optimization, the C compiler might reuse its location for some other
value and the SCM object would no longer be protected. Normally,
this leads to exactly the right behavior: the compiler will only
overwrite a reference when it is no longer needed and thus the object
becomes unprotected precisely when the reference disappears, just as
wanted.
There are situations, however, where a SCM object needs to be
around longer than its reference from a local variable or function
parameter. This happens, for example, when you retrieve some pointer
from a smob and work with that pointer directly. The reference to the
SCM smob object might be dead after the pointer has been
retrieved, but the pointer itself (and the memory pointed to) is still
in use and thus the smob object must be protected. The compiler does
not know about this connection and might overwrite the SCM
reference too early.
To get around this problem, you can use scm_remember_upto_here_1
and its cousins. It will keep the compiler from overwriting the
reference. For a typical example of its use, see Remembering During Operations.
Scheme has a more general view of program flow than C, both locally and non-locally.
Controlling the local flow of control involves things like gotos, loops, calling functions and returning from them. Non-local control flow refers to situations where the program jumps across one or more levels of function activations without using the normal call or return operations.
The primitive means of C for local control flow is the goto
statement, together with if. Loops done with for,
while or do could in principle be rewritten with just
goto and if. In Scheme, the primitive means for local
control flow is the function call (together with if).
Thus, the repetition of some computation in a loop is ultimately
implemented by a function that calls itself, that is, by recursion.
This approach is theoretically very powerful since it is easier to reason formally about recursion than about gotos. In C, using recursion exclusively would not be practical, though, since it would eat up the stack very quickly. In Scheme, however, it is practical: function calls that appear in a tail position do not use any additional stack space (see Tail Calls).
A function call is in a tail position when it is the last thing the
calling function does. The value returned by the called function is
immediately returned from the calling function. In the following
example, the call to bar-1 is in a tail position, while the
call to bar-2 is not. (The call to 1- in foo-2
is in a tail position, though.)
(define (foo-1 x)
(bar-1 (1- x)))
(define (foo-2 x)
(1- (bar-2 x)))
Thus, when you take care to recurse only in tail positions, the recursion will only use constant stack space and will be as good as a loop constructed from gotos.
Scheme offers a few syntactic abstractions (do and named
let) that make writing loops slightly easier.
But only Scheme functions can call other functions in a tail position: C functions can not. This matters when you have, say, two functions that call each other recursively to form a common loop. The following (unrealistic) example shows how one might go about determining whether a non-negative integer n is even or odd.
(define (my-even? n)
(cond ((zero? n) #t)
(else (my-odd? (1- n)))))
(define (my-odd? n)
(cond ((zero? n) #f)
(else (my-even? (1- n)))))
Because the calls to my-even? and my-odd? are in tail
positions, these two procedures can be applied to arbitrary large
integers without overflowing the stack. (They will still take a lot
of time, of course.)
However, when one or both of the two procedures would be rewritten in C, it could no longer call its companion in a tail position (since C does not have this concept). You might need to take this consideration into account when deciding which parts of your program to write in Scheme and which in C.
In addition to calling functions and returning from them, a Scheme program can also exit non-locally from a function so that the control flow returns directly to an outer level. This means that some functions might not return at all.
Even more, it is not only possible to jump to some outer level of control, a Scheme program can also jump back into the middle of a function that has already exited. This might cause some functions to return more than once.
In general, these non-local jumps are done by invoking
continuations that have previously been captured using
call-with-current-continuation. Guile also offers a slightly
restricted set of functions, catch and throw, that can
only be used for non-local exits. This restriction makes them more
efficient. Error reporting (with the function error) is
implemented by invoking throw, for example. The functions
catch and throw belong to the topic of exceptions.
Since Scheme functions can call C functions and vice versa, C code can
experience the more general control flow of Scheme as well. It is
possible that a C function will not return at all, or will return more
than once. While C does offer setjmp and longjmp for
non-local exits, it is still an unusual thing for C code. In
contrast, non-local exits are very common in Scheme, mostly to report
errors.
You need to be prepared for the non-local jumps in the control flow
whenever you use a function from libguile: it is best to assume
that any libguile function might signal an error or run a pending
signal handler (which in turn can do arbitrary things).
It is often necessary to take cleanup actions when the control leaves a
function non-locally. Also, when the control returns non-locally, some
setup actions might be called for. For example, the Scheme function
with-output-to-port needs to modify the global state so that
current-output-port returns the port passed to
with-output-to-port. The global output port needs to be reset to
its previous value when with-output-to-port returns normally or
when it is exited non-locally. Likewise, the port needs to be set again
when control enters non-locally.
Scheme code can use the dynamic-wind function to arrange for
the setting and resetting of the global state. C code can use the
corresponding scm_internal_dynamic_wind function, or a
scm_dynwind_begin/scm_dynwind_end pair together with
suitable 'dynwind actions' (see Dynamic Wind).
Instead of coping with non-local control flow, you can also prevent it
by erecting a continuation barrier, See Continuation Barriers. The function scm_c_with_continuation_barrier, for
example, is guaranteed to return exactly once.
You can not call libguile functions from handlers for POSIX signals, but
you can register Scheme handlers for POSIX signals such as
SIGINT. These handlers do not run during the actual signal
delivery. Instead, they are run when the program (more precisely, the
thread that the handler has been registered for) reaches the next
safe point.
The libguile functions themselves have many such safe points.
Consequently, you must be prepared for arbitrary actions anytime you
call a libguile function. For example, even scm_cons can contain
a safe point and when a signal handler is pending for your thread,
calling scm_cons will run this handler and anything might happen,
including a non-local exit although scm_cons would not ordinarily
do such a thing on its own.
If you do not want to allow the running of asynchronous signal handlers,
you can block them temporarily with scm_dynwind_block_asyncs, for
example. See See System asyncs.
Since signal handling in Guile relies on safe points, you need to make sure that your functions do offer enough of them. Normally, calling libguile functions in the normal course of action is all that is needed. But when a thread might spent a long time in a code section that calls no libguile function, it is good to include explicit safe points. This can allow the user to interrupt your code with <C-c>, for example.
You can do this with the macro SCM_TICK. This macro is
syntactically a statement. That is, you could use it like this:
while (1)
{
SCM_TICK;
do_some_work ();
}
Frequent execution of a safe point is even more important in multi threaded programs, See Multi-Threading.
Guile can be used in multi-threaded programs just as well as in single-threaded ones.
Each thread that wants to use functions from libguile must put itself into guile mode and must then follow a few rules. If it doesn't want to honor these rules in certain situations, a thread can temporarily leave guile mode (but can no longer use libguile functions during that time, of course).
Threads enter guile mode by calling scm_with_guile,
scm_boot_guile, or scm_init_guile. As explained in the
reference documentation for these functions, Guile will then learn about
the stack bounds of the thread and can protect the SCM values
that are stored in local variables. When a thread puts itself into
guile mode for the first time, it gets a Scheme representation and is
listed by all-threads, for example.
Threads in guile mode can block (e.g., do blocking I/O) without causing any
problems2; temporarily
leaving guile mode with scm_without_guile before blocking slightly
improves GC performance, though. For some common blocking operations, Guile
provides convenience functions. For example, if you want to lock a pthread
mutex while in guile mode, you might want to use scm_pthread_mutex_lock
which is just like pthread_mutex_lock except that it leaves guile mode
while blocking.
All libguile functions are (intended to be) robust in the face of multiple threads using them concurrently. This means that there is no risk of the internal data structures of libguile becoming corrupted in such a way that the process crashes.
A program might still produce nonsensical results, though. Taking hashtables as an example, Guile guarantees that you can use them from multiple threads concurrently and a hashtable will always remain a valid hashtable and Guile will not crash when you access it. It does not guarantee, however, that inserting into it concurrently from two threads will give useful results: only one insertion might actually happen, none might happen, or the table might in general be modified in a totally arbitrary manner. (It will still be a valid hashtable, but not the one that you might have expected.) Guile might also signal an error when it detects a harmful race condition.
Thus, you need to put in additional synchronizations when multiple threads want to use a single hashtable, or any other mutable Scheme object.
When writing C code for use with libguile, you should try to make it robust as well. An example that converts a list into a vector will help to illustrate. Here is a correct version:
SCM
my_list_to_vector (SCM list)
{
SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
size_t len, i;
len = SCM_SIMPLE_VECTOR_LENGTH (vector);
i = 0;
while (i < len && scm_is_pair (list))
{
SCM_SIMPLE_VECTOR_SET (vector, i, SCM_CAR (list));
list = SCM_CDR (list);
i++;
}
return vector;
}
The first thing to note is that storing into a SCM location
concurrently from multiple threads is guaranteed to be robust: you don't
know which value wins but it will in any case be a valid SCM
value.
But there is no guarantee that the list referenced by list is not
modified in another thread while the loop iterates over it. Thus, while
copying its elements into the vector, the list might get longer or
shorter. For this reason, the loop must check both that it doesn't
overrun the vector (SCM_SIMPLE_VECTOR_SET does no range-checking)
and that it doesn't overrun the list (SCM_CAR and SCM_CDR
likewise do no type checking).
It is safe to use SCM_CAR and SCM_CDR on the local
variable list once it is known that the variable contains a pair.
The contents of the pair might change spontaneously, but it will always
stay a valid pair (and a local variable will of course not spontaneously
point to a different Scheme object).
Likewise, a simple vector such as the one returned by
scm_make_vector is guaranteed to always stay the same length so
that it is safe to only use SCM_SIMPLE_VECTOR_LENGTH once and store the
result. (In the example, vector is safe anyway since it is a
fresh object that no other thread can possibly know about until it is
returned from my_list_to_vector.)
Of course the behavior of my_list_to_vector is suboptimal when
list does indeed get asynchronously lengthened or shortened in
another thread. But it is robust: it will always return a valid vector.
That vector might be shorter than expected, or its last elements might
be unspecified, but it is a valid vector and if a program wants to rule
out these cases, it must avoid modifying the list asynchronously.
Here is another version that is also correct:
SCM
my_pedantic_list_to_vector (SCM list)
{
SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
size_t len, i;
len = SCM_SIMPLE_VECTOR_LENGTH (vector);
i = 0;
while (i < len)
{
SCM_SIMPLE_VECTOR_SET (vector, i, scm_car (list));
list = scm_cdr (list);
i++;
}
return vector;
}
This version uses the type-checking and thread-robust functions
scm_car and scm_cdr instead of the faster, but less robust
macros SCM_CAR and SCM_CDR. When the list is shortened
(that is, when list holds a non-pair), scm_car will throw
an error. This might be preferable to just returning a half-initialized
vector.
The API for accessing vectors and arrays of various kinds from C takes a slightly different approach to thread-robustness. In order to get at the raw memory that stores the elements of an array, you need to reserve that array as long as you need the raw memory. During the time an array is reserved, its elements can still spontaneously change their values, but the memory itself and other things like the size of the array are guaranteed to stay fixed. Any operation that would change these parameters of an array that is currently reserved will signal an error. In order to avoid these errors, a program should of course put suitable synchronization mechanisms in place. As you can see, Guile itself is again only concerned about robustness, not about correctness: without proper synchronization, your program will likely not be correct, but the worst consequence is an error message.
Real thread-safeness often requires that a critical section of code is executed in a certain restricted manner. A common requirement is that the code section is not entered a second time when it is already being executed. Locking a mutex while in that section ensures that no other thread will start executing it, blocking asyncs ensures that no asynchronous code enters the section again from the current thread, and the error checking of Guile mutexes guarantees that an error is signalled when the current thread accidentally reenters the critical section via recursive function calls.
Guile provides two mechanisms to support critical sections as outlined
above. You can either use the macros
SCM_CRITICAL_SECTION_START and SCM_CRITICAL_SECTION_END
for very simple sections; or use a dynwind context together with a
call to scm_dynwind_critical_section.
The macros only work reliably for critical sections that are guaranteed to not cause a non-local exit. They also do not detect an accidental reentry by the current thread. Thus, you should probably only use them to delimit critical sections that do not contain calls to libguile functions or to other external functions that might do complicated things.
The function scm_dynwind_critical_section, on the other hand,
will correctly deal with non-local exits because it requires a dynwind
context. Also, by using a separate mutex for each critical section,
it can detect accidental reentries.
Smobs are Guile's mechanism for adding new primitive types to the system. The term “smob” was coined by Aubrey Jaffer, who says it comes from “small object”, referring to the fact that they are quite limited in size: they can hold just one pointer to a larger memory block plus 16 extra bits.
To define a new smob type, the programmer provides Guile with some
essential information about the type — how to print it, how to
garbage collect it, and so on — and Guile allocates a fresh type tag
for it. The programmer can then use scm_c_define_gsubr to make
a set of C functions visible to Scheme code that create and operate on
these objects.
(You can find a complete version of the example code used in this
section in the Guile distribution, in doc/example-smob. That
directory includes a makefile and a suitable main function, so
you can build a complete interactive Guile shell, extended with the
datatypes described here.)
To define a new type, the programmer must write four functions to manage instances of the type:
markSCM values that the object
has stored. The default smob mark function does nothing.
See Garbage Collecting Smobs, for more details.
freescm_make_smob_type is non-zero)
using scm_gc_free. See Garbage Collecting Smobs, for more
details.
This function operates while the heap is in an inconsistent state and
must therefore be careful. See Smobs, for details about what this
function is allowed to do.
printdisplay or write. The default print
function prints #<NAME ADDRESS> where NAME is the first
argument passed to scm_make_smob_type.
equalpequal? function to compare two instances
of the same smob type, Guile calls this function. It should return
SCM_BOOL_T if a and b should be considered
equal?, or SCM_BOOL_F otherwise. If equalp is
NULL, equal? will assume that two instances of this type are
never equal? unless they are eq?.
To actually register the new smob type, call scm_make_smob_type.
It returns a value of type scm_t_bits which identifies the new
smob type.
The four special functions described above are registered by calling
one of scm_set_smob_mark, scm_set_smob_free,
scm_set_smob_print, or scm_set_smob_equalp, as
appropriate. Each function is intended to be used at most once per
type, and the call should be placed immediately following the call to
scm_make_smob_type.
There can only be at most 256 different smob types in the system. Instead of registering a huge number of smob types (for example, one for each relevant C struct in your application), it is sometimes better to register just one and implement a second layer of type dispatching on top of it. This second layer might use the 16 extra bits to extend its type, for example.
Here is how one might declare and register a new type representing eight-bit gray-scale images:
#include <libguile.h>
struct image {
int width, height;
char *pixels;
/* The name of this image */
SCM name;
/* A function to call when this image is
modified, e.g., to update the screen,
or SCM_BOOL_F if no action necessary */
SCM update_func;
};
static scm_t_bits image_tag;
void
init_image_type (void)
{
image_tag = scm_make_smob_type ("image", sizeof (struct image));
scm_set_smob_mark (image_tag, mark_image);
scm_set_smob_free (image_tag, free_image);
scm_set_smob_print (image_tag, print_image);
}
Normally, smobs can have one immediate word of data. This word
stores either a pointer to an additional memory block that holds the
real data, or it might hold the data itself when it fits. The word is
large enough for a SCM value, a pointer to void, or an
integer that fits into a size_t or ssize_t.
You can also create smobs that have two or three immediate words, and when these words suffice to store all data, it is more efficient to use these super-sized smobs instead of using a normal smob plus a memory block. See Double Smobs, for their discussion.
Guile provides functions for managing memory which are often helpful when implementing smobs. See Memory Blocks.
To retrieve the immediate word of a smob, you use the macro
SCM_SMOB_DATA. It can be set with SCM_SET_SMOB_DATA.
The 16 extra bits can be accessed with SCM_SMOB_FLAGS and
SCM_SET_SMOB_FLAGS.
The two macros SCM_SMOB_DATA and SCM_SET_SMOB_DATA treat
the immediate word as if it were of type scm_t_bits, which is
an unsigned integer type large enough to hold a pointer to
void. Thus you can use these macros to store arbitrary
pointers in the smob word.
When you want to store a SCM value directly in the immediate
word of a smob, you should use the macros SCM_SMOB_OBJECT and
SCM_SET_SMOB_OBJECT to access it.
Creating a smob instance can be tricky when it consists of multiple steps that allocate resources and might fail. It is recommended that you go about creating a smob in the following way:
scm_gc_malloc.
SCM values in it that must be protected.
Initialize these fields with SCM_BOOL_F.
A valid state is one that can be safely acted upon by the mark and free functions of your smob type.
SCM_NEWSMOB, passing it the initialized
memory block. (This step will always succeed.)
This procedure ensures that the smob is in a valid state as soon as it
exists, that all resources that are allocated for the smob are
properly associated with it so that they can be properly freed, and
that no SCM values that need to be protected are stored in it
while the smob does not yet completely exist and thus can not protect
them.
Continuing the example from above, if the global variable
image_tag contains a tag returned by scm_make_smob_type,
here is how we could construct a smob whose immediate word contains a
pointer to a freshly allocated struct image:
SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
SCM smob;
struct image *image;
int width = scm_to_int (s_width);
int height = scm_to_int (s_height);
/* Step 1: Allocate the memory block.
*/
image = (struct image *)
scm_gc_malloc (sizeof (struct image), "image");
/* Step 2: Initialize it with straight code.
*/
image->width = width;
image->height = height;
image->pixels = NULL;
image->name = SCM_BOOL_F;
image->update_func = SCM_BOOL_F;
/* Step 3: Create the smob.
*/
SCM_NEWSMOB (smob, image_tag, image);
/* Step 4: Finish the initialization.
*/
image->name = name;
image->pixels =
scm_gc_malloc (width * height, "image pixels");
return smob;
}
Let us look at what might happen when make_image is called.
The conversions of s_width and s_height to ints might
fail and signal an error, thus causing a non-local exit. This is not a
problem since no resources have been allocated yet that would have to be
freed.
The allocation of image in step 1 might fail, but this is likewise no problem.
Step 2 can not exit non-locally. At the end of it, the image
struct is in a valid state for the mark_image and
free_image functions (see below).
Step 3 can not exit non-locally either. This is guaranteed by Guile. After it, smob contains a valid smob that is properly initialized and protected, and in turn can properly protect the Scheme values in its image struct.
But before the smob is completely created, SCM_NEWSMOB might
cause the garbage collector to run. During this garbage collection, the
SCM values in the image struct would be invisible to Guile.
It only gets to know about them via the mark_image function, but
that function can not yet do its job since the smob has not been created
yet. Thus, it is important to not store SCM values in the
image struct until after the smob has been created.
Step 4, finally, might fail and cause a non-local exit. In that case,
the complete creation of the smob has not been successful, but it does
nevertheless exist in a valid state. It will eventually be freed by
the garbage collector, and all the resources that have been allocated
for it will be correctly freed by free_image.
Functions that operate on smobs should check that the passed
SCM value indeed is a suitable smob before accessing its data.
They can do this with scm_assert_smob_type.
For example, here is a simple function that operates on an image smob, and checks the type of its argument.
SCM
clear_image (SCM image_smob)
{
int area;
struct image *image;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
area = image->width * image->height;
memset (image->pixels, 0, area);
/* Invoke the image's update function.
*/
if (scm_is_true (image->update_func))
scm_call_0 (image->update_func);
scm_remember_upto_here_1 (image_smob);
return SCM_UNSPECIFIED;
}
See Remembering During Operations for an explanation of the call
to scm_remember_upto_here_1.
Once a smob has been released to the tender mercies of the Scheme system, it must be prepared to survive garbage collection. Guile calls the mark and free functions of the smob to manage this.
As described in more detail elsewhere (see Conservative GC), every object in the Scheme system has a mark bit, which the garbage collector uses to tell live objects from dead ones. When collection starts, every object's mark bit is clear. The collector traces pointers through the heap, starting from objects known to be live, and sets the mark bit on each object it encounters. When it can find no more unmarked objects, the collector walks all objects, live and dead, frees those whose mark bits are still clear, and clears the mark bit on the others.
The two main portions of the collection are called the mark phase, during which the collector marks live objects, and the sweep phase, during which the collector frees all unmarked objects.
The mark bit of a smob lives in a special memory region. When the collector encounters a smob, it sets the smob's mark bit, and uses the smob's type tag to find the appropriate mark function for that smob. It then calls this mark function, passing it the smob as its only argument.
The mark function is responsible for marking any other Scheme
objects the smob refers to. If it does not do so, the objects' mark
bits will still be clear when the collector begins to sweep, and the
collector will free them. If this occurs, it will probably break, or at
least confuse, any code operating on the smob; the smob's SCM
values will have become dangling references.
To mark an arbitrary Scheme object, the mark function calls
scm_gc_mark.
Thus, here is how we might write mark_image:
SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
scm_gc_mark (image->update_func);
return SCM_BOOL_F;
}
Note that, even though the image's update_func could be an
arbitrarily complex structure (representing a procedure and any values
enclosed in its environment), scm_gc_mark will recurse as
necessary to mark all its components. Because scm_gc_mark sets
an object's mark bit before it recurses, it is not confused by
circular structures.
As an optimization, the collector will mark whatever value is returned by the mark function; this helps limit depth of recursion during the mark phase. Thus, the code above should really be written as:
SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
return image->update_func;
}
Finally, when the collector encounters an unmarked smob during the sweep phase, it uses the smob's tag to find the appropriate free function for the smob. It then calls that function, passing it the smob as its only argument.
The free function must release any resources used by the smob.
However, it must not free objects managed by the collector; the
collector will take care of them. For historical reasons, the return
type of the free function should be size_t, an unsigned
integral type; the free function should always return zero.
Here is how we might write the free_image function for the image
smob type:
size_t
free_image (SCM image_smob)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_free (image->pixels,
image->width * image->height,
"image pixels");
scm_gc_free (image, sizeof (struct image), "image");
return 0;
}
During the sweep phase, the garbage collector will clear the mark bits on all live objects. The code which implements a smob need not do this itself.
There is no way for smob code to be notified when collection is complete.
It is usually a good idea to minimize the amount of processing done during garbage collection; keep the mark and free functions very simple. Since collections occur at unpredictable times, it is easy for any unusual activity to interfere with normal code.
It is often useful to define very simple smob types — smobs which have no data to mark, other than the cell itself, or smobs whose immediate data word is simply an ordinary Scheme object, to be marked recursively. Guile provides some functions to handle these common cases; you can use this function as your smob type's mark function, if your smob's structure is simple enough.
If the smob refers to no other Scheme objects, then no action is necessary; the garbage collector has already marked the smob cell itself. In that case, you can use zero as your mark function.
If the smob refers to exactly one other Scheme object via its first
immediate word, you can use scm_markcdr as its mark function.
Its definition is simply:
SCM
scm_markcdr (SCM obj)
{
return SCM_SMOB_OBJECT (obj);
}
It's important that a smob is visible to the garbage collector whenever its contents are being accessed. Otherwise it could be freed while code is still using it.
For example, consider a procedure to convert image data to a list of pixel values.
SCM
image_to_list (SCM image_smob)
{
struct image *image;
SCM lst;
int i;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
lst = SCM_EOL;
for (i = image->width * image->height - 1; i >= 0; i--)
lst = scm_cons (scm_from_char (image->pixels[i]), lst);
scm_remember_upto_here_1 (image_smob);
return lst;
}
In the loop, only the image pointer is used and the C compiler
has no reason to keep the image_smob value anywhere. If
scm_cons results in a garbage collection, image_smob might
not be on the stack or anywhere else and could be freed, leaving the
loop accessing freed data. The use of scm_remember_upto_here_1
prevents this, by creating a reference to image_smob after all
data accesses.
There's no need to do the same for lst, since that's the return
value and the compiler will certainly keep it in a register or
somewhere throughout the routine.
The clear_image example previously shown (see Type checking)
also used scm_remember_upto_here_1 for this reason.
It's only in quite rare circumstances that a missing
scm_remember_upto_here_1 will bite, but when it happens the
consequences are serious. Fortunately the rule is simple: whenever
calling a Guile library function or doing something that might, ensure
that the SCM of a smob is referenced past all accesses to its
insides. Do this by adding an scm_remember_upto_here_1 if
there are no other references.
In a multi-threaded program, the rule is the same. As far as a given thread is concerned, a garbage collection still only occurs within a Guile library function, not at an arbitrary time. (Guile waits for all threads to reach one of its library functions, and holds them there while the collector runs.)
Smobs are called smob because they are small: they normally have only
room for one void* or SCM value plus 16 bits. The
reason for this is that smobs are directly implemented by using the
low-level, two-word cells of Guile that are also used to implement
pairs, for example. (see Data Representation for the
details.) One word of the two-word cells is used for
SCM_SMOB_DATA (or SCM_SMOB_OBJECT), the other contains
the 16-bit type tag and the 16 extra bits.
In addition to the fundamental two-word cells, Guile also has
four-word cells, which are appropriately called double cells.
You can use them for double smobs and get two more immediate
words of type scm_t_bits.
A double smob is created with SCM_NEWSMOB2 or
SCM_NEWSMOB3 instead of SCM_NEWSMOB. Its immediate
words can be retrieved as scm_t_bits with
SCM_SMOB_DATA_2 and SCM_SMOB_DATA_3 in addition to
SCM_SMOB_DATA. Unsurprisingly, the words can be set to
scm_t_bits values with SCM_SET_SMOB_DATA_2 and
SCM_SET_SMOB_DATA_3.
Of course there are also SCM_SMOB_OBJECT_2,
SCM_SMOB_OBJECT_3, SCM_SET_SMOB_OBJECT_2, and
SCM_SET_SMOB_OBJECT_3.
Here is the complete text of the implementation of the image datatype, as presented in the sections above. We also provide a definition for the smob's print function, and make some objects and functions static, to clarify exactly what the surrounding code is using.
As mentioned above, you can find this code in the Guile distribution, in
doc/example-smob. That directory includes a makefile and a
suitable main function, so you can build a complete interactive
Guile shell, extended with the datatypes described here.)
/* file "image-type.c" */
#include <stdlib.h>
#include <libguile.h>
static scm_t_bits image_tag;
struct image {
int width, height;
char *pixels;
/* The name of this image */
SCM name;
/* A function to call when this image is
modified, e.g., to update the screen,
or SCM_BOOL_F if no action necessary */
SCM update_func;
};
static SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
SCM smob;
struct image *image;
int width = scm_to_int (s_width);
int height = scm_to_int (s_height);
/* Step 1: Allocate the memory block.
*/
image = (struct image *)
scm_gc_malloc (sizeof (struct image), "image");
/* Step 2: Initialize it with straight code.
*/
image->width = width;
image->height = height;
image->pixels = NULL;
image->name = SCM_BOOL_F;
image->update_func = SCM_BOOL_F;
/* Step 3: Create the smob.
*/
SCM_NEWSMOB (smob, image_tag, image);
/* Step 4: Finish the initialization.
*/
image->name = name;
image->pixels =
scm_gc_malloc (width * height, "image pixels");
return smob;
}
SCM
clear_image (SCM image_smob)
{
int area;
struct image *image;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
area = image->width * image->height;
memset (image->pixels, 0, area);
/* Invoke the image's update function.
*/
if (scm_is_true (image->update_func))
scm_call_0 (image->update_func);
scm_remember_upto_here_1 (image_smob);
return SCM_UNSPECIFIED;
}
static SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
return image->update_func;
}
static size_t
free_image (SCM image_smob)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_free (image->pixels,
image->width * image->height,
"image pixels");
scm_gc_free (image, sizeof (struct image), "image");
return 0;
}
static int
print_image (SCM image_smob, SCM port, scm_print_state *pstate)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_puts ("#<image ", port);
scm_display (image->name, port);
scm_puts (">", port);
/* non-zero means success */
return 1;
}
void
init_image_type (void)
{
image_tag = scm_make_smob_type ("image", sizeof (struct image));
scm_set_smob_mark (image_tag, mark_image);
scm_set_smob_free (image_tag, free_image);
scm_set_smob_print (image_tag, print_image);
scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
}
Here is a sample build and interaction with the code from the example-smob directory, on the author's machine:
zwingli:example-smob$ make CC=gcc
gcc `pkg-config --cflags guile-2.0` -c image-type.c -o image-type.o
gcc `pkg-config --cflags guile-2.0` -c myguile.c -o myguile.o
gcc image-type.o myguile.o `pkg-config --libs guile-2.0` -o myguile
zwingli:example-smob$ ./myguile
guile> make-image
#<primitive-procedure make-image>
guile> (define i (make-image "Whistler's Mother" 100 100))
guile> i
#<image Whistler's Mother>
guile> (clear-image i)
guile> (clear-image 4)
ERROR: In procedure clear-image in expression (clear-image 4):
ERROR: Wrong type (expecting image): 4
ABORT: (wrong-type-arg)
Type "(backtrace)" to get more information.
guile>
When writing C code for use with Guile, you typically define a set of
C functions, and then make some of them visible to the Scheme world by
calling scm_c_define_gsubr or related functions. If you have
many functions to publish, it can sometimes be annoying to keep the
list of calls to scm_c_define_gsubr in sync with the list of
function definitions.
Guile provides the guile-snarf program to manage this problem.
Using this tool, you can keep all the information needed to define the
function alongside the function definition itself; guile-snarf
will extract this information from your source code, and automatically
generate a file of calls to scm_c_define_gsubr which you can
#include into an initialization function.
The snarfing mechanism works for many kind of initialization actions,
not just for collecting calls to scm_c_define_gsubr. For a
full list of what can be done, See Snarfing Macros.
The guile-snarf program is invoked like this:
guile-snarf [-o outfile] [cpp-args ...]
This command will extract initialization actions to outfile.
When no outfile has been specified or when outfile is
-, standard output will be used. The C preprocessor is called
with cpp-args (which usually include an input file) and the
output is filtered to extract the initialization actions.
If there are errors during processing, outfile is deleted and the program exits with non-zero status.
During snarfing, the pre-processor macro SCM_MAGIC_SNARFER is
defined. You could use this to avoid including snarfer output files
that don't yet exist by writing code like this:
#ifndef SCM_MAGIC_SNARFER
#include "foo.x"
#endif
Here is how you might define the Scheme function clear-image,
implemented by the C function clear_image:
#include <libguile.h>
SCM_DEFINE (clear_image, "clear-image", 1, 0, 0,
(SCM image_smob),
"Clear the image.")
{
/* C code to clear the image in image_smob... */
}
void
init_image_type ()
{
#include "image-type.x"
}
The SCM_DEFINE declaration says that the C function
clear_image implements a Scheme function called
clear-image, which takes one required argument (of type
SCM and named image_smob), no optional arguments, and no
rest argument. The string "Clear the image." provides a short
help text for the function, it is called a docstring.
SCM_DEFINE macro also defines a static array of characters
initialized to the Scheme name of the function. In this case,
s_clear_image is set to the C string, "clear-image". You might
want to use this symbol when generating error messages.
Assuming the text above lives in a file named image-type.c, you will need to execute the following command to prepare this file for compilation:
guile-snarf -o image-type.x image-type.c
This scans image-type.c for SCM_DEFINE
declarations, and writes to image-type.x the output:
scm_c_define_gsubr ("clear-image", 1, 0, 0, (SCM (*)() ) clear_image);
When compiled normally, SCM_DEFINE is a macro which expands to
the function header for clear_image.
Note that the output file name matches the #include from the
input file. Also, you still need to provide all the same information
you would if you were using scm_c_define_gsubr yourself, but you
can place the information near the function definition itself, so it is
less likely to become incorrect or out-of-date.
If you have many files that guile-snarf must process, you should
consider using a fragment like the following in your Makefile:
snarfcppopts = $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS)
.SUFFIXES: .x
.c.x:
guile-snarf -o $@ $< $(snarfcppopts)
This tells make to run guile-snarf to produce each needed
.x file from the corresponding .c file.
The program guile-snarf passes its command-line arguments
directly to the C preprocessor, which it uses to extract the
information it needs from the source code. this means you can pass
normal compilation flags to guile-snarf to define preprocessor
symbols, add header file directories, and so on.
Guile is designed as an extension language interpreter that is straightforward to integrate with applications written in C (and C++). The big win here for the application developer is that Guile integration, as the Guile web page says, “lowers your project's hacktivation energy.” Lowering the hacktivation energy means that you, as the application developer, and your users, reap the benefits that flow from being able to extend the application in a high level extension language rather than in plain old C.
In abstract terms, it's difficult to explain what this really means and what the integration process involves, so instead let's begin by jumping straight into an example of how you might integrate Guile into an existing program, and what you could expect to gain by so doing. With that example under our belts, we'll then return to a more general analysis of the arguments involved and the range of programming options available.
Dia is a free software program for drawing schematic diagrams like flow charts and floor plans (http://www.gnome.org/projects/dia/). This section conducts the thought experiment of adding Guile to Dia. In so doing, it aims to illustrate several of the steps and considerations involved in adding Guile to applications in general.
First off, you should understand why you want to add Guile to Dia at all, and that means forming a picture of what Dia does and how it does it. So, what are the constituents of the Dia application?
(In other words, a textbook example of the model - view - controller paradigm.)
Next question: how will Dia benefit once the Guile integration is complete? Several (positive!) answers are possible here, and the choice is obviously up to the application developers. Still, one answer is that the main benefit will be the ability to manipulate Dia's application domain objects from Scheme.
Suppose that Dia made a set of procedures available in Scheme, representing the most basic operations on objects such as shapes, connectors, and so on. Using Scheme, the application user could then write code that builds upon these basic operations to create more complex procedures. For example, given basic procedures to enumerate the objects on a page, to determine whether an object is a square, and to change the fill pattern of a single shape, the user can write a Scheme procedure to change the fill pattern of all squares on the current page:
(define (change-squares'-fill-pattern new-pattern)
(for-each-shape current-page
(lambda (shape)
(if (square? shape)
(change-fill-pattern shape new-pattern)))))
Assuming this objective, four steps are needed to achieve it.
First, you need a way of representing your application-specific objects
— such as shape in the previous example — when they are
passed into the Scheme world. Unless your objects are so simple that
they map naturally into builtin Scheme data types like numbers and
strings, you will probably want to use Guile's SMOB interface to
create a new Scheme data type for your objects.
Second, you need to write code for the basic operations like
for-each-shape and square? such that they access and
manipulate your existing data structures correctly, and then make these
operations available as primitives on the Scheme level.
Third, you need to provide some mechanism within the Dia application that a user can hook into to cause arbitrary Scheme code to be evaluated.
Finally, you need to restructure your top-level application C code a little so that it initializes the Guile interpreter correctly and declares your SMOBs and primitives to the Scheme world.
The following subsections expand on these four points in turn.
For all but the most trivial applications, you will probably want to allow some representation of your domain objects to exist on the Scheme level. This is where the idea of SMOBs comes in, and with it issues of lifetime management and garbage collection.
To get more concrete about this, let's look again at the example we gave earlier of how application users can use Guile to build higher-level functions from the primitives that Dia itself provides.
(define (change-squares'-fill-pattern new-pattern)
(for-each-shape current-page
(lambda (shape)
(if (square? shape)
(change-fill-pattern shape new-pattern)))))
Consider what is stored here in the variable shape. For each
shape on the current page, the for-each-shape primitive calls
(lambda (shape) ...) with an argument representing that
shape. Question is: how is that argument represented on the Scheme
level? The issues are as follows.
square? and change-fill-pattern primitives. In
other words, a primitive like square? has somehow to be able to
turn the value that it receives back into something that points to the
underlying C structure describing a shape.
shape in a global variable, but then that shape is deleted (in a
way that the Scheme code is not aware of), and later on some other
Scheme code uses that global variable again in a call to, say,
square??
shape argument passes
transiently in and out of the Scheme world, it would be quite wrong the
delete the underlying C shape just because the Scheme code has
finished evaluation. How do we avoid this happening?
One resolution of these issues is for the Scheme-level representation of
a shape to be a new, Scheme-specific C structure wrapped up as a SMOB.
The SMOB is what is passed into and out of Scheme code, and the
Scheme-specific C structure inside the SMOB points to Dia's underlying C
structure so that the code for primitives like square? can get at
it.
To cope with an underlying shape being deleted while Scheme code is still holding onto a Scheme shape value, the underlying C structure should have a new field that points to the Scheme-specific SMOB. When a shape is deleted, the relevant code chains through to the Scheme-specific structure and sets its pointer back to the underlying structure to NULL. Thus the SMOB value for the shape continues to exist, but any primitive code that tries to use it will detect that the underlying shape has been deleted because the underlying structure pointer is NULL.
So, to summarize the steps involved in this resolution of the problem
(and assuming that the underlying C structure for a shape is
struct dia_shape):
struct dia_guile_shape
{
struct dia_shape * c_shape; /* NULL => deleted */
}
struct dia_shape that points to its struct
dia_guile_shape if it has one —
struct dia_shape
{
...
struct dia_guile_shape * guile_shape;
}
— so that C code can set guile_shape->c_shape to NULL when the
underlying shape is deleted.
struct dia_guile_shape as a SMOB type.
c_shape field when decoding it, to find out whether the
underlying C shape is still there.
As far as memory management is concerned, the SMOB values and their Scheme-specific structures are under the control of the garbage collector, whereas the underlying C structures are explicitly managed in exactly the same way that Dia managed them before we thought of adding Guile.
When the garbage collector decides to free a shape SMOB value, it calls
the SMOB free function that was specified when defining the shape
SMOB type. To maintain the correctness of the guile_shape field
in the underlying C structure, this function should chain through to the
underlying C structure (if it still exists) and set its
guile_shape field to NULL.
For full documentation on defining and using SMOB types, see Defining New Types (Smobs).
Once the details of object representation are decided, writing the primitive function code that you need is usually straightforward.
A primitive is simply a C function whose arguments and return value are
all of type SCM, and whose body does whatever you want it to do.
As an example, here is a possible implementation of the square?
primitive:
static SCM square_p (SCM shape)
{
struct dia_guile_shape * guile_shape;
/* Check that arg is really a shape SMOB. */
scm_assert_smob_type (shape_tag, shape);
/* Access Scheme-specific shape structure. */
guile_shape = SCM_SMOB_DATA (shape);
/* Find out if underlying shape exists and is a
square; return answer as a Scheme boolean. */
return scm_from_bool (guile_shape->c_shape &&
(guile_shape->c_shape->type == DIA_SQUARE));
}
Notice how easy it is to chain through from the SCM shape
parameter that square_p receives — which is a SMOB — to the
Scheme-specific structure inside the SMOB, and thence to the underlying
C structure for the shape.
In this code, scm_assert_smob_type, SCM_SMOB_DATA, and
scm_from_bool are from the standard Guile API. We assume that
shape_tag was given to us when we made the shape SMOB type, using
scm_make_smob_type. The call to scm_assert_smob_type
ensures that shape is indeed a shape. This is needed to guard
against Scheme code using the square? procedure incorrectly, as
in (square? "hello"); Scheme's latent typing means that usage
errors like this must be caught at run time.
Having written the C code for your primitives, you need to make them
available as Scheme procedures by calling the scm_c_define_gsubr
function. scm_c_define_gsubr (see Primitive Procedures) takes arguments that
specify the Scheme-level name for the primitive and how many required,
optional and rest arguments it can accept. The square? primitive
always requires exactly one argument, so the call to make it available
in Scheme reads like this:
scm_c_define_gsubr ("square?", 1, 0, 0, square_p);
For where to put this call, see the subsection after next on the structure of Guile-enabled code (see Dia Structure).
To make the Guile integration useful, you have to design some kind of hook into your application that application users can use to cause their Scheme code to be evaluated.
Technically, this is straightforward; you just have to decide on a mechanism that is appropriate for your application. Think of Emacs, for example: when you type <ESC> :, you get a prompt where you can type in any Elisp code, which Emacs will then evaluate. Or, again like Emacs, you could provide a mechanism (such as an init file) to allow Scheme code to be associated with a particular key sequence, and evaluate the code when that key sequence is entered.
In either case, once you have the Scheme code that you want to evaluate,
as a null terminated string, you can tell Guile to evaluate it by
calling the scm_c_eval_string function.
Let's assume that the pre-Guile Dia code looks structurally like this:
main ()
When you add Guile to a program, one (rather technical) requirement is
that Guile's garbage collector needs to know where the bottom of the C
stack is. The easiest way to ensure this is to use
scm_boot_guile like this:
main ()
scm_boot_guile (argc, argv, inner_main, NULL)
inner_main ()
scm_c_define_gsubr
In other words, you move the guts of what was previously in your
main function into a new function called inner_main, and
then add a scm_boot_guile call, with inner_main as a
parameter, to the end of main.
Assuming that you are using SMOBs and have written primitive code as
described in the preceding subsections, you also need to insert calls to
declare your new SMOBs and export the primitives to Scheme. These
declarations must happen inside the dynamic scope of the
scm_boot_guile call, but also before any code is run that
could possibly use them — the beginning of inner_main is an
ideal place for this.
The steps described so far implement an initial Guile integration that already gives a lot of additional power to Dia application users. But there are further steps that you could take, and it's interesting to consider a few of these.
In general, you could progressively move more of Dia's source code from C into Scheme. This might make the code more maintainable and extensible, and it could open the door to new programming paradigms that are tricky to effect in C but straightforward in Scheme.
A specific example of this is that you could use the guile-gtk package, which provides Scheme-level procedures for most of the Gtk+ library, to move the code that lays out and displays Dia objects from C to Scheme.
As you follow this path, it naturally becomes less useful to maintain a distinction between Dia's original non-Guile-related source code, and its later code implementing SMOBs and primitives for the Scheme world.
For example, suppose that the original source code had a
dia_change_fill_pattern function:
void dia_change_fill_pattern (struct dia_shape * shape,
struct dia_pattern * pattern)
{
/* real pattern change work */
}
During initial Guile integration, you add a change_fill_pattern
primitive for Scheme purposes, which accesses the underlying structures
from its SMOB values and uses dia_change_fill_pattern to do the
real work:
SCM change_fill_pattern (SCM shape, SCM pattern)
{
struct dia_shape * d_shape;
struct dia_pattern * d_pattern;
...
dia_change_fill_pattern (d_shape, d_pattern);
return SCM_UNSPECIFIED;
}
At this point, it makes sense to keep dia_change_fill_pattern and
change_fill_pattern separate, because
dia_change_fill_pattern can also be called without going through
Scheme at all, say because the user clicks a button which causes a
C-registered Gtk+ callback to be called.
But, if the code for creating buttons and registering their callbacks is
moved into Scheme (using guile-gtk), it may become true that
dia_change_fill_pattern can no longer be called other than
through Scheme. In which case, it makes sense to abolish it and move
its contents directly into change_fill_pattern, like this:
SCM change_fill_pattern (SCM shape, SCM pattern)
{
struct dia_shape * d_shape;
struct dia_pattern * d_pattern;
...
/* real pattern change work */
return SCM_UNSPECIFIED;
}
So further Guile integration progressively reduces the amount of functional C code that you have to maintain over the long term.
A similar argument applies to data representation. In the discussion of SMOBs earlier, issues arose because of the different memory management and lifetime models that normally apply to data structures in C and in Scheme. However, with further Guile integration, you can resolve this issue in a more radical way by allowing all your data structures to be under the control of the garbage collector, and kept alive by references from the Scheme world. Instead of maintaining an array or linked list of shapes in C, you would instead maintain a list in Scheme.
Rather like the coalescing of dia_change_fill_pattern and
change_fill_pattern, the practical upshot of such a change is
that you would no longer have to keep the dia_shape and
dia_guile_shape structures separate, and so wouldn't need to
worry about the pointers between them. Instead, you could change the
SMOB definition to wrap the dia_shape structure directly, and
send dia_guile_shape off to the scrap yard. Cut out the middle
man!
Finally, we come to the holy grail of Guile's free software / extension language approach. Once you have a Scheme representation for interesting Dia data types like shapes, and a handy bunch of primitives for manipulating them, it suddenly becomes clear that you have a bundle of functionality that could have far-ranging use beyond Dia itself. In other words, the data types and primitives could now become a library, and Dia becomes just one of the many possible applications using that library — albeit, at this early stage, a rather important one!
In this model, Guile becomes just the glue that binds everything together. Imagine an application that usefully combined functionality from Dia, Gnumeric and GnuCash — it's tricky right now, because no such application yet exists; but it'll happen some day ...
Underlying Guile's value proposition is the assumption that programming in a high level language, specifically Guile's implementation of Scheme, is necessarily better in some way than programming in C. What do we mean by this claim, and how can we be so sure?
One class of advantages applies not only to Scheme, but more generally to any interpretable, high level, scripting language, such as Emacs Lisp, Python, Ruby, or TeX's macro language. Common features of all such languages, when compared to C, are that:
In the case of Scheme, particular features that make programming easier — and more fun! — are its powerful mechanisms for abstracting parts of programs (closures — see About Closure) and for iteration (see while do).
The evidence in support of this argument is empirical: the huge amount of code that has been written in extension languages for applications that support this mechanism. Most notable are extensions written in Emacs Lisp for GNU Emacs, in TeX's macro language for TeX, and in Script-Fu for the Gimp, but there is increasingly now a significant code eco-system for Guile-based applications as well, such as Lilypond and GnuCash. It is close to inconceivable that similar amounts of functionality could have been added to these applications just by writing new code in their base implementation languages.
As an example of what this means in practice, imagine writing a testbed for an application that is tested by submitting various requests (via a C interface) and validating the output received. Suppose further that the application keeps an idea of its current state, and that the “correct” output for a given request may depend on the current application state. A complete “white box”3 test plan for this application would aim to submit all possible requests in each distinguishable state, and validate the output for all request/state combinations.
To write all this test code in C would be very tedious. Suppose instead that the testbed code adds a single new C function, to submit an arbitrary request and return the response, and then uses Guile to export this function as a Scheme procedure. The rest of the testbed can then be written in Scheme, and so benefits from all the advantages of programming in Scheme that were described in the previous section.
(In this particular example, there is an additional benefit of writing most of the testbed in Scheme. A common problem for white box testing is that mistakes and mistaken assumptions in the application under test can easily be reproduced in the testbed code. It is more difficult to copy mistakes like this when the testbed is written in a different language from the application.)
The preceding arguments and example point to a model of Guile programming that is applicable in many cases. According to this model, Guile programming involves a balance between C and Scheme programming, with the aim being to extract the greatest possible Scheme level benefit from the least amount of C level work.
The C level work required in this model usually consists of packaging and exporting functions and application objects such that they can be seen and manipulated on the Scheme level. To help with this, Guile's C language interface includes utility features that aim to make this kind of integration very easy for the application developer. These features are documented later in this part of the manual: see REFFIXME.
This model, though, is really just one of a range of possible programming options. If all of the functionality that you need is available from Scheme, you could choose instead to write your whole application in Scheme (or one of the other high level languages that Guile supports through translation), and simply use Guile as an interpreter for Scheme. (In the future, we hope that Guile will also be able to compile Scheme code, so lessening the performance gap between C and Scheme code.) Or, at the other end of the C–Scheme scale, you could write the majority of your application in C, and only call out to Guile occasionally for specific actions such as reading a configuration file or executing a user-specified extension. The choices boil down to two basic questions:
These are of course design questions, and the right design for any given application will always depend upon the particular requirements that you are trying to meet. In the context of Guile, however, there are some generally applicable considerations that can help you when designing your answers.
Suppose, for the sake of argument, that you would prefer to write your whole application in Scheme. Then the API available to you consists of:
A module in the last category can either be a pure Scheme module — in
other words a collection of utility procedures coded in Scheme — or a
module that provides a Scheme interface to an extension library coded in
C — in other words a nice package where someone else has done the work
of wrapping up some useful C code for you. The set of available modules
is growing quickly and already includes such useful examples as
(gtk gtk), which makes Gtk+ drawing functions available in
Scheme, and (database postgres), which provides SQL access to a
Postgres database.
Given the growing collection of pre-existing modules, it is quite feasible that your application could be implemented by combining a selection of these modules together with new application code written in Scheme.
If this approach is not enough, because the functionality that your application needs is not already available in this form, and it is impossible to write the new functionality in Scheme, you will need to write some C code. If the required function is already available in C (e.g. in a library), all you need is a little glue to connect it to the world of Guile. If not, you need both to write the basic code and to plumb it into Guile.
In either case, two general considerations are important. Firstly, what is the interface by which the functionality is presented to the Scheme world? Does the interface consist only of function calls (for example, a simple drawing interface), or does it need to include objects of some kind that can be passed between C and Scheme and manipulated by both worlds. Secondly, how does the lifetime and memory management of objects in the C code relate to the garbage collection governed approach of Scheme objects? In the case where the basic C code is not already written, most of the difficulties of memory management can be avoided by using Guile's C interface features from the start.
For the full documentation on writing C code for Guile and connecting existing C code to the Guile world, see REFFIXME.
So far we have considered what Guile programming means for an application developer. But what if you are instead using an existing Guile-based application, and want to know what your options are for programming and extending this application?
The answer to this question varies from one application to another, because the options available depend inevitably on whether the application developer has provided any hooks for you to hang your own code on and, if there are such hooks, what they allow you to do.4 For example...
In the last two cases, what you can do is, by definition, restricted by the application, and you should refer to the application's own manual to find out your options.
The most well known example of the first case is Emacs, with its extension language Emacs Lisp: as well as being a text editor, Emacs supports the loading and execution of arbitrary Emacs Lisp code. The result of such openness has been dramatic: Emacs now benefits from user-contributed Emacs Lisp libraries that extend the basic editing function to do everything from reading news to psychoanalysis and playing adventure games. The only limitation is that extensions are restricted to the functionality provided by Emacs's built-in set of primitive operations. For example, you can interact and display data by manipulating the contents of an Emacs buffer, but you can't pop-up and draw a window with a layout that is totally different to the Emacs standard.
This situation with a Guile application that supports the loading of arbitrary user code is similar, except perhaps even more so, because Guile also supports the loading of extension libraries written in C. This last point enables user code to add new primitive operations to Guile, and so to bypass the limitation present in Emacs Lisp.
At this point, the distinction between an application developer and an application user becomes rather blurred. Instead of seeing yourself as a user extending an application, you could equally well say that you are developing a new application of your own using some of the primitive functionality provided by the original application. As such, all the discussions of the preceding sections of this chapter are relevant to how you can proceed with developing your extension.
Autoconf, a part of the GNU build system, makes it easy for users to build your package. This section documents Guile's Autoconf support.
As explained in the GNU Autoconf Manual, any package needs configuration at build-time (see Introduction). If your package uses Guile (or uses a package that in turn uses Guile), you probably need to know what specific Guile features are available and details about them.
The way to do this is to write feature tests and arrange for their execution
by the configure script, typically by adding the tests to
configure.ac, and running autoconf to create configure.
Users of your package then run configure in the normal way.
Macros are a way to make common feature tests easy to express. Autoconf provides a wide range of macros (see Existing Tests), and Guile installation provides Guile-specific tests in the areas of: program detection, compilation flags reporting, and Scheme module checks.
As mentioned earlier in this chapter, Guile supports parallel
installation, and uses pkg-config to let the user choose which
version of Guile they are interested in. pkg-config has its own
set of Autoconf macros that are probably installed on most every
development system. The most useful of these macros is
PKG_CHECK_MODULES.
PKG_CHECK_MODULES([GUILE], [guile-2.0])
This example looks for Guile and sets the GUILE_CFLAGS and
GUILE_LIBS variables accordingly, or prints an error and exits if
Guile was not found.
Guile comes with additional Autoconf macros providing more information,
installed as prefix/share/aclocal/guile.m4. Their names
all begin with GUILE_.
This macro looks for programs
guile,guile-configandguile-tools, and sets variables GUILE, GUILE_CONFIG and GUILE_TOOLS, to their paths, respectively. If either of the first two is not found, signal error.The variables are marked for substitution, as by
AC_SUBST.
This macro runs the
guile-configscript, installed with Guile, to find out where Guile's header files and libraries are installed. It sets four variables, GUILE_CFLAGS, GUILE_LDFLAGS, GUILE_LIBS, and GUILE_LTLIBS.GUILE_CFLAGS: flags to pass to a C or C++ compiler to build code that uses Guile header files. This is almost always just one or more
-Iflags.GUILE_LDFLAGS: flags to pass to the compiler to link a program against Guile. This includes
-lguilefor the Guile library itself, any libraries that Guile itself requires (like -lqthreads), and so on. It may also include one or more-Lflag to tell the compiler where to find the libraries. But it does not include flags that influence the program's runtime search path for libraries, and will therefore lead to a program that fails to start, unless all necessary libraries are installed in a standard location such as /usr/lib.GUILE_LIBS and GUILE_LTLIBS: flags to pass to the compiler or to libtool, respectively, to link a program against Guile. It includes flags that augment the program's runtime search path for libraries, so that shared libraries will be found at the location where they were during linking, even in non-standard locations. GUILE_LIBS is to be used when linking the program directly with the compiler, whereas GUILE_LTLIBS is to be used when linking the program is done through libtool.
The variables are marked for substitution, as by
AC_SUBST.
This looks for Guile's "site" directory, usually something like PREFIX/share/guile/site, and sets var GUILE_SITE to the path. Note that the var name is different from the macro name.
The variable is marked for substitution, as by
AC_SUBST.
var is a shell variable name to be set to the return value. check is a Guile Scheme expression, evaluated with "$GUILE -c", and returning either 0 or non-#f to indicate the check passed. Non-0 number or #f indicates failure. Avoid using the character "#" since that confuses autoconf.
var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). featuretest is an expression acceptable to GUILE_CHECK, q.v. description is a present-tense verb phrase (passed to AC_MSG_CHECKING).
var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list).
symlist is a list of symbols, WITHOUT surrounding parens, like: ice-9 common-list.
var is a shell variable to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.
module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.
Using the autoconf macros is straightforward: Add the macro "calls" (actually
instantiations) to configure.ac, run aclocal, and finally,
run autoconf. If your system doesn't have guile.m4 installed, place
the desired macro definitions (AC_DEFUN forms) in acinclude.m4,
and aclocal will do the right thing.
Some of the macros can be used inside normal shell constructs: if foo ;
then GUILE_BAZ ; fi, but this is not guaranteed. It's probably a good idea
to instantiate macros at top-level.
We now include two examples, one simple and one complicated.
The first example is for a package that uses libguile, and thus needs to
know how to compile and link against it. So we use
PKG_CHECK_MODULES to set the vars GUILE_CFLAGS and
GUILE_LIBS, which are automatically substituted in the Makefile.
In configure.ac:
PKG_CHECK_MODULES([GUILE], [guile-2.0])
In Makefile.in:
GUILE_CFLAGS = @GUILE_CFLAGS@
GUILE_LIBS = @GUILE_LIBS@
myprog.o: myprog.c
$(CC) -o $ $(GUILE_CFLAGS) $<
myprog: myprog.o
$(CC) -o $ $< $(GUILE_LIBS)
The second example is for a package of Guile Scheme modules that uses an
external program and other Guile Scheme modules (some might call this a "pure
scheme" package). So we use the GUILE_SITE_DIR macro, a regular
AC_PATH_PROG macro, and the GUILE_MODULE_AVAILABLE macro.
In configure.ac:
GUILE_SITE_DIR
probably_wont_work=""
# pgtype pgtable
GUILE_MODULE_AVAILABLE(have_guile_pg, (database postgres))
test $have_guile_pg = no &&
probably_wont_work="(my pgtype) (my pgtable) $probably_wont_work"
# gpgutils
AC_PATH_PROG(GNUPG,gpg)
test x"$GNUPG" = x &&
probably_wont_work="(my gpgutils) $probably_wont_work"
if test ! "$probably_wont_work" = "" ; then
p=" ***"
echo
echo "$p"
echo "$p NOTE:"
echo "$p The following modules probably won't work:"
echo "$p $probably_wont_work"
echo "$p They can be installed anyway, and will work if their"
echo "$p dependencies are installed later. Please see README."
echo "$p"
echo
fi
In Makefile.in:
instdir = @GUILE_SITE@/my
install:
$(INSTALL) my/*.scm $(instdir)
Guile provides an application programming interface (API) to developers in two core languages: Scheme and C. This part of the manual contains reference documentation for all of the functionality that is available through both Scheme and C interfaces.
Guile's application programming interface (API) makes functionality available that an application developer can use in either C or Scheme programming. The interface consists of elements that may be macros, functions or variables in C, and procedures, variables, syntax or other types of object in Scheme.
Many elements are available to both Scheme and C, in a form that is
appropriate. For example, the assq Scheme procedure is also
available as scm_assq to C code. These elements are documented
only once, addressing both the Scheme and C aspects of them.
The Scheme name of an element is related to its C name in a regular way. Also, a C function takes its parameters in a systematic way.
Normally, the name of a C function can be derived given its Scheme name, using some simple textual transformations:
- (hyphen) with _ (underscore).
? (question mark) with _p.
! (exclamation point) with _x.
-> with _to_.
<= (less than or equal) with _leq.
>= (greater than or equal) with _geq.
< (less than) with _less.
> (greater than) with _gr.
scm_.
A C function always takes a fixed number of arguments of type
SCM, even when the corresponding Scheme function takes a
variable number.
For some Scheme functions, some last arguments are optional; the
corresponding C function must always be invoked with all optional
arguments specified. To get the effect as if an argument has not been
specified, pass SCM_UNDEFINED as its value. You can not do
this for an argument in the middle; when one argument is
SCM_UNDEFINED all the ones following it must be
SCM_UNDEFINED as well.
Some Scheme functions take an arbitrary number of rest arguments; the corresponding C function must be invoked with a list of all these arguments. This list is always the last argument of the C function.
These two variants can also be combined.
The type of the return value of a C function that corresponds to a
Scheme function is always SCM. In the descriptions below,
types are therefore often omitted but for the return value and for the
arguments.
From time to time functions and other features of Guile become obsolete. Guile's deprecation is a mechanism that can help you cope with this.
When you use a feature that is deprecated, you will likely get a warning
message at run-time. Also, if you have a new enough toolchain, using a
deprecated function from libguile will cause a link-time warning.
The primary source for information about just what interfaces are deprecated in a given release is the file NEWS. That file also documents what you should use instead of the obsoleted things.
The file README contains instructions on how to control the inclusion or removal of the deprecated features from the public API of Guile, and how to control the deprecation warning messages.
The idea behind this mechanism is that normally all deprecated interfaces are available, but you get feedback when compiling and running code that uses them, so that you can migrate to the newer APIs at your leisure.
Guile represents all Scheme values with the single C type SCM.
For an introduction to this topic, See Dynamic Types.
SCMis the user level abstract C type that is used to represent all of Guile's Scheme objects, no matter what the Scheme object type is. No C operation except assignment is guaranteed to work with variables of typeSCM, so you should only use macros and functions to work withSCMvalues. Values are converted between C data types and theSCMtype with utility functions and macros.
scm_t_bitsis an unsigned integral data type that is guaranteed to be large enough to hold all information that is required to represent any Scheme object. While this data type is mostly used to implement Guile's internals, the use of this type is also necessary to write certain kinds of extensions to Guile.
Transforms the
SCMvalue x into its representation as an integral type. Only after applyingSCM_UNPACKit is possible to access the bits and contents of theSCMvalue.
Takes a valid integral representation of a Scheme object and transforms it into its representation as a
SCMvalue.
Each thread that wants to use functions from the Guile API needs to
put itself into guile mode with either scm_with_guile or
scm_init_guile. The global state of Guile is initialized
automatically when the first thread enters guile mode.
When a thread wants to block outside of a Guile API function, it
should leave guile mode temporarily with scm_without_guile,
See Blocking.
Threads that are created by call-with-new-thread or
scm_spawn_thread start out in guile mode so you don't need to
initialize them.
Call func, passing it data and return what func returns. While func is running, the current thread is in guile mode and can thus use the Guile API.
When
scm_with_guileis called from guile mode, the thread remains in guile mode whenscm_with_guilereturns.Otherwise, it puts the current thread into guile mode and, if needed, gives it a Scheme representation that is contained in the list returned by
all-threads, for example. This Scheme representation is not removed whenscm_with_guilereturns so that a given thread is always represented by the same Scheme value during its lifetime, if at all.When this is the first thread that enters guile mode, the global state of Guile is initialized before calling
func.The function func is called via
scm_with_continuation_barrier; thus,scm_with_guilereturns exactly once.When
scm_with_guilereturns, the thread is no longer in guile mode (except whenscm_with_guilewas called from guile mode, see above). Thus, onlyfunccan storeSCMvariables on the stack and be sure that they are protected from the garbage collector. Seescm_init_guilefor another approach at initializing Guile that does not have this restriction.It is OK to call
scm_with_guilewhile a thread has temporarily left guile mode viascm_without_guile. It will then simply temporarily enter guile mode again.
Arrange things so that all of the code in the current thread executes as if from within a call to
scm_with_guile. That is, all functions called by the current thread can assume thatSCMvalues on their stack frames are protected from the garbage collector (except when the thread has explicitly left guile mode, of course).When
scm_init_guileis called from a thread that already has been in guile mode once, nothing happens. This behavior matters when you callscm_init_guilewhile the thread has only temporarily left guile mode: in that case the thread will not be in guile mode afterscm_init_guilereturns. Thus, you should not usescm_init_guilein such a scenario.When a uncaught throw happens in a thread that has been put into guile mode via
scm_init_guile, a short message is printed to the current error port and the thread is exited viascm_pthread_exit (NULL). No restrictions are placed on continuations.The function
scm_init_guilemight not be available on all platforms since it requires some stack-bounds-finding magic that might not have been ported to all platforms that Guile runs on. Thus, if you can, it is better to usescm_with_guileor its variationscm_boot_guileinstead of this function.
Enter guile mode as with
scm_with_guileand call main_func, passing it data, argc, and argv as indicated. When main_func returns,scm_boot_guilecallsexit (0);scm_boot_guilenever returns. If you want some other exit value, have main_func callexititself. If you don't want to exit at all, usescm_with_guileinstead ofscm_boot_guile.The function
scm_boot_guilearranges for the Schemecommand-linefunction to return the strings given by argc and argv. If main_func modifies argc or argv, it should callscm_set_program_argumentswith the final list, so Scheme code will know which arguments have been processed (see Runtime Environment).
Process command-line arguments in the manner of the
guileexecutable. This includes loading the normal Guile initialization files, interacting with the user or running any scripts or expressions specified by-sor-eoptions, and then exiting. See Invoking Guile, for more details.Since this function does not return, you must do all application-specific initialization before calling this function.
The following macros do two different things: when compiled normally,
they expand in one way; when processed during snarfing, they cause the
guile-snarf program to pick up some initialization code,
See Function Snarfing.
The descriptions below use the term `normally' to refer to the case
when the code is compiled normally, and `while snarfing' when the code
is processed by guile-snarf.
Normally,
SCM_SNARF_INITexpands to nothing; while snarfing, it causes code to be included in the initialization action file, followed by a semicolon.This is the fundamental macro for snarfing initialization actions. The more specialized macros below use it internally.
Normally, this macro expands into
static const char s_c_name[] = scheme_name; SCM c_name arglistWhile snarfing, it causes
scm_c_define_gsubr (s_c_name, req, opt, var, c_name);to be added to the initialization actions. Thus, you can use it to declare a C function named c_name that will be made available to Scheme with the name scheme_name.
Note that the arglist argument must have parentheses around it.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_from_locale_symbol (scheme_name));Thus, you can use them declare a static or global variable of type
SCMthat will be initialized to the symbol named scheme_name.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_make_keyword (scheme_name));Thus, you can use them declare a static or global variable of type
SCMthat will be initialized to the keyword named scheme_name.
These macros are equivalent to
SCM_VARIABLE_INITandSCM_GLOBAL_VARIABLE_INIT, respectively, with a value ofSCM_BOOL_F.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_define (scheme_name, value));Thus, you can use them declare a static or global C variable of type
SCMthat will be initialized to the object representing the Scheme variable named scheme_name in the current module. The variable will be defined when it doesn't already exist. It is always set to value.
This chapter describes those of Guile's simple data types which are primarily used for their role as items of generic data. By simple we mean data types that are not primarily used as containers to hold other data — i.e. pairs, lists, vectors and so on. For the documentation of such compound data types, see Compound Data Types.
The two boolean values are #t for true and #f for false.
Boolean values are returned by predicate procedures, such as the general
equality predicates eq?, eqv? and equal?
(see Equality) and numerical and string comparison operators like
string=? (see String Comparison) and <=
(see Comparison).
(<= 3 8)
⇒ #t
(<= 3 -3)
⇒ #f
(equal? "house" "houses")
⇒ #f
(eq? #f #f)
⇒
#t
In test condition contexts like if and cond
(see Conditionals), where a group of subexpressions will be
evaluated only if a condition expression evaluates to “true”,
“true” means any value at all except #f.
(if #t "yes" "no")
⇒ "yes"
(if 0 "yes" "no")
⇒ "yes"
(if #f "yes" "no")
⇒ "no"
A result of this asymmetry is that typical Scheme source code more often
uses #f explicitly than #t: #f is necessary to
represent an if or cond false value, whereas #t is
not necessary to represent an if or cond true value.
It is important to note that #f is not equivalent to any
other Scheme value. In particular, #f is not the same as the
number 0 (like in C and C++), and not the same as the “empty list”
(like in some Lisp dialects).
In C, the two Scheme boolean values are available as the two constants
SCM_BOOL_T for #t and SCM_BOOL_F for #f.
Care must be taken with the false value SCM_BOOL_F: it is not
false when used in C conditionals. In order to test for it, use
scm_is_false or scm_is_true.
Return
#tif obj is either#tor#f, else return#f.
Return
1if val isSCM_BOOL_T, return0when val isSCM_BOOL_F, else signal a `wrong type' error.You should probably use
scm_is_trueinstead of this function when you just want to test aSCMvalue for trueness.
Guile supports a rich “tower” of numerical types — integer, rational, real and complex — and provides an extensive set of mathematical and scientific functions for operating on numerical data. This section of the manual documents those types and functions.
You may also find it illuminating to read R5RS's presentation of numbers in Scheme, which is particularly clear and accessible: see Numbers.
Scheme's numerical “tower” consists of the following categories of numbers:
It is called a tower because each category “sits on” the one that follows it, in the sense that every integer is also a rational, every rational is also real, and every real number is also a complex number (but with zero imaginary part).
In addition to the classification into integers, rationals, reals and
complex numbers, Scheme also distinguishes between whether a number is
represented exactly or not. For example, the result of
2*sin(pi/4) is exactly 2^(1/2), but Guile
can represent neither pi/4 nor 2^(1/2) exactly.
Instead, it stores an inexact approximation, using the C type
double.
Guile can represent exact rationals of any magnitude, inexact
rationals that fit into a C double, and inexact complex numbers
with double real and imaginary parts.
The number? predicate may be applied to any Scheme value to
discover whether the value is any of the supported numerical types.
Return
#tif obj is any kind of number, else#f.
For example:
(number? 3)
⇒ #t
(number? "hello there!")
⇒ #f
(define pi 3.141592654)
(number? pi)
⇒ #t
The next few subsections document each of Guile's numerical data types in detail.
Integers are whole numbers, that is numbers with no fractional part, such as 2, 83, and −3789.
Integers in Guile can be arbitrarily big, as shown by the following example.
(define (factorial n)
(let loop ((n n) (product 1))
(if (= n 0)
product
(loop (- n 1) (* product n)))))
(factorial 3)
⇒ 6
(factorial 20)
⇒ 2432902008176640000
(- (factorial 45))
⇒ -119622220865480194561963161495657715064383733760000000000
Readers whose background is in programming languages where integers are limited by the need to fit into just 4 or 8 bytes of memory may find this surprising, or suspect that Guile's representation of integers is inefficient. In fact, Guile achieves a near optimal balance of convenience and efficiency by using the host computer's native representation of integers where possible, and a more general representation where the required number does not fit in the native form. Conversion between these two representations is automatic and completely invisible to the Scheme level programmer.
C has a host of different integer types, and Guile offers a host of
functions to convert between them and the SCM representation.
For example, a C int can be handled with scm_to_int and
scm_from_int. Guile also defines a few C integer types of its
own, to help with differences between systems.
C integer types that are not covered can be handled with the generic
scm_to_signed_integer and scm_from_signed_integer for
signed types, or with scm_to_unsigned_integer and
scm_from_unsigned_integer for unsigned types.
Scheme integers can be exact and inexact. For example, a number
written as 3.0 with an explicit decimal-point is inexact, but
it is also an integer. The functions integer? and
scm_is_integer report true for such a number, but the functions
scm_is_signed_integer and scm_is_unsigned_integer only
allow exact integers and thus report false. Likewise, the conversion
functions like scm_to_signed_integer only accept exact
integers.
The motivation for this behavior is that the inexactness of a number
should not be lost silently. If you want to allow inexact integers,
you can explicitly insert a call to inexact->exact or to its C
equivalent scm_inexact_to_exact. (Only inexact integers will
be converted by this call into exact integers; inexact non-integers
will become exact fractions.)
Return
#tif x is an exact or inexact integer number, else#f.(integer? 487) ⇒ #t (integer? 3.0) ⇒ #t (integer? -3.4) ⇒ #f (integer? +inf.0) ⇒ #t
The C types are equivalent to the corresponding ISO C types but are defined on all platforms, with the exception of
scm_t_int64andscm_t_uint64, which are only defined when a 64-bit type is available. For example,scm_t_int8is equivalent toint8_t.You can regard these definitions as a stop-gap measure until all platforms provide these types. If you know that all the platforms that you are interested in already provide these types, it is better to use them directly instead of the types provided by Guile.
Return
1when x represents an exact integer that is between min and max, inclusive.These functions can be used to check whether a
SCMvalue will fit into a given range, such as the range of a given C integer type. If you just want to convert aSCMvalue to a given C integer type, use one of the conversion functions directly.
When x represents an exact integer that is between min and max inclusive, return that integer. Else signal an error, either a `wrong-type' error when x is not an exact integer, or an `out-of-range' error when it doesn't fit the given range.
Return the
SCMvalue that represents the integer x. This function will always succeed and will always return an exact number.
When x represents an exact integer that fits into the indicated C type, return that integer. Else signal an error, either a `wrong-type' error when x is not an exact integer, or an `out-of-range' error when it doesn't fit the given range.
The functions
scm_to_long_long,scm_to_ulong_long,scm_to_int64, andscm_to_uint64are only available when the corresponding types are.
Return the
SCMvalue that represents the integer x. These functions will always succeed and will always return an exact number.
Assign val to the multiple precision integer rop. val must be an exact integer, otherwise an error will be signalled. rop must have been initialized with
mpz_initbefore this function is called. When rop is no longer needed the occupied space must be freed withmpz_clear. See Initializing Integers, for details.
Mathematically, the real numbers are the set of numbers that describe all possible points along a continuous, infinite, one-dimensional line. The rational numbers are the set of all numbers that can be written as fractions p/q, where p and q are integers. All rational numbers are also real, but there are real numbers that are not rational, for example the square root of 2, and pi.
Guile can represent both exact and inexact rational numbers, but it
cannot represent precise finite irrational numbers. Exact rationals are
represented by storing the numerator and denominator as two exact
integers. Inexact rationals are stored as floating point numbers using
the C type double.
Exact rationals are written as a fraction of integers. There must be no whitespace around the slash:
1/2
-22/7
Even though the actual encoding of inexact rationals is in binary, it may be helpful to think of it as a decimal number with a limited number of significant figures and a decimal point somewhere, since this corresponds to the standard notation for non-whole numbers. For example:
0.34
-0.00000142857931198
-5648394822220000000000.0
4.0
The limited precision of Guile's encoding means that any finite “real”
number in Guile can be written in a rational form, by multiplying and
then dividing by sufficient powers of 10 (or in fact, 2). For example,
‘-0.00000142857931198’ is the same as −142857931198 divided
by 100000000000000000. In Guile's current incarnation, therefore, the
rational? and real? predicates are equivalent for finite
numbers.
Dividing by an exact zero leads to a error message, as one might expect. However, dividing by an inexact zero does not produce an error. Instead, the result of the division is either plus or minus infinity, depending on the sign of the divided number and the sign of the zero divisor (some platforms support signed zeroes ‘-0.0’ and ‘+0.0’; ‘0.0’ is the same as ‘+0.0’).
Dividing zero by an inexact zero yields a NaN (`not a number')
value, although they are actually considered numbers by Scheme.
Attempts to compare a NaN value with any number (including
itself) using =, <, >, <= or >=
always returns #f. Although a NaN value is not
= to itself, it is both eqv? and equal? to itself
and other NaN values. However, the preferred way to test for
them is by using nan?.
The real NaN values and infinities are written ‘+nan.0’,
‘+inf.0’ and ‘-inf.0’. This syntax is also recognized by
read as an extension to the usual Scheme syntax. These special
values are considered by Scheme to be inexact real numbers but not
rational. Note that non-real complex numbers may also contain
infinities or NaN values in their real or imaginary parts. To
test a real number to see if it is infinite, a NaN value, or
neither, use inf?, nan?, or finite?, respectively.
Every real number in Scheme belongs to precisely one of those three
classes.
On platforms that follow IEEE 754 for their floating point
arithmetic, the ‘+inf.0’, ‘-inf.0’, and ‘+nan.0’ values
are implemented using the corresponding IEEE 754 values.
They behave in arithmetic operations like IEEE 754 describes
it, i.e., (= +nan.0 +nan.0) ⇒ #f.
Return
#tif obj is a real number, else#f. Note that the sets of integer and rational values form subsets of the set of real numbers, so the predicate will also be fulfilled if obj is an integer number or a rational number.
Return
#tif x is a rational number,#fotherwise. Note that the set of integer values forms a subset of the set of rational numbers, i.e. the predicate will also be fulfilled if x is an integer number.
Returns the simplest rational number differing from x by no more than eps.
As required by R5RS,
rationalizeonly returns an exact result when both its arguments are exact. Thus, you might need to useinexact->exacton the arguments.(rationalize (inexact->exact 1.2) 1/100) ⇒ 6/5
Return
#tif the real number x is ‘+inf.0’ or ‘-inf.0’. Otherwise return#f.
Return
#tif the real number x is ‘+nan.0’, or#fotherwise.
Return
#tif the real number x is neither infinite nor a NaN,#fotherwise.
Return the numerator of the rational number x.
Return the denominator of the rational number x.
Equivalent to
scm_is_true (scm_real_p (val))andscm_is_true (scm_rational_p (val)), respectively.
Returns the number closest to val that is representable as a
double. Returns infinity for a val that is too large in magnitude. The argument val must be a real number.
Return the
SCMvalue that represents val. The returned value is inexact according to the predicateinexact?, but it will be exactly equal to val.
Complex numbers are the set of numbers that describe all possible points in a two-dimensional space. The two coordinates of a particular point in this space are known as the real and imaginary parts of the complex number that describes that point.
In Guile, complex numbers are written in rectangular form as the sum of
their real and imaginary parts, using the symbol i to indicate
the imaginary part.
3+4i
⇒
3.0+4.0i
(* 3-8i 2.3+0.3i)
⇒
9.3-17.5i
Polar form can also be used, with an ‘@’ between magnitude and angle,
1@3.141592 ⇒ -1.0 (approx)
-1@1.57079 ⇒ 0.0-1.0i (approx)
Guile represents a complex number as a pair of inexact reals, so the real and imaginary parts of a complex number have the same properties of inexactness and limited precision as single inexact real numbers.
Note that each part of a complex number may contain any inexact real value, including the special values ‘+nan.0’, ‘+inf.0’ and ‘-inf.0’, as well as either of the signed zeroes ‘0.0’ or ‘-0.0’.
Return
#tif x is a complex number,#fotherwise. Note that the sets of real, rational and integer values form subsets of the set of complex numbers, i.e. the predicate will also be fulfilled if x is a real, rational or integer number.
R5RS requires that, with few exceptions, a calculation involving inexact
numbers always produces an inexact result. To meet this requirement,
Guile distinguishes between an exact integer value such as ‘5’ and
the corresponding inexact integer value which, to the limited precision
available, has no fractional part, and is printed as ‘5.0’. Guile
will only convert the latter value to the former when forced to do so by
an invocation of the inexact->exact procedure.
The only exception to the above requirement is when the values of the
inexact numbers do not affect the result. For example (expt n 0)
is ‘1’ for any value of n, therefore (expt 5.0 0) is
permitted to return an exact ‘1’.
Return
#tif the number z is exact,#fotherwise.(exact? 2) ⇒ #t (exact? 0.5) ⇒ #f (exact? (/ 2)) ⇒ #t
Return a
1if the number z is exact, and0otherwise. This is equivalent toscm_is_true (scm_exact_p (z)).An alternate approch to testing the exactness of a number is to use
scm_is_signed_integerorscm_is_unsigned_integer.
Return
#tif the number z is inexact,#felse.
Return a
1if the number z is inexact, and0otherwise. This is equivalent toscm_is_true (scm_inexact_p (z)).
Return an exact number that is numerically closest to z, when there is one. For inexact rationals, Guile returns the exact rational that is numerically equal to the inexact rational. Inexact complex numbers with a non-zero imaginary part can not be made exact.
(inexact->exact 0.5) ⇒ 1/2The following happens because 12/10 is not exactly representable as a
double(on most platforms). However, when reading a decimal number that has been marked exact with the “#e” prefix, Guile is able to represent it correctly.(inexact->exact 1.2) ⇒ 5404319552844595/4503599627370496 #e1.2 ⇒ 6/5
Convert the number z to its inexact representation.
The read syntax for integers is a string of digits, optionally preceded by a minus or plus character, a code indicating the base in which the integer is encoded, and a code indicating whether the number is exact or inexact. The supported base codes are:
#b#B#o#O#d#D#x#XIf the base code is omitted, the integer is assumed to be decimal. The following examples show how these base codes are used.
-13
⇒ -13
#d-13
⇒ -13
#x-13
⇒ -19
#b+1101
⇒ 13
#o377
⇒ 255
The codes for indicating exactness (which can, incidentally, be applied to all numerical values) are:
#e#E#i#IIf the exactness indicator is omitted, the number is exact unless it contains a radix point. Since Guile can not represent exact complex numbers, an error is signalled when asking for them.
(exact? 1.2)
⇒ #f
(exact? #e1.2)
⇒ #t
(exact? #e+1i)
ERROR: Wrong type argument
Guile also understands the syntax ‘+inf.0’ and ‘-inf.0’ for plus and minus infinity, respectively. The value must be written exactly as shown, that is, they always must have a sign and exactly one zero digit after the decimal point. It also understands ‘+nan.0’ and ‘-nan.0’ for the special `not-a-number' value. The sign is ignored for `not-a-number' and the value is always printed as ‘+nan.0’.
Return
#tif n is an odd number,#fotherwise.
Return
#tif n is an even number,#fotherwise.
Return the quotient or remainder from n divided by d. The quotient is rounded towards zero, and the remainder will have the same sign as n. In all cases quotient and remainder satisfy n = q*d + r.
(remainder 13 4) ⇒ 1 (remainder -13 4) ⇒ -1See also
truncate-quotient,truncate-remainderand related operations in Arithmetic.
Return the remainder from n divided by d, with the same sign as d.
(modulo 13 4) ⇒ 1 (modulo -13 4) ⇒ 3 (modulo 13 -4) ⇒ -3 (modulo -13 -4) ⇒ -1See also
floor-quotient,floor-remainderand related operations in Arithmetic.
Return the greatest common divisor of all arguments. If called without arguments, 0 is returned.
The C function
scm_gcdalways takes two arguments, while the Scheme function can take an arbitrary number.
Return the least common multiple of the arguments. If called without arguments, 1 is returned.
The C function
scm_lcmalways takes two arguments, while the Scheme function can take an arbitrary number.
Return n raised to the integer exponent k, modulo m.
(modulo-expt 2 3 5) ⇒ 3
Return two exact non-negative integers s and r such that k = s^2 + r and s^2 <= k < (s + 1)^2. An error is raised if k is not an exact non-negative integer.
(exact-integer-sqrt 10) ⇒ 3 and 1
The C comparison functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. Also keep in mind that
the C functions return one of the Scheme boolean values
SCM_BOOL_T or SCM_BOOL_F which are both true as far as C
is concerned. Thus, always write scm_is_true (scm_num_eq_p (x,
y)) when testing the two Scheme numbers x and y for
equality, for example.
Return
#tif all parameters are numerically equal.
Return
#tif the list of parameters is monotonically increasing.
Return
#tif the list of parameters is monotonically decreasing.
Return
#tif the list of parameters is monotonically non-decreasing.
Return
#tif the list of parameters is monotonically non-increasing.
Return
#tif z is an exact or inexact number equal to zero.
Return
#tif x is an exact or inexact number greater than zero.
Return
#tif x is an exact or inexact number less than zero.
The following procedures read and write numbers according to their
external representation as defined by R5RS (see R5RS Lexical Structure). See the (ice-9 i18n) module, for locale-dependent number parsing.
Return a string holding the external representation of the number n in the given radix. If n is inexact, a radix of 10 will be used.
Return a number of the maximally precise representation expressed by the given string. radix must be an exact integer, either 2, 8, 10, or 16. If supplied, radix is a default radix that may be overridden by an explicit radix prefix in string (e.g. "#o177"). If radix is not supplied, then the default radix is 10. If string is not a syntactically valid notation for a number, then
string->numberreturns#f.
As per
string->numberabove, but taking a C string, as pointer and length. The string characters should be in the current locale encoding (localein the name refers only to that, there's no locale-dependent parsing).
Return a complex number constructed of the given real-part and imaginary-part parts.
Return the real part of the number z.
Return the imaginary part of the number z.
Return the magnitude of the number z. This is the same as
absfor real arguments, but also allows complex numbers.
Like
scm_make_rectangularorscm_make_polar, respectively, but these functions takedoubles as their arguments.
Returns the real or imaginary part of z as a
double.
Returns the magnitude or angle of z as a
double.
The C arithmetic functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. When you need to
invoke them with just one argument, for example to compute the
equivalent of (- x), pass SCM_UNDEFINED as the second
one: scm_difference (x, SCM_UNDEFINED).
Return the sum of all parameter values. Return 0 if called without any parameters.
If called with one argument z1, -z1 is returned. Otherwise the sum of all but the first argument are subtracted from the first argument.
Return the product of all arguments. If called without arguments, 1 is returned.
Divide the first argument by the product of the remaining arguments. If called with one argument z1, 1/z1 is returned.
Return the absolute value of x.
x must be a number with zero imaginary part. To calculate the magnitude of a complex number, use
magnitudeinstead.
Return the maximum of all parameter values.
Return the minimum of all parameter values.
Round the inexact number x towards zero.
Round the inexact number x to the nearest integer. When exactly halfway between two integers, round to the even one.
Like
scm_truncate_numberorscm_round_number, respectively, but these functions take and returndoublevalues.
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
euclidean-quotientreturns the integer q andeuclidean-remainderreturns the real number r such that x = q*y + r and 0 <= r < |y|.euclidean/returns both q and r, and is more efficient than computing each separately. Note that when y > 0,euclidean-quotientreturns floor(x/y), otherwise it returns ceiling(x/y).Note that these operators are equivalent to the R6RS operators
div,mod, anddiv-and-mod.(euclidean-quotient 123 10) ⇒ 12 (euclidean-remainder 123 10) ⇒ 3 (euclidean/ 123 10) ⇒ 12 and 3 (euclidean/ 123 -10) ⇒ -12 and 3 (euclidean/ -123 10) ⇒ -13 and 7 (euclidean/ -123 -10) ⇒ 13 and 7 (euclidean/ -123.2 -63.5) ⇒ 2.0 and 3.8 (euclidean/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
floor-quotientreturns the integer q andfloor-remainderreturns the real number r such that q = floor(x/y) and x = q*y + r.floor/returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the same sign as y.When x and y are integers,
floor-remainderis equivalent to the R5RS integer-only operatormodulo.(floor-quotient 123 10) ⇒ 12 (floor-remainder 123 10) ⇒ 3 (floor/ 123 10) ⇒ 12 and 3 (floor/ 123 -10) ⇒ -13 and -7 (floor/ -123 10) ⇒ -13 and 7 (floor/ -123 -10) ⇒ 12 and -3 (floor/ -123.2 -63.5) ⇒ 1.0 and -59.7 (floor/ 16/3 -10/7) ⇒ -4 and -8/21
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
ceiling-quotientreturns the integer q andceiling-remainderreturns the real number r such that q = ceiling(x/y) and x = q*y + r.ceiling/returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the opposite sign of y.(ceiling-quotient 123 10) ⇒ 13 (ceiling-remainder 123 10) ⇒ -7 (ceiling/ 123 10) ⇒ 13 and -7 (ceiling/ 123 -10) ⇒ -12 and 3 (ceiling/ -123 10) ⇒ -12 and -3 (ceiling/ -123 -10) ⇒ 13 and 7 (ceiling/ -123.2 -63.5) ⇒ 2.0 and 3.8 (ceiling/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
truncate-quotientreturns the integer q andtruncate-remainderreturns the real number r such that q is x/y rounded toward zero, and x = q*y + r.truncate/returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the same sign as x.When x and y are integers, these operators are equivalent to the R5RS integer-only operators
quotientandremainder.(truncate-quotient 123 10) ⇒ 12 (truncate-remainder 123 10) ⇒ 3 (truncate/ 123 10) ⇒ 12 and 3 (truncate/ 123 -10) ⇒ -12 and 3 (truncate/ -123 10) ⇒ -12 and -3 (truncate/ -123 -10) ⇒ 12 and -3 (truncate/ -123.2 -63.5) ⇒ 1.0 and -59.7 (truncate/ 16/3 -10/7) ⇒ -3 and 22/21
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
centered-quotientreturns the integer q andcentered-remainderreturns the real number r such that x = q*y + r and -|y/2| <= r < |y/2|.centered/returns both q and r, and is more efficient than computing each separately.Note that
centered-quotientreturns x/y rounded to the nearest integer. When x/y lies exactly half-way between two integers, the tie is broken according to the sign of y. If y > 0, ties are rounded toward positive infinity, otherwise they are rounded toward negative infinity. This is a consequence of the requirement that -|y/2| <= r < |y/2|.Note that these operators are equivalent to the R6RS operators
div0,mod0, anddiv0-and-mod0.(centered-quotient 123 10) ⇒ 12 (centered-remainder 123 10) ⇒ 3 (centered/ 123 10) ⇒ 12 and 3 (centered/ 123 -10) ⇒ -12 and 3 (centered/ -123 10) ⇒ -12 and -3 (centered/ -123 -10) ⇒ 12 and -3 (centered/ 125 10) ⇒ 13 and -5 (centered/ 127 10) ⇒ 13 and -3 (centered/ 135 10) ⇒ 14 and -5 (centered/ -123.2 -63.5) ⇒ 2.0 and 3.8 (centered/ 16/3 -10/7) ⇒ -4 and -8/21
These procedures accept two real numbers x and y, where the divisor y must be non-zero.
round-quotientreturns the integer q andround-remainderreturns the real number r such that x = q*y + r and q is x/y rounded to the nearest integer, with ties going to the nearest even integer.round/returns both q and r, and is more efficient than computing each separately.Note that
round/andcentered/are almost equivalent, but their behavior differs when x/y lies exactly half-way between two integers. In this case,round/chooses the nearest even integer, whereascentered/chooses in such a way to satisfy the constraint -|y/2| <= r < |y/2|, which is stronger than the corresponding constraint forround/, -|y/2| <= r <= |y/2|. In particular, when x and y are integers, the number of possible remainders returned bycentered/is |y|, whereas the number of possible remainders returned byround/is |y|+1 when y is even.(round-quotient 123 10) ⇒ 12 (round-remainder 123 10) ⇒ 3 (round/ 123 10) ⇒ 12 and 3 (round/ 123 -10) ⇒ -12 and 3 (round/ -123 10) ⇒ -12 and -3 (round/ -123 -10) ⇒ 12 and -3 (round/ 125 10) ⇒ 12 and 5 (round/ 127 10) ⇒ 13 and -3 (round/ 135 10) ⇒ 14 and -5 (round/ -123.2 -63.5) ⇒ 2.0 and 3.8 (round/ 16/3 -10/7) ⇒ -4 and -8/21
The following procedures accept any kind of number as arguments, including complex numbers.
Return the square root of z. Of the two possible roots (positive and negative), the one with a positive real part is returned, or if that's zero then a positive imaginary part. Thus,
(sqrt 9.0) ⇒ 3.0 (sqrt -9.0) ⇒ 0.0+3.0i (sqrt 1.0+1.0i) ⇒ 1.09868411346781+0.455089860562227i (sqrt -1.0-1.0i) ⇒ 0.455089860562227-1.09868411346781i
Return e to the power of z, where e is the base of natural logarithms (2.71828...).
For the following bitwise functions, negative numbers are treated as infinite precision twos-complements. For instance -6 is bits ...111010, with infinitely many ones on the left. It can be seen that adding 6 (binary 110) to such a bit pattern gives all zeros.
Return the bitwise and of the integer arguments.
(logand) ⇒ -1 (logand 7) ⇒ 7 (logand #b111 #b011 #b001) ⇒ 1
Return the bitwise or of the integer arguments.
(logior) ⇒ 0 (logior 7) ⇒ 7 (logior #b000 #b001 #b011) ⇒ 3
Return the bitwise xor of the integer arguments. A bit is set in the result if it is set in an odd number of arguments.
(logxor) ⇒ 0 (logxor 7) ⇒ 7 (logxor #b000 #b001 #b011) ⇒ 2 (logxor #b000 #b001 #b011 #b011) ⇒ 1
Return the integer which is the ones-complement of the integer argument, ie. each 0 bit is changed to 1 and each 1 bit to 0.
(number->string (lognot #b10000000) 2) ⇒ "-10000001" (number->string (lognot #b0) 2) ⇒ "-1"
Test whether j and k have any 1 bits in common. This is equivalent to
(not (zero? (logand j k))), but without actually calculating thelogand, just testing for non-zero.(logtest #b0100 #b1011) ⇒ #f (logtest #b0100 #b0111) ⇒ #t
Test whether bit number index in j is set. index starts from 0 for the least significant bit.
(logbit? 0 #b1101) ⇒ #t (logbit? 1 #b1101) ⇒ #f (logbit? 2 #b1101) ⇒ #t (logbit? 3 #b1101) ⇒ #t (logbit? 4 #b1101) ⇒ #f
Return n shifted left by cnt bits, or shifted right if cnt is negative. This is an “arithmetic” shift.
This is effectively a multiplication by 2^cnt, and when cnt is negative it's a division, rounded towards negative infinity. (Note that this is not the same rounding as
quotientdoes.)With n viewed as an infinite precision twos complement,
ashmeans a left shift introducing zero bits, or a right shift dropping bits.(number->string (ash #b1 3) 2) ⇒ "1000" (number->string (ash #b1010 -1) 2) ⇒ "101" ;; -23 is bits ...11101001, -6 is bits ...111010 (ash -23 -2) ⇒ -6
Return the number of bits in integer n. If n is positive, the 1-bits in its binary representation are counted. If negative, the 0-bits in its two's-complement binary representation are counted. If zero, 0 is returned.
(logcount #b10101010) ⇒ 4 (logcount 0) ⇒ 0 (logcount -2) ⇒ 1
Return the number of bits necessary to represent n.
For positive n this is how many bits to the most significant one bit. For negative n it's how many bits to the most significant zero bit in twos complement form.
(integer-length #b10101010) ⇒ 8 (integer-length #b1111) ⇒ 4 (integer-length 0) ⇒ 0 (integer-length -1) ⇒ 0 (integer-length -256) ⇒ 8 (integer-length -257) ⇒ 9
Return n raised to the power k. k must be an exact integer, n can be any number.
Negative k is supported, and results in 1/n^abs(k) in the usual way. n^0 is 1, as usual, and that includes 0^0 is 1.
(integer-expt 2 5) ⇒ 32 (integer-expt -3 3) ⇒ -27 (integer-expt 5 -3) ⇒ 1/125 (integer-expt 0 0) ⇒ 1
Return the integer composed of the start (inclusive) through end (exclusive) bits of n. The startth bit becomes the 0-th bit in the result.
(number->string (bit-extract #b1101101010 0 4) 2) ⇒ "1010" (number->string (bit-extract #b1101101010 4 9) 2) ⇒ "10110"
Pseudo-random numbers are generated from a random state object, which
can be created with seed->random-state or
datum->random-state. An external representation (i.e. one
which can written with write and read with read) of a
random state object can be obtained via
random-state->datum. The state parameter to the
various functions below is optional, it defaults to the state object
in the *random-state* variable.
Return a copy of the random state state.
Return a number in [0, n).
Accepts a positive integer or real n and returns a number of the same type between zero (inclusive) and n (exclusive). The values returned have a uniform distribution.
Return an inexact real in an exponential distribution with mean 1. For an exponential distribution with mean u use
(*u(random:exp)).
Fills vect with inexact real random numbers the sum of whose squares is equal to 1.0. Thinking of vect as coordinates in space of dimension n =
(vector-lengthvect), the coordinates are uniformly distributed over the surface of the unit n-sphere.
Return an inexact real in a normal distribution. The distribution used has mean 0 and standard deviation 1. For a normal distribution with mean m and standard deviation d use
(+m(*d(random:normal))).
Fills vect with inexact real random numbers that are independent and standard normally distributed (i.e., with mean 0 and variance 1).
Fills vect with inexact real random numbers the sum of whose squares is less than 1.0. Thinking of vect as coordinates in space of dimension n =
(vector-lengthvect), the coordinates are uniformly distributed within the unit n-sphere.
Return a uniformly distributed inexact real random number in [0,1).
Return a new random state using seed.
Return a new random state from datum, which should have been obtained by
random-state->datum.
Return a datum representation of state that may be written out and read back with the Scheme reader.
Construct a new random state seeded from a platform-specific source of entropy, appropriate for use in non-security-critical applications. Currently /dev/urandom is tried first, or else the seed is based on the time, date, process ID, an address from a freshly allocated heap cell, an address from the local stack frame, and a high-resolution timer if available.
The global random state used by the above functions when the state parameter is not given.
Note that the initial value of *random-state* is the same every
time Guile starts up. Therefore, if you don't pass a state
parameter to the above procedures, and you don't set
*random-state* to (seed->random-state your-seed), where
your-seed is something that isn't the same every time,
you'll get the same sequence of “random” numbers on every run.
For example, unless the relevant source code has changed, (map
random (cdr (iota 30))), if the first use of random numbers since
Guile started up, will always give:
(map random (cdr (iota 19)))
⇒
(0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)
To seed the random state in a sensible way for non-security-critical applications, do this during initialization of your program:
(set! *random-state* (random-state-from-platform))
In Scheme, there is a data type to describe a single character.
Defining what exactly a character is can be more complicated than it seems. Guile follows the advice of R6RS and uses The Unicode Standard to help define what a character is. So, for Guile, a character is anything in the Unicode Character Database.
The Unicode Character Database is basically a table of characters
indexed using integers called 'code points'. Valid code points are in
the ranges 0 to #xD7FF inclusive or #xE000 to
#x10FFFF inclusive, which is about 1.1 million code points.
Any code point that has been assigned to a character or that has otherwise been given a meaning by Unicode is called a 'designated code point'. Most of the designated code points, about 200,000 of them, indicate characters, accents or other combining marks that modify other characters, symbols, whitespace, and control characters. Some are not characters but indicators that suggest how to format or display neighboring characters.
If a code point is not a designated code point – if it has not been assigned to a character by The Unicode Standard – it is a 'reserved code point', meaning that they are reserved for future use. Most of the code points, about 800,000, are 'reserved code points'.
By convention, a Unicode code point is written as “U+XXXX” where “XXXX” is a hexadecimal number. Please note that this convenient notation is not valid code. Guile does not interpret “U+XXXX” as a character.
In Scheme, a character literal is written as #\name where
name is the name of the character that you want. Printable
characters have their usual single character name; for example,
#\a is a lower case a.
Some of the code points are 'combining characters' that are not meant
to be printed by themselves but are instead meant to modify the
appearance of the previous character. For combining characters, an
alternate form of the character literal is #\ followed by
U+25CC (a small, dotted circle), followed by the combining character.
This allows the combining character to be drawn on the circle, not on
the backslash of #\.
Many of the non-printing characters, such as whitespace characters and control characters, also have names.
The most commonly used non-printing characters have long character names, described in the table below.
| Character Name | Codepoint
|
#\nul | U+0000
|
#\alarm | u+0007
|
#\backspace | U+0008
|
#\tab | U+0009
|
#\linefeed | U+000A
|
#\newline | U+000A
|
#\vtab | U+000B
|
#\page | U+000C
|
#\return | U+000D
|
#\esc | U+001B
|
#\space | U+0020
|
#\delete | U+007F
|
There are also short names for all of the “C0 control characters” (those with code points below 32). The following table lists the short name for each character.
0 = #\nul
| 1 = #\soh
| 2 = #\stx
| 3 = #\etx
|
4 = #\eot
| 5 = #\enq
| 6 = #\ack
| 7 = #\bel
|
8 = #\bs
| 9 = #\ht
| 10 = #\lf
| 11 = #\vt
|
12 = #\ff
| 13 = #\cr
| 14 = #\so
| 15 = #\si
|
16 = #\dle
| 17 = #\dc1
| 18 = #\dc2
| 19 = #\dc3
|
20 = #\dc4
| 21 = #\nak
| 22 = #\syn
| 23 = #\etb
|
24 = #\can
| 25 = #\em
| 26 = #\sub
| 27 = #\esc
|
28 = #\fs
| 29 = #\gs
| 30 = #\rs
| 31 = #\us
|
32 = #\sp
|
The short name for the “delete” character (code point U+007F) is
#\del.
There are also a few alternative names left over for compatibility with previous versions of Guile.
| Alternate | Standard
|
#\nl | #\newline
|
#\np | #\page
|
#\null | #\nul
|
Characters may also be written using their code point values. They can
be written with as an octal number, such as #\10 for
#\bs or #\177 for #\del.
If one prefers hex to octal, there is an additional syntax for character
escapes: #\xHHHH – the letter 'x' followed by a hexadecimal
number of one to eight digits.
Fundamentally, the character comparison operations below are numeric comparisons of the character's code points.
Return
#tiff code point of x is equal to the code point of y, else#f.
Return
#tiff the code point of x is less than the code point of y, else#f.
Return
#tiff the code point of x is less than or equal to the code point of y, else#f.
Return
#tiff the code point of x is greater than the code point of y, else#f.
Return
#tiff the code point of x is greater than or equal to the code point of y, else#f.
Case-insensitive character comparisons use Unicode case folding. In case folding comparisons, if a character is lowercase and has an uppercase form that can be expressed as a single character, it is converted to uppercase before comparison. All other characters undergo no conversion before the comparison occurs. This includes the German sharp S (Eszett) which is not uppercased before conversion because its uppercase form has two characters. Unicode case folding is language independent: it uses rules that are generally true, but, it cannot cover all cases for all languages.
Return
#tiff the case-folded code point of x is the same as the case-folded code point of y, else#f.
Return
#tiff the case-folded code point of x is less than the case-folded code point of y, else#f.
Return
#tiff the case-folded code point of x is less than or equal to the case-folded code point of y, else#f.
Return
#tiff the case-folded code point of x is greater than the case-folded code point of y, else#f.
Return
#tiff the case-folded code point of x is greater than or equal to the case-folded code point of y, else#f.
Return
#tiff chr is alphabetic, else#f.
Return
#tiff chr is numeric, else#f.
Return
#tiff chr is whitespace, else#f.
Return
#tiff chr is uppercase, else#f.
Return
#tiff chr is lowercase, else#f.
Return
#tiff chr is either uppercase or lowercase, else#f.
Return a symbol giving the two-letter name of the Unicode general category assigned to chr or
#fif no named category is assigned. The following table provides a list of category names along with their meanings.
Lu Uppercase letter Pf Final quote punctuation Ll Lowercase letter Po Other punctuation Lt Titlecase letter Sm Math symbol Lm Modifier letter Sc Currency symbol Lo Other letter Sk Modifier symbol Mn Non-spacing mark So Other symbol Mc Combining spacing mark Zs Space separator Me Enclosing mark Zl Line separator Nd Decimal digit number Zp Paragraph separator Nl Letter number Cc Control No Other number Cf Format Pc Connector punctuation Cs Surrogate Pd Dash punctuation Co Private use Ps Open punctuation Cn Unassigned Pe Close punctuation Pi Initial quote punctuation
Return the code point of chr.
Return the character that has code point n. The integer n must be a valid code point. Valid code points are in the ranges 0 to
#xD7FFinclusive or#xE000to#x10FFFFinclusive.
Return the uppercase character version of chr.
Return the lowercase character version of chr.
Return the titlecase character version of chr if one exists; otherwise return the uppercase version.
For most characters these will be the same, but the Unicode Standard includes certain digraph compatibility characters, such as
U+01F3“dz”, for which the uppercase and titlecase characters are different (U+01F1“DZ” andU+01F2“Dz” in this case, respectively).
These C functions take an integer representation of a Unicode codepoint and return the codepoint corresponding to its uppercase, lowercase, and titlecase forms respectively. The type
scm_t_wcharis a signed, 32-bit integer.
The features described in this section correspond directly to SRFI-14.
The data type charset implements sets of characters (see Characters). Because the internal representation of character sets is not visible to the user, a lot of procedures for handling them are provided.
Character sets can be created, extended, tested for the membership of a characters and be compared to other character sets.
Use these procedures for testing whether an object is a character set,
or whether several character sets are equal or subsets of each other.
char-set-hash can be used for calculating a hash value, maybe for
usage in fast lookup procedures.
Return
#tif obj is a character set,#fotherwise.
Return
#tif all given character sets are equal.
Return
#tif every character set csi is a subset of character set csi+1.
Compute a hash value for the character set cs. If bound is given and non-zero, it restricts the returned value to the range 0 ... bound - 1.
Character set cursors are a means for iterating over the members of a
character sets. After creating a character set cursor with
char-set-cursor, a cursor can be dereferenced with
char-set-ref, advanced to the next member with
char-set-cursor-next. Whether a cursor has passed past the last
element of the set can be checked with end-of-char-set?.
Additionally, mapping and (un-)folding procedures for character sets are provided.
Return a cursor into the character set cs.
Return the character at the current cursor position cursor in the character set cs. It is an error to pass a cursor for which
end-of-char-set?returns true.
Advance the character set cursor cursor to the next character in the character set cs. It is an error if the cursor given satisfies
end-of-char-set?.
Return
#tif cursor has reached the end of a character set,#fotherwise.
Fold the procedure kons over the character set cs, initializing it with knil.
This is a fundamental constructor for character sets.
- g is used to generate a series of “seed” values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of the seed values.
- f maps each seed value to a character. These characters are added to the base character set base_cs to form the result; base_cs defaults to the empty set.
This is a fundamental constructor for character sets.
- g is used to generate a series of “seed” values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of the seed values.
- f maps each seed value to a character. These characters are added to the base character set base_cs to form the result; base_cs defaults to the empty set.
Apply proc to every character in the character set cs. The return value is not specified.
Map the procedure proc over every character in cs. proc must be a character -> character procedure.
New character sets are produced with these procedures.
Return a newly allocated character set containing all characters in cs.
Return a character set containing all given characters.
Convert the character list list to a character set. If the character set base_cs is given, the character in this set are also included in the result.
Convert the character list list to a character set. The characters are added to base_cs and base_cs is returned.
Convert the string str to a character set. If the character set base_cs is given, the characters in this set are also included in the result.
Convert the string str to a character set. The characters from the string are added to base_cs, and base_cs is returned.
Return a character set containing every character from cs so that it satisfies pred. If provided, the characters from base_cs are added to the result.
Return a character set containing every character from cs so that it satisfies pred. The characters are added to base_cs and base_cs is returned.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is
#f, these characters are silently left out of the resulting character set.The characters in base_cs are added to the result, if given.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is
#f, these characters are silently left out of the resulting character set.The characters are added to base_cs and base_cs is returned.
Coerces x into a char-set. x may be a string, character or char-set. A string is converted to the set of its constituent characters; a character is converted to a singleton set; a char-set is returned as-is.
Access the elements and other information of a character set with these procedures.
Returns an association list containing debugging information for cs. The association list has the following entries.
The return value of this function cannot be relied upon to be consistent between versions of Guile and should not be used in code.
char-set- The char-set itself
len- The number of groups of contiguous code points the char-set contains
ranges- A list of lists where each sublist is a range of code points and their associated characters
Return the number of elements in character set cs.
Return the number of the elements int the character set cs which satisfy the predicate pred.
Return a list containing the elements of the character set cs.
Return a string containing the elements of the character set cs. The order in which the characters are placed in the string is not defined.
Return
#tiff the character ch is contained in the character set cs.
Return a true value if every character in the character set cs satisfies the predicate pred.
Return a true value if any character in the character set cs satisfies the predicate pred.
Character sets can be manipulated with the common set algebra operation, such as union, complement, intersection etc. All of these procedures provide side-effecting variants, which modify their character set argument(s).
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Return the complement of the character set cs.
Note that the complement of a character set is likely to contain many
reserved code points (code points that are not associated with
characters). It may be helpful to modify the output of
char-set-complement by computing its intersection with the set
of designated code points, char-set:designated.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
Return the complement of the character set cs.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
In order to make the use of the character set data type and procedures useful, several predefined character set variables exist.
These character sets are locale independent and are not recomputed
upon a setlocale call. They contain characters from the whole
range of Unicode code points. For instance, char-set:letter
contains about 94,000 characters.
All lower-case characters.
All upper-case characters.
All single characters that function as if they were an upper-case letter followed by a lower-case letter.
All letters. This includes
char-set:lower-case,char-set:upper-case,char-set:title-case, and many letters that have no case at all. For example, Chinese and Japanese characters typically have no concept of case.
The union of
char-set:letterandchar-set:digit.
All characters which would put ink on the paper.
The union of
char-set:graphicandchar-set:whitespace.
All whitespace characters.
All horizontal whitespace characters, which notably includes
#\spaceand#\tab.
The ISO control characters are the C0 control characters (U+0000 to U+001F), delete (U+007F), and the C1 control characters (U+0080 to U+009F).
All punctuation characters, such as the characters
!"#%&'()*,-./:;?@[\\]_{}
All symbol characters, such as the characters
$+<=>^`|~.
The hexadecimal digits
0123456789abcdefABCDEF.
This character set contains all designated code points. This includes all the code points to which Unicode has assigned a character or other meaning.
This character set contains all possible code points. This includes both designated and reserved code points.
Strings are fixed-length sequences of characters. They can be created by calling constructor procedures, but they can also literally get entered at the REPL or in Scheme source files.
Strings always carry the information about how many characters they are composed of with them, so there is no special end-of-string character, like in C. That means that Scheme strings can contain any character, even the ‘#\nul’ character ‘\0’.
To use strings efficiently, you need to know a bit about how Guile implements them. In Guile, a string consists of two parts, a head and the actual memory where the characters are stored. When a string (or a substring of it) is copied, only a new head gets created, the memory is usually not copied. The two heads start out pointing to the same memory.
When one of these two strings is modified, as with string-set!,
their common memory does get copied so that each string has its own
memory and modifying one does not accidentally modify the other as well.
Thus, Guile's strings are `copy on write'; the actual copying of their
memory is delayed until one string is written to.
This implementation makes functions like substring very
efficient in the common case that no modifications are done to the
involved strings.
If you do know that your strings are getting modified right away, you
can use substring/copy instead of substring. This
function performs the copy immediately at the time of creation. This
is more efficient, especially in a multi-threaded program. Also,
substring/copy can avoid the problem that a short substring
holds on to the memory of a very large original string that could
otherwise be recycled.
If you want to avoid the copy altogether, so that modifications of one
string show up in the other, you can use substring/shared. The
strings created by this procedure are called mutation sharing
substrings since the substring and the original string share
modifications to each other.
If you want to prevent modifications, use substring/read-only.
Guile provides all procedures of SRFI-13 and a few more.
The read syntax for strings is an arbitrarily long sequence of
characters enclosed in double quotes (").
Backslash is an escape character and can be used to insert the following
special characters. \" and \\ are R5RS standard, the
next seven are R6RS standard — notice they follow C syntax — and the
remaining four are Guile extensions.
\\\"" is otherwise the end
of the string).
\a\f\n\r\t\v\b\0\ followed by newline (ASCII 10)\ is the last character in a line, the
string will continue with the first character from the next line,
without a line break.
If the hungry-eol-escapes reader option is enabled, which is not
the case by default, leading whitespace on the next line is discarded.
"foo\
bar"
⇒ "foo bar"
(read-enable 'hungry-eol-escapes)
"foo\
bar"
⇒ "foobar"
\xHH\x7f for an ASCII DEL (127).
\uHHHH\u0100 for a capital A with macron (U+0100).
\UHHHHHH\U010402.
The following are examples of string literals:
"foo"
"bar plonk"
"Hello World"
"\"Hi\", he said."
The three escape sequences \xHH, \uHHHH and \UHHHHHH were
chosen to not break compatibility with code written for previous versions of
Guile. The R6RS specification suggests a different, incompatible syntax for hex
escapes: \xHHHH; – a character code followed by one to eight hexadecimal
digits terminated with a semicolon. If this escape format is desired instead,
it can be enabled with the reader option r6rs-hex-escapes.
(read-enable 'r6rs-hex-escapes)
For more on reader options, See Scheme Read.
The following procedures can be used to check whether a given string fulfills some specified property.
Return
#tif obj is a string, else#f.
Return
#tif str's length is zero, and#fotherwise.(string-null? "") ⇒ #t y ⇒ "foo" (string-null? y) ⇒ #f
Check if char_pred is true for any character in string s.
char_pred can be a character to check for any equal to that, or a character set (see Character Sets) to check for any in that set, or a predicate procedure to call.
For a procedure, calls
(char_predc)are made successively on the characters from start to end. If char_pred returns true (ie. non-#f),string-anystops and that return value is the return fromstring-any. The call on the last character (ie. at end-1), if that point is reached, is a tail call.If there are no characters in s (ie. start equals end) then the return is
#f.
Check if char_pred is true for every character in string s.
char_pred can be a character to check for every character equal to that, or a character set (see Character Sets) to check for every character being in that set, or a predicate procedure to call.
For a procedure, calls
(char_predc)are made successively on the characters from start to end. If char_pred returns#f,string-everystops and returns#f. The call on the last character (ie. at end-1), if that point is reached, is a tail call and the return from that call is the return fromstring-every.If there are no characters in s (ie. start equals end) then the return is
#t.
The string constructor procedures create new string objects, possibly initializing them with some specified character data. See also See String Selection, for ways to create strings from existing strings.
Return a newly allocated string made from the given character arguments.
(string #\x #\y #\z) ⇒ "xyz" (string) ⇒ ""
Return a newly allocated string made from a list of characters.
(list->string '(#\a #\b #\c)) ⇒ "abc"
Return a newly allocated string made from a list of characters, in reverse order.
(reverse-list->string '(#\a #\B #\c)) ⇒ "cBa"
Return a newly allocated string of length k. If chr is given, then all elements of the string are initialized to chr, otherwise the contents of the string are unspecified.
Like
scm_make_string, but expects the length as asize_t.
proc is an integer->char procedure. Construct a string of size len by applying proc to each index to produce the corresponding string element. The order in which proc is applied to the indices is not specified.
Append the string in the string list ls, using the string delim as a delimiter between the elements of ls. grammar is a symbol which specifies how the delimiter is placed between the strings, and defaults to the symbol
infix.
infix- Insert the separator between list elements. An empty string will produce an empty list.
string-infix- Like
infix, but will raise an error if given the empty list.suffix- Insert the separator after every list element.
prefix- Insert the separator before each list element.
When processing strings, it is often convenient to first convert them
into a list representation by using the procedure string->list,
work with the resulting list, and then convert it back into a string.
These procedures are useful for similar tasks.
Convert the string str into a list of characters.
Split the string str into a list of substrings delimited by appearances of the character chr. Note that an empty substring between separator characters will result in an empty string in the result list.
(string-split "root:x:0:0:root:/root:/bin/bash" #\:) ⇒ ("root" "x" "0" "0" "root" "/root" "/bin/bash") (string-split "::" #\:) ⇒ ("" "" "") (string-split "" #\:) ⇒ ("")
Portions of strings can be extracted by these procedures.
string-ref delivers individual characters whereas
substring can be used to extract substrings from longer strings.
Return the number of characters in string.
Return the number of characters in str as a
size_t.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return a copy of the given string str.
The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Return a new string formed from the characters of str beginning with index start (inclusive) and ending with index end (exclusive). str must be a string, start and end must be exact integers satisfying:
0 <= start <= end <=
(string-lengthstr).The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Like
substring, but the strings continue to share their storage even if they are modified. Thus, modifications to str show up in the new string, and vice versa.
Like
substring, but the storage for the new string is copied immediately.
Like
substring, but the resulting string can not be modified.
Like
scm_substring, etc. but the bounds are given as asize_t.
Return the n first characters of s.
Return all but the first n characters of s.
Return the n last characters of s.
Return all but the last n characters of s.
Take characters start to end from the string s and either pad with char or truncate them to give len characters.
string-padpads or truncates on the left, so for example(string-pad "x" 3) ⇒ " x" (string-pad "abcde" 3) ⇒ "cde"
string-pad-rightpads or truncates on the right, so for example(string-pad-right "x" 3) ⇒ "x " (string-pad-right "abcde" 3) ⇒ "abc"
Trim occurrences of char_pred from the ends of s.
string-trimtrims char_pred characters from the left (start) of the string,string-trim-righttrims them from the right (end) of the string,string-trim-bothtrims from both ends.char_pred can be a character, a character set, or a predicate procedure to call on each character. If char_pred is not given the default is whitespace as per
char-set:whitespace(see Standard Character Sets).(string-trim " x ") ⇒ "x " (string-trim-right "banana" #\a) ⇒ "banan" (string-trim-both ".,xy:;" char-set:punctuation) ⇒ "xy" (string-trim-both "xyzzy" (lambda (c) (or (eqv? c #\x) (eqv? c #\y)))) ⇒ "zz"
These procedures are for modifying strings in-place. This means that the result of the operation is not a new string; instead, the original string's memory representation is modified.
Store chr in element k of str and return an unspecified value. k must be a valid index of str.
Like
scm_string_set_x, but the index is given as asize_t.
Stores chr in every element of the given str and returns an unspecified value.
Change every character in str between start and end to fill.
(define y (string-copy "abcdefg")) (substring-fill! y 1 3 #\r) y ⇒ "arrdefg"
Copy the substring of str1 bounded by start1 and end1 into str2 beginning at position start2. str1 and str2 can be the same string.
Copy the sequence of characters from index range [start, end) in string s to string target, beginning at index tstart. The characters are copied left-to-right or right-to-left as needed – the copy is guaranteed to work, even if target and s are the same string. It is an error if the copy operation runs off the end of the target string.
The procedures in this section are similar to the character ordering predicates (see Characters), but are defined on character sequences.
The first set is specified in R5RS and has names that end in ?.
The second set is specified in SRFI-13 and the names have not ending
?.
The predicates ending in -ci ignore the character case
when comparing strings. For now, case-insensitive comparison is done
using the R5RS rules, where every lower-case character that has a
single character upper-case form is converted to uppercase before
comparison. See See the (ice-9 i18n) module, for locale-dependent string comparison.
Lexicographic equality predicate; return
#tif the two strings are the same length and contain the same characters in the same positions, otherwise return#f.The procedure
string-ci=?treats upper and lower case letters as though they were the same character, butstring=?treats upper and lower case as distinct characters.
Lexicographic ordering predicate; return
#tif s1 is lexicographically less than s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically less than or equal to s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically greater than s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically greater than or equal to s2.
Case-insensitive string equality predicate; return
#tif the two strings are the same length and their component characters match (ignoring case) at each position; otherwise return#f.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically less than s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically less than or equal to s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically greater than s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically greater than or equal to s2 regardless of case.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position where the lowercased letters do not match.
Return
#fif s1 and s2 are not equal, a true value otherwise.
Return
#fif s1 and s2 are equal, a true value otherwise.
Return
#fif s1 is greater or equal to s2, a true value otherwise.
Return
#fif s1 is less or equal to s2, a true value otherwise.
Return
#fif s1 is greater to s2, a true value otherwise.
Return
#fif s1 is less to s2, a true value otherwise.
Return
#fif s1 and s2 are not equal, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 and s2 are equal, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is greater or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is less or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is greater to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is less to s2, a true value otherwise. The character comparison is done case-insensitively.
Compute a hash value for S. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Compute a hash value for S. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Because the same visual appearance of an abstract Unicode character can
be obtained via multiple sequences of Unicode characters, even the
case-insensitive string comparison functions described above may return
#f when presented with strings containing different
representations of the same character. For example, the Unicode
character “LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE” can be
represented with a single character (U+1E69) or by the character “LATIN
SMALL LETTER S” (U+0073) followed by the combining marks “COMBINING
DOT BELOW” (U+0323) and “COMBINING DOT ABOVE” (U+0307).
For this reason, it is often desirable to ensure that the strings to be compared are using a mutually consistent representation for every character. The Unicode standard defines two methods of normalizing the contents of strings: Decomposition, which breaks composite characters into a set of constituent characters with an ordering defined by the Unicode Standard; and composition, which performs the converse.
There are two decomposition operations. “Canonical decomposition” produces character sequences that share the same visual appearance as the original characters, while “compatibility decomposition” produces ones whose visual appearances may differ from the originals but which represent the same abstract character.
These operations are encapsulated in the following set of normalization forms:
The functions below put their arguments into one of the forms described above.
Return the
NFDnormalized form of s.
Return the
NFKDnormalized form of s.
Return the
NFCnormalized form of s.
Return the
NFKCnormalized form of s.
Search through the string s from left to right, returning the index of the first occurrence of a character which
- equals char_pred, if it is character,
- satisfies the predicate char_pred, if it is a procedure,
- is in the set char_pred, if it is a character set.
Return
#fif no match is found.
Search through the string s from right to left, returning the index of the last occurrence of a character which
- equals char_pred, if it is character,
- satisfies the predicate char_pred, if it is a procedure,
- is in the set if char_pred is a character set.
Return
#fif no match is found.
Return the length of the longest common prefix of the two strings.
Return the length of the longest common prefix of the two strings, ignoring character case.
Return the length of the longest common suffix of the two strings.
Return the length of the longest common suffix of the two strings, ignoring character case.
Is s1 a prefix of s2?
Is s1 a prefix of s2, ignoring character case?
Is s1 a suffix of s2?
Is s1 a suffix of s2, ignoring character case?
Search through the string s from right to left, returning the index of the last occurrence of a character which
- equals char_pred, if it is character,
- satisfies the predicate char_pred, if it is a procedure,
- is in the set if char_pred is a character set.
Return
#fif no match is found.
Search through the string s from left to right, returning the index of the first occurrence of a character which
- does not equal char_pred, if it is character,
- does not satisfy the predicate char_pred, if it is a procedure,
- is not in the set if char_pred is a character set.
Search through the string s from right to left, returning the index of the last occurrence of a character which
- does not equal char_pred, if it is character,
- does not satisfy the predicate char_pred, if it is a procedure,
- is not in the set if char_pred is a character set.
Return the count of the number of characters in the string s which
- equals char_pred, if it is character,
- satisfies the predicate char_pred, if it is a procedure.
- is in the set char_pred, if it is a character set.
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings.
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings. Character comparison is done case-insensitively.
These are procedures for mapping strings to their upper- or lower-case equivalents, respectively, or for capitalizing strings.
They use the basic case mapping rules for Unicode characters. No special language or context rules are considered. The resulting strings are guaranteed to be the same length as the input strings.
See the (ice-9 i18n) module, for locale-dependent case conversions.
Upcase every character in
str.
Destructively upcase every character in
str.(string-upcase! y) ⇒ "ARRDEFG" y ⇒ "ARRDEFG"
Downcase every character in str.
Destructively downcase every character in str.
y ⇒ "ARRDEFG" (string-downcase! y) ⇒ "arrdefg" y ⇒ "arrdefg"
Return a freshly allocated string with the characters in str, where the first character of every word is capitalized.
Upcase the first character of every word in str destructively and return str.
y ⇒ "hello world" (string-capitalize! y) ⇒ "Hello World" y ⇒ "Hello World"
Titlecase every first character in a word in str.
Destructively titlecase every first character in a word in str.
Reverse the string str. The optional arguments start and end delimit the region of str to operate on.
Reverse the string str in-place. The optional arguments start and end delimit the region of str to operate on. The return value is unspecified.
Return a newly allocated string whose characters form the concatenation of the given strings, args.
(let ((h "hello ")) (string-append h "world")) ⇒ "hello world"
Like
string-append, but the result may share memory with the argument strings.
Append the elements of ls (which must be strings) together into a single string. Guaranteed to return a freshly allocated string.
Without optional arguments, this procedure is equivalent to
(string-concatenate (reverse ls))If the optional argument final_string is specified, it is consed onto the beginning to ls before performing the list-reverse and string-concatenate operations. If end is given, only the characters of final_string up to index end are used.
Guaranteed to return a freshly allocated string.
Like
string-concatenate, but the result may share memory with the strings in the list ls.
Like
string-concatenate-reverse, but the result may share memory with the strings in the ls arguments.
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified.
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified. The string s is modified in-place, the return value is not specified.
proc is mapped over s in left-to-right order. The return value is not specified.
Call
(proci)for each index i in s, from left to right.For example, to change characters to alternately upper and lower case,
(define str (string-copy "studly")) (string-for-each-index (lambda (i) (string-set! str i ((if (even? i) char-upcase char-downcase) (string-ref str i)))) str) str ⇒ "StUdLy"
Fold kons over the characters of s, with knil as the terminating element, from left to right. kons must expect two arguments: The actual character and the last result of kons' application.
Fold kons over the characters of s, with knil as the terminating element, from right to left. kons must expect two arguments: The actual character and the last result of kons' application.
- g is used to generate a series of seed values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of these seed values.
- f maps each seed value to the corresponding character in the result string. These chars are assembled into the string in a left-to-right order.
- base is the optional initial/leftmost portion of the constructed string; it default to the empty string.
- make_final is applied to the terminal seed value (on which p returns true) to produce the final/rightmost portion of the constructed string. The default is nothing extra.
- g is used to generate a series of seed values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of these seed values.
- f maps each seed value to the corresponding character in the result string. These chars are assembled into the string in a right-to-left order.
- base is the optional initial/rightmost portion of the constructed string; it default to the empty string.
- make_final is applied to the terminal seed value (on which p returns true) to produce the final/leftmost portion of the constructed string. It defaults to
(lambda (x) ).
This is the extended substring procedure that implements replicated copying of a substring of some string.
s is a string, start and end are optional arguments that demarcate a substring of s, defaulting to 0 and the length of s. Replicate this substring up and down index space, in both the positive and negative directions.
xsubstringreturns the substring of this string beginning at index from, and ending at to, which defaults to from + (end - start).
Exactly the same as
xsubstring, but the extracted text is written into the string target starting at index tstart. The operation is not defined if(eq?target s)or these arguments share storage – you cannot copy a string on top of itself.
Return the string s1, but with the characters start1 ... end1 replaced by the characters start2 ... end2 from s2.
Split the string s into a list of substrings, where each substring is a maximal non-empty contiguous sequence of characters from the character set token_set, which defaults to
char-set:graphic. If start or end indices are provided, they restrictstring-tokenizeto operating on the indicated substring of s.
Filter the string s, retaining only those characters which satisfy char_pred.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
Delete characters satisfying char_pred from s.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
When creating a Scheme string from a C string or when converting a Scheme string to a C string, the concept of character encoding becomes important.
In C, a string is just a sequence of bytes, and the character encoding describes the relation between these bytes and the actual characters that make up the string. For Scheme strings, character encoding is not an issue (most of the time), since in Scheme you never get to see the bytes, only the characters.
Converting to C and converting from C each have their own challenges.
When converting from C to Scheme, it is important that the sequence of bytes in the C string be valid with respect to its encoding. ASCII strings, for example, can't have any bytes greater than 127. An ASCII byte greater than 127 is considered ill-formed and cannot be converted into a Scheme character.
Problems can occur in the reverse operation as well. Not all character encodings can hold all possible Scheme characters. Some encodings, like ASCII for example, can only describe a small subset of all possible characters. So, when converting to C, one must first decide what to do with Scheme characters that can't be represented in the C string.
Converting a Scheme string to a C string will often allocate fresh
memory to hold the result. You must take care that this memory is
properly freed eventually. In many cases, this can be achieved by
using scm_dynwind_free inside an appropriate dynwind context,
See Dynamic Wind.
Creates a new Scheme string that has the same contents as str when interpreted in the character encoding of the current locale.
For
scm_from_locale_string, str must be null-terminated.For
scm_from_locale_stringn, len specifies the length of str in bytes, and str does not need to be null-terminated. If len is(size_t)-1, then str does need to be null-terminated and the real length will be found withstrlen.If the C string is ill-formed, an error will be raised.
Note that these functions should not be used to convert C string constants, because there is no guarantee that the current locale will match that of the source code. To convert C string constants, use
scm_from_latin1_string,scm_from_utf8_stringorscm_from_utf32_string.
Like
scm_from_locale_stringandscm_from_locale_stringn, respectively, but also frees str withfreeeventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.
Returns a C string with the same contents as str in the character encoding of the current locale. The C string must be freed with
freeeventually, maybe by usingscm_dynwind_free, See Dynamic Wind.For
scm_to_locale_string, the returned string is null-terminated and an error is signalled when str contains#\nulcharacters.For
scm_to_locale_stringnand lenp notNULL, str might contain#\nulcharacters and the length of the returned string in bytes is stored in*lenp. The returned string will not be null-terminated in this case. If lenp isNULL,scm_to_locale_stringnbehaves likescm_to_locale_string.If a character in str cannot be represented in the character encoding of the current locale, the default port conversion strategy is used. See Ports, for more on conversion strategies.
If the conversion strategy is
error, an error will be raised. If it issubstitute, a replacement character, such as a question mark, will be inserted in its place. If it isescape, a hex escape will be inserted in its place.
Puts str as a C string in the current locale encoding into the memory pointed to by buf. The buffer at buf has room for max_len bytes and
scm_to_local_stringbufwill never store more than that. No terminating'\0'will be stored.The return value of
scm_to_locale_stringbufis the number of bytes that are needed for all of str, regardless of whether buf was large enough to hold them. Thus, when the return value is larger than max_len, only max_len bytes have been stored and you probably need to try again with a larger buffer.
For most situations, string conversion should occur using the current
locale, such as with the functions above. But there may be cases where
one wants to convert strings from a character encoding other than the
locale's character encoding. For these cases, the lower-level functions
scm_to_stringn and scm_from_stringn are provided. These
functions should seldom be necessary if one is properly using locales.
This is an enumerated type that can take one of three values:
SCM_FAILED_CONVERSION_ERROR,SCM_FAILED_CONVERSION_QUESTION_MARK, andSCM_FAILED_CONVERSION_ESCAPE_SEQUENCE. They are used to indicate a strategy for handling characters that cannot be converted to or from a given character encoding.SCM_FAILED_CONVERSION_ERRORindicates that a conversion should throw an error if some characters cannot be converted.SCM_FAILED_CONVERSION_QUESTION_MARKindicates that a conversion should replace unconvertable characters with the question mark character. And,SCM_FAILED_CONVERSION_ESCAPE_SEQUENCErequests that a conversion should replace an unconvertable character with an escape sequence.While all three strategies apply when converting Scheme strings to C, only
SCM_FAILED_CONVERSION_ERRORandSCM_FAILED_CONVERSION_QUESTION_MARKcan be used when converting C strings to Scheme.
This function returns a newly allocated C string from the Guile string str. The length of the returned string in bytes will be returned in lenp. The character encoding of the C string is passed as the ASCII, null-terminated C string encoding. The handler parameter gives a strategy for dealing with characters that cannot be converted into encoding.
If lenp is
NULL, this function will return a null-terminated C string. It will throw an error if the string contains a null character.
This function returns a scheme string from the C string str. The length in bytes of the C string is input as len. The encoding of the C string is passed as the ASCII, null-terminated C string
encoding. The handler parameters suggests a strategy for dealing with unconvertable characters.
The following conversion functions are provided as a convenience for the most commonly used encodings.
Return a scheme string from the null-terminated C string str, which is ISO-8859-1-, UTF-8-, or UTF-32-encoded. These functions should be used to convert hard-coded C string constants into Scheme strings.
Return a scheme string from C string str, which is ISO-8859-1-, UTF-8-, or UTF-32-encoded, of length len. len is the number of bytes pointed to by str for
scm_from_latin1_stringnandscm_from_utf8_stringn; it is the number of elements (code points) in str in the case ofscm_from_utf32_stringn.
Return a newly allocated, ISO-8859-1-, UTF-8-, or UTF-32-encoded C string from Scheme string str. An error is thrown when str cannot be converted to the specified encoding. If lenp is
NULL, the returned C string will be null terminated, and an error will be thrown if the C string would otherwise contain null characters. If lenp is notNULL, the string is not null terminated, and the length of the returned string is returned in lenp. The length returned is the number of bytes forscm_to_latin1_stringnandscm_to_utf8_stringn; it is the number of elements (code points) forscm_to_utf32_stringn.
Guile stores each string in memory as a contiguous array of Unicode code points along with an associated set of attributes. If all of the code points of a string have an integer range between 0 and 255 inclusive, the code point array is stored as one byte per code point: it is stored as an ISO-8859-1 (aka Latin-1) string. If any of the code points of the string has an integer value greater that 255, the code point array is stored as four bytes per code point: it is stored as a UTF-32 string.
Conversion between the one-byte-per-code-point and four-bytes-per-code-point representations happens automatically as necessary.
No API is provided to set the internal representation of strings; however, there are pair of procedures available to query it. These are debugging procedures. Using them in production code is discouraged, since the details of Guile's internal representation of strings may change from release to release.
Return the number of bytes used to encode a Unicode code point in string str. The result is one or four.
Returns an association list containing debugging information for str. The association list has the following entries.
string- The string itself.
start- The start index of the string into its stringbuf
length- The length of the string
shared- If this string is a substring, it returns its parent string. Otherwise, it returns
#fread-only#tif the string is read-onlystringbuf-chars- A new string containing this string's stringbuf's characters
stringbuf-length- The number of characters in this stringbuf
stringbuf-shared#tif this stringbuf is sharedstringbuf-wide#tif this stringbuf's characters are stored in a 32-bit buffer, or#fif they are stored in an 8-bit buffer
A bytevector is a raw bit string. The (rnrs bytevectors)
module provides the programming interface specified by the
Revised^6 Report on the Algorithmic Language Scheme (R6RS). It contains procedures to manipulate bytevectors and
interpret their contents in a number of ways: bytevector contents can be
accessed as signed or unsigned integer of various sizes and endianness,
as IEEE-754 floating point numbers, or as strings. It is a useful tool
to encode and decode binary data.
The R6RS (Section 4.3.4) specifies an external representation for
bytevectors, whereby the octets (integers in the range 0–255) contained
in the bytevector are represented as a list prefixed by #vu8:
#vu8(1 53 204)
denotes a 3-byte bytevector containing the octets 1, 53, and 204. Like string literals, booleans, etc., bytevectors are “self-quoting”, i.e., they do not need to be quoted:
#vu8(1 53 204)
⇒ #vu8(1 53 204)
Bytevectors can be used with the binary input/output primitives of the R6RS (see R6RS I/O Ports).
Some of the following procedures take an endianness parameter. The endianness is defined as the order of bytes in multi-byte numbers: numbers encoded in big endian have their most significant bytes written first, whereas numbers encoded in little endian have their least significant bytes first5.
Little-endian is the native endianness of the IA32 architecture and
its derivatives, while big-endian is native to SPARC and PowerPC,
among others. The native-endianness procedure returns the
native endianness of the machine it runs on.
Return a value denoting the native endianness of the host machine.
Return an object denoting the endianness specified by symbol. If symbol is neither
bignorlittlethen an error is raised at expand-time.
The objects denoting big- and little-endianness, respectively.
Bytevectors can be created, copied, and analyzed with the following procedures and C functions.
Return a new bytevector of len bytes. Optionally, if fill is given, fill it with fill; fill must be in the range [-128,255].
Return true if obj is a bytevector.
Return the length in bytes of bytevector bv.
Likewise, return the length in bytes of bytevector bv.
Return is bv1 equals to bv2—i.e., if they have the same length and contents.
Fill bytevector bv with fill, a byte.
Copy len bytes from source into target, starting reading from source-start (a positive index within source) and start writing at target-start. It is permitted for the source and target regions to overlap.
Return a newly allocated copy of bv.
Return the byte at index in bytevector bv.
Set the byte at index in bv to value.
Low-level C macros are available. They do not perform any type-checking; as such they should be used with care.
Return a pointer to the contents of bytevector bv.
The contents of a bytevector can be interpreted as a sequence of integers of any given size, sign, and endianness.
(let ((bv (make-bytevector 4)))
(bytevector-u8-set! bv 0 #x12)
(bytevector-u8-set! bv 1 #x34)
(bytevector-u8-set! bv 2 #x56)
(bytevector-u8-set! bv 3 #x78)
(map (lambda (number)
(number->string number 16))
(list (bytevector-u8-ref bv 0)
(bytevector-u16-ref bv 0 (endianness big))
(bytevector-u32-ref bv 0 (endianness little)))))
⇒ ("12" "1234" "78563412")
The most generic procedures to interpret bytevector contents as integers are described below.
Return the size-byte long unsigned integer at index index in bv, decoded according to endianness.
Return the size-byte long signed integer at index index in bv, decoded according to endianness.
Set the size-byte long unsigned integer at index to value, encoded according to endianness.
Set the size-byte long signed integer at index to value, encoded according to endianness.
The following procedures are similar to the ones above, but specialized to a given integer size:
Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to endianness.
Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to endianness.
Finally, a variant specialized for the host's endianness is available
for each of these functions (with the exception of the u8
accessors, for obvious reasons):
Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to the host's native endianness.
Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to the host's native endianness.
Bytevector contents can readily be converted to/from lists of signed or unsigned integers:
(bytevector->sint-list (u8-list->bytevector (make-list 4 255))
(endianness little) 2)
⇒ (-1 -1)
Return a newly allocated list of unsigned 8-bit integers from the contents of bv.
Return a newly allocated bytevector consisting of the unsigned 8-bit integers listed in lst.
Return a list of unsigned integers of size bytes representing the contents of bv, decoded according to endianness.
Return a list of signed integers of size bytes representing the contents of bv, decoded according to endianness.
Return a new bytevector containing the unsigned integers listed in lst and encoded on size bytes according to endianness.
Return a new bytevector containing the signed integers listed in lst and encoded on size bytes according to endianness.
Bytevector contents can also be accessed as IEEE-754 single- or double-precision floating point numbers (respectively 32 and 64-bit long) using the procedures described here.
Return the IEEE-754 single-precision floating point number from bv at index according to endianness.
Store real number value in bv at index according to endianness.
Specialized procedures are also available:
Return the IEEE-754 single-precision floating point number from bv at index according to the host's native endianness.
Store real number value in bv at index according to the host's native endianness.
Bytevector contents can also be interpreted as Unicode strings encoded in one of the most commonly available encoding formats.
(utf8->string (u8-list->bytevector '(99 97 102 101)))
⇒ "cafe"
(string->utf8 "café") ;; SMALL LATIN LETTER E WITH ACUTE ACCENT
⇒ #vu8(99 97 102 195 169)
Return a newly allocated bytevector that contains the UTF-8, UTF-16, or UTF-32 (aka. UCS-4) encoding of str. For UTF-16 and UTF-32, endianness should be the symbol
bigorlittle; when omitted, it defaults to big endian.
Return a newly allocated string that contains from the UTF-8-, UTF-16-, or UTF-32-decoded contents of bytevector utf. For UTF-16 and UTF-32, endianness should be the symbol
bigorlittle; when omitted, it defaults to big endian.
As an extension to the R6RS, Guile allows bytevectors to be manipulated with the generalized vector procedures (see Generalized Vectors). This also allows bytevectors to be accessed using the generic array procedures (see Array Procedures). When using these APIs, bytes are accessed one at a time as 8-bit unsigned integers:
(define bv #vu8(0 1 2 3))
(generalized-vector? bv)
⇒ #t
(generalized-vector-ref bv 2)
⇒ 2
(generalized-vector-set! bv 2 77)
(array-ref bv 2)
⇒ 77
(array-type bv)
⇒ vu8
Bytevectors may also be accessed with the SRFI-4 API. See SRFI-4 and Bytevectors, for more information.
Symbols in Scheme are widely used in three ways: as items of discrete data, as lookup keys for alists and hash tables, and to denote variable references.
A symbol is similar to a string in that it is defined by a sequence of characters. The sequence of characters is known as the symbol's name. In the usual case — that is, where the symbol's name doesn't include any characters that could be confused with other elements of Scheme syntax — a symbol is written in a Scheme program by writing the sequence of characters that make up the name, without any quotation marks or other special syntax. For example, the symbol whose name is “multiply-by-2” is written, simply:
multiply-by-2
Notice how this differs from a string with contents “multiply-by-2”, which is written with double quotation marks, like this:
"multiply-by-2"
Looking beyond how they are written, symbols are different from strings in two important respects.
The first important difference is uniqueness. If the same-looking string is read twice from two different places in a program, the result is two different string objects whose contents just happen to be the same. If, on the other hand, the same-looking symbol is read twice from two different places in a program, the result is the same symbol object both times.
Given two read symbols, you can use eq? to test whether they are
the same (that is, have the same name). eq? is the most
efficient comparison operator in Scheme, and comparing two symbols like
this is as fast as comparing, for example, two numbers. Given two
strings, on the other hand, you must use equal? or
string=?, which are much slower comparison operators, to
determine whether the strings have the same contents.
(define sym1 (quote hello))
(define sym2 (quote hello))
(eq? sym1 sym2) ⇒ #t
(define str1 "hello")
(define str2 "hello")
(eq? str1 str2) ⇒ #f
(equal? str1 str2) ⇒ #t
The second important difference is that symbols, unlike strings, are not
self-evaluating. This is why we need the (quote ...)s in the
example above: (quote hello) evaluates to the symbol named
"hello" itself, whereas an unquoted hello is read as the
symbol named "hello" and evaluated as a variable reference ... about
which more below (see Symbol Variables).
Numbers and symbols are similar to the extent that they both lend
themselves to eq? comparison. But symbols are more descriptive
than numbers, because a symbol's name can be used directly to describe
the concept for which that symbol stands.
For example, imagine that you need to represent some colours in a computer program. Using numbers, you would have to choose arbitrarily some mapping between numbers and colours, and then take care to use that mapping consistently:
;; 1=red, 2=green, 3=purple
(if (eq? (colour-of car) 1)
...)
You can make the mapping more explicit and the code more readable by defining constants:
(define red 1)
(define green 2)
(define purple 3)
(if (eq? (colour-of car) red)
...)
But the simplest and clearest approach is not to use numbers at all, but symbols whose names specify the colours that they refer to:
(if (eq? (colour-of car) 'red)
...)
The descriptive advantages of symbols over numbers increase as the set of concepts that you want to describe grows. Suppose that a car object can have other properties as well, such as whether it has or uses:
Then a car's combined property set could be naturally represented and manipulated as a list of symbols:
(properties-of car1)
⇒
(red manual unleaded power-steering)
(if (memq 'power-steering (properties-of car1))
(display "Unfit people can drive this car.\n")
(display "You'll need strong arms to drive this car!\n"))
-|
Unfit people can drive this car.
Remember, the fundamental property of symbols that we are relying on
here is that an occurrence of 'red in one part of a program is an
indistinguishable symbol from an occurrence of 'red in
another part of a program; this means that symbols can usefully be
compared using eq?. At the same time, symbols have naturally
descriptive names. This combination of efficiency and descriptive power
makes them ideal for use as discrete data.
Given their efficiency and descriptive power, it is natural to use symbols as the keys in an association list or hash table.
To illustrate this, consider a more structured representation of the car properties example from the preceding subsection. Rather than mixing all the properties up together in a flat list, we could use an association list like this:
(define car1-properties '((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)))
Notice how this structure is more explicit and extensible than the flat
list. For example it makes clear that manual refers to the
transmission rather than, say, the windows or the locking of the car.
It also allows further properties to use the same symbols among their
possible values without becoming ambiguous:
(define car1-properties '((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)
(seat-colour . red)
(locking . manual)))
With a representation like this, it is easy to use the efficient
assq-XXX family of procedures (see Association Lists) to
extract or change individual pieces of information:
(assq-ref car1-properties 'fuel) ⇒ unleaded
(assq-ref car1-properties 'transmission) ⇒ manual
(assq-set! car1-properties 'seat-colour 'black)
⇒
((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)
(seat-colour . black)
(locking . manual)))
Hash tables also have keys, and exactly the same arguments apply to the
use of symbols in hash tables as in association lists. The hash value
that Guile uses to decide where to add a symbol-keyed entry to a hash
table can be obtained by calling the symbol-hash procedure:
Return a hash value for symbol.
See Hash Tables for information about hash tables in general, and for why you might choose to use a hash table rather than an association list.
When an unquoted symbol in a Scheme program is evaluated, it is interpreted as a variable reference, and the result of the evaluation is the appropriate variable's value.
For example, when the expression (string-length "abcd") is read
and evaluated, the sequence of characters string-length is read
as the symbol whose name is "string-length". This symbol is associated
with a variable whose value is the procedure that implements string
length calculation. Therefore evaluation of the string-length
symbol results in that procedure.
The details of the connection between an unquoted symbol and the variable to which it refers are explained elsewhere. See Binding Constructs, for how associations between symbols and variables are created, and Modules, for how those associations are affected by Guile's module system.
Given any Scheme value, you can determine whether it is a symbol using
the symbol? primitive:
Return
#tif obj is a symbol, otherwise return#f.
Once you know that you have a symbol, you can obtain its name as a
string by calling symbol->string. Note that Guile differs by
default from R5RS on the details of symbol->string as regards
case-sensitivity:
Return the name of symbol s as a string. By default, Guile reads symbols case-sensitively, so the string returned will have the same case variation as the sequence of characters that caused s to be created.
If Guile is set to read symbols case-insensitively (as specified by R5RS), and s comes into being as part of a literal expression (see Literal expressions) or by a call to the
readorstring-ci->symbolprocedures, Guile converts any alphabetic characters in the symbol's name to lower case before creating the symbol object, so the string returned here will be in lower case.If s was created by
string->symbol, the case of characters in the string returned will be the same as that in the string that was passed tostring->symbol, regardless of Guile's case-sensitivity setting at the time s was created.It is an error to apply mutation procedures like
string-set!to strings returned by this procedure.
Most symbols are created by writing them literally in code. However it is also possible to create symbols programmatically using the following procedures:
Return a newly allocated symbol made from the given character arguments.
(symbol #\x #\y #\z) ⇒ xyz
Return a newly allocated symbol made from a list of characters.
(list->symbol '(#\a #\b #\c)) ⇒ abc
Return a newly allocated symbol whose characters form the concatenation of the given symbols, args.
(let ((h 'hello)) (symbol-append h 'world)) ⇒ helloworld
Return the symbol whose name is string. This procedure can create symbols with names containing special characters or letters in the non-standard case, but it is usually a bad idea to create such symbols because in some implementations of Scheme they cannot be read as themselves.
Return the symbol whose name is str. If Guile is currently reading symbols case-insensitively, str is converted to lowercase before the returned symbol is looked up or created.
The following examples illustrate Guile's detailed behaviour as regards the case-sensitivity of symbols:
(read-enable 'case-insensitive) ; R5RS compliant behaviour
(symbol->string 'flying-fish) ⇒ "flying-fish"
(symbol->string 'Martin) ⇒ "martin"
(symbol->string
(string->symbol "Malvina")) ⇒ "Malvina"
(eq? 'mISSISSIppi 'mississippi) ⇒ #t
(string->symbol "mISSISSIppi") ⇒ mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #f
(eq? 'LolliPop
(string->symbol (symbol->string 'LolliPop))) ⇒ #t
(string=? "K. Harper, M.D."
(symbol->string
(string->symbol "K. Harper, M.D."))) ⇒ #t
(read-disable 'case-insensitive) ; Guile default behaviour
(symbol->string 'flying-fish) ⇒ "flying-fish"
(symbol->string 'Martin) ⇒ "Martin"
(symbol->string
(string->symbol "Malvina")) ⇒ "Malvina"
(eq? 'mISSISSIppi 'mississippi) ⇒ #f
(string->symbol "mISSISSIppi") ⇒ mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #t
(eq? 'LolliPop
(string->symbol (symbol->string 'LolliPop))) ⇒ #t
(string=? "K. Harper, M.D."
(symbol->string
(string->symbol "K. Harper, M.D."))) ⇒ #t
From C, there are lower level functions that construct a Scheme symbol from a C string in the current locale encoding.
When you want to do more from C, you should convert between symbols
and strings using scm_symbol_to_string and
scm_string_to_symbol and work with the strings.
Construct and return a Scheme symbol whose name is specified by the null-terminated C string name. These are appropriate when the C string is hard-coded in the source code.
Construct and return a Scheme symbol whose name is specified by name. For
scm_from_locale_symbol, name must be null terminated; forscm_from_locale_symbolnthe length of name is specified explicitly by len.Note that these functions should not be used when name is a C string constant, because there is no guarantee that the current locale will match that of the source code. In such cases, use
scm_from_latin1_symbolorscm_from_utf8_symbol.
Like
scm_from_locale_symbolandscm_from_locale_symboln, respectively, but also frees str withfreeeventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.
The size of a symbol can also be obtained from C:
Finally, some applications, especially those that generate new Scheme
code dynamically, need to generate symbols for use in the generated
code. The gensym primitive meets this need:
Create a new symbol with a name constructed from a prefix and a counter value. The string prefix can be specified as an optional argument. Default prefix is ‘ g’. The counter is increased by 1 at each call. There is no provision for resetting the counter.
The symbols generated by gensym are likely to be unique,
since their names begin with a space and it is only otherwise possible
to generate such symbols if a programmer goes out of their way to do
so. Uniqueness can be guaranteed by instead using uninterned symbols
(see Symbol Uninterned), though they can't be usefully written out
and read back in.
In traditional Lisp dialects, symbols are often understood as having three kinds of value at once:
put or get functions.
Although Scheme (as one of its simplifications with respect to Lisp) does away with the distinction between variable and function namespaces, Guile currently retains some elements of the traditional structure in case they turn out to be useful when implementing translators for other languages, in particular Emacs Lisp.
Specifically, Guile symbols have two extra slots, one for a symbol's property list, and one for its “function value.” The following procedures are provided to access these slots.
Return the contents of symbol's function slot.
Set the contents of symbol's function slot to value.
Return the property list currently associated with symbol.
Set symbol's property list to value.
From sym's property list, return the value for property prop. The assumption is that sym's property list is an association list whose keys are distinguished from each other using
equal?; prop should be one of the keys in that list. If the property list has no entry for prop,symbol-propertyreturns#f.
In sym's property list, set the value for property prop to val, or add a new entry for prop, with value val, if none already exists. For the structure of the property list, see
symbol-property.
From sym's property list, remove the entry for property prop, if there is one. For the structure of the property list, see
symbol-property.
Support for these extra slots may be removed in a future release, and it is probably better to avoid using them. For a more modern and Schemely approach to properties, see Object Properties.
The read syntax for a symbol is a sequence of letters, digits, and
extended alphabetic characters, beginning with a character that
cannot begin a number. In addition, the special cases of +,
-, and ... are read as symbols even though numbers can
begin with +, - or ..
Extended alphabetic characters may be used within identifiers as if they were letters. The set of extended alphabetic characters is:
! $ % & * + - . / : < = > ? @ ^ _ ~
In addition to the standard read syntax defined above (which is taken from R5RS (see Formal syntax)), Guile provides an extended symbol read syntax that allows the inclusion of unusual characters such as space characters, newlines and parentheses. If (for whatever reason) you need to write a symbol containing characters not mentioned above, you can do so as follows.
#{,
}#.
Here are a few examples of this form of read syntax. The first symbol needs to use extended syntax because it contains a space character, the second because it contains a line break, and the last because it looks like a number.
#{foo bar}#
#{what
ever}#
#{4242}#
Although Guile provides this extended read syntax for symbols, widespread usage of it is discouraged because it is not portable and not very readable.
What makes symbols useful is that they are automatically kept unique. There are no two symbols that are distinct objects but have the same name. But of course, there is no rule without exception. In addition to the normal symbols that have been discussed up to now, you can also create special uninterned symbols that behave slightly differently.
To understand what is different about them and why they might be useful, we look at how normal symbols are actually kept unique.
Whenever Guile wants to find the symbol with a specific name, for
example during read or when executing string->symbol, it
first looks into a table of all existing symbols to find out whether a
symbol with the given name already exists. When this is the case, Guile
just returns that symbol. When not, a new symbol with the name is
created and entered into the table so that it can be found later.
Sometimes you might want to create a symbol that is guaranteed `fresh', i.e. a symbol that did not exist previously. You might also want to somehow guarantee that no one else will ever unintentionally stumble across your symbol in the future. These properties of a symbol are often needed when generating code during macro expansion. When introducing new temporary variables, you want to guarantee that they don't conflict with variables in other people's code.
The simplest way to arrange for this is to create a new symbol but not enter it into the global table of all symbols. That way, no one will ever get access to your symbol by chance. Symbols that are not in the table are called uninterned. Of course, symbols that are in the table are called interned.
You create new uninterned symbols with the function make-symbol.
You can test whether a symbol is interned or not with
symbol-interned?.
Uninterned symbols break the rule that the name of a symbol uniquely
identifies the symbol object. Because of this, they can not be written
out and read back in like interned symbols. Currently, Guile has no
support for reading uninterned symbols. Note that the function
gensym does not return uninterned symbols for this reason.
Return a new uninterned symbol with the name name. The returned symbol is guaranteed to be unique and future calls to
string->symbolwill not return it.
Return
#tif symbol is interned, otherwise return#f.
For example:
(define foo-1 (string->symbol "foo"))
(define foo-2 (string->symbol "foo"))
(define foo-3 (make-symbol "foo"))
(define foo-4 (make-symbol "foo"))
(eq? foo-1 foo-2)
⇒ #t
; Two interned symbols with the same name are the same object,
(eq? foo-1 foo-3)
⇒ #f
; but a call to make-symbol with the same name returns a
; distinct object.
(eq? foo-3 foo-4)
⇒ #f
; A call to make-symbol always returns a new object, even for
; the same name.
foo-3
⇒ #<uninterned-symbol foo 8085290>
; Uninterned symbols print differently from interned symbols,
(symbol? foo-3)
⇒ #t
; but they are still symbols,
(symbol-interned? foo-3)
⇒ #f
; just not interned.
Keywords are self-evaluating objects with a convenient read syntax that makes them easy to type.
Guile's keyword support conforms to R5RS, and adds a (switchable) read
syntax extension to permit keywords to begin with : as well as
#:, or to end with :.
Keywords are useful in contexts where a program or procedure wants to be able to accept a large number of optional arguments without making its interface unmanageable.
To illustrate this, consider a hypothetical make-window
procedure, which creates a new window on the screen for drawing into
using some graphical toolkit. There are many parameters that the caller
might like to specify, but which could also be sensibly defaulted, for
example:
If make-window did not use keywords, the caller would have to
pass in a value for each possible argument, remembering the correct
argument order and using a special value to indicate the default value
for that argument:
(make-window 'default ;; Color depth
'default ;; Background color
800 ;; Width
100 ;; Height
...) ;; More make-window arguments
With keywords, on the other hand, defaulted arguments are omitted, and non-default arguments are clearly tagged by the appropriate keyword. As a result, the invocation becomes much clearer:
(make-window #:width 800 #:height 100)
On the other hand, for a simpler procedure with few arguments, the use
of keywords would be a hindrance rather than a help. The primitive
procedure cons, for example, would not be improved if it had to
be invoked as
(cons #:car x #:cdr y)
So the decision whether to use keywords or not is purely pragmatic: use them if they will clarify the procedure invocation at point of call.
If a procedure wants to support keywords, it should take a rest argument and then use whatever means is convenient to extract keywords and their corresponding arguments from the contents of that rest argument.
The following example illustrates the principle: the code for
make-window uses a helper procedure called
get-keyword-value to extract individual keyword arguments from
the rest argument.
(define (get-keyword-value args keyword default)
(let ((kv (memq keyword args)))
(if (and kv (>= (length kv) 2))
(cadr kv)
default)))
(define (make-window . args)
(let ((depth (get-keyword-value args #:depth screen-depth))
(bg (get-keyword-value args #:bg "white"))
(width (get-keyword-value args #:width 800))
(height (get-keyword-value args #:height 100))
...)
...))
But you don't need to write get-keyword-value. The (ice-9
optargs) module provides a set of powerful macros that you can use to
implement keyword-supporting procedures like this:
(use-modules (ice-9 optargs))
(define (make-window . args)
(let-keywords args #f ((depth screen-depth)
(bg "white")
(width 800)
(height 100))
...))
Or, even more economically, like this:
(use-modules (ice-9 optargs))
(define* (make-window #:key (depth screen-depth)
(bg "white")
(width 800)
(height 100))
...)
For further details on let-keywords, define* and other
facilities provided by the (ice-9 optargs) module, see
Optional Arguments.
Guile, by default, only recognizes a keyword syntax that is compatible
with R5RS. A token of the form #:NAME, where NAME has the
same syntax as a Scheme symbol (see Symbol Read Syntax), is the
external representation of the keyword named NAME. Keyword
objects print using this syntax as well, so values containing keyword
objects can be read back into Guile. When used in an expression,
keywords are self-quoting objects.
If the keyword read option is set to 'prefix, Guile also
recognizes the alternative read syntax :NAME. Otherwise, tokens
of the form :NAME are read as symbols, as required by R5RS.
If the keyword read option is set to 'postfix, Guile
recognizes the SRFI-88 read syntax NAME: (see SRFI-88).
Otherwise, tokens of this form are read as symbols.
To enable and disable the alternative non-R5RS keyword syntax, you use
the read-set! procedure documented Scheme Read. Note that
the prefix and postfix syntax are mutually exclusive.
(read-set! keywords 'prefix)
#:type
⇒
#:type
:type
⇒
#:type
(read-set! keywords 'postfix)
type:
⇒
#:type
:type
⇒
:type
(read-set! keywords #f)
#:type
⇒
#:type
:type
-|
ERROR: In expression :type:
ERROR: Unbound variable: :type
ABORT: (unbound-variable)
Return
#tif the argument obj is a keyword, else#f.
Return the symbol with the same name as keyword.
Return the keyword with the same name as symbol.
Equivalent to
scm_symbol_to_keyword (scm_from_locale_symbol (name))andscm_symbol_to_keyword (scm_from_locale_symboln (name,len)), respectively.Note that these functions should not be used when name is a C string constant, because there is no guarantee that the current locale will match that of the source code. In such cases, use
scm_from_latin1_keywordorscm_from_utf8_keyword.
Equivalent to
scm_symbol_to_keyword (scm_from_latin1_symbol (name))andscm_symbol_to_keyword (scm_from_utf8_symbol (name)), respectively.
Procedures and macros are documented in their own sections: see Procedures and Macros.
Variable objects are documented as part of the description of Guile's module system: see Variables.
Asyncs, dynamic roots and fluids are described in the section on scheduling: see Scheduling.
Hooks are documented in the section on general utility functions: see Hooks.
Ports are described in the section on I/O: see Input and Output.
Regular expressions are described in their own section: see Regular Expressions.
This chapter describes Guile's compound data types. By compound we mean that the primary purpose of these data types is to act as containers for other kinds of data (including other compound objects). For instance, a (non-uniform) vector with length 5 is a container that can hold five arbitrary Scheme objects.
The various kinds of container object differ from each other in how their memory is allocated, how they are indexed, and how particular values can be looked up within them.
Pairs are used to combine two Scheme objects into one compound object. Hence the name: A pair stores a pair of objects.
The data type pair is extremely important in Scheme, just like in any other Lisp dialect. The reason is that pairs are not only used to make two values available as one object, but that pairs are used for constructing lists of values. Because lists are so important in Scheme, they are described in a section of their own (see Lists).
Pairs can literally get entered in source code or at the REPL, in the
so-called dotted list syntax. This syntax consists of an opening
parentheses, the first element of the pair, a dot, the second element
and a closing parentheses. The following example shows how a pair
consisting of the two numbers 1 and 2, and a pair containing the symbols
foo and bar can be entered. It is very important to write
the whitespace before and after the dot, because otherwise the Scheme
parser would not be able to figure out where to split the tokens.
(1 . 2)
(foo . bar)
But beware, if you want to try out these examples, you have to quote the expressions. More information about quotation is available in the section Expression Syntax. The correct way to try these examples is as follows.
'(1 . 2)
⇒
(1 . 2)
'(foo . bar)
⇒
(foo . bar)
A new pair is made by calling the procedure cons with two
arguments. Then the argument values are stored into a newly allocated
pair, and the pair is returned. The name cons stands for
"construct". Use the procedure pair? to test whether a
given Scheme object is a pair or not.
Return a newly allocated pair whose car is x and whose cdr is y. The pair is guaranteed to be different (in the sense of
eq?) from every previously existing object.
Return
#tif x is a pair; otherwise return#f.
The two parts of a pair are traditionally called car and
cdr. They can be retrieved with procedures of the same name
(car and cdr), and can be modified with the procedures
set-car! and set-cdr!. Since a very common operation in
Scheme programs is to access the car of a car of a pair, or the car of
the cdr of a pair, etc., the procedures called caar,
cadr and so on are also predefined.
Return the car or the cdr of pair, respectively.
These two macros are the fastest way to access the car or cdr of a pair; they can be thought of as compiling into a single memory reference.
These macros do no checking at all. The argument pair must be a valid pair.
These procedures are compositions of
carandcdr, where for examplecaddrcould be defined by(define caddr (lambda (x) (car (cdr (cdr x)))))
cadr,caddrandcadddrpick out the second, third or fourth elements of a list, respectively. SRFI-1 provides the same under the namessecond,thirdandfourth(see SRFI-1 Selectors).
Stores value in the car field of pair. The value returned by
set-car!is unspecified.
Stores value in the cdr field of pair. The value returned by
set-cdr!is unspecified.
A very important data type in Scheme—as well as in all other Lisp dialects—is the data type list.6
This is the short definition of what a list is:
(),
The syntax for lists is an opening parentheses, then all the elements of the list (separated by whitespace) and finally a closing parentheses.7.
(1 2 3) ; a list of the numbers 1, 2 and 3 ("foo" bar 3.1415) ; a string, a symbol and a real number () ; the empty list
The last example needs a bit more explanation. A list with no elements, called the empty list, is special in some ways. It is used for terminating lists by storing it into the cdr of the last pair that makes up a list. An example will clear that up:
(car '(1))
⇒
1
(cdr '(1))
⇒
()
This example also shows that lists have to be quoted when written (see Expression Syntax), because they would otherwise be mistakingly taken as procedure applications (see Simple Invocation).
Often it is useful to test whether a given Scheme object is a list or not. List-processing procedures could use this information to test whether their input is valid, or they could do different things depending on the datatype of their arguments.
The predicate null? is often used in list-processing code to
tell whether a given list has run out of elements. That is, a loop
somehow deals with the elements of a list until the list satisfies
null?. Then, the algorithm terminates.
Return
#tiff x is the empty list, else#f.
This section describes the procedures for constructing new lists.
list simply returns a list where the elements are the arguments,
cons* is similar, but the last argument is stored in the cdr of
the last pair of the list.
SCM_UNDEFINED)Return a new list containing elements elem1 to elemN.
scm_list_ntakes a variable number of arguments, terminated by the specialSCM_UNDEFINED. That finalSCM_UNDEFINEDis not included in the list. None of elem1 to elemN can themselves beSCM_UNDEFINED, orscm_list_nwill terminate at that point.
Like
list, but the last arg provides the tail of the constructed list, returning(consarg1(consarg2(cons ...argn))). Requires at least one argument. If given one argument, that argument is returned as result. This function is calledlist*in some other Schemes and in Common LISP.
Return a (newly-created) copy of lst.
Create a list containing of n elements, where each element is initialized to init. init defaults to the empty list
()if not given.
Note that list-copy only makes a copy of the pairs which make up
the spine of the lists. The list elements are not copied, which means
that modifying the elements of the new list also modifies the elements
of the old list. On the other hand, applying procedures like
set-cdr! or delv! to the new list will not alter the old
list. If you also need to copy the list elements (making a deep copy),
use the procedure copy-tree (see Copying).
These procedures are used to get some information about a list, or to retrieve one or more elements of a list.
Return the number of elements in list lst.
Return the last pair in lst, signalling an error if lst is circular.
Return the kth element from list.
Return the "tail" of lst beginning with its kth element. The first element of the list is considered to be element 0.
list-tailandlist-cdr-refare identical. It may help to think oflist-cdr-refas accessing the kth cdr of the list, or returning the results of cdring k times down lst.
Copy the first k elements from lst into a new list, and return it.
append and append! are used to concatenate two or more
lists in order to form a new list. reverse and reverse!
return lists with the same elements as their arguments, but in reverse
order. The procedure variants with an ! directly modify the
pairs which form the list, whereas the other procedures create new
pairs. This is why you should be careful when using the side-effecting
variants.
Return a list comprising all the elements of lists lst1 to lstN.
(append '(x) '(y)) ⇒ (x y) (append '(a) '(b c d)) ⇒ (a b c d) (append '(a (b)) '((c))) ⇒ (a (b) (c))The last argument lstN may actually be any object; an improper list results if the last argument is not a proper list.
(append '(a b) '(c . d)) ⇒ (a b c . d) (append '() 'a) ⇒ a
appenddoesn't modify the given lists, but the return may share structure with the final lstN.append!modifies the given lists to form its return.For
scm_appendandscm_append_x, lstlst is a list of the list operands lst1 ... lstN. That lstlst itself is not modified or used in the return.
Return a list comprising the elements of lst, in reverse order.
reverseconstructs a new list,reverse!modifies lst in constructing its return.For
reverse!, the optional newtail is appended to the result. newtail isn't reversed, it simply becomes the list tail. Forscm_reverse_x, the newtail parameter is mandatory, but can beSCM_EOLif no further tail is required.
The following procedures modify an existing list, either by changing elements of the list, or by changing the list structure itself.
Set the kth element of list to val.
Set the kth cdr of list to val.
Return a newly-created copy of lst with elements
eq?to item removed. This procedure mirrorsmemq:delqcompares elements of lst against item witheq?.
Return a newly-created copy of lst with elements
eqv?to item removed. This procedure mirrorsmemv:delvcompares elements of lst against item witheqv?.
Return a newly-created copy of lst with elements
equal?to item removed. This procedure mirrorsmember:deletecompares elements of lst against item withequal?.See also SRFI-1 which has an extended
delete(SRFI-1 Deleting), and also anlset-differencewhich can delete multiple items in one call (SRFI-1 Set Operations).
These procedures are destructive versions of
delq,delvanddelete: they modify the pointers in the existing lst rather than creating a new list. Caveat evaluator: Like other destructive list functions, these functions cannot modify the binding of lst, and so cannot be used to delete the first element of lst destructively.
Like
delq!, but only deletes the first occurrence of item from lst. Tests for equality usingeq?. See alsodelv1!anddelete1!.
Like
delv!, but only deletes the first occurrence of item from lst. Tests for equality usingeqv?. See alsodelq1!anddelete1!.
Like
delete!, but only deletes the first occurrence of item from lst. Tests for equality usingequal?. See alsodelq1!anddelv1!.
Return a list containing all elements from lst which satisfy the predicate pred. The elements in the result list have the same order as in lst. The order in which pred is applied to the list elements is not specified.
filterdoes not change lst, but the result may share a tail with it.filter!may modify lst to construct its return.
The following procedures search lists for particular elements. They use
different comparison predicates for comparing list elements with the
object to be searched. When they fail, they return #f, otherwise
they return the sublist whose car is equal to the search object, where
equality depends on the equality predicate used.
Return the first sublist of lst whose car is
eq?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.
Return the first sublist of lst whose car is
eqv?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.
Return the first sublist of lst whose car is
equal?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.See also SRFI-1 which has an extended
memberfunction (SRFI-1 Searching).
List processing is very convenient in Scheme because the process of iterating over the elements of a list can be highly abstracted. The procedures in this section are the most basic iterating procedures for lists. They take a procedure and one or more lists as arguments, and apply the procedure to each element of the list. They differ in their return value.
Apply proc to each element of the list arg1 (if only two arguments are given), or to the corresponding elements of the argument lists (if more than two arguments are given). The result(s) of the procedure applications are saved and returned in a list. For
map, the order of procedure applications is not specified,map-in-orderapplies the procedure from left to right to the list elements.
Like
map, but the procedure is always applied from left to right, and the result(s) of the procedure applications are thrown away. The return value is not specified.
See also SRFI-1 which extends these functions to take lists of unequal lengths (SRFI-1 Fold and Map).
Vectors are sequences of Scheme objects. Unlike lists, the length of a vector, once the vector is created, cannot be changed. The advantage of vectors over lists is that the time required to access one element of a vector given its position (synonymous with index), a zero-origin number, is constant, whereas lists have an access time linear to the position of the accessed element in the list.
Vectors can contain any kind of Scheme object; it is even possible to have different types of objects in the same vector. For vectors containing vectors, you may wish to use arrays, instead. Note, too, that vectors are the special case of one dimensional non-uniform arrays and that most array procedures operate happily on vectors (see Arrays).
Vectors can literally be entered in source code, just like strings,
characters or some of the other data types. The read syntax for vectors
is as follows: A sharp sign (#), followed by an opening
parentheses, all elements of the vector in their respective read syntax,
and finally a closing parentheses. The following are examples of the
read syntax for vectors; where the first vector only contains numbers
and the second three different object types: a string, a symbol and a
number in hexadecimal notation.
#(1 2 3)
#("Hello" foo #xdeadbeef)
Like lists, vectors have to be quoted:
'#(a b c) ⇒ #(a b c)
Instead of creating a vector implicitly by using the read syntax just
described, you can create a vector dynamically by calling one of the
vector and list->vector primitives with the list of Scheme
values that you want to place into a vector. The size of the vector
thus created is determined implicitly by the number of arguments given.
Return a newly allocated vector composed of the given arguments. Analogous to
list.(vector 'a 'b 'c) ⇒ #(a b c)
The inverse operation is vector->list:
Return a newly allocated list composed of the elements of v.
(vector->list '#(dah dah didah)) ⇒ (dah dah didah) (list->vector '(dididit dah)) ⇒ #(dididit dah)
To allocate a vector with an explicitly specified size, use
make-vector. With this primitive you can also specify an initial
value for the vector elements (the same value for all elements, that
is):
Return a newly allocated vector of len elements. If a second argument is given, then each position is initialized to fill. Otherwise the initial contents of each position is unspecified.
Like
scm_make_vector, but the length is given as asize_t.
To check whether an arbitrary Scheme value is a vector, use the
vector? primitive:
Return
#tif obj is a vector, otherwise return#f.
Return non-zero when obj is a vector, otherwise return
zero.
vector-length and vector-ref return information about a
given vector, respectively its size and the elements that are contained
in the vector.
Return the number of elements in vector as an exact integer.
Return the number of elements in vector as a
size_t.
Return the contents of position k of vector. k must be a valid index of vector.
(vector-ref '#(1 1 2 3 5 8 13 21) 5) ⇒ 8 (vector-ref '#(1 1 2 3 5 8 13 21) (let ((i (round (* 2 (acos -1))))) (if (inexact? i) (inexact->exact i) i))) ⇒ 13
Return the contents of position k (a
size_t) of vector.
A vector created by one of the dynamic vector constructor procedures (see Vector Creation) can be modified using the following procedures.
NOTE: According to R5RS, it is an error to use any of these procedures on a literally read vector, because such vectors should be considered as constants. Currently, however, Guile does not detect this error.
Store obj in position k of vector. k must be a valid index of vector. The value returned by ‘vector-set!’ is unspecified.
(let ((vec (vector 0 '(2 2 2 2) "Anna"))) (vector-set! vec 1 '("Sue" "Sue")) vec) ⇒ #(0 ("Sue" "Sue") "Anna")
Store obj in position k (a
size_t) of v.
Store fill in every position of vector. The value returned by
vector-fill!is unspecified.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-left!copies elements in leftmost order. Therefore, in the case where vec1 and vec2 refer to the same vector,vector-move-left!is usually appropriate when start1 is greater than start2.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-right!copies elements in rightmost order. Therefore, in the case where vec1 and vec2 refer to the same vector,vector-move-right!is usually appropriate when start1 is less than start2.
A vector can be read and modified from C with the functions
scm_c_vector_ref and scm_c_vector_set_x, for example. In
addition to these functions, there are two more ways to access vectors
from C that might be more efficient in certain situations: you can
restrict yourself to simple vectors and then use the very fast
simple vector macros; or you can use the very general framework
for accessing all kinds of arrays (see Accessing Arrays from C),
which is more verbose, but can deal efficiently with all kinds of
vectors (and arrays). For vectors, you can use the
scm_vector_elements and scm_vector_writable_elements
functions as shortcuts.
Return non-zero if obj is a simple vector, else return zero. A simple vector is a vector that can be used with the
SCM_SIMPLE_*macros below.The following functions are guaranteed to return simple vectors:
scm_make_vector,scm_c_make_vector,scm_vector,scm_list_to_vector.
Evaluates to the length of the simple vector vec. No type checking is done.
Evaluates to the element at position idx in the simple vector vec. No type or range checking is done.
Sets the element at position idx in the simple vector vec to val. No type or range checking is done.
Acquire a handle for the vector vec and return a pointer to the elements of it. This pointer can only be used to read the elements of vec. When vec is not a vector, an error is signaled. The handle must eventually be released with
scm_array_handle_release.The variables pointed to by lenp and incp are filled with the number of elements of the vector and the increment (number of elements) between successive elements, respectively. Successive elements of vec need not be contiguous in their underlying “root vector” returned here; hence the increment is not necessarily equal to 1 and may well be negative too (see Shared Arrays).
The following example shows the typical way to use this function. It creates a list of all elements of vec (in reverse order).
scm_t_array_handle handle; size_t i, len; ssize_t inc; const SCM *elt; SCM list; elt = scm_vector_elements (vec, &handle, &len, &inc); list = SCM_EOL; for (i = 0; i < len; i++, elt += inc) list = scm_cons (*elt, list); scm_array_handle_release (&handle);
Like
scm_vector_elementsbut the pointer can be used to modify the vector.The following example shows the typical way to use this function. It fills a vector with
#t.scm_t_array_handle handle; size_t i, len; ssize_t inc; SCM *elt; elt = scm_vector_writable_elements (vec, &handle, &len, &inc); for (i = 0; i < len; i++, elt += inc) *elt = SCM_BOOL_T; scm_array_handle_release (&handle);
A uniform numeric vector is a vector whose elements are all of a single numeric type. Guile offers uniform numeric vectors for signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers, two sizes of floating point values, and complex floating-point numbers of these two sizes. See SRFI-4, for more information.
For many purposes, bytevectors work just as well as uniform vectors, and have the advantage that they integrate well with binary input and output. See Bytevectors, for more information on bytevectors.
Bit vectors are zero-origin, one-dimensional arrays of booleans. They
are displayed as a sequence of 0s and 1s prefixed by
#*, e.g.,
(make-bitvector 8 #f) ⇒
#*00000000
Bit vectors are also generalized vectors, See Generalized Vectors, and can thus be used with the array procedures, See Arrays. Bit vectors are the special case of one dimensional bit arrays.
Return
#twhen obj is a bitvector, else return#f.
Create a new bitvector of length len and optionally initialize all elements to fill.
Like
scm_make_bitvector, but the length is given as asize_t.
Create a new bitvector with the arguments as elements.
Return the length of the bitvector vec.
Like
scm_bitvector_length, but the length is returned as asize_t.
Return the element at index idx of the bitvector vec.
Return the element at index idx of the bitvector vec.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set all elements of the bitvector vec when val is true, else clear them.
Return a new bitvector initialized with the elements of list.
Return a new list initialized with the elements of the bitvector vec.
Return a count of how many entries in bitvector are equal to bool. For example,
(bit-count #f #*000111000) ⇒ 6
Return the index of the first occurrence of bool in bitvector, starting from start. If there is no bool entry between start and the end of bitvector, then return
#f. For example,(bit-position #t #*000101 0) ⇒ 3 (bit-position #f #*0001111 3) ⇒ #f
Modify bitvector by replacing each element with its negation.
Set entries of bitvector to bool, with uvec selecting the entries to change. The return value is unspecified.
If uvec is a bit vector, then those entries where it has
#tare the ones in bitvector which are set to bool. uvec and bitvector must be the same length. When bool is#tit's like uvec is OR'ed into bitvector. Or when bool is#fit can be seen as an ANDNOT.(define bv #*01000010) (bit-set*! bv #*10010001 #t) bv ⇒ #*11010011If uvec is a uniform vector of unsigned long integers, then they're indexes into bitvector which are set to bool.
(define bv #*01000010) (bit-set*! bv #u(5 2 7) #t) bv ⇒ #*01100111
Return a count of how many entries in bitvector are equal to bool, with uvec selecting the entries to consider.
uvec is interpreted in the same way as for
bit-set*!above. Namely, if uvec is a bit vector then entries which have#tthere are considered in bitvector. Or if uvec is a uniform vector of unsigned long integers then it's the indexes in bitvector to consider.For example,
(bit-count* #*01110111 #*11001101 #t) ⇒ 3 (bit-count* #*01110111 #u(7 0 4) #f) ⇒ 2
Like
scm_vector_elements(see Vector Accessing from C), but for bitvectors. The variable pointed to by offp is set to the value returned byscm_array_handle_bit_elements_offset. Seescm_array_handle_bit_elementsfor how to use the returned pointer and the offset.
Like
scm_bitvector_elements, but the pointer is good for reading and writing.
Guile has a number of data types that are generally vector-like: strings, uniform numeric vectors, bytevectors, bitvectors, and of course ordinary vectors of arbitrary Scheme values. These types are disjoint: a Scheme value belongs to at most one of the five types listed above.
If you want to gloss over this distinction and want to treat all four types with common code, you can use the procedures in this section. They work with the generalized vector type, which is the union of the five vector-like types.
Return
#tif obj is a vector, bytevector, string, bitvector, or uniform numeric vector.
Return the length of the generalized vector v.
Return the element at index idx of the generalized vector v.
Set the element at index idx of the generalized vector v to val.
Return a new list whose elements are the elements of the generalized vector v.
Return
1if obj is a vector, string, bitvector, or uniform numeric vector; else return0.
Return the length of the generalized vector v.
Return the element at index idx of the generalized vector v.
Set the element at index idx of the generalized vector v to val.
Like
scm_array_get_handlebut an error is signalled when v is not of rank one. You can usescm_array_handle_refandscm_array_handle_setto read and write the elements of v, or you can use functions likescm_array_handle_<foo>_elementsto deal with specific types of vectors.
Arrays are a collection of cells organized into an arbitrary number of dimensions. Each cell can be accessed in constant time by supplying an index for each dimension.
In the current implementation, an array uses a generalized vector for
the actual storage of its elements. Any kind of generalized vector
will do, so you can have arrays of uniform numeric values, arrays of
characters, arrays of bits, and of course, arrays of arbitrary Scheme
values. For example, arrays with an underlying c64vector might
be nice for digital signal processing, while arrays made from a
u8vector might be used to hold gray-scale images.
The number of dimensions of an array is called its rank. Thus, a matrix is an array of rank 2, while a vector has rank 1. When accessing an array element, you have to specify one exact integer for each dimension. These integers are called the indices of the element. An array specifies the allowed range of indices for each dimension via an inclusive lower and upper bound. These bounds can well be negative, but the upper bound must be greater than or equal to the lower bound minus one. When all lower bounds of an array are zero, it is called a zero-origin array.
Arrays can be of rank 0, which could be interpreted as a scalar. Thus, a zero-rank array can store exactly one object and the list of indices of this element is the empty list.
Arrays contain zero elements when one of their dimensions has a zero length. These empty arrays maintain information about their shape: a matrix with zero columns and 3 rows is different from a matrix with 3 columns and zero rows, which again is different from a vector of length zero.
Generalized vectors, such as strings, uniform numeric vectors, bytevectors, bit vectors and ordinary vectors, are the special case of one dimensional arrays.
An array is displayed as # followed by its rank, followed by a
tag that describes the underlying vector, optionally followed by
information about its shape, and finally followed by the cells,
organized into dimensions using parentheses.
In more words, the array tag is of the form
#<rank><vectag><@lower><:len><@lower><:len>...
where <rank> is a positive integer in decimal giving the rank of
the array. It is omitted when the rank is 1 and the array is non-shared
and has zero-origin (see below). For shared arrays and for a non-zero
origin, the rank is always printed even when it is 1 to distinguish
them from ordinary vectors.
The <vectag> part is the tag for a uniform numeric vector, like
u8, s16, etc, b for bitvectors, or a for
strings. It is empty for ordinary vectors.
The <@lower> part is a ‘@’ character followed by a signed
integer in decimal giving the lower bound of a dimension. There is one
<@lower> for each dimension. When all lower bounds are zero,
all <@lower> parts are omitted.
The <:len> part is a ‘:’ character followed by an unsigned
integer in decimal giving the length of a dimension. Like for the lower
bounds, there is one <:len> for each dimension, and the
<:len> part always follows the <@lower> part for a
dimension. Lengths are only then printed when they can't be deduced
from the nested lists of elements of the array literal, which can happen
when at least one length is zero.
As a special case, an array of rank 0 is printed as
#0<vectag>(<scalar>), where <scalar> is the result of
printing the single element of the array.
Thus,
#(1 2 3)#@2(1 2 3)#2((1 2 3) (4 5 6))#u32(0 1 2)#2u32@2@3((1 2) (2 3))#2()#2:0:2()#0(12)In addition, bytevectors are also arrays, but use a different syntax (see Bytevectors):
#vu8(1 2 3)When an array is created, the range of each dimension must be specified, e.g., to create a 2x3 array with a zero-based index:
(make-array 'ho 2 3) ⇒ #2((ho ho ho) (ho ho ho))
The range of each dimension can also be given explicitly, e.g., another way to create the same array:
(make-array 'ho '(0 1) '(0 2)) ⇒ #2((ho ho ho) (ho ho ho))
The following procedures can be used with arrays (or vectors). An argument shown as idx... means one parameter for each dimension in the array. A idxlist argument means a list of such values, one for each dimension.
Return
#tif the obj is an array, and#fif not.The second argument to scm_array_p is there for historical reasons, but it is not used. You should always pass
SCM_UNDEFINEDas its value.
Return
#tif the obj is an array of type type, and#fif not.
Return
0if the obj is an array of type type, and1if not.
Equivalent to
(make-typed-array #tfill bound...).
Create and return an array that has as many dimensions as there are bounds and (maybe) fill it with fill.
The underlying storage vector is created according to type, which must be a symbol whose name is the `vectag' of the array as explained above, or
#tfor ordinary, non-specialized arrays.For example, using the symbol
f64for type will create an array that uses af64vectorfor storing its elements, andawill use a string.When fill is not the special unspecified value, the new array is filled with fill. Otherwise, the initial contents of the array is unspecified. The special unspecified value is stored in the variable
*unspecified*so that for example(make-typed-array 'u32 *unspecified* 4)creates a uninitializedu32vector of length 4.Each bound may be a positive non-zero integer N, in which case the index for that dimension can range from 0 through N-1; or an explicit index range specifier in the form
(LOWER UPPER), where both lower and upper are integers, possibly less than zero, and possibly the same number (however, lower cannot be greater than upper).
Return an array of the type indicated by type with elements the same as those of list.
The argument dimspec determines the number of dimensions of the array and their lower bounds. When dimspec is an exact integer, it gives the number of dimensions directly and all lower bounds are zero. When it is a list of exact integers, then each element is the lower index bound of a dimension, and there will be as many dimensions as elements in the list.
Return the type of array. This is the `vectag' used for printing array (or
#tfor ordinary arrays) and can be used withmake-typed-arrayto create an array of the same kind as array.
Return the element at
(idx ...)in array.(define a (make-array 999 '(1 2) '(3 4))) (array-ref a 2 4) ⇒ 999
Return
#tif the given index would be acceptable toarray-ref.(define a (make-array #f '(1 2) '(3 4))) (array-in-bounds? a 2 3) ⇒ #t (array-in-bounds? a 0 0) ⇒ #f
Set the element at
(idx ...)in array to obj. The return value is unspecified.(define a (make-array #f '(0 1) '(0 1))) (array-set! a #t 1 1) a ⇒ #2((#f #f) (#f #t))
Return a list of the bounds for each dimension of array.
array-shapegives(lower upper)for each dimension.array-dimensionsinstead returns just upper+1 for dimensions with a 0 lower bound. Both are suitable as input tomake-array.For example,
(define a (make-array 'foo '(-1 3) 5)) (array-shape a) ⇒ ((-1 3) (0 4)) (array-dimensions a) ⇒ ((-1 3) 5)
Return a list consisting of all the elements, in order, of array.
Copy every element from vector or array src to the corresponding element of dst. dst must have the same rank as src, and be at least as large in each dimension. The return value is unspecified.
Store fill in every element of array. The value returned is unspecified.
Return
#tif all arguments are arrays with the same shape, the same type, and have corresponding elements which are eitherequal?orarray-equal?. This function differs fromequal?(see Equality) in that all arguments must be arrays.
Set each element of the dst array to values obtained from calls to proc. The value returned is unspecified.
Each call is
(proc elem1...elemN), where each elem is from the corresponding src array, at the dst index.array-map-in-order!makes the calls in row-major order,array-map!makes them in an unspecified order.The src arrays must have the same number of dimensions as dst, and must have a range for each dimension which covers the range in dst. This ensures all dst indices are valid in each src.
Apply proc to each tuple of elements of src1 ... srcN, in row-major order. The value returned is unspecified.
Set each element of the dst array to values returned by calls to proc. The value returned is unspecified.
Each call is
(proc i1...iN), where i1...iN is the destination index, one parameter for each dimension. The order in which the calls are made is unspecified.For example, to create a 4x4 matrix representing a cyclic group,
/ 0 1 2 3 \ | 1 2 3 0 | | 2 3 0 1 | \ 3 0 1 2 /(define a (make-array #f 4 4)) (array-index-map! a (lambda (i j) (modulo (+ i j) 4)))
Attempt to read all elements of ura, in lexicographic order, as binary objects from port-or-fdes. If an end of file is encountered, the objects up to that point are put into ura (starting at the beginning) and the remainder of the array is unchanged.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be read, leaving the remainder of the vector unchanged.
uniform-array-read!returns the number of objects read. port-or-fdes may be omitted, in which case it defaults to the value returned by(current-input-port).
Writes all elements of ura as binary objects to port-or-fdes.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be written.
The number of objects actually written is returned. port-or-fdes may be omitted, in which case it defaults to the value returned by
(current-output-port).
Return a new array which shares the storage of oldarray. Changes made through either affect the same underlying storage. The bound... arguments are the shape of the new array, the same as
make-array(see Array Procedures).mapfunc translates coordinates from the new array to the oldarray. It's called as
(mapfuncnewidx1 ...)with one parameter for each dimension of the new array, and should return a list of indices for oldarray, one for each dimension of oldarray.mapfunc must be affine linear, meaning that each oldarray index must be formed by adding integer multiples (possibly negative) of some or all of newidx1 etc, plus a possible integer offset. The multiples and offset must be the same in each call.
One good use for a shared array is to restrict the range of some dimensions, so as to apply sayarray-for-eachorarray-fill!to only part of an array. The plainlistfunction can be used for mapfunc in this case, making no changes to the index values. For example,(make-shared-array #2((a b c) (d e f) (g h i)) list 3 2) ⇒ #2((a b) (d e) (g h))The new array can have fewer dimensions than oldarray, for example to take a column from an array.
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i) (list i 2)) '(0 2)) ⇒ #1(c f i)A diagonal can be taken by using the single new array index for both row and column in the old array. For example,
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i) (list i i)) '(0 2)) ⇒ #1(a e i)Dimensions can be increased by for instance considering portions of a one dimensional array as rows in a two dimensional array. (
array-contentsbelow can do the opposite, flattening an array.)(make-shared-array #1(a b c d e f g h i j k l) (lambda (i j) (list (+ (* i 3) j))) 4 3) ⇒ #2((a b c) (d e f) (g h i) (j k l))By negating an index the order that elements appear can be reversed. The following just reverses the column order,
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i j) (list i (- 2 j))) 3 3) ⇒ #2((c b a) (f e d) (i h g))A fixed offset on indexes allows for instance a change from a 0 based to a 1 based array,
(define x #2((a b c) (d e f) (g h i))) (define y (make-shared-array x (lambda (i j) (list (1- i) (1- j))) '(1 3) '(1 3))) (array-ref x 0 0) ⇒ a (array-ref y 1 1) ⇒ aA multiple on an index allows every Nth element of an array to be taken. The following is every third element,
(make-shared-array #1(a b c d e f g h i j k l) (lambda (i) (list (* i 3))) 4) ⇒ #1(a d g j)The above examples can be combined to make weird and wonderful selections from an array, but it's important to note that because mapfunc must be affine linear, arbitrary permutations are not possible.
In the current implementation, mapfunc is not called for every access to the new array but only on some sample points to establish a base and stride for new array indices in oldarray data. A few sample points are enough because mapfunc is linear.
For each dimension, return the distance between elements in the root vector.
Return the root vector index of the first element in the array.
Return the root vector of a shared array.
If array may be unrolled into a one dimensional shared array without changing their order (last subscript changing fastest), then
array-contentsreturns that shared array, otherwise it returns#f. All arrays made bymake-arrayandmake-typed-arraymay be unrolled, some arrays made bymake-shared-arraymay not be.If the optional argument strict is provided, a shared array will be returned only if its elements are stored internally contiguous in memory.
Return an array sharing contents with array, but with dimensions arranged in a different order. There must be one dim argument for each dimension of array. dim1, dim2, ... should be integers between 0 and the rank of the array to be returned. Each integer in that range must appear at least once in the argument list.
The values of dim1, dim2, ... correspond to dimensions in the array to be returned, and their positions in the argument list to dimensions of array. Several dims may have the same value, in which case the returned array will have smaller rank than array.
(transpose-array '#2((a b) (c d)) 1 0) ⇒ #2((a c) (b d)) (transpose-array '#2((a b) (c d)) 0 0) ⇒ #1(a d) (transpose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 1 0) ⇒ #2((a 4) (b 5) (c 6))
For interworking with external C code, Guile provides an API to allow C code to access the elements of a Scheme array. In particular, for uniform numeric arrays, the API exposes the underlying uniform data as a C array of numbers of the relevant type.
While pointers to the elements of an array are in use, the array itself must be protected so that the pointer remains valid. Such a protected array is said to be reserved. A reserved array can be read but modifications to it that would cause the pointer to its elements to become invalid are prevented. When you attempt such a modification, an error is signalled.
(This is similar to locking the array while it is in use, but without the danger of a deadlock. In a multi-threaded program, you will need additional synchronization to avoid modifying reserved arrays.)
You must take care to always unreserve an array after reserving it, even in the presence of non-local exits. If a non-local exit can happen between these two calls, you should install a dynwind context that releases the array when it is left (see Dynamic Wind).
In addition, array reserving and unreserving must be properly paired. For instance, when reserving two or more arrays in a certain order, you need to unreserve them in the opposite order.
Once you have reserved an array and have retrieved the pointer to its elements, you must figure out the layout of the elements in memory. Guile allows slices to be taken out of arrays without actually making a copy, such as making an alias for the diagonal of a matrix that can be treated as a vector. Arrays that result from such an operation are not stored contiguously in memory and when working with their elements directly, you need to take this into account.
The layout of array elements in memory can be defined via a mapping function that computes a scalar position from a vector of indices. The scalar position then is the offset of the element with the given indices from the start of the storage block of the array.
In Guile, this mapping function is restricted to be affine: all
mapping functions of Guile arrays can be written as p = b +
c[0]*i[0] + c[1]*i[1] + ... + c[n-1]*i[n-1] where i[k] is the
kth index and n is the rank of the array. For
example, a matrix of size 3x3 would have b == 0, c[0] ==
3 and c[1] == 1. When you transpose this matrix (with
transpose-array, say), you will get an array whose mapping
function has b == 0, c[0] == 1 and c[1] == 3.
The function scm_array_handle_dims gives you (indirect) access to
the coefficients c[k].
Note that there are no functions for accessing the elements of a character array yet. Once the string implementation of Guile has been changed to use Unicode, we will provide them.
This is a structure type that holds all information necessary to manage the reservation of arrays as explained above. Structures of this type must be allocated on the stack and must only be accessed by the functions listed below.
Reserve array, which must be an array, and prepare handle to be used with the functions below. You must eventually call
scm_array_handle_releaseon handle, and do this in a properly nested fashion, as explained above. The structure pointed to by handle does not need to be initialized before calling this function.
End the array reservation represented by handle. After a call to this function, handle might be used for another reservation.
Return the rank of the array represented by handle.
This structure type holds information about the layout of one dimension of an array. It includes the following fields:
ssize_t lbndssize_t ubnd- The lower and upper bounds (both inclusive) of the permissible index range for the given dimension. Both values can be negative, but lbnd is always less than or equal to ubnd.
ssize_t inc- The distance from one element of this dimension to the next. Note, too, that this can be negative.
Return a pointer to a C vector of information about the dimensions of the array represented by handle. This pointer is valid as long as the array remains reserved. As explained above, the
scm_t_array_dimstructures returned by this function can be used calculate the position of an element in the storage block of the array from its indices.This position can then be used as an index into the C array pointer returned by the various
scm_array_handle_<foo>_elementsfunctions, or withscm_array_handle_refandscm_array_handle_set.Here is how one can compute the position pos of an element given its indices in the vector indices:
ssize_t indices[RANK]; scm_t_array_dim *dims; ssize_t pos; size_t i; pos = 0; for (i = 0; i < RANK; i++) { if (indices[i] < dims[i].lbnd || indices[i] > dims[i].ubnd) out_of_range (); pos += (indices[i] - dims[i].lbnd) * dims[i].inc; }
Compute the position corresponding to indices, a list of indices. The position is computed as described above for
scm_array_handle_dims. The number of the indices and their range is checked and an appropriate error is signalled for invalid indices.
Return the element at position pos in the storage block of the array represented by handle. Any kind of array is acceptable. No range checking is done on pos.
Set the element at position pos in the storage block of the array represented by handle to val. Any kind of array is acceptable. No range checking is done on pos. An error is signalled when the array can not store val.
Return a pointer to the elements of a ordinary array of general Scheme values (i.e., a non-uniform array) for reading. This pointer is valid as long as the array remains reserved.
Like
scm_array_handle_elements, but the pointer is good for reading and writing.
Return a pointer to the elements of a uniform numeric array for reading. This pointer is valid as long as the array remains reserved. The size of each element is given by
scm_array_handle_uniform_element_size.
Like
scm_array_handle_uniform_elements, but the pointer is good reading and writing.
Return the size of one element of the uniform numeric array represented by handle.
Return a pointer to the elements of a uniform numeric array of the indicated kind for reading. This pointer is valid as long as the array remains reserved.
The pointers for
c32andc64uniform numeric arrays point to pairs of floating point numbers. The even index holds the real part, the odd index the imaginary part of the complex number.
Like
scm_array_handle_<kind>_elements, but the pointer is good for reading and writing.
Return a pointer to the words that store the bits of the represented array, which must be a bit array.
Unlike other arrays, bit arrays have an additional offset that must be figured into index calculations. That offset is returned by
scm_array_handle_bit_elements_offset.To find a certain bit you first need to calculate its position as explained above for
scm_array_handle_dimsand then add the offset. This gives the absolute position of the bit, which is always a non-negative integer.Each word of the bit array storage block contains exactly 32 bits, with the least significant bit in that word having the lowest absolute position number. The next word contains the next 32 bits.
Thus, the following code can be used to access a bit whose position according to
scm_array_handle_dimsis given in pos:SCM bit_array; scm_t_array_handle handle; scm_t_uint32 *bits; ssize_t pos; size_t abs_pos; size_t word_pos, mask; scm_array_get_handle (&bit_array, &handle); bits = scm_array_handle_bit_elements (&handle); pos = ... abs_pos = pos + scm_array_handle_bit_elements_offset (&handle); word_pos = abs_pos / 32; mask = 1L << (abs_pos % 32); if (bits[word_pos] & mask) /* bit is set. */ scm_array_handle_release (&handle);
Like
scm_array_handle_bit_elementsbut the pointer is good for reading and writing. You must take care not to modify bits outside of the allowed index range of the array, even for contiguous arrays.
The (ice-9 vlist) module provides an implementation of the VList
data structure designed by Phil Bagwell in 2002. VLists are immutable lists,
which can contain any Scheme object. They improve on standard Scheme linked
lists in several areas:
The idea behind VLists is to store vlist elements in increasingly large
contiguous blocks (implemented as vectors here). These blocks are linked to one
another using a pointer to the next block and an offset within that block. The
size of these blocks form a geometric series with ratio
block-growth-factor (2 by default).
The VList structure also serves as the basis for the VList-based hash lists or “vhashes”, an immutable dictionary type (see VHashes).
However, the current implementation in (ice-9 vlist) has several
noteworthy shortcomings:
vlist-cons mutates part of its internal structure, which makes
it non-thread-safe. This could be fixed, but it would slow down
vlist-cons.
vlist-cons always allocates at least as much memory as cons.
Again, Phil Bagwell describes how to fix it, but that would require tuning the
garbage collector in a way that may not be generally beneficial.
vlist-cons is a Scheme procedure compiled to bytecode, and it does not
compete with the straightforward C implementation of cons, and with the
fact that the VM has a special cons instruction.
We hope to address these in the future.
The programming interface exported by (ice-9 vlist) is defined below.
Most of it is the same as SRFI-1 with an added vlist- prefix to function
names.
The empty VList. Note that it's possible to create an empty VList not
eq?tovlist-null; thus, callers should always usevlist-null?when testing whether a VList is empty.
Return a new vlist with item as its head and vlist as its tail.
A fluid that defines the growth factor of VList blocks, 2 by default.
The functions below provide the usual set of higher-level list operations.
Fold over vlist, calling proc for each element, as for SRFI-1
foldandfold-right(seefold).
Return the element at index index in vlist. This is typically a constant-time operation.
Return the length of vlist. This is typically logarithmic in the number of elements in vlist.
Return a new vlist whose content are those of vlist in reverse order.
Map proc over the elements of vlist and return a new vlist.
Call proc on each element of vlist. The result is unspecified.
Return a new vlist that does not contain the count first elements of vlist. This is typically a constant-time operation.
Return a new vlist that contains only the count first elements of vlist.
Return a new vlist containing all the elements from vlist that satisfy pred.
Return a new vlist corresponding to vlist without the elements equal? to x.
Return a new vlist, as for SRFI-1
unfoldandunfold-right(seeunfold).
A record type is a first class object representing a user-defined data type. A record is an instance of a record type.
Return
#tif obj is a record of any type and#fotherwise.Note that
record?may be true of any Scheme value; there is no promise that records are disjoint with other Scheme types.
Create and return a new record-type descriptor.
type-name is a string naming the type. Currently it's only used in the printed representation of records, and in diagnostics. field-names is a list of symbols naming the fields of a record of the type. Duplicates are not allowed among these symbols.
(make-record-type "employee" '(name age salary))The optional print argument is a function used by
display,write, etc, for printing a record of the new type. It's called as(printrecord port)and should look at record and write to port.
Return a procedure for constructing new members of the type represented by rtd. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in that list are unspecified. The field-names argument defaults to the list of field names in the call to
make-record-typethat created the type represented by rtd; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.
Return a procedure for testing membership in the type represented by rtd. The returned procedure accepts exactly one argument and returns a true value if the argument is a member of the indicated record type; it returns a false value otherwise.
Return a procedure for reading the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field-names in the call to
make-record-typethat created the type represented by rtd.
Return a procedure for writing the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the modifier procedure is unspecified. The symbol field-name must be a member of the list of field-names in the call to
make-record-typethat created the type represented by rtd.
Return a record-type descriptor representing the type of the given record. That is, for example, if the returned descriptor were passed to
record-predicate, the resulting predicate would return a true value when passed the given record. Note that it is not necessarily the case that the returned descriptor is the one that was passed torecord-constructorin the call that created the constructor procedure that created the given record.
Return the type-name associated with the type represented by rtd. The returned value is
eqv?to the type-name argument given in the call tomake-record-typethat created the type represented by rtd.
Return a list of the symbols naming the fields in members of the type represented by rtd. The returned value is
equal?to the field-names argument given in the call tomake-record-typethat created the type represented by rtd.
A structure is a first class data type which holds Scheme values
or C words in fields numbered 0 upwards. A vtable represents a
structure type, giving field types and permissions, and an optional
print function for write etc.
Structures are lower level than records (see Records) but have
some extra features. The vtable system allows sets of types be
constructed, with class data. The uninterpreted words can
inter-operate with C code, allowing arbitrary pointers or other values
to be stored along side usual Scheme SCM values.
A vtable is a structure type, specifying its layout, and other information. A vtable is actually itself a structure, but there's no need to worry about that initially (see Vtable Contents.)
Create a new vtable.
fields is a string describing the fields in the structures to be created. Each field is represented by two characters, a type letter and a permissions letter, for example
"pw". The types are as follows.
p– a Scheme value. “p” stands for “protected” meaning it's protected against garbage collection.u– an arbitrary word of data (anscm_t_bits). At the Scheme level it's read and written as an unsigned integer. “u” stands for “uninterpreted” (it's not treated as a Scheme value), or “unprotected” (it's not marked during GC), or “unsigned long” (its size), or all of these things.s– a self-reference. Such a field holds theSCMvalue of the structure itself (a circular reference). This can be useful in C code where you might have a pointer to the data array, and want to get the SchemeSCMhandle for the structure. In Scheme code it has no use.The second letter for each field is a permission code,
w– writable, the field can be read and written.r– read-only, the field can be read but not written.o– opaque, the field can be neither read nor written at the Scheme level. This can be used for fields which should only be used from C code.W,R,O– a tail array, with permissions for the array fields as perw,r,o.A tail array is further fields at the end of a structure. The last field in the layout string might be for instance ‘pW’ to have a tail of writable Scheme-valued fields. The ‘pW’ field itself holds the tail size, and the tail fields come after it.
Here are some examples.
(make-vtable "pw") ;; one writable field (make-vtable "prpw") ;; one read-only and one writable (make-vtable "pwuwuw") ;; one scheme and two uninterpreted (make-vtable "prpW") ;; one fixed then a tail arrayThe optional print argument is a function called by
displayandwrite(etc) to give a printed representation of a structure created from this vtable. It's called(printstruct port)and should look at struct and write to port. The default print merely gives a form like ‘#<struct ADDR:ADDR>’ with a pair of machine addresses.The following print function for example shows the two fields of its structure.
(make-vtable "prpw" (lambda (struct port) (display "#<" port) (display (struct-ref struct 0) port) (display " and " port) (display (struct-ref struct 1) port) (display ">" port)))
This section describes the basic procedures for working with
structures. make-struct creates a structure, and
struct-ref and struct-set! access write fields.
Create a new structure, with layout per the given vtable (see Vtables).
tail-size is the size of the tail array if vtable specifies a tail array. tail-size should be 0 when vtable doesn't specify a tail array.
The optional init... arguments are initial values for the fields of the structure (and the tail array). This is the only way to put values in read-only fields. If there are fewer init arguments than fields then the defaults are
#ffor a Scheme field (typep) or 0 for an uninterpreted field (typeu).Type
sself-reference fields, permissionoopaque fields, and the count field of a tail array are all ignored for the init arguments, ie. an argument is not consumed by such a field. Ansis always set to the structure itself, anois always set to#for 0 (with the intention that C code will do something to it later), and the tail count is always the given tail-size.For example,
(define v (make-vtable "prpwpw")) (define s (make-struct v 0 123 "abc" 456)) (struct-ref s 0) ⇒ 123 (struct-ref s 1) ⇒ "abc"(define v (make-vtable "prpW")) (define s (make-struct v 6 "fixed field" 'x 'y)) (struct-ref s 0) ⇒ "fixed field" (struct-ref s 1) ⇒ 2 ;; tail size (struct-ref s 2) ⇒ x ;; tail array ... (struct-ref s 3) ⇒ y (struct-ref s 4) ⇒ #f
Return
#tif obj is a structure, or#fif not.
Return the contents of field number n in struct. The first field is number 0.
An error is thrown if n is out of range, or if the field cannot be read because it's
oopaque.
Set field number n in struct to value. The first field is number 0.
An error is thrown if n is out of range, or if the field cannot be written because it's
rread-only oroopaque.
Return the vtable used by struct.
This can be used to examine the layout of an unknown structure, see Vtable Contents.
A vtable is itself a structure, with particular fields that hold information about the structures to be created. These include the fields of those structures, and the print function for them. The variables below allow access to those fields.
Return
#tif obj is a vtable structure.Note that because vtables are simply structures with a particular layout,
struct-vtable?can potentially return true on an application structure which merely happens to look like a vtable.
The field number of the layout specification in a vtable. The layout specification is a symbol like
pwpwformed from the fields string passed tomake-vtable, or created bymake-struct-layout(see Vtable Vtables).(define v (make-vtable "pwpw" 0)) (struct-ref v vtable-index-layout) ⇒ pwpwThis field is read-only, since the layout of structures using a vtable cannot be changed.
A self-reference to the vtable, ie. a type
sfield. This is used by C code within Guile and has no use at the Scheme level.
The field number of the printer function. This field contains
#fif the default print function should be used.(define (my-print-func struct port) ...) (define v (make-vtable "pwpw" my-print-func)) (struct-ref v vtable-index-printer) ⇒ my-print-funcThis field is writable, allowing the print function to be changed dynamically.
Get or set the name of vtable. name is a symbol and is used in the default print function when printing structures created from vtable.
(define v (make-vtable "pw")) (set-struct-vtable-name! v 'my-name) (define s (make-struct v 0)) (display s) -| #<my-name b7ab3ae0:b7ab3730>
Return the tag of the given vtable.
As noted above, a vtable is a structure and that structure is itself
described by a vtable. Such a “vtable of a vtable” can be created
with make-vtable-vtable below. This can be used to build sets
of related vtables, possibly with extra application fields.
This second level of vtable can be a little confusing. The ball example below is a typical use, adding a “class data” field to the vtables, from which instance structures are created. The current implementation of Guile's own records (see Records) does something similar, a record type descriptor is a vtable with room to hold the field names of the records to be created from it.
Create a “vtable-vtable” which can be used to create vtables. This vtable-vtable is also a vtable, and is self-describing, meaning its vtable is itself. The following is a simple usage.
(define vt-vt (make-vtable-vtable "" 0)) (define vt (make-struct vt-vt 0 (make-struct-layout "pwpw")) (define s (make-struct vt 0 123 456)) (struct-ref s 0) ⇒ 123
make-structis used to create a vtable from the vtable-vtable. The first initializer is a layout object (fieldvtable-index-layout), usually obtained frommake-struct-layout(below). An optional second initializer is a printer function (fieldvtable-index-printer), used as described undermake-vtable(see Vtables).user-fields is a layout string giving extra fields to have in the vtables. A vtable starts with some base fields as per Vtable Contents, and user-fields is appended. The user-fields start at field numbervtable-offset-user(below), and exist in both the vtable-vtable and in the vtables created from it. Such fields provide space for “class data”. For example,(define vt-of-vt (make-vtable-vtable "pw" 0)) (define vt (make-struct vt-of-vt 0)) (struct-set! vt vtable-offset-user "my class data")tail-size is the size of the tail array in the vtable-vtable itself, if user-fields specifies a tail array. This should be 0 if nothing extra is required or the format has no tail array. The tail array field such as ‘pW’ holds the tail array size, as usual, and is followed by the extra space.
(define vt-vt (make-vtable-vtable "pW" 20)) (define my-vt-tail-start (1+ vtable-offset-user)) (struct-set! vt-vt (+ 3 my-vt-tail-start) "data in tail")The optional print argument is used by
displayandwrite(etc) to print the vtable-vtable and any vtables created from it. It's called as(printvtable port)and should look at vtable and write to port. The default is the usual structure print function, which just gives machine addresses.
Return a structure layout symbol, from a fields string. fields is as described under
make-vtable(see Vtables). An invalid fields string is an error.(make-struct-layout "prpW") ⇒ prpW (make-struct-layout "blah") ⇒ ERROR
The first field in a vtable which is available for application use. Such fields only exist when specified by user-fields in
make-vtable-vtableabove.
Here's an extended vtable-vtable example, creating classes of “balls”. Each class has a “colour”, which is fixed. Instances of those classes are created, and such each such ball has an “owner”, which can be changed.
(define ball-root (make-vtable-vtable "pr" 0))
(define (make-ball-type ball-color)
(make-struct ball-root 0
(make-struct-layout "pw")
(lambda (ball port)
(format port "#<a ~A ball owned by ~A>"
(color ball)
(owner ball)))
ball-color))
(define (color ball)
(struct-ref (struct-vtable ball) vtable-offset-user))
(define (owner ball)
(struct-ref ball 0))
(define red (make-ball-type 'red))
(define green (make-ball-type 'green))
(define (make-ball type owner) (make-struct type 0 owner))
(define ball (make-ball green 'Nisse))
ball ⇒ #<a green ball owned by Nisse>
A dictionary object is a data structure used to index
information in a user-defined way. In standard Scheme, the main
aggregate data types are lists and vectors. Lists are not really
indexed at all, and vectors are indexed only by number
(e.g. (vector-ref foo 5)). Often you will find it useful
to index your data on some other type; for example, in a library
catalog you might want to look up a book by the name of its
author. Dictionaries are used to help you organize information in
such a way.
An association list (or alist for short) is a list of
key-value pairs. Each pair represents a single quantity or
object; the car of the pair is a key which is used to
identify the object, and the cdr is the object's value.
A hash table also permits you to index objects with arbitrary keys, but in a way that makes looking up any one object extremely fast. A well-designed hash system makes hash table lookups almost as fast as conventional array or vector references.
Alists are popular among Lisp programmers because they use only the language's primitive operations (lists, car, cdr and the equality primitives). No changes to the language core are necessary. Therefore, with Scheme's built-in list manipulation facilities, it is very convenient to handle data stored in an association list. Also, alists are highly portable and can be easily implemented on even the most minimal Lisp systems.
However, alists are inefficient, especially for storing large quantities of data. Because we want Guile to be useful for large software systems as well as small ones, Guile provides a rich set of tools for using either association lists or hash tables.
An association list is a conventional data structure that is often used
to implement simple key-value databases. It consists of a list of
entries in which each entry is a pair. The key of each entry is
the car of the pair and the value of each entry is the
cdr.
ASSOCIATION LIST ::= '( (KEY1 . VALUE1)
(KEY2 . VALUE2)
(KEY3 . VALUE3)
...
)
Association lists are also known, for short, as alists.
The structure of an association list is just one example of the infinite
number of possible structures that can be built using pairs and lists.
As such, the keys and values in an association list can be manipulated
using the general list structure procedures cons, car,
cdr, set-car!, set-cdr! and so on. However,
because association lists are so useful, Guile also provides specific
procedures for manipulating them.
All of Guile's dedicated association list procedures, apart from
acons, come in three flavours, depending on the level of equality
that is required to decide whether an existing key in the association
list is the same as the key that the procedure call uses to identify the
required entry.
eq? to determine key
equality.
eqv? to determine
key equality.
equal? to
determine key equality.
acons is an exception because it is used to build association
lists which do not require their entries' keys to be unique.
acons adds a new entry to an association list and returns the
combined association list. The combined alist is formed by consing the
new entry onto the head of the alist specified in the acons
procedure call. So the specified alist is not modified, but its
contents become shared with the tail of the combined alist that
acons returns.
In the most common usage of acons, a variable holding the
original association list is updated with the combined alist:
(set! address-list (acons name address address-list))
In such cases, it doesn't matter that the old and new values of
address-list share some of their contents, since the old value is
usually no longer independently accessible.
Note that acons adds the specified new entry regardless of
whether the alist may already contain entries with keys that are, in
some sense, the same as that of the new entry. Thus acons is
ideal for building alists where there is no concept of key uniqueness.
(set! task-list (acons 3 "pay gas bill" '()))
task-list
⇒
((3 . "pay gas bill"))
(set! task-list (acons 3 "tidy bedroom" task-list))
task-list
⇒
((3 . "tidy bedroom") (3 . "pay gas bill"))
assq-set!, assv-set! and assoc-set! are used to add
or replace an entry in an association list where there is a
concept of key uniqueness. If the specified association list already
contains an entry whose key is the same as that specified in the
procedure call, the existing entry is replaced by the new one.
Otherwise, the new entry is consed onto the head of the old association
list to create the combined alist. In all cases, these procedures
return the combined alist.
assq-set! and friends may destructively modify the
structure of the old association list in such a way that an existing
variable is correctly updated without having to set! it to the
value returned:
address-list
⇒
(("mary" . "34 Elm Road") ("james" . "16 Bow Street"))
(assoc-set! address-list "james" "1a London Road")
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
address-list
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
Or they may not:
(assoc-set! address-list "bob" "11 Newington Avenue")
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
address-list
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
The only safe way to update an association list variable when adding or
replacing an entry like this is to set! the variable to the
returned value:
(set! address-list
(assoc-set! address-list "bob" "11 Newington Avenue"))
address-list
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
Because of this slight inconvenience, you may find it more convenient to use hash tables to store dictionary data. If your application will not be modifying the contents of an alist very often, this may not make much difference to you.
If you need to keep the old value of an association list in a form
independent from the list that results from modification by
acons, assq-set!, assv-set! or assoc-set!,
use list-copy to copy the old association list before modifying
it.
Add a new key-value pair to alist. A new pair is created whose car is key and whose cdr is value, and the pair is consed onto alist, and the new list is returned. This function is not destructive; alist is not modified.
Reassociate key in alist with value: find any existing alist entry for key and associate it with the new value. If alist does not contain an entry for key, add a new one. Return the (possibly new) alist.
These functions do not attempt to verify the structure of alist, and so may cause unusual results if passed an object that is not an association list.
assq, assv and assoc find the entry in an alist
for a given key, and return the (key . value) pair.
assq-ref, assv-ref and assoc-ref do a similar
lookup, but return just the value.
Return the first entry in alist with the given key. The return is the pair
(KEY . VALUE)from alist. If there's no matching entry the return is#f.
assqcompares keys witheq?,assvuseseqv?andassocusesequal?. See also SRFI-1 which has an extendedassoc(SRFI-1 Association Lists).
Return the value from the first entry in alist with the given key, or
#fif there's no such entry.
assq-refcompares keys witheq?,assv-refuseseqv?andassoc-refusesequal?.Notice these functions have the key argument last, like other
-reffunctions, but this is opposite to whatassqetc above use.When the return is
#fit can be either key not found, or an entry which happens to have value#fin thecdr. Useassqetc above if you need to differentiate these cases.
To remove the element from an association list whose key matches a
specified key, use assq-remove!, assv-remove! or
assoc-remove! (depending, as usual, on the level of equality
required between the key that you specify and the keys in the
association list).
As with assq-set! and friends, the specified alist may or may not
be modified destructively, and the only safe way to update a variable
containing the alist is to set! it to the value that
assq-remove! and friends return.
address-list
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
⇒
(("bob" . "11 Newington Avenue") ("james" . "1a London Road"))
Note that, when assq/v/oc-remove! is used to modify an
association list that has been constructed only using the corresponding
assq/v/oc-set!, there can be at most one matching entry in the
alist, so the question of multiple entries being removed in one go does
not arise. If assq/v/oc-remove! is applied to an association
list that has been constructed using acons, or an
assq/v/oc-set! with a different level of equality, or any mixture
of these, it removes only the first matching entry from the alist, even
if the alist might contain further matching entries. For example:
(define address-list '())
(set! address-list (assq-set! address-list "mary" "11 Elm Street"))
(set! address-list (assq-set! address-list "mary" "57 Pine Drive"))
address-list
⇒
(("mary" . "57 Pine Drive") ("mary" . "11 Elm Street"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
⇒
(("mary" . "11 Elm Street"))
In this example, the two instances of the string "mary" are not the same
when compared using eq?, so the two assq-set! calls add
two distinct entries to address-list. When compared using
equal?, both "mary"s in address-list are the same as the
"mary" in the assoc-remove! call, but assoc-remove! stops
after removing the first matching entry that it finds, and so one of the
"mary" entries is left in place.
Delete the first entry in alist associated with key, and return the resulting alist.
sloppy-assq, sloppy-assv and sloppy-assoc behave
like the corresponding non-sloppy- procedures, except that they
return #f when the specified association list is not well-formed,
where the non-sloppy- versions would signal an error.
Specifically, there are two conditions for which the non-sloppy-
procedures signal an error, which the sloppy- procedures handle
instead by returning #f. Firstly, if the specified alist as a
whole is not a proper list:
(assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
⇒
ERROR: In procedure assoc in expression (assoc "mary" (quote #)):
ERROR: Wrong type argument in position 2 (expecting
association list): ((1 . 2) ("key" . "door") . "open sesame")
(sloppy-assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
⇒
#f
Secondly, if one of the entries in the specified alist is not a pair:
(assoc 2 '((1 . 1) 2 (3 . 9)))
⇒
ERROR: In procedure assoc in expression (assoc 2 (quote #)):
ERROR: Wrong type argument in position 2 (expecting
association list): ((1 . 1) 2 (3 . 9))
(sloppy-assoc 2 '((1 . 1) 2 (3 . 9)))
⇒
#f
Unless you are explicitly working with badly formed association lists,
it is much safer to use the non-sloppy- procedures, because they
help to highlight coding and data errors that the sloppy-
versions would silently cover up.
Behaves like
assqbut does not do any error checking. Recommended only for use in Guile internals.
Behaves like
assvbut does not do any error checking. Recommended only for use in Guile internals.
Behaves like
assocbut does not do any error checking. Recommended only for use in Guile internals.
Here is a longer example of how alists may be used in practice.
(define capitals '(("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami")))
;; What's the capital of Oregon?
(assoc "Oregon" capitals) ⇒ ("Oregon" . "Salem")
(assoc-ref capitals "Oregon") ⇒ "Salem"
;; We left out South Dakota.
(set! capitals
(assoc-set! capitals "South Dakota" "Pierre"))
capitals
⇒ (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami"))
;; And we got Florida wrong.
(set! capitals
(assoc-set! capitals "Florida" "Tallahassee"))
capitals
⇒ (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Tallahassee"))
;; After Oregon secedes, we can remove it.
(set! capitals
(assoc-remove! capitals "Oregon"))
capitals
⇒ (("South Dakota" . "Pierre")
("New York" . "Albany")
("Florida" . "Tallahassee"))
The (ice-9 vlist) module provides an implementation of VList-based
hash lists (see VLists). VList-based hash lists, or vhashes, are an
immutable dictionary type similar to association lists that maps keys to
values. However, unlike association lists, accessing a value given its
key is typically a constant-time operation.
The VHash programming interface of (ice-9 vlist) is mostly the same as
that of association lists found in SRFI-1, with procedure names prefixed by
vhash- instead of alist- (see SRFI-1 Association Lists).
In addition, vhashes can be manipulated using VList operations:
(vlist-head (vhash-consq 'a 1 vlist-null))
⇒ (a . 1)
(define vh1 (vhash-consq 'b 2 (vhash-consq 'a 1 vlist-null)))
(define vh2 (vhash-consq 'c 3 (vlist-tail vh1)))
(vhash-assq 'a vh2)
⇒ (a . 1)
(vhash-assq 'b vh2)
⇒ #f
(vhash-assq 'c vh2)
⇒ (c . 3)
(vlist->list vh2)
⇒ ((c . 3) (a . 1))
However, keep in mind that procedures that construct new VLists
(vlist-map, vlist-filter, etc.) return raw VLists, not vhashes:
(define vh (alist->vhash '((a . 1) (b . 2) (c . 3)) hashq))
(vhash-assq 'a vh)
⇒ (a . 1)
(define vl
;; This will create a raw vlist.
(vlist-filter (lambda (key+value) (odd? (cdr key+value))) vh))
(vhash-assq 'a vl)
⇒ ERROR: Wrong type argument in position 2
(vlist->list vl)
⇒ ((a . 1) (c . 3))
Return a new hash list based on vhash where key is associated with value, using hash-proc to compute the hash of key. vhash must be either
vlist-nullor a vhash returned by a previous call tovhash-cons. hash-proc defaults tohash(seehashprocedure). Withvhash-consq, thehashqhash function is used; withvhash-consvthehashvhash function is used.All
vhash-conscalls made to construct a vhash should use the same hash-proc. Failing to do that, the result is undefined.
Return the first key/value pair from vhash whose key is equal to key according to the equal? equality predicate (which defaults to
equal?), and using hash-proc (which defaults tohash) to compute the hash of key. The second form useseq?as the equality predicate andhashqas the hash function; the last form useseqv?andhashv.Note that it is important to consistently use the same hash function for hash-proc as was passed to
vhash-cons. Failing to do that, the result is unpredictable.
Remove all associations from vhash with key, comparing keys with equal? (which defaults to
equal?), and computing the hash of key using hash-proc (which defaults tohash). The second form useseq?as the equality predicate andhashqas the hash function; the last one useseqv?andhashv.Again the choice of hash-proc must be consistent with previous calls to
vhash-cons.
Fold over the key/value elements of vhash in the given direction, with each call to proc having the form
(prockey value result), where result is the result of the previous call to proc and init the value of result for the first call to proc.
Fold over all the values associated with key in vhash, with each call to proc having the form
(proc value result), where result is the result of the previous call to proc and init the value of result for the first call to proc.Keys in vhash are hashed using hash are compared using equal?. The second form uses
eq?as the equality predicate andhashqas the hash function; the third one useseqv?andhashv.Example:
(define vh (alist->vhash '((a . 1) (a . 2) (z . 0) (a . 3)))) (vhash-fold* cons '() 'a vh) ⇒ (3 2 1) (vhash-fold* cons '() 'z vh) ⇒ (0)
Return the vhash corresponding to alist, an association list, using hash-proc to compute key hashes. When omitted, hash-proc defaults to
hash.
Hash tables are dictionaries which offer similar functionality as association lists: They provide a mapping from keys to values. The difference is that association lists need time linear in the size of elements when searching for entries, whereas hash tables can normally search in constant time. The drawback is that hash tables require a little bit more memory, and that you can not use the normal list procedures (see Lists) for working with them.
Guile provides two types of hashtables. One is an abstract data type that can only be manipulated with the functions in this section. The other type is concrete: it uses a normal vector with alists as elements. The advantage of the abstract hash tables is that they will be automatically resized when they become too full or too empty.
For demonstration purposes, this section gives a few usage examples of some hash table procedures, together with some explanation what they do.
First we start by creating a new hash table with 31 slots, and populate it with two key/value pairs.
(define h (make-hash-table 31))
;; This is an opaque object
h
⇒
#<hash-table 0/31>
;; We can also use a vector of alists.
(define h (make-vector 7 '()))
h
⇒
#(() () () () () () ())
;; Inserting into a hash table can be done with hashq-set!
(hashq-set! h 'foo "bar")
⇒
"bar"
(hashq-set! h 'braz "zonk")
⇒
"zonk"
;; Or with hash-create-handle!
(hashq-create-handle! h 'frob #f)
⇒
(frob . #f)
;; The vector now contains three elements in the alists and the frob
;; entry is at index (hashq 'frob).
h
⇒
#(((braz . "zonk")) ((foo . "bar")) () () () () ((frob . #f)))
(hashq 'frob 7)
⇒
6
You can get the value for a given key with the procedure
hashq-ref, but the problem with this procedure is that you
cannot reliably determine whether a key does exists in the table. The
reason is that the procedure returns #f if the key is not in
the table, but it will return the same value if the key is in the
table and just happens to have the value #f, as you can see in
the following examples.
(hashq-ref h 'foo)
⇒
"bar"
(hashq-ref h 'frob)
⇒
#f
(hashq-ref h 'not-there)
⇒
#f
Better is to use the procedure hashq-get-handle, which makes a
distinction between the two cases. Just like assq, this
procedure returns a key/value-pair on success, and #f if the
key is not found.
(hashq-get-handle h 'foo)
⇒
(foo . "bar")
(hashq-get-handle h 'not-there)
⇒
#f
There is no procedure for calculating the number of key/value-pairs in
a hash table, but hash-fold can be used for doing exactly that.
(hash-fold (lambda (key value seed) (+ 1 seed)) 0 h)
⇒
3
Like the association list functions, the hash table functions come in
several varieties, according to the equality test used for the keys.
Plain hash- functions use equal?, hashq-
functions use eq?, hashv- functions use eqv?, and
the hashx- functions use an application supplied test.
A single make-hash-table creates a hash table suitable for use
with any set of functions, but it's imperative that just one set is
then used consistently, or results will be unpredictable.
Hash tables are implemented as a vector indexed by a hash value formed
from the key, with an association list of key/value pairs for each
bucket in case distinct keys hash together. Direct access to the
pairs in those lists is provided by the -handle- functions.
The abstract kind of hash tables hide the vector in an opaque object
that represents the hash table, while for the concrete kind the vector
is the hashtable.
When the number of table entries in an abstract hash table goes above a threshold, the vector is made larger and the entries are rehashed, to prevent the bucket lists from becoming too long and slowing down accesses. When the number of entries goes below a threshold, the vector is shrunk to save space.
A abstract hash table is created with make-hash-table. To
create a vector that is suitable as a hash table, use
(make-vector size '()), for example.
For the hashx- “extended” routines, an application supplies a
hash function producing an integer index like hashq etc
below, and an assoc alist search function like assq etc
(see Retrieving Alist Entries). Here's an example of such
functions implementing case-insensitive hashing of string keys,
(use-modules (srfi srfi-1)
(srfi srfi-13))
(define (my-hash str size)
(remainder (string-hash-ci str) size))
(define (my-assoc str alist)
(find (lambda (pair) (string-ci=? str (car pair))) alist))
(define my-table (make-hash-table))
(hashx-set! my-hash my-assoc my-table "foo" 123)
(hashx-ref my-hash my-assoc my-table "FOO")
⇒ 123
In a hashx- hash function the aim is to spread keys
across the vector, so bucket lists don't become long. But the actual
values are arbitrary as long as they're in the range 0 to
size-1. Helpful functions for forming a hash value, in
addition to hashq etc below, include symbol-hash
(see Symbol Keys), string-hash and string-hash-ci
(see String Comparison), and char-set-hash
(see Character Set Predicates/Comparison).
Create a new abstract hash table object, with an optional minimum vector size.
When size is given, the table vector will still grow and shrink automatically, as described above, but with size as a minimum. If an application knows roughly how many entries the table will hold then it can use size to avoid rehashing when initial entries are added.
Return
#tif obj is a abstract hash table object.
Remove all items from table (without triggering a resize).
Lookup key in the given hash table, and return the associated value. If key is not found, return dflt, or
#fif dflt is not given.
Associate val with key in the given hash table. If key is already present then it's associated value is changed. If it's not present then a new entry is created.
Remove any association for key in the given hash table. If key is not in table then nothing is done.
Return a hash value for key. This is a number in the range 0 to size-1, which is suitable for use in a hash table of the given size.
Note that
hashqandhashvmay use internal addresses of objects, so if an object is garbage collected and re-created it can have a different hash value, even when the two are notionallyeq?. For instance with symbols,(hashq 'something 123) ⇒ 19 (gc) (hashq 'something 123) ⇒ 62In normal use this is not a problem, since an object entered into a hash table won't be garbage collected until removed. It's only if hashing calculations are somehow separated from normal references that its lifetime needs to be considered.
Return the
(key.value)pair for key in the given hash table, or#fif key is not in table.
Return the
(key.value)pair for key in the given hash table. If key is not in table then create an entry for it with init as the value, and return that pair.
Apply proc to the entries in the given hash table. Each call is
(proc key value).hash-map->listreturns a list of the results from these calls,hash-for-eachdiscards the results and returns an unspecified value.Calls are made over the table entries in an unspecified order, and for
hash-map->listthe order of the values in the returned list is unspecified. Results will be unpredictable if table is modified while iterating.For example the following returns a new alist comprising all the entries from
mytable, in no particular order.(hash-map->list cons mytable)
Apply proc to the entries in the given hash table. Each call is
(proc handle), where handle is a(key.value)pair. Return an unspecified value.
hash-for-each-handlediffers fromhash-for-eachonly in the argument list of proc.
Accumulate a result by applying proc to the elements of the given hash table. Each call is
(proc key value prior-result), where key and value are from the table and prior-result is the return from the previous proc call. For the first call, prior-result is the given init value.Calls are made over the table entries in an unspecified order. Results will be unpredictable if table is modified while
hash-foldis running.For example, the following returns a count of how many keys in
mytableare strings.(hash-fold (lambda (key value prior) (if (string? key) (1+ prior) prior)) 0 mytable)
This chapter contains reference information related to defining and working with smobs. See Defining New Types (Smobs) for a tutorial-like introduction to smobs.
This function adds a new smob type, named name, with instance size size, to the system. The return value is a tag that is used in creating instances of the type.
If size is 0, the default free function will do nothing.
If size is not 0, the default free function will deallocate the memory block pointed to by
SCM_SMOB_DATAwithscm_gc_free. The WHAT parameter in the call toscm_gc_freewill be NAME.Default values are provided for the mark, free, print, and equalp functions, as described in Defining New Types (Smobs). If you want to customize any of these functions, the call to
scm_make_smob_typeshould be immediately followed by calls to one or several ofscm_set_smob_mark,scm_set_smob_free,scm_set_smob_print, and/orscm_set_smob_equalp.
This function sets the smob freeing procedure (sometimes referred to as a finalizer) for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The free procedure must deallocate all resources that are directly associated with the smob instance OBJ. It must assume that all
SCMvalues that it references have already been freed and are thus invalid.It must also not call any libguile function or macro except
scm_gc_free,SCM_SMOB_FLAGS,SCM_SMOB_DATA,SCM_SMOB_DATA_2, andSCM_SMOB_DATA_3.The free procedure must return 0.
Note that defining a freeing procedure is not necessary if the resources associated with obj consists only of memory allocated with
scm_gc_mallocorscm_gc_malloc_pointerlessbecause this memory is automatically reclaimed by the garbage collector when it is no longer needed (seescm_gc_malloc).
This function sets the smob marking procedure for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.Defining a marking procedure may sometimes be unnecessary because large parts of the process' memory (with the exception of
scm_gc_malloc_pointerlessregions, andmalloc- orscm_malloc-allocated memory) are scanned for live pointers8.The mark procedure must cause
scm_gc_markto be called for everySCMvalue that is directly referenced by the smob instance obj. One of theseSCMvalues can be returned from the procedure and Guile will callscm_gc_markfor it. This can be used to avoid deep recursions for smob instances that form a list.It must not call any libguile function or macro except
scm_gc_mark,SCM_SMOB_FLAGS,SCM_SMOB_DATA,SCM_SMOB_DATA_2, andSCM_SMOB_DATA_3.
This function sets the smob printing procedure for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The print procedure should output a textual representation of the smob instance obj to port, using information in pstate.
The textual representation should be of the form
#<name ...>. This ensures thatreadwill not interpret it as some other Scheme value.It is often best to ignore pstate and just print to port with
scm_display,scm_write,scm_simple_format, andscm_puts.
This function sets the smob equality-testing predicate for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The equalp procedure should return
SCM_BOOL_Twhen obj1 isequal?to obj2. Else it should return SCM_BOOL_F. Both obj1 and obj2 are instances of the smob type tc.
When val is a smob of the type indicated by tag, do nothing. Else, signal an error.
Return true iff exp is a smob instance of the type indicated by tag. The expression exp can be evaluated more than once, so it shouldn't contain any side effects.
Make value contain a smob instance of the type with tag tag and smob data data, data2, and data3, as appropriate.
The tag is what has been returned by
scm_make_smob_type. The initial values data, data2, and data3 are of typescm_t_bits; when you want to use them forSCMvalues, these values need to be converted to ascm_t_bitsfirst by usingSCM_UNPACK.The flags of the smob instance start out as zero.
Since it is often the case (e.g., in smob constructors) that you will create a smob instance and return it, there is also a slightly specialized macro for this situation:
This macro expands to a block of code that creates a smob instance of the type with tag tag and smob data data, data2, and data3, as with
SCM_NEWSMOB, etc., and causes the surrounding function to return thatSCMvalue. It should be the last piece of code in a block.
Return the 16 extra bits of the smob obj. No meaning is predefined for these bits, you can use them freely.
Set the 16 extra bits of the smob obj to flags. No meaning is predefined for these bits, you can use them freely.
Return the first (second, third) immediate word of the smob obj as a
scm_t_bitsvalue. When the word contains aSCMvalue, useSCM_SMOB_OBJECT(etc.) instead.
Set the first (second, third) immediate word of the smob obj to val. When the word should be set to a
SCMvalue, useSCM_SMOB_SET_OBJECT(etc.) instead.
Return the first (second, third) immediate word of the smob obj as a
SCMvalue. When the word contains ascm_t_bitsvalue, useSCM_SMOB_DATA(etc.) instead.
Set the first (second, third) immediate word of the smob obj to val. When the word should be set to a
scm_t_bitsvalue, useSCM_SMOB_SET_DATA(etc.) instead.
Return a pointer to the first (second, third) immediate word of the smob obj. Note that this is a pointer to
SCM. If you need to work withscm_t_bitsvalues, useSCM_PACKandSCM_UNPACK, as appropriate.
Mark the references in the smob x, assuming that x's first data word contains an ordinary Scheme object, and x refers to no other objects. This function simply returns x's first data word.
A lambda expression evaluates to a procedure. The environment
which is in effect when a lambda expression is evaluated is
enclosed in the newly created procedure, this is referred to as a
closure (see About Closure).
When a procedure created by lambda is called with some actual
arguments, the environment enclosed in the procedure is extended by
binding the variables named in the formal argument list to new locations
and storing the actual arguments into these locations. Then the body of
the lambda expression is evaluated sequentially. The result of
the last expression in the procedure body is then the result of the
procedure invocation.
The following examples will show how procedures can be created using
lambda, and what you can do with these procedures.
(lambda (x) (+ x x)) ⇒ a procedure
((lambda (x) (+ x x)) 4) ⇒ 8
The fact that the environment in effect when creating a procedure is enclosed in the procedure is shown with this example:
(define add4
(let ((x 4))
(lambda (y) (+ x y))))
(add4 6) ⇒ 10
formals should be a formal argument list as described in the following table.
(variable1...)- The procedure takes a fixed number of arguments; when the procedure is called, the arguments will be stored into the newly created location for the formal variables.
- variable
- The procedure takes any number of arguments; when the procedure is called, the sequence of actual arguments will converted into a list and stored into the newly created location for the formal variable.
(variable1...variablen.variablen+1)- If a space-delimited period precedes the last variable, then the procedure takes n or more variables where n is the number of formal arguments before the period. There must be at least one argument before the period. The first n actual arguments will be stored into the newly allocated locations for the first n formal arguments and the sequence of the remaining actual arguments is converted into a list and the stored into the location for the last formal argument. If there are exactly n actual arguments, the empty list is stored into the location of the last formal argument.
The list in variable or variablen+1 is always newly created and the procedure can modify it if desired. This is the case even when the procedure is invoked via
apply, the required part of the list argument there will be copied (see Procedures for On the Fly Evaluation).body is a sequence of Scheme expressions which are evaluated in order when the procedure is invoked.
Procedures written in C can be registered for use from Scheme,
provided they take only arguments of type SCM and return
SCM values. scm_c_define_gsubr is likely to be the most
useful mechanism, combining the process of registration
(scm_c_make_gsubr) and definition (scm_define).
Register a C procedure FCN as a “subr” — a primitive subroutine that can be called from Scheme. It will be associated with the given name but no environment binding will be created. The arguments req, opt and rst specify the number of required, optional and “rest” arguments respectively. The total number of these arguments should match the actual number of arguments to fcn, but may not exceed 10. The number of rest arguments should be 0 or 1.
scm_c_make_gsubrreturns a value of typeSCMwhich is a “handle” for the procedure.
Register a C procedure FCN, as for
scm_c_make_gsubrabove, and additionally create a top-level Scheme binding for the procedure in the “current environment” usingscm_define.scm_c_define_gsubrreturns a handle for the procedure in the same way asscm_c_make_gsubr, which is usually not further required.
The evaluation strategy given in Lambda describes how procedures are interpreted. Interpretation operates directly on expanded Scheme source code, recursively calling the evaluator to obtain the value of nested expressions.
Most procedures are compiled, however. This means that Guile has done some pre-computation on the procedure, to determine what it will need to do each time the procedure runs. Compiled procedures run faster than interpreted procedures.
Loading files is the normal way that compiled procedures come to being. If Guile sees that a file is uncompiled, or that its compiled file is out of date, it will attempt to compile the file when it is loaded, and save the result to disk. Procedures can be compiled at runtime as well. See Read/Load/Eval/Compile, for more information on runtime compilation.
Compiled procedures, also known as programs, respond all procedures that operate on procedures. In addition, there are a few more accessors for low-level details on programs.
Most people won't need to use the routines described in this section, but it's good to have them documented. You'll have to include the appropriate module first, though:
(use-modules (system vm program))
Returns
#tiff obj is a compiled procedure.
Returns the object code associated with this program. See Bytecode and Objcode, for more information.
Returns the “object table” associated with this program, as a vector. See VM Programs, for more information.
Returns the module that was current when this program was created. Can return
#fif the compiler could determine that this information was unnecessary.
Returns the set of free variables that this program captures in its closure, as a vector. If a closure is code with data, you can get the code from
program-objcode, and the data viaprogram-free-variables.Some of the values captured are actually in variable “boxes”. See Variables and the VM, for more information.
Users must not modify the returned value unless they think they're really clever.
Return the metadata thunk of program, or
#fif it has no metadata.When called, a metadata thunk returns a list of the following form:
(bindings sources arities.properties). The format of each of these elements is discussed below.
Bindings annotations for programs, along with their accessors.
Bindings declare names and liveness extents for block-local variables. The best way to see what these are is to play around with them at a REPL. See VM Concepts, for more information.
Note that bindings information is stored in a program as part of its metadata thunk, so including it in the generated object code does not impose a runtime performance penalty.
Source location annotations for programs, along with their accessors.
Source location information propagates through the compiler and ends up being serialized to the program's metadata. This information is keyed by the offset of the instruction pointer within the object code of the program. Specifically, it is keyed on the
ipjust following an instruction, so that backtraces can find the source location of a call that is in progress.
Accessors for a representation of the “arity” of a program.
The normal case is that a procedure has one arity. For example,
(lambda (x) x), takes one required argument, and that's it. One could access that number of required arguments via(arity:nreq (program-arities (lambda (x) x))). Similarly,arity:noptgets the number of optional arguments, andarity:rest?returns a true value if the procedure has a rest arg.
arity:kwreturns a list of(kw.idx)pairs, if the procedure has keyword arguments. The idx refers to the idxth local variable; See Variables and the VM, for more information. Finallyarity:allow-other-keys?returns a true value if other keys are allowed. See Optional Arguments, for more information.So what about
arity:startandarity:end, then? They return the range of bytes in the program's bytecode for which a given arity is valid. You see, a procedure can actually have more than one arity. The question, “what is a procedure's arity” only really makes sense at certain points in the program, delimited by thesearity:startandarity:endvalues.
Scheme procedures, as defined in R5RS, can either handle a fixed number of actual arguments, or a fixed number of actual arguments followed by arbitrarily many additional arguments. Writing procedures of variable arity can be useful, but unfortunately, the syntactic means for handling argument lists of varying length is a bit inconvenient. It is possible to give names to the fixed number of arguments, but the remaining (optional) arguments can be only referenced as a list of values (see Lambda).
For this reason, Guile provides an extension to lambda,
lambda*, which allows the user to define procedures with
optional and keyword arguments. In addition, Guile's virtual machine
has low-level support for optional and keyword argument dispatch.
Calls to procedures with optional and keyword arguments can be made
cheaply, without allocating a rest list.
lambda* is like lambda, except with some extensions to
allow optional and keyword arguments.
Create a procedure which takes optional and/or keyword arguments specified with#:optionaland#:key. For example,(lambda* (a b #:optional c d . e) '())is a procedure with fixed arguments a and b, optional arguments c and d, and rest argument e. If the optional arguments are omitted in a call, the variables for them are bound to
#f.Likewise,
define*is syntactic sugar for defining procedures usinglambda*.
lambda*can also make procedures with keyword arguments. For example, a procedure defined like this:(define* (sir-yes-sir #:key action how-high) (list action how-high))can be called as
(sir-yes-sir #:action 'jump),(sir-yes-sir #:how-high 13),(sir-yes-sir #:action 'lay-down #:how-high 0), or just(sir-yes-sir). Whichever arguments are given as keywords are bound to values (and those not given are#f).Optional and keyword arguments can also have default values to take when not present in a call, by giving a two-element list of variable name and expression. For example in
(define* (frob foo #:optional (bar 42) #:key (baz 73)) (list foo bar baz))foo is a fixed argument, bar is an optional argument with default value 42, and baz is a keyword argument with default value 73. Default value expressions are not evaluated unless they are needed, and until the procedure is called.
Normally it's an error if a call has keywords other than those specified by
#:key, but adding#:allow-other-keysto the definition (after the keyword argument declarations) will ignore unknown keywords.If a call has a keyword given twice, the last value is used. For example,
(define* (flips #:key (heads 0) (tails 0)) (display (list heads tails))) (flips #:heads 37 #:tails 42 #:heads 99) -| (99 42)
#:restis a synonym for the dotted syntax rest argument. The argument lists(a . b)and(a #:rest b)are equivalent in all respects. This is provided for more similarity to DSSSL, MIT-Scheme and Kawa among others, as well as for refugees from other Lisp dialects.When
#:keyis used together with a rest argument, the keyword parameters in a call all remain in the rest list. This is the same as Common Lisp. For example,((lambda* (#:key (x 0) #:allow-other-keys #:rest r) (display r)) #:x 123 #:y 456) -| (#:x 123 #:y 456)
#:optionaland#:keyestablish their bindings successively, from left to right. This means default expressions can refer back to prior parameters, for example(lambda* (start #:optional (end (+ 10 start))) (do ((i start (1+ i))) ((> i end)) (display i)))The exception to this left-to-right scoping rule is the rest argument. If there is a rest argument, it is bound after the optional arguments, but before the keyword arguments.
Before Guile 2.0, lambda* and define* were implemented
using macros that processed rest list arguments. This was not optimal,
as calling procedures with optional arguments had to allocate rest
lists at every procedure invocation. Guile 2.0 improved this
situation by bringing optional and keyword arguments into Guile's
core.
However there are occasions in which you have a list and want to parse
it for optional or keyword arguments. Guile's (ice-9 optargs)
provides some macros to help with that task.
The syntax let-optional and let-optional* are for
destructuring rest argument lists and giving names to the various list
elements. let-optional binds all variables simultaneously, while
let-optional* binds them sequentially, consistent with let
and let* (see Local Bindings).
These two macros give you an optional argument interface that is very Schemey and introduces no fancy syntax. They are compatible with the scsh macros of the same name, but are slightly extended. Each of binding may be of one of the forms var or
(var default-value). rest-arg should be the rest-argument of the procedures these are used from. The items in rest-arg are sequentially bound to the variable names are given. When rest-arg runs out, the remaining vars are bound either to the default values or#fif no default value was specified. rest-arg remains bound to whatever may have been left of rest-arg.After binding the variables, the expressions expr ... are evaluated in order.
Similarly, let-keywords and let-keywords* extract values
from keyword style argument lists, binding local variables to those
values or to defaults.
args is evaluated and should give a list of the form
(#:keyword1 value1 #:keyword2 value2 ...). The bindings are variables and default expressions, with the variables to be set (by name) from the keyword values. The body forms are then evaluated and the last is the result. An example will make the syntax clearest,(define args '(#:xyzzy "hello" #:foo "world")) (let-keywords args #t ((foo "default for foo") (bar (string-append "default" "for" "bar"))) (display foo) (display ", ") (display bar)) -| world, defaultforbarThe binding for
foocomes from the#:fookeyword inargs. But the binding forbaris the default in thelet-keywords, since there's no#:barin the args.allow-other-keys? is evaluated and controls whether unknown keywords are allowed in the args list. When true other keys are ignored (such as
#:xyzzyin the example), when#fan error is thrown for anything unknown.
(ice-9 optargs) also provides some more define* sugar,
which is not so useful with modern Guile coding, but still supported:
define*-public is the lambda* version of
define-public; defmacro* and defmacro*-public
exist for defining macros with the improved argument list handling
possibilities. The -public versions not only define the
procedures/macros, but also export them from the current module.
These are just like
defmacroanddefmacro-publicexcept that they takelambda*-style extended parameter lists, where#:optional,#:key,#:allow-other-keysand#:restare allowed with the usual semantics. Here is an example of a macro with an optional argument:(defmacro* transmogrify (a #:optional b) (a 1))
R5RS's rest arguments are indeed useful and very general, but they
often aren't the most appropriate or efficient means to get the job
done. For example, lambda* is a much better solution to the
optional argument problem than lambda with rest arguments.
Likewise, case-lambda works well for when you want one
procedure to do double duty (or triple, or ...), without the penalty
of consing a rest list.
For example:
(define (make-accum n)
(case-lambda
(() n)
((m) (set! n (+ n m)) n)))
(define a (make-accum 20))
(a) ⇒ 20
(a 10) ⇒ 30
(a) ⇒ 30
The value returned by a case-lambda form is a procedure which
matches the number of actual arguments against the formals in the
various clauses, in order. The first matching clause is selected, the
corresponding values from the actual parameter list are bound to the
variable names in the clauses and the body of the clause is evaluated.
If no clause matches, an error is signalled.
The syntax of the case-lambda form is defined in the following
EBNF grammar. Formals means a formal argument list just like
with lambda (see Lambda).
<case-lambda>
--> (case-lambda <case-lambda-clause>)
<case-lambda-clause>
--> (<formals> <definition-or-command>*)
<formals>
--> (<identifier>*)
| (<identifier>* . <identifier>)
| <identifier>
Rest lists can be useful with case-lambda:
(define plus
(case-lambda
(() 0)
((a) a)
((a b) (+ a b))
((a b . rest) (apply plus (+ a b) rest))))
(plus 1 2 3) ⇒ 6
Also, for completeness. Guile defines case-lambda* as well,
which is like case-lambda, except with lambda* clauses.
A case-lambda* clause matches if the arguments fill the
required arguments, but are not too many for the optional and/or rest
arguments.
Keyword arguments are possible with case-lambda*, but they do
not contribute to the “matching” behavior. That is to say,
case-lambda* matches only on required, optional, and rest
arguments, and on the predicate; keyword arguments may be present but
do not contribute to the “success” of a match. In fact a bad keyword
argument list may cause an error to be raised.
As a functional programming language, Scheme allows the definition of higher-order functions, i.e., functions that take functions as arguments and/or return functions. Utilities to derive procedures from other procedures are provided and described below.
Return a procedure that accepts any number of arguments and returns value.
(procedure? (const 3)) ⇒ #t ((const 'hello)) ⇒ hello ((const 'hello) 'world) ⇒ hello
Return a procedure with the same arity as proc that returns the
notof proc's result.(procedure? (negate number?)) ⇒ #t ((negate odd?) 2) ⇒ #t ((negate real?) 'dream) ⇒ #t ((negate string-prefix?) "GNU" "GNU Guile") ⇒ #f (filter (negate number?) '(a 2 "b")) ⇒ (a "b")
Compose proc with the procedures in rest, such that the last one in rest is applied first and proc last, and return the resulting procedure. The given procedures must have compatible arity.
(procedure? (compose 1+ 1-)) ⇒ #t ((compose sqrt 1+ 1+) 2) ⇒ 2.0 ((compose 1+ sqrt) 3) ⇒ 2.73205080756888 (eq? (compose 1+) 1+) ⇒ #t ((compose zip unzip2) '((1 2) (a b))) ⇒ ((1 2) (a b))
In addition to the information that is strictly necessary to run, procedures may have other associated information. For example, the name of a procedure is information not for the procedure, but about the procedure. This meta-information can be accessed via the procedure properties interface.
The first group of procedures in this meta-interface are predicates to
test whether a Scheme object is a procedure, or a special procedure,
respectively. procedure? is the most general predicates, it
returns #t for any kind of procedure. closure? does not
return #t for primitive procedures, and thunk? only
returns #t for procedures which do not accept any arguments.
Return
#tif obj is a procedure.
Procedure properties are general properties associated with procedures. These can be the name of a procedure or other relevant information, such as debug hints.
Return the name of the procedure proc
Return the source of the procedure proc. Returns
#fif the source code is not available.
Return the properties associated with proc, as an association list.
Return the property of proc with name key.
Set proc's property list to alist.
In proc's property list, set the property named key to value.
Documentation for a procedure can be accessed with the procedure
procedure-documentation.
Return the documentation string associated with
proc. By convention, if a procedure contains more than one expression and the first expression is a string constant, that string is assumed to contain documentation for that procedure.
A procedure with setter is a special kind of procedure which normally behaves like any accessor procedure, that is a procedure which accesses a data structure. The difference is that this kind of procedure has a so-called setter attached, which is a procedure for storing something into a data structure.
Procedures with setters are treated specially when the procedure appears
in the special form set! (REFFIXME). How it works is best shown
by example.
Suppose we have a procedure called foo-ref, which accepts two
arguments, a value of type foo and an integer. The procedure
returns the value stored at the given index in the foo object.
Let f be a variable containing such a foo data
structure.9
(foo-ref f 0) ⇒ bar
(foo-ref f 1) ⇒ braz
Also suppose that a corresponding setter procedure called
foo-set! does exist.
(foo-set! f 0 'bla)
(foo-ref f 0) ⇒ bla
Now we could create a new procedure called foo, which is a
procedure with setter, by calling make-procedure-with-setter with
the accessor and setter procedures foo-ref and foo-set!.
Let us call this new procedure foo.
(define foo (make-procedure-with-setter foo-ref foo-set!))
foo can from now an be used to either read from the data
structure stored in f, or to write into the structure.
(set! (foo f 0) 'dum)
(foo f 0) ⇒ dum
Create a new procedure which behaves like procedure, but with the associated setter setter.
Return
#tif obj is a procedure with an associated setter procedure.
Return the procedure of proc, which must be an applicable struct.
Return the setter of proc, which must be either a procedure with setter or an operator struct.
You can define an inlinable procedure by using
define-inlinable instead of define. An inlinable
procedure behaves the same as a regular procedure, but direct calls will
result in the procedure body being inlined into the caller.
Bear in mind that starting from version 2.0.3, Guile has a partial evaluator that can inline the body of inner procedures when deemed appropriate:
scheme@(guile-user)> ,optimize (define (foo x)
(define (bar) (+ x 3))
(* (bar) 2))
$1 = (define foo
(lambda (#{x 94}#) (* (+ #{x 94}# 3) 2)))
The partial evaluator does not inline top-level bindings, though, so
this is a situation where you may find it interesting to use
define-inlinable.
Procedures defined with define-inlinable are always
inlined, at all direct call sites. This eliminates function call
overhead at the expense of an increase in code size. Additionally, the
caller will not transparently use the new definition if the inline
procedure is redefined. It is not possible to trace an inlined
procedures or install a breakpoint in it (see Traps). For these
reasons, you should not make a procedure inlinable unless it
demonstrably improves performance in a crucial way.
In general, only small procedures should be considered for inlining, as making large procedures inlinable will probably result in an increase in code size. Additionally, the elimination of the call overhead rarely matters for large procedures.
Define name as a procedure with parameters parameters and body body.
At its best, programming in Lisp is an iterative process of building up a language appropriate to the problem at hand, and then solving the problem in that language. Defining new procedures is part of that, but Lisp also allows the user to extend its syntax, with its famous macros.
Macros are syntactic extensions which cause the expression that they appear in to be transformed in some way before being evaluated. In expressions that are intended for macro transformation, the identifier that names the relevant macro must appear as the first element, like this:
(macro-name macro-args ...)
Macro expansion is a separate phase of evaluation, run before code is interpreted or compiled. A macro is a program that runs on programs, translating an embedded language into core Scheme10.
A macro is a binding between a keyword and a syntax transformer. Since it's
difficult to discuss define-syntax without discussing the format of
transformers, consider the following example macro definition:
(define-syntax when
(syntax-rules ()
((when condition exp ...)
(if condition
(begin exp ...)))))
(when #t
(display "hey ho\n")
(display "let's go\n"))
-| hey ho
-| let's go
In this example, the when binding is bound with define-syntax.
Syntax transformers are discussed in more depth in Syntax Rules and
Syntax Case.
Bind keyword to the syntax transformer obtained by evaluating transformer.
After a macro has been defined, further instances of keyword in Scheme source code will invoke the syntax transformer defined by transformer.
One can also establish local syntactic bindings with let-syntax.
Bind keyword... to transformer... while expanding exp....
A
let-syntaxbinding only exists at expansion-time.(let-syntax ((unless (syntax-rules () ((unless condition exp ...) (if (not condition) (begin exp ...)))))) (unless #t (primitive-exit 1)) "rock rock rock") ⇒ "rock rock rock"
A define-syntax form is valid anywhere a definition may appear: at the
top-level, or locally. Just as a local define expands out to an instance
of letrec, a local define-syntax expands out to
letrec-syntax.
Bind keyword... to transformer... while expanding exp....
In the spirit of
letrecversuslet, an expansion produced by transformer may reference a keyword bound by the same letrec-syntax.(letrec-syntax ((my-or (syntax-rules () ((my-or) #t) ((my-or exp) exp) ((my-or exp rest ...) (let ((t exp)) (if exp exp (my-or rest ...))))))) (my-or #f "rockaway beach")) ⇒ "rockaway beach"
syntax-rules macros are simple, pattern-driven syntax transformers, with
a beauty worthy of Scheme.
Create a syntax transformer that will rewrite an expression using the rules embodied in the pattern and template clauses.
A syntax-rules macro consists of three parts: the literals (if any), the
patterns, and as many templates as there are patterns.
When the syntax expander sees the invocation of a syntax-rules macro, it
matches the expression against the patterns, in order, and rewrites the
expression using the template from the first matching pattern. If no pattern
matches, a syntax error is signalled.
We have already seen some examples of patterns in the previous section:
(unless condition exp ...), (my-or exp), and so on. A pattern is
structured like the expression that it is to match. It can have nested structure
as well, like (let ((var val) ...) exp exp* ...). Broadly speaking,
patterns are made of lists, improper lists, vectors, identifiers, and datums.
Users can match a sequence of patterns using the ellipsis (...).
Identifiers in a pattern are called literals if they are present in the
syntax-rules literals list, and pattern variables otherwise. When
building up the macro output, the expander replaces instances of a pattern
variable in the template with the matched subexpression.
(define-syntax kwote
(syntax-rules ()
((kwote exp)
(quote exp))))
(kwote (foo . bar))
⇒ (foo . bar)
An improper list of patterns matches as rest arguments do:
(define-syntax let1
(syntax-rules ()
((_ (var val) . exps)
(let ((var val)) . exps))))
However this definition of let1 probably isn't what you want, as the tail
pattern exps will match non-lists, like (let1 (foo 'bar) . baz). So
often instead of using improper lists as patterns, ellipsized patterns are
better. Instances of a pattern variable in the template must be followed by an
ellipsis.
(define-syntax let1
(syntax-rules ()
((_ (var val) exp ...)
(let ((var val)) exp ...))))
This let1 probably still doesn't do what we want, because the body
matches sequences of zero expressions, like (let1 (foo 'bar)). In this
case we need to assert we have at least one body expression. A common idiom for
this is to name the ellipsized pattern variable with an asterisk:
(define-syntax let1
(syntax-rules ()
((_ (var val) exp exp* ...)
(let ((var val)) exp exp* ...))))
A vector of patterns matches a vector whose contents match the patterns, including ellipsizing and tail patterns.
(define-syntax letv
(syntax-rules ()
((_ #((var val) ...) exp exp* ...)
(let ((var val) ...) exp exp* ...))))
(letv #((foo 'bar)) foo)
⇒ foo
Literals are used to match specific datums in an expression, like the use of
=> and else in cond expressions.
(define-syntax cond1
(syntax-rules (=> else)
((cond1 test => fun)
(let ((exp test))
(if exp (fun exp) #f)))
((cond1 test exp exp* ...)
(if test (begin exp exp* ...)))
((cond1 else exp exp* ...)
(begin exp exp* ...))))
(define (square x) (* x x))
(cond1 10 => square)
⇒ 100
(let ((=> #t))
(cond1 10 => square))
⇒ #<procedure square (x)>
A literal matches an input expression if the input expression is an identifier with the same name as the literal, and both are unbound11.
If a pattern is not a list, vector, or an identifier, it matches as a literal,
with equal?.
(define-syntax define-matcher-macro
(syntax-rules ()
((_ name lit)
(define-syntax name
(syntax-rules ()
((_ lit) #t)
((_ else) #f))))))
(define-matcher-macro is-literal-foo? "foo")
(is-literal-foo? "foo")
⇒ #t
(is-literal-foo? "bar")
⇒ #f
(let ((foo "foo"))
(is-literal-foo? foo))
⇒ #f
The last example indicates that matching happens at expansion-time, not at run-time.
Syntax-rules macros are always used as (macro . args), and
the macro will always be a symbol. Correspondingly, a syntax-rules
pattern must be a list (proper or improper), and the first pattern in that list
must be an identifier. Incidentally it can be any identifier – it doesn't have
to actually be the name of the macro. Thus the following three are equivalent:
(define-syntax when
(syntax-rules ()
((when c e ...)
(if c (begin e ...)))))
(define-syntax when
(syntax-rules ()
((_ c e ...)
(if c (begin e ...)))))
(define-syntax when
(syntax-rules ()
((something-else-entirely c e ...)
(if c (begin e ...)))))
For clarity, use one of the first two variants. Also note that since the pattern
variable will always match the macro itself (e.g., cond1), it is actually
left unbound in the template.
syntax-rules macros have a magical property: they preserve referential
transparency. When you read a macro definition, any free bindings in that macro
are resolved relative to the macro definition; and when you read a macro
instantiation, all free bindings in that expression are resolved relative to the
expression.
This property is sometimes known as hygiene, and it does aid in code cleanliness. In your macro definitions, you can feel free to introduce temporary variables, without worrying about inadvertently introducing bindings into the macro expansion.
Consider the definition of my-or from the previous section:
(define-syntax my-or
(syntax-rules ()
((my-or)
#t)
((my-or exp)
exp)
((my-or exp rest ...)
(let ((t exp))
(if exp
exp
(my-or rest ...))))))
A naive expansion of (let ((t #t)) (my-or #f t)) would yield:
(let ((t #t))
(let ((t #f))
(if t t t)))
⇒ #f
Which clearly is not what we want. Somehow the t in the definition is
distinct from the t at the site of use; and it is indeed this distinction
that is maintained by the syntax expander, when expanding hygienic macros.
This discussion is mostly relevant in the context of traditional Lisp macros (see Defmacros), which do not preserve referential transparency. Hygiene adds to the expressive power of Scheme.
One often ends up writing simple one-clause syntax-rules macros.
There is a convenient shorthand for this idiom, in the form of
define-syntax-rule.
Define keyword as a new
syntax-rulesmacro with one clause.
Cast into this form, our when example is significantly shorter:
(define-syntax-rule (when c e ...)
(if c (begin e ...)))
For a formal definition of syntax-rules and its pattern language, see
See Macros.
syntax-rules macros are simple and clean, but do they have limitations.
They do not lend themselves to expressive error messages: patterns either match
or they don't. Their ability to generate code is limited to template-driven
expansion; often one needs to define a number of helper macros to get real work
done. Sometimes one wants to introduce a binding into the lexical context of the
generated code; this is impossible with syntax-rules. Relatedly, they
cannot programmatically generate identifiers.
The solution to all of these problems is to use syntax-case if you need
its features. But if for some reason you're stuck with syntax-rules, you
might enjoy Joe Marshall's
syntax-rules Primer for the Merely Eccentric.
syntax-case Systemsyntax-case macros are procedural syntax transformers, with a power
worthy of Scheme.
Match the syntax object syntax against the given patterns, in order. If a pattern matches, return the result of evaluating the associated exp.
Compare the following definitions of when:
(define-syntax when
(syntax-rules ()
((_ test e e* ...)
(if test (begin e e* ...)))))
(define-syntax when
(lambda (x)
(syntax-case x ()
((_ test e e* ...)
#'(if test (begin e e* ...))))))
Clearly, the syntax-case definition is similar to its syntax-rules
counterpart, and equally clearly there are some differences. The
syntax-case definition is wrapped in a lambda, a function of one
argument; that argument is passed to the syntax-case invocation; and the
“return value” of the macro has a #' prefix.
All of these differences stem from the fact that syntax-case does not
define a syntax transformer itself – instead, syntax-case expressions
provide a way to destructure a syntax object, and to rebuild syntax
objects as output.
So the lambda wrapper is simply a leaky implementation detail, that
syntax transformers are just functions that transform syntax to syntax. This
should not be surprising, given that we have already described macros as
“programs that write programs”. syntax-case is simply a way to take
apart and put together program text, and to be a valid syntax transformer it
needs to be wrapped in a procedure.
Unlike traditional Lisp macros (see Defmacros), syntax-case macros
transform syntax objects, not raw Scheme forms. Recall the naive expansion of
my-or given in the previous section:
(let ((t #t))
(my-or #f t))
;; naive expansion:
(let ((t #t))
(let ((t #f))
(if t t t)))
Raw Scheme forms simply don't have enough information to distinguish the first
two t instances in (if t t t) from the third t. So instead
of representing identifiers as symbols, the syntax expander represents
identifiers as annotated syntax objects, attaching such information to those
syntax objects as is needed to maintain referential transparency.
Syntax objects are typically created internally to the process of expansion, but it is possible to create them outside of syntax expansion:
(syntax (foo bar baz))
⇒ #<some representation of that syntax>
However it is more common, and useful, to create syntax objects when building
output from a syntax-case expression.
(define-syntax add1
(lambda (x)
(syntax-case x ()
((_ exp)
(syntax (+ exp 1))))))
It is not strictly necessary for a syntax-case expression to return a
syntax object, because syntax-case expressions can be used in helper
functions, or otherwise used outside of syntax expansion itself. However a
syntax transformer procedure must return a syntax object, so most uses of
syntax-case do end up returning syntax objects.
Here in this case, the form that built the return value was (syntax (+ exp
1)). The interesting thing about this is that within a syntax
expression, any appearance of a pattern variable is substituted into the
resulting syntax object, carrying with it all relevant metadata from the source
expression, such as lexical identity and source location.
Indeed, a pattern variable may only be referenced from inside a syntax
form. The syntax expander would raise an error when defining add1 if it
found exp referenced outside a syntax form.
Since syntax appears frequently in macro-heavy code, it has a special
reader macro: #'. #'foo is transformed by the reader into
(syntax foo), just as 'foo is transformed into (quote foo).
The pattern language used by syntax-case is conveniently the same
language used by syntax-rules. Given this, Guile actually defines
syntax-rules in terms of syntax-case:
(define-syntax syntax-rules
(lambda (x)
(syntax-case x ()
((_ (k ...) ((keyword . pattern) template) ...)
#'(lambda (x)
(syntax-case x (k ...)
((dummy . pattern) #'template)
...))))))
And that's that.
syntax-case?The examples we have shown thus far could just as well have been expressed with
syntax-rules, and have just shown that syntax-case is more
verbose, which is true. But there is a difference: syntax-case creates
procedural macros, giving the full power of Scheme to the macro expander.
This has many practical applications.
A common desire is to be able to match a form only if it is an identifier. This
is impossible with syntax-rules, given the datum matching forms. But with
syntax-case it is easy:
;; relying on previous add1 definition
(define-syntax add1!
(lambda (x)
(syntax-case x ()
((_ var) (identifier? #'var)
#'(set! var (add1 var))))))
(define foo 0)
(add1! foo)
foo ⇒ 1
(add1! "not-an-identifier") ⇒ error
With syntax-rules, the error for (add1! "not-an-identifier") would
be something like “invalid set!”. With syntax-case, it will say
something like “invalid add1!”, because we attach the guard
clause to the pattern: (identifier? #'var). This becomes more important
with more complicated macros. It is necessary to use identifier?, because
to the expander, an identifier is more than a bare symbol.
Note that even in the guard clause, we reference the var pattern variable
within a syntax form, via #'var.
Another common desire is to introduce bindings into the lexical context of the
output expression. One example would be in the so-called “anaphoric macros”,
like aif. Anaphoric macros bind some expression to a well-known
identifier, often it, within their bodies. For example, in (aif
(foo) (bar it)), it would be bound to the result of (foo).
To begin with, we should mention a solution that doesn't work:
;; doesn't work
(define-syntax aif
(lambda (x)
(syntax-case x ()
((_ test then else)
#'(let ((it test))
(if it then else))))))
The reason that this doesn't work is that, by default, the expander will
preserve referential transparency; the then and else expressions
won't have access to the binding of it.
But they can, if we explicitly introduce a binding via datum->syntax.
Create a syntax object that wraps datum, within the lexical context corresponding to the syntax object for-syntax.
For completeness, we should mention that it is possible to strip the metadata from a syntax object, returning a raw Scheme datum:
Strip the metadata from syntax-object, returning its contents as a raw Scheme datum.
In this case we want to introduce it in the context of the whole
expression, so we can create a syntax object as (datum->syntax x 'it),
where x is the whole expression, as passed to the transformer procedure.
Here's another solution that doesn't work:
;; doesn't work either
(define-syntax aif
(lambda (x)
(syntax-case x ()
((_ test then else)
(let ((it (datum->syntax x 'it)))
#'(let ((it test))
(if it then else)))))))
The reason that this one doesn't work is that there are really two
environments at work here – the environment of pattern variables, as
bound by syntax-case, and the environment of lexical variables,
as bound by normal Scheme. The outer let form establishes a binding in
the environment of lexical variables, but the inner let form is inside a
syntax form, where only pattern variables will be substituted. Here we
need to introduce a piece of the lexical environment into the pattern
variable environment, and we can do so using syntax-case itself:
;; works, but is obtuse
(define-syntax aif
(lambda (x)
(syntax-case x ()
((_ test then else)
;; invoking syntax-case on the generated
;; syntax object to expose it to `syntax'
(syntax-case (datum->syntax x 'it) ()
(it
#'(let ((it test))
(if it then else))))))))
(aif (getuid) (display it) (display "none")) (newline)
-| 500
However there are easier ways to write this. with-syntax is often
convenient:
Bind patterns pat from their corresponding values val, within the lexical context of exp....
;; better (define-syntax aif (lambda (x) (syntax-case x () ((_ test then else) (with-syntax ((it (datum->syntax x 'it))) #'(let ((it test)) (if it then else)))))))
As you might imagine, with-syntax is defined in terms of
syntax-case. But even that might be off-putting to you if you are an old
Lisp macro hacker, used to building macro output with quasiquote. The
issue is that with-syntax creates a separation between the point of
definition of a value and its point of substitution.
So for cases in which a quasiquote style makes more sense,
syntax-case also defines quasisyntax, and the related
unsyntax and unsyntax-splicing, abbreviated by the reader as
#`, #,, and #,@, respectively.
For example, to define a macro that inserts a compile-time timestamp into a source file, one may write:
(define-syntax display-compile-timestamp
(lambda (x)
(syntax-case x ()
((_)
#`(begin
(display "The compile timestamp was: ")
(display #,(current-time))
(newline))))))
Readers interested in further information on syntax-case macros should
see R. Kent Dybvig's excellent The Scheme Programming Language, either
edition 3 or 4, in the chapter on syntax. Dybvig was the primary author of the
syntax-case system. The book itself is available online at
http://scheme.com/tspl4/.
As noted in the previous section, Guile's syntax expander operates on syntax objects. Procedural macros consume and produce syntax objects. This section describes some of the auxiliary helpers that procedural macros can use to compare, generate, and query objects of this data type.
Return
#tiff the syntax objects a and b refer to the same lexically-bound identifier.
Return
#tiff the syntax objects a and b refer to the same free identifier.
Return a list of temporary identifiers as long as ls is long.
Return the source properties that correspond to the syntax object x. See Source Properties, for more information.
Guile also offers some more experimental interfaces in a separate module. As was the case with the Large Hadron Collider, it is unclear to our senior macrologists whether adding these interfaces will result in awesomeness or in the destruction of Guile via the creation of a singularity. We will preserve their functionality through the 2.0 series, but we reserve the right to modify them in a future stable series, to a more than usual degree.
(use-modules (system syntax))
Return the name of the module whose source contains the identifier id.
Resolve the identifer id, a syntax object, within the current lexical environment, and return two values, the binding type and a binding value. The binding type is a symbol, which may be one of the following:
lexical- A lexically-bound variable. The value is a unique token (in the sense of
eq?) identifying this binding.macro- A syntax transformer, either local or global. The value is the transformer procedure.
pattern-variable- A pattern variable, bound via syntax-case. The value is an opaque object, internal to the expander.
displaced-lexical- A lexical variable that has gone out of scope. This can happen if a badly-written procedural macro saves a syntax object, then attempts to introduce it in a context in which it is unbound. The value is
#f.global- A global binding. The value is a pair, whose head is the symbol, and whose tail is the name of the module in which to resolve the symbol.
other- Some other binding, like
lambdaor other core bindings. The value is#f.This is a very low-level procedure, with limited uses. One case in which it is useful is to build abstractions that associate auxiliary information with macros:
(define aux-property (make-object-property)) (define-syntax-rule (with-aux aux value) (let ((trans value)) (set! (aux-property trans) aux) trans)) (define-syntax retrieve-aux (lambda (x) (syntax-case x () ((x id) (call-with-values (lambda () (syntax-local-binding #'id)) (lambda (type val) (with-syntax ((aux (datum->syntax #'here (and (eq? type 'macro) (aux-property val))))) #''aux))))))) (define-syntax foo (with-aux 'bar (syntax-rules () ((_) 'foo)))) (foo) ⇒ foo (retrieve-aux foo) ⇒ bar
syntax-local-bindingmust be called within the dynamic extent of a syntax transformer; to call it otherwise will signal an error.
Return a list of identifiers that were visible lexically when the identifier id was created, in order from outermost to innermost.
This procedure is intended to be used in specialized procedural macros, to provide a macro with the set of bound identifiers that the macro can reference.
As a technical implementation detail, the identifiers returned by
syntax-locally-bound-identifierswill be anti-marked, like the syntax object that is given as input to a macro. This is to signal to the macro expander that these bindings were present in the original source, and do not need to be hygienically renamed, as would be the case with other introduced identifiers. See the discussion of hygiene in section 12.1 of the R6RS, for more information on marks.(define (local-lexicals id) (filter (lambda (x) (eq? (syntax-local-binding x) 'lexical)) (syntax-locally-bound-identifiers id))) (define-syntax lexicals (lambda (x) (syntax-case x () ((lexicals) #'(lexicals lexicals)) ((lexicals scope) (with-syntax (((id ...) (local-lexicals #'scope))) #'(list (cons 'id id) ...)))))) (let* ((x 10) (x 20)) (lexicals)) ⇒ ((x . 10) (x . 20))
The traditional way to define macros in Lisp is very similar to procedure
definitions. The key differences are that the macro definition body should
return a list that describes the transformed expression, and that the definition
is marked as a macro definition (rather than a procedure definition) by the use
of a different definition keyword: in Lisp, defmacro rather than
defun, and in Scheme, define-macro rather than define.
Guile supports this style of macro definition using both defmacro
and define-macro. The only difference between them is how the
macro name and arguments are grouped together in the definition:
(defmacro name (args ...) body ...)
is the same as
(define-macro (name args ...) body ...)
The difference is analogous to the corresponding difference between
Lisp's defun and Scheme's define.
Having read the previous section on syntax-case, it's probably clear that
Guile actually implements defmacros in terms of syntax-case, applying the
transformer on the expression between invocations of syntax->datum and
datum->syntax. This realization leads us to the problem with defmacros,
that they do not preserve referential transparency. One can be careful to not
introduce bindings into expanded code, via liberal use of gensym, but
there is no getting around the lack of referential transparency for free
bindings in the macro itself.
Even a macro as simple as our when from before is difficult to get right:
(define-macro (when cond exp . rest)
`(if ,cond
(begin ,exp . ,rest)))
(when #f (display "Launching missiles!\n"))
⇒ #f
(let ((if list))
(when #f (display "Launching missiles!\n")))
-| Launching missiles!
⇒ (#f #<unspecified>)
Guile's perspective is that defmacros have had a good run, but that modern
macros should be written with syntax-rules or syntax-case. There
are still many uses of defmacros within Guile itself, but we will be phasing
them out over time. Of course we won't take away defmacro or
define-macro themselves, as there is lots of code out there that uses
them.
When the syntax expander sees a form in which the first element is a macro, the whole form gets passed to the macro's syntax transformer. One may visualize this as:
(define-syntax foo foo-transformer)
(foo arg...)
;; expands via
(foo-transformer #'(foo arg...))
If, on the other hand, a macro is referenced in some other part of a form, the syntax transformer is invoked with only the macro reference, not the whole form.
(define-syntax foo foo-transformer)
foo
;; expands via
(foo-transformer #'foo)
This allows bare identifier references to be replaced programmatically via a
macro. syntax-rules provides some syntax to effect this transformation
more easily.
Returns a macro transformer that will replace occurrences of the macro with exp.
For example, if you are importing external code written in terms of fx+,
the fixnum addition operator, but Guile doesn't have fx+, you may use the
following to replace fx+ with +:
(define-syntax fx+ (identifier-syntax +))
There is also special support for recognizing identifiers on the
left-hand side of a set! expression, as in the following:
(define-syntax foo foo-transformer)
(set! foo val)
;; expands via
(foo-transformer #'(set! foo val))
;; iff foo-transformer is a "variable transformer"
As the example notes, the transformer procedure must be explicitly marked as being a “variable transformer”, as most macros aren't written to discriminate on the form in the operator position.
Mark the transformer procedure as being a “variable transformer”. In practice this means that, when bound to a syntactic keyword, it may detect references to that keyword on the left-hand-side of a
set!.(define bar 10) (define-syntax bar-alias (make-variable-transformer (lambda (x) (syntax-case x (set!) ((set! var val) #'(set! bar val)) ((var arg ...) #'(bar arg ...)) (var (identifier? #'var) #'bar))))) bar-alias ⇒ 10 (set! bar-alias 20) bar ⇒ 20 (set! bar 30) bar-alias ⇒ 30
There is an extension to identifier-syntax which allows it to handle the
set! case as well:
Create a variable transformer. The first clause is used for references to the variable in operator or operand position, and the second for appearances of the variable on the left-hand-side of an assignment.
For example, the previous
bar-aliasexample could be expressed more succinctly like this:(define-syntax bar-alias (identifier-syntax (var bar) ((set! var val) (set! bar val))))As before, the templates in
identifier-syntaxforms do not need wrapping in#'syntax forms.
Syntax parameters12 are a
mechanism for rebinding a macro definition within the dynamic extent of
a macro expansion. This provides a convenient solution to one of the
most common types of unhygienic macro: those that introduce a unhygienic
binding each time the macro is used. Examples include a lambda
form with a return keyword, or class macros that introduce a
special self binding.
With syntax parameters, instead of introducing the binding unhygienically each time, we instead create one binding for the keyword, which we can then adjust later when we want the keyword to have a different meaning. As no new bindings are introduced, hygiene is preserved. This is similar to the dynamic binding mechanisms we have at run-time (see parameters), except that the dynamic binding only occurs during macro expansion. The code after macro expansion remains lexically scoped.
Binds keyword to the value obtained by evaluating transformer. The transformer provides the default expansion for the syntax parameter, and in the absence of
syntax-parameterize, is functionally equivalent todefine-syntax. Usually, you will just want to have the transformer throw a syntax error indicating that the keyword is supposed to be used in conjunction with another macro, for example:(define-syntax-parameter return (lambda (stx) (syntax-violation 'return "return used outside of a lambda^" stx)))
Adjusts keyword ... to use the values obtained by evaluating their transformer ..., in the expansion of the exp ... forms. Each keyword must be bound to a syntax-parameter.
syntax-parameterizediffers fromlet-syntax, in that the binding is not shadowed, but adjusted, and so uses of the keyword in the expansion of exp ... use the new transformers. This is somewhat similar to howparameterizeadjusts the values of regular parameters, rather than creating new bindings.(define-syntax lambda^ (syntax-rules () [(lambda^ argument-list body body* ...) (lambda argument-list (call-with-current-continuation (lambda (escape) ;; In the body we adjust the 'return' keyword so that calls ;; to 'return' are replaced with calls to the escape ;; continuation. (syntax-parameterize ([return (syntax-rules () [(return vals (... ...)) (escape vals (... ...))])]) body body* ...))))])) ;; Now we can write functions that return early. Here, 'product' will ;; return immediately if it sees any 0 element. (define product (lambda^ (list) (fold (lambda (n o) (if (zero? n) (return 0) (* n o))) 1 list)))
As syntax-case macros have the whole power of Scheme available to them,
they present a problem regarding time: when a macro runs, what parts of the
program are available for the macro to use?
The default answer to this question is that when you import a module (via
define-module or use-modules), that module will be loaded up at
expansion-time, as well as at run-time. Additionally, top-level syntactic
definitions within one compilation unit made by define-syntax are also
evaluated at expansion time, in the order that they appear in the compilation
unit (file).
But if a syntactic definition needs to call out to a normal procedure at expansion-time, it might well need need special declarations to indicate that the procedure should be made available at expansion-time.
For example, the following code will work at a REPL, but not in a file:
;; incorrect
(use-modules (srfi srfi-19))
(define (date) (date->string (current-date)))
(define-syntax %date (identifier-syntax (date)))
(define *compilation-date* %date)
It works at a REPL because the expressions are evaluated one-by-one, in order, but if placed in a file, the expressions are expanded one-by-one, but not evaluated until the compiled file is loaded.
The fix is to use eval-when.
;; correct: using eval-when
(use-modules (srfi srfi-19))
(eval-when (compile load eval)
(define (date) (date->string (current-date))))
(define-syntax %date (identifier-syntax (date)))
(define *compilation-date* %date)
Evaluate exp... under the given conditions. Valid conditions include
eval,load, andcompile. If you need to useeval-when, use it with all three conditions, as in the above example. Other uses ofeval-whenmay void your warranty or poison your cat.
Construct a syntax transformer object. This is part of Guile's low-level support for syntax-case.
Return
#tiff obj is a syntax transformer.Note that it's a bit difficult to actually get a macro as a first-class object; simply naming it (like
case) will produce a syntax error. But it is possible to get these objects usingmodule-ref:(macro? (module-ref (current-module) 'case)) ⇒ #t
Return the type that was given when m was constructed, via
make-syntax-transformer.
Return the binding of the macro m.
Return the transformer of the macro m. This will return a procedure, for which one may ask the docstring. That's the whole reason this section is documented. Actually a part of the result of
macro-binding.
This chapter contains information about procedures which are not cleanly tied to a specific data type. Because of their wide range of applications, they are collected in a utility chapter.
There are three kinds of core equality predicates in Scheme, described
below. The same kinds of comparisons arise in other functions, like
memq and friends (see List Searching).
For all three tests, objects of different types are never equal. So
for instance a list and a vector are not equal?, even if their
contents are the same. Exact and inexact numbers are considered
different types too, and are hence not equal even if their values are
the same.
eq? tests just for the same object (essentially a pointer
comparison). This is fast, and can be used when searching for a
particular object, or when working with symbols or keywords (which are
always unique objects).
eqv? extends eq? to look at the value of numbers and
characters. It can for instance be used somewhat like =
(see Comparison) but without an error if one operand isn't a
number.
equal? goes further, it looks (recursively) into the contents
of lists, vectors, etc. This is good for instance on lists that have
been read or calculated in various places and are the same, just not
made up of the same pairs. Such lists look the same (when printed),
and equal? will consider them the same.
Return
#tif x and y are the same object, except for numbers and characters. For example,(define x (vector 1 2 3)) (define y (vector 1 2 3)) (eq? x x) ⇒ #t (eq? x y) ⇒ #fNumbers and characters are not equal to any other object, but the problem is they're not necessarily
eq?to themselves either. This is even so when the number comes directly from a variable,(let ((n (+ 2 3))) (eq? n n)) ⇒ *unspecified*Generally
eqv?below should be used when comparing numbers or characters.=(see Comparison) orchar=?(see Characters) can be used too.It's worth noting that end-of-list
(),#t,#f, a symbol of a given name, and a keyword of a given name, are unique objects. There's just one of each, so for instance no matter how()arises in a program, it's the same object and can be compared witheq?,(define x (cdr '(123))) (define y (cdr '(456))) (eq? x y) ⇒ #t (define x (string->symbol "foo")) (eq? x 'foo) ⇒ #t
Return
1when x and y are equal in the sense ofeq?, otherwise return0.The
==operator should not be used onSCMvalues, anSCMis a C type which cannot necessarily be compared using==(see The SCM Type).
Return
#tif x and y are the same object, or for characters and numbers the same value.On objects except characters and numbers,
eqv?is the same aseq?above, it's true if x and y are the same object.If x and y are numbers or characters,
eqv?compares their type and value. An exact number is noteqv?to an inexact number (even if their value is the same).(eqv? 3 (+ 1 2)) ⇒ #t (eqv? 1 1.0) ⇒ #f
Return
#tif x and y are the same type, and their contents or value are equal.For a pair, string, vector, array or structure,
equal?compares the contents, and does so using the sameequal?recursively, so a deep structure can be traversed.(equal? (list 1 2 3) (list 1 2 3)) ⇒ #t (equal? (list 1 2 3) (vector 1 2 3)) ⇒ #fFor other objects,
equal?compares as pereqv?above, which means characters and numbers are compared by type and value (and likeeqv?, exact and inexact numbers are notequal?, even if their value is the same).(equal? 3 (+ 1 2)) ⇒ #t (equal? 1 1.0) ⇒ #fHash tables are currently only compared as per
eq?, so two different tables are notequal?, even if their contents are the same.
equal?does not support circular data structures, it may go into an infinite loop if asked to compare two circular lists or similar.New application-defined object types (see Defining New Types (Smobs)) have an
equalphandler which is called byequal?. This lets an application traverse the contents or control what is consideredequal?for two objects of such a type. If there's no such handler, the default is to just compare as pereq?.
It's often useful to associate a piece of additional information with a Scheme object even though that object does not have a dedicated slot available in which the additional information could be stored. Object properties allow you to do just that.
Guile's representation of an object property is a procedure-with-setter
(see Procedures with Setters) that can be used with the generalized
form of set! (REFFIXME) to set and retrieve that property for any
Scheme object. So, setting a property looks like this:
(set! (my-property obj1) value-for-obj1)
(set! (my-property obj2) value-for-obj2)
And retrieving values of the same property looks like this:
(my-property obj1)
⇒
value-for-obj1
(my-property obj2)
⇒
value-for-obj2
To create an object property in the first place, use the
make-object-property procedure:
(define my-property (make-object-property))
Create and return an object property. An object property is a procedure-with-setter that can be called in two ways.
(set! (property obj)val)sets obj's property to val.(property obj)returns the current setting of obj's property.
A single object property created by make-object-property can
associate distinct property values with all Scheme values that are
distinguishable by eq? (including, for example, integers).
Internally, object properties are implemented using a weak key hash table. This means that, as long as a Scheme value with property values is protected from garbage collection, its property values are also protected. When the Scheme value is collected, its entry in the property table is removed and so the (ex-) property values are no longer protected by the table.
Guile also implements a more traditional Lispy interface to properties, in which each object has an list of key-value pairs associated with it. Properties in that list are keyed by symbols. This is a legacy interface; you should use weak hash tables or object properties instead.
Return obj's property list.
Set obj's property list to alist.
Return the property of obj with name key.
In obj's property list, set the property named key to value.
Sorting is very important in computer programs. Therefore, Guile comes
with several sorting procedures built-in. As always, procedures with
names ending in ! are side-effecting, that means that they may
modify their parameters in order to produce their results.
The first group of procedures can be used to merge two lists (which must be already sorted on their own) and produce sorted lists containing all elements of the input lists.
Merge two already sorted lists into one. Given two lists alist and blist, such that
(sorted? alist less?)and(sorted? blist less?), return a new list in which the elements of alist and blist have been stably interleaved so that(sorted? (merge alist blist less?) less?). Note: this does _not_ accept vectors.
Takes two lists alist and blist such that
(sorted? alist less?)and(sorted? blist less?)and returns a new list in which the elements of alist and blist have been stably interleaved so that(sorted? (merge alist blist less?) less?). This is the destructive variant ofmergeNote: this does _not_ accept vectors.
The following procedures can operate on sequences which are either
vectors or list. According to the given arguments, they return sorted
vectors or lists, respectively. The first of the following procedures
determines whether a sequence is already sorted, the other sort a given
sequence. The variants with names starting with stable- are
special in that they maintain a special property of the input sequences:
If two or more elements are the same according to the comparison
predicate, they are left in the same order as they appeared in the
input.
Return
#tiff items is a list or a vector such that for all 1 <= i <= m, the predicate less returns true when applied to all elements i - 1 and i
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. This is not a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. The sorting is destructive, that means that the input sequence is modified to produce the sorted result. This is not a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. This is a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. The sorting is destructive, that means that the input sequence is modified to produce the sorted result. This is a stable sort.
The procedures in the last group only accept lists or vectors as input, as their names indicate.
Sort the list items, using less for comparing the list elements. This is a stable sort.
Sort the list items, using less for comparing the list elements. The sorting is destructive, that means that the input list is modified to produce the sorted result. This is a stable sort.
Sort the vector vec, using less for comparing the vector elements. startpos (inclusively) and endpos (exclusively) delimit the range of the vector which gets sorted. The return value is not specified.
The procedures for copying lists (see Lists) only produce a flat
copy of the input list, and currently Guile does not even contain
procedures for copying vectors. copy-tree can be used for these
application, as it does not only copy the spine of a list, but also
copies any pairs in the cars of the input lists.
Recursively copy the data tree that is bound to obj, and return the new data structure.
copy-treerecurses down the contents of both pairs and vectors (since both cons cells and vector cells may point to arbitrary objects), and stops recursing when it hits any other object.
When debugging Scheme programs, but also for providing a human-friendly interface, a procedure for converting any Scheme object into string format is very useful. Conversion from/to strings can of course be done with specialized procedures when the data type of the object to convert is known, but with this procedure, it is often more comfortable.
object->string converts an object by using a print procedure for
writing to a string port, and then returning the resulting string.
Converting an object back from the string is only possible if the object
type has a read syntax and the read syntax is preserved by the printing
procedure.
Return a Scheme string obtained by printing obj. Printing function can be specified by the optional second argument printer (default:
write).
A hook is a list of procedures to be called at well defined points in time. Typically, an application provides a hook h and promises its users that it will call all of the procedures in h at a defined point in the application's processing. By adding its own procedure to h, an application user can tap into or even influence the progress of the application.
Guile itself provides several such hooks for debugging and customization purposes: these are listed in a subsection below.
When an application first creates a hook, it needs to know how many arguments will be passed to the hook's procedures when the hook is run. The chosen number of arguments (which may be none) is declared when the hook is created, and all the procedures that are added to that hook must be capable of accepting that number of arguments.
A hook is created using make-hook. A procedure can be added to
or removed from a hook using add-hook! or remove-hook!,
and all of a hook's procedures can be removed together using
reset-hook!. When an application wants to run a hook, it does so
using run-hook.
Hook usage is shown by some examples in this section. First, we will define a hook of arity 2 — that is, the procedures stored in the hook will have to accept two arguments.
(define hook (make-hook 2))
hook
⇒ #<hook 2 40286c90>
Now we are ready to add some procedures to the newly created hook with
add-hook!. In the following example, two procedures are added,
which print different messages and do different things with their
arguments.
(add-hook! hook (lambda (x y)
(display "Foo: ")
(display (+ x y))
(newline)))
(add-hook! hook (lambda (x y)
(display "Bar: ")
(display (* x y))
(newline)))
Once the procedures have been added, we can invoke the hook using
run-hook.
(run-hook hook 3 4)
-| Bar: 12
-| Foo: 7
Note that the procedures are called in the reverse of the order with
which they were added. This is because the default behaviour of
add-hook! is to add its procedure to the front of the
hook's procedure list. You can force add-hook! to add its
procedure to the end of the list instead by providing a third
#t argument on the second call to add-hook!.
(add-hook! hook (lambda (x y)
(display "Foo: ")
(display (+ x y))
(newline)))
(add-hook! hook (lambda (x y)
(display "Bar: ")
(display (* x y))
(newline))
#t) ; <- Change here!
(run-hook hook 3 4)
-| Foo: 7
-| Bar: 12
When you create a hook with make-hook, you must specify the arity
of the procedures which can be added to the hook. If the arity is not
given explicitly as an argument to make-hook, it defaults to
zero. All procedures of a given hook must have the same arity, and when
the procedures are invoked using run-hook, the number of
arguments passed must match the arity specified at hook creation time.
The order in which procedures are added to a hook matters. If the third
parameter to add-hook! is omitted or is equal to #f, the
procedure is added in front of the procedures which might already be on
that hook, otherwise the procedure is added at the end. The procedures
are always called from the front to the end of the list when they are
invoked via run-hook.
The ordering of the list of procedures returned by hook->list
matches the order in which those procedures would be called if the hook
was run using run-hook.
Note that the C functions in the following entries are for handling Scheme-level hooks in C. There are also C-level hooks which have their own interface (see C Hooks).
Create a hook for storing procedure of arity n_args. n_args defaults to zero. The returned value is a hook object to be used with the other hook procedures.
Return
#tif hook is an empty hook,#fotherwise.
Add the procedure proc to the hook hook. The procedure is added to the end if append_p is true, otherwise it is added to the front. The return value of this procedure is not specified.
Remove the procedure proc from the hook hook. The return value of this procedure is not specified.
Remove all procedures from the hook hook. The return value of this procedure is not specified.
Convert the procedure list of hook to a list.
Apply all procedures from the hook hook to the arguments args. The order of the procedure application is first to last. The return value of this procedure is not specified.
If, in C code, you are certain that you have a hook object and well
formed argument list for that hook, you can also use
scm_c_run_hook, which is identical to scm_run_hook but
does no type checking.
The same as
scm_run_hookbut without any type checking to confirm that hook is actually a hook object and that args is a well-formed list matching the arity of the hook.
For C code, SCM_HOOKP is a faster alternative to
scm_hook_p:
Here is an example of how to handle Scheme-level hooks from C code using the above functions.
if (scm_is_true (scm_hook_p (obj)))
/* handle Scheme-level hook using C functions */
scm_reset_hook_x (obj);
else
/* do something else (obj is not a hook) */
The hooks already described are intended to be populated by Scheme-level procedures. In addition to this, the Guile library provides an independent set of interfaces for the creation and manipulation of hooks that are designed to be populated by functions implemented in C.
The original motivation here was to provide a kind of hook that could safely be invoked at various points during garbage collection. Scheme-level hooks are unsuitable for this purpose as running them could itself require memory allocation, which would then invoke garbage collection recursively ... However, it is also the case that these hooks are easier to work with than the Scheme-level ones if you only want to register C functions with them. So if that is mainly what your code needs to do, you may prefer to use this interface.
To create a C hook, you should allocate storage for a structure of type
scm_t_c_hook and then initialize it using scm_c_hook_init.
Data type for a C hook. The internals of this type should be treated as opaque.
Enumeration of possible hook types, which are:
SCM_C_HOOK_NORMAL- Type of hook for which all the registered functions will always be called.
SCM_C_HOOK_OR- Type of hook for which the sequence of registered functions will be called only until one of them returns C true (a non-NULL pointer).
SCM_C_HOOK_AND- Type of hook for which the sequence of registered functions will be called only until one of them returns C false (a NULL pointer).
Initialize the C hook at memory pointed to by hook. type should be one of the values of the
scm_t_c_hook_typeenumeration, and controls how the hook functions will be called. hook_data is a closure parameter that will be passed to all registered hook functions when they are called.
To add or remove a C function from a C hook, use scm_c_hook_add
or scm_c_hook_remove. A hook function must expect three
void * parameters which are, respectively:
scm_c_hook_init.
scm_c_hook_add.
scm_c_hook_run call that
runs the hook.
Function type for a C hook function: takes three
void *parameters and returns avoid *result.
Add function func, with function closure data func_data, to the C hook hook. The new function is appended to the hook's list of functions if appendp is non-zero, otherwise prepended.
Remove function func, with function closure data func_data, from the C hook hook.
scm_c_hook_removechecks both func and func_data so as to allow for the same func being registered multiple times with different closure data.
Finally, to invoke a C hook, call the scm_c_hook_run function
specifying the hook and the call closure data for this run:
Run the C hook hook will call closure data data. Subject to the variations for hook types
SCM_C_HOOK_ORandSCM_C_HOOK_AND,scm_c_hook_runcalls hook's registered functions in turn, passing them the hook's closure data, each function's closure data, and the call closure data.
scm_c_hook_run's return value is the return value of the last function to be called.
Whenever Guile performs a garbage collection, it calls the following hooks in the order shown.
C hook called at the very start of a garbage collection, after setting
scm_gc_running_pto 1, but before entering the GC critical section.If garbage collection is blocked because
scm_block_gcis non-zero, GC exits early soon after calling this hook, and no further hooks will be called.
C hook called before beginning the mark phase of garbage collection, after the GC thread has entered a critical section.
C hook called before beginning the sweep phase of garbage collection. This is the same as at the end of the mark phase, since nothing else happens between marking and sweeping.
C hook called after the end of the sweep phase of garbage collection, but while the GC thread is still inside its critical section.
C hook called at the very end of a garbage collection, after the GC thread has left its critical section.
Scheme hook with arity 0. This hook is run asynchronously (see Asyncs) soon after the GC has completed and any other events that were deferred during garbage collection have been processed. (Also accessible from C with the name
scm_after_gc_hook.)
All the C hooks listed here have type SCM_C_HOOK_NORMAL, are
initialized with hook closure data NULL, are invoked by
scm_c_hook_run with call closure data NULL.
The Scheme hook after-gc-hook is particularly useful in
conjunction with guardians (see Guardians). Typically, if you are
using a guardian, you want to call the guardian after garbage collection
to see if any of the objects added to the guardian have been collected.
By adding a thunk that performs this call to after-gc-hook, you
can ensure that your guardian is tested after every garbage collection
cycle.
Scheme supports the definition of variables in different contexts. Variables can be defined at the top level, so that they are visible in the entire program, and variables can be defined locally to procedures and expressions. This is important for modularity and data abstraction.
At the top level of a program (i.e., not nested within any other expression), a definition of the form
(define a value)
defines a variable called a and sets it to the value value.
If the variable already exists in the current module, because it has
already been created by a previous define expression with the
same name, its value is simply changed to the new value. In this
case, then, the above form is completely equivalent to
(set! a value)
This equivalence means that define can be used interchangeably
with set! to change the value of variables at the top level of
the REPL or a Scheme source file. It is useful during interactive
development when reloading a Scheme file that you have modified, because
it allows the define expressions in that file to work as expected
both the first time that the file is loaded and on subsequent occasions.
Note, though, that define and set! are not always
equivalent. For example, a set! is not allowed if the named
variable does not already exist, and the two expressions can behave
differently in the case where there are imported variables visible from
another module.
Create a top level variable named name with value value. If the named variable already exists, just change its value. The return value of a
defineexpression is unspecified.
The C API equivalents of define are scm_define and
scm_c_define, which differ from each other in whether the
variable name is specified as a SCM symbol or as a
null-terminated C string.
C equivalents of
define, with variable name specified either by sym, a symbol, or by name, a null-terminated C string. Both variants return the new or preexisting variable object.
define (when it occurs at top level), scm_define and
scm_c_define all create or set the value of a variable in the top
level environment of the current module. If there was not already a
variable with the specified name belonging to the current module, but a
similarly named variable from another module was visible through having
been imported, the newly created variable in the current module will
shadow the imported variable, such that the imported variable is no
longer visible.
Attention: Scheme definitions inside local binding constructs (see Local Bindings) act differently (see Internal Definitions).
Many people end up in a development style of adding and changing
definitions at runtime, building out their program without restarting
it. (You can do this using reload-module, the reload REPL
command, the load procedure, or even just pasting code into a
REPL.) If you are one of these people, you will find that sometimes you
there are some variables that you don't want to redefine all the
time. For these, use define-once.
Create a top level variable named name with value value, but only if name is not already bound in the current module.
Old Lispers probably know define-once under its Lisp name,
defvar.
As opposed to definitions at the top level, which creates bindings that are visible to all code in a module, it is also possible to define variables which are only visible in a well-defined part of the program. Normally, this part of a program will be a procedure or a subexpression of a procedure.
With the constructs for local binding (let, let*,
letrec, and letrec*), the Scheme language has a block
structure like most other programming languages since the days of
Algol 60. Readers familiar to languages like C or Java should
already be used to this concept, but the family of let
expressions has a few properties which are well worth knowing.
The most basic local binding construct is let.
bindings has the form
((variable1 init1) ...)that is zero or more two-element lists of a variable and an arbitrary expression each. All variable names must be distinct.
A
letexpression is evaluated as follows.
- All init expressions are evaluated.
- New storage is allocated for the variables.
- The values of the init expressions are stored into the variables.
- The expressions in body are evaluated in order, and the value of the last expression is returned as the value of the
letexpression.The init expressions are not allowed to refer to any of the variables.
The other binding constructs are variations on the same theme: making new values, binding them to variables, and executing a body in that new, extended lexical context.
Similar to
let, but the variable bindings are performed sequentially, that means that all init expression are allowed to use the variables defined on their left in the binding list.A
let*expression can always be expressed with nestedletexpressions.(let* ((a 1) (b a)) b) == (let ((a 1)) (let ((b a)) b))
Similar to
let, but it is possible to refer to the variable from lambda expression created in any of the inits. That is, procedures created in the init expression can recursively refer to the defined variables.(letrec ((even? (lambda (n) (if (zero? n) #t (odd? (- n 1))))) (odd? (lambda (n) (if (zero? n) #f (even? (- n 1)))))) (even? 88)) ⇒ #tNote that while the init expressions may refer to the new variables, they may not access their values. For example, making the
even?function above creates a closure (see About Closure) referencing theodd?variable. Butodd?can't be called until after execution has entered the body.
Similar to
letrec, except the init expressions are bound to their variables in order.
letrec*thus relaxes the letrec restriction, in that later init expressions may refer to the values of previously bound variables.(letrec ((a 42) (b (+ a 10))) (* a b)) ⇒ ;; Error: unbound variable: a (letrec* ((a 42) (b (+ a 10))) (* a b)) ⇒ 2184
There is also an alternative form of the let form, which is used
for expressing iteration. Because of the use as a looping construct,
this form (the named let) is documented in the section about
iteration (see Iteration)
A define form which appears inside the body of a lambda,
let, let*, letrec, letrec* or equivalent
expression is called an internal definition. An internal
definition differs from a top level definition (see Top Level),
because the definition is only visible inside the complete body of the
enclosing form. Let us examine the following example.
(let ((frumble "froz"))
(define banana (lambda () (apple 'peach)))
(define apple (lambda (x) x))
(banana))
⇒
peach
Here the enclosing form is a let, so the defines in the
let-body are internal definitions. Because the scope of the
internal definitions is the complete body of the
let-expression, the lambda-expression which gets bound to
the variable banana may refer to the variable apple, even
though its definition appears lexically after the definition of
banana. This is because a sequence of internal definition acts
as if it were a letrec* expression.
(let ()
(define a 1)
(define b 2)
(+ a b))
is equivalent to
(let ()
(letrec* ((a 1) (b 2))
(+ a b)))
Internal definitions are only allowed at the beginning of the body of an enclosing expression. They may not be mixed with other expressions.
Another noteworthy difference to top level definitions is that within
one group of internal definitions all variable names must be distinct.
That means where on the top level a second define for a given variable
acts like a set!, an exception is thrown for internal definitions
with duplicate bindings.
As a historical note, it used to be that internal bindings were expanded
in terms of letrec, not letrec*. This was the situation
for the R5RS report and before. However with the R6RS, it was recognized
that sequential definition was a more intuitive expansion, as in the
following case:
(let ()
(define a 1)
(define b (+ a a))
(+ a b))
Guile decided to follow the R6RS in this regard, and now expands
internal definitions using letrec*.
Guile provides a procedure for checking whether a symbol is bound in the top level environment.
Return
#tif sym is defined in the module module or the current module when module is not specified; otherwise return#f.
See Control Flow for a discussion of how the more general control flow of Scheme affects C code.
As an expression, the begin syntax is used to evaluate a sequence
of sub-expressions in order. Consider the conditional expression below:
(if (> x 0)
(begin (display "greater") (newline)))
If the test is true, we want to display “greater” to the current
output port, then display a newline. We use begin to form a
compound expression out of this sequence of sub-expressions.
The expression(s) are evaluated in left-to-right order and the value of the last expression is returned as the value of the
begin-expression. This expression type is used when the expressions before the last one are evaluated for their side effects.
The begin syntax has another role in definition context
(see Internal Definitions). A begin form in a definition
context splices its subforms into its place. For example,
consider the following procedure:
(define (make-seal)
(define-sealant seal open)
(values seal open))
Let us assume the existence of a define-sealant macro that
expands out to some definitions wrapped in a begin, like so:
(define (make-seal)
(begin
(define seal-tag
(list 'seal))
(define (seal x)
(cons seal-tag x))
(define (sealed? x)
(and (pair? x) (eq? (car x) seal-tag)))
(define (open x)
(if (sealed? x)
(cdr x)
(error "Expected a sealed value:" x))))
(values seal open))
Here, because the begin is in definition context, its subforms
are spliced into the place of the begin. This allows the
definitions created by the macro to be visible to the following
expression, the values form.
It is a fine point, but splicing and sequencing are different. It can make sense to splice zero forms, because it can make sense to have zero internal definitions before the expressions in a procedure or lexical binding form. However it does not make sense to have a sequence of zero expressions, because in that case it would not be clear what the value of the sequence would be, because in a sequence of zero expressions, there can be no last value. Sequencing zero expressions is an error.
It would be more elegant in some ways to eliminate splicing from the
Scheme language, and without macros (see Macros), that would be a
good idea. But it is useful to be able to write macros that expand out
to multiple definitions, as in define-sealant above, so Scheme
abuses the begin form for these two tasks.
Guile provides three syntactic constructs for conditional evaluation.
if is the normal if-then-else expression (with an optional else
branch), cond is a conditional expression with multiple branches
and case branches if an expression has one of a set of constant
values.
All arguments may be arbitrary expressions. First, test is evaluated. If it returns a true value, the expression consequent is evaluated and alternate is ignored. If test evaluates to
#f, alternate is evaluated instead. The values of the evaluated branch (consequent or alternate) are returned as the values of theifexpression.When alternate is omitted and the test evaluates to
#f, the value of the expression is not specified.
When you go to write an if without an alternate (a one-armed
if), part of what you are expressing is that you don't care
about the return value (or values) of the expression. As such, you are
more interested in the effect of evaluating the consequent
expression. (By convention, we use the word statement to refer to
an expression that is evaluated for effect, not for value).
In such a case, it is considered more clear to express these intentions
with these special forms, when and unless. As an added
bonus, these forms accept multiple statements to evaluate, which are
implicitly wrapped in a begin.
The actual definitions of these forms are in many ways their most clear documentation:
(define-syntax-rule (when test stmt stmt* ...) (if test (begin stmt stmt* ...))) (define-syntax-rule (unless condition stmt stmt* ...) (if (not test) (begin stmt stmt* ...)))That is to say,
whenevaluates its consequent statements in order if test is true.unlessis the opposite: it evaluates the statements if test is false.
Each
cond-clause must look like this:(test expression ...)where test and expression are arbitrary expression, or like this
(test => expression)where expression must evaluate to a procedure.
The tests of the clauses are evaluated in order and as soon as one of them evaluates to a true values, the corresponding expressions are evaluated in order and the last value is returned as the value of the
cond-expression. For the=>clause type, expression is evaluated and the resulting procedure is applied to the value of test. The result of this procedure application is then the result of thecond-expression.One additional
cond-clause is available as an extension to standard Scheme:(test guard => expression)where guard and expression must evaluate to procedures. For this clause type, test may return multiple values, and
condignores its boolean state; instead,condevaluates guard and applies the resulting procedure to the value(s) of test, as if guard were the consumer argument ofcall-with-values. Iff the result of that procedure call is a true value, it evaluates expression and applies the resulting procedure to the value(s) of test, in the same manner as the guard was called.The test of the last clause may be the symbol
else. Then, if none of the preceding tests is true, the expressions following theelseare evaluated to produce the result of thecond-expression.
key may be any expression, the clauses must have the form
((datum1 ...) expr1 expr2 ...)and the last clause may have the form
(else expr1 expr2 ...)All datums must be distinct. First, key is evaluated. The result of this evaluation is compared against all datum values using
eqv?. When this comparison succeeds, the expression(s) following the datum are evaluated from left to right, returning the value of the last expression as the result of thecaseexpression.If the key matches no datum and there is an
else-clause, the expressions following theelseare evaluated. If there is no such clause, the result of the expression is unspecified.
and and or evaluate all their arguments in order, similar
to begin, but evaluation stops as soon as one of the expressions
evaluates to false or true, respectively.
Evaluate the exprs from left to right and stop evaluation as soon as one expression evaluates to
#f; the remaining expressions are not evaluated. The value of the last evaluated expression is returned. If no expression evaluates to#f, the value of the last expression is returned.If used without expressions,
#tis returned.
Evaluate the exprs from left to right and stop evaluation as soon as one expression evaluates to a true value (that is, a value different from
#f); the remaining expressions are not evaluated. The value of the last evaluated expression is returned. If all expressions evaluate to#f,#fis returned.If used without expressions,
#fis returned.
Scheme has only few iteration mechanisms, mainly because iteration in
Scheme programs is normally expressed using recursion. Nevertheless,
R5RS defines a construct for programming loops, calling do. In
addition, Guile has an explicit looping syntax called while.
Bind variables and evaluate body until test is true. The return value is the last expr after test, if given. A simple example will illustrate the basic form,
(do ((i 1 (1+ i))) ((> i 4)) (display i)) -| 1234Or with two variables and a final return value,
(do ((i 1 (1+ i)) (p 3 (* 3 p))) ((> i 4) p) (format #t "3**~s is ~s\n" i p)) -| 3**1 is 3 3**2 is 9 3**3 is 27 3**4 is 81 ⇒ 789The variable bindings are established like a
let, in that the expressions are all evaluated and then all bindings made. When iterating, the optional step expressions are evaluated with the previous bindings in scope, then new bindings all made.The test expression is a termination condition. Looping stops when the test is true. It's evaluated before running the body each time, so if it's true the first time then body is not run at all.
The optional exprs after the test are evaluated at the end of looping, with the final variable bindings available. The last expr gives the return value, or if there are no exprs the return value is unspecified.
Each iteration establishes bindings to fresh locations for the variables, like a new
letfor each iteration. This is done for variables without step expressions too. The following illustrates this, showing how a newiis captured by thelambdain each iteration (see The Concept of Closure).(define lst '()) (do ((i 1 (1+ i))) ((> i 4)) (set! lst (cons (lambda () i) lst))) (map (lambda (proc) (proc)) lst) ⇒ (4 3 2 1)
Run a loop executing the body forms while cond is true. cond is tested at the start of each iteration, so if it's
#fthe first time then body is not executed at all.Within
while, two extra bindings are provided, they can be used from both cond and body.— Scheme Procedure: continue
Abandon the current iteration, go back to the start and test cond again, etc.
If the loop terminates normally, by the cond evaluating to
#f, then thewhileexpression as a whole evaluates to#f. If it terminates by a call tobreakwith some number of arguments, those arguments are returned from thewhileexpression, as multiple values. Otherwise if it terminates by a call tobreakwith no arguments, then return value is#t.(while #f (error "not reached")) ⇒ #f (while #t (break)) ⇒ #t (while #t (break 1 2 3)) ⇒ 1 2 3Each
whileform gets its ownbreakandcontinueprocedures, operating on thatwhile. This means when loops are nested the outerbreakcan be used to escape all the way out. For example,(while (test1) (let ((outer-break break)) (while (test2) (if (something) (outer-break #f)) ...)))Note that each
breakandcontinueprocedure can only be used within the dynamic extent of itswhile. Outside thewhiletheir behaviour is unspecified.
Another very common way of expressing iteration in Scheme programs is the use of the so-called named let.
Named let is a variant of let which creates a procedure and calls
it in one step. Because of the newly created procedure, named let is
more powerful than do–it can be used for iteration, but also
for arbitrary recursion.
For the definition of bindings see the documentation about
let(see Local Bindings).Named
letworks as follows:
- A new procedure which accepts as many arguments as are in bindings is created and bound locally (using
let) to variable. The new procedure's formal argument names are the name of the variables.- The body expressions are inserted into the newly created procedure.
- The procedure is called with the init expressions as the formal arguments.
The next example implements a loop which iterates (by recursion) 1000 times.
(let lp ((x 1000)) (if (positive? x) (lp (- x 1)) x)) ⇒ 0
Prompts are control-flow barriers between different parts of a program. In the same way that a user sees a shell prompt (e.g., the Bash prompt) as a barrier between the operating system and her programs, Scheme prompts allow the Scheme programmer to treat parts of programs as if they were running in different operating systems.
We use this roundabout explanation because, unless you're a functional programming junkie, you probably haven't heard the term, “delimited, composable continuation”. That's OK; it's a relatively recent topic, but a very useful one to know about.
Guile's primitive delimited control operators are
call-with-prompt and abort-to-prompt.
Set up a prompt, and call thunk within that prompt.
During the dynamic extent of the call to thunk, a prompt named tag will be present in the dynamic context, such that if a user calls
abort-to-prompt(see below) with that tag, control rewinds back to the prompt, and the handler is run.handler must be a procedure. The first argument to handler will be the state of the computation begun when thunk was called, and ending with the call to
abort-to-prompt. The remaining arguments to handler are those passed toabort-to-prompt.
Make a new prompt tag. Currently prompt tags are generated symbols. This may change in some future Guile version.
Return the default prompt tag. Having a distinguished default prompt tag allows some useful prompt and abort idioms, discussed in the next section.
Unwind the dynamic and control context to the nearest prompt named tag, also passing the given values.
C programmers may recognize call-with-prompt and abort-to-prompt
as a fancy kind of setjmp and longjmp, respectively. Prompts are
indeed quite useful as non-local escape mechanisms. Guile's catch and
throw are implemented in terms of prompts. Prompts are more convenient
than longjmp, in that one has the opportunity to pass multiple values to
the jump target.
Also unlike longjmp, the prompt handler is given the full state of the
process that was aborted, as the first argument to the prompt's handler. That
state is the continuation of the computation wrapped by the prompt. It is
a delimited continuation, because it is not the whole continuation of the
program; rather, just the computation initiated by the call to
call-with-prompt.
The continuation is a procedure, and may be reinstated simply by invoking it, with any number of values. Here's where things get interesting, and complicated as well. Besides being described as delimited, continuations reified by prompts are also composable, because invoking a prompt-saved continuation composes that continuation with the current one.
Imagine you have saved a continuation via call-with-prompt:
(define cont
(call-with-prompt
;; tag
'foo
;; thunk
(lambda ()
(+ 34 (abort-to-prompt 'foo)))
;; handler
(lambda (k) k)))
The resulting continuation is the addition of 34. It's as if you had written:
(define cont
(lambda (x)
(+ 34 x)))
So, if we call cont with one numeric value, we get that number,
incremented by 34:
(cont 8)
⇒ 42
(* 2 (cont 8))
⇒ 84
The last example illustrates what we mean when we say, "composes with the
current continuation". We mean that there is a current continuation – some
remaining things to compute, like (lambda (x) (* x 2)) – and that
calling the saved continuation doesn't wipe out the current continuation, it
composes the saved continuation with the current one.
We're belaboring the point here because traditional Scheme continuations, as discussed in the next section, aren't composable, and are actually less expressive than continuations captured by prompts. But there's a place for them both.
Before moving on, we should mention that if the handler of a prompt is a
lambda expression, and the first argument isn't referenced, an abort to
that prompt will not cause a continuation to be reified. This can be an
important efficiency consideration to keep in mind.
There is a whole zoo of delimited control operators, and as it does not seem to be a bounded set, Guile implements support for them in a separate module:
(use-modules (ice-9 control))
Firstly, we have a helpful abbreviation for the call-with-prompt
operator.
Evaluate expr in a prompt, optionally specifying a tag and a handler. If no tag is given, the default prompt tag is used.
If no handler is given, a default handler is installed. The default handler accepts a procedure of one argument, which will called on the captured continuation, within a prompt.
Sometimes it's easier just to show code, as in this case:
(define (default-prompt-handler k proc) (% (default-prompt-tag) (proc k) default-prompt-handler))The
%symbol is chosen because it looks like a prompt.
Likewise there is an abbreviation for abort-to-prompt, which
assumes the default prompt tag:
As mentioned before, (ice-9 control) also provides other
delimited control operators. This section is a bit technical, and
first-time users of delimited continuations should probably come back to
it after some practice with %.
Still here? So, when one implements a delimited control operator like
call-with-prompt, one needs to make two decisions. Firstly, does
the handler run within or outside the prompt? Having the handler run
within the prompt allows an abort inside the handler to return to the
same prompt handler, which is often useful. However it prevents tail
calls from the handler, so it is less general.
Similarly, does invoking a captured continuation reinstate a prompt? Again we have the tradeoff of convenience versus proper tail calls.
These decisions are captured in the Felleisen F operator. If
neither the continuations nor the handlers implicitly add a prompt, the
operator is known as –F–. This is the case for Guile's
call-with-prompt and abort-to-prompt.
If both continuation and handler implicitly add prompts, then the
operator is +F+. shift and reset are such
operators.
Establish a prompt, and evaluate body... within that prompt.
The prompt handler is designed to work with
shift, described below.
Abort to the nearest
reset, and evaluate body... in a context in which the captured continuation is bound to cont.As mentioned above, both the body... expression and invocations of cont implicitly establish a prompt.
Interested readers are invited to explore Oleg Kiselyov's wonderful web site at http://okmij.org/ftp/, for more information on these operators.
A “continuation” is the code that will execute when a given function or expression returns. For example, consider
(define (foo)
(display "hello\n")
(display (bar)) (newline)
(exit))
The continuation from the call to bar comprises a
display of the value returned, a newline and an
exit. This can be expressed as a function of one argument.
(lambda (r)
(display r) (newline)
(exit))
In Scheme, continuations are represented as special procedures just like this. The special property is that when a continuation is called it abandons the current program location and jumps directly to that represented by the continuation.
A continuation is like a dynamic label, capturing at run-time a point in program execution, including all the nested calls that have lead to it (or rather the code that will execute when those calls return).
Continuations are created with the following functions.
Capture the current continuation and call
(proc cont)with it. The return value is the value returned by proc, or when(cont value)is later invoked, the return is the value passed.Normally cont should be called with one argument, but when the location resumed is expecting multiple values (see Multiple Values) then they should be passed as multiple arguments, for instance
(cont x y z).cont may only be used from the same side of a continuation barrier as it was created (see Continuation Barriers), and in a multi-threaded program only from the thread in which it was created.
The call to proc is not part of the continuation captured, it runs only when the continuation is created. Often a program will want to store cont somewhere for later use; this can be done in proc.
The
callin the namecall-with-current-continuationrefers to the way a call to proc gives the newly created continuation. It's not related to the way a call is used later to invoke that continuation.
call/ccis an alias forcall-with-current-continuation. This is in common use since the latter is rather long.
Here is a simple example,
(define kont #f)
(format #t "the return is ~a\n"
(call/cc (lambda (k)
(set! kont k)
1)))
⇒ the return is 1
(kont 2)
⇒ the return is 2
call/cc captures a continuation in which the value returned is
going to be displayed by format. The lambda stores this
in kont and gives an initial return 1 which is
displayed. The later invocation of kont resumes the captured
point, but this time returning 2, which is displayed.
When Guile is run interactively, a call to format like this has
an implicit return back to the read-eval-print loop. call/cc
captures that like any other return, which is why interactively
kont will come back to read more input.
C programmers may note that
call/cc is like setjmp in
the way it records at runtime a point in program execution. A call to
a continuation is like a longjmp in that it abandons the
present location and goes to the recorded one. Like longjmp,
the value passed to the continuation is the value returned by
call/cc on resuming there. However longjmp can only go
up the program stack, but the continuation mechanism can go anywhere.
When a continuation is invoked, call/cc and subsequent code
effectively “returns” a second time. It can be confusing to imagine
a function returning more times than it was called. It may help
instead to think of it being stealthily re-entered and then program
flow going on as normal.
dynamic-wind (see Dynamic Wind) can be used to ensure setup
and cleanup code is run when a program locus is resumed or abandoned
through the continuation mechanism.
Continuations are a powerful mechanism, and can be used to implement almost any sort of control structure, such as loops, coroutines, or exception handlers.
However the implementation of continuations in Guile is not as efficient as one might hope, because Guile is designed to cooperate with programs written in other languages, such as C, which do not know about continuations. Basically continuations are captured by a block copy of the stack, and resumed by copying back.
For this reason, continuations captured by call/cc should be used only
when there is no other simple way to achieve the desired result, or when the
elegance of the continuation mechanism outweighs the need for performance.
Escapes upwards from loops or nested functions are generally best handled with prompts (see Prompts). Coroutines can be efficiently implemented with cooperating threads (a thread holds a full program stack but doesn't copy it around the way continuations do).
Scheme allows a procedure to return more than one value to its caller. This is quite different to other languages which only allow single-value returns. Returning multiple values is different from returning a list (or pair or vector) of values to the caller, because conceptually not one compound object is returned, but several distinct values.
The primitive procedures for handling multiple values are values
and call-with-values. values is used for returning
multiple values from a procedure. This is done by placing a call to
values with zero or more arguments in tail position in a
procedure body. call-with-values combines a procedure returning
multiple values with a procedure which accepts these values as
parameters.
Delivers all of its arguments to its continuation. Except for continuations created by the
call-with-valuesprocedure, all continuations take exactly one value. The effect of passing no value or more than one value to continuations that were not created bycall-with-valuesis unspecified.For
scm_values, args is a list of arguments and the return is a multiple-values object which the caller can return. In the current implementation that object shares structure with args, so args should not be modified subsequently.
Returns the value at the position specified by idx in values. Note that values will ordinarily be a multiple-values object, but it need not be. Any other object represents a single value (itself), and is handled appropriately.
Calls its producer argument with no values and a continuation that, when passed some values, calls the consumer procedure with those values as arguments. The continuation for the call to consumer is the continuation of the call to
call-with-values.(call-with-values (lambda () (values 4 5)) (lambda (a b) b)) ⇒ 5(call-with-values * -) ⇒ -1
In addition to the fundamental procedures described above, Guile has a
module which exports a syntax called receive, which is much
more convenient. This is in the (ice-9 receive) and is the
same as specified by SRFI-8 (see SRFI-8).
(use-modules (ice-9 receive))
Evaluate the expression expr, and bind the result values (zero or more) to the formal arguments in formals. formals is a list of symbols, like the argument list in a
lambda(see Lambda). After binding the variables, the expressions in body ... are evaluated in order, the return value is the result from the last expression.For example getting results from
partitionin SRFI-1 (see SRFI-1),(receive (odds evens) (partition odd? '(7 4 2 8 3)) (display odds) (display " and ") (display evens)) -| (7 3) and (4 2 8)
A common requirement in applications is to want to jump non-locally from the depths of a computation back to, say, the application's main processing loop. Usually, the place that is the target of the jump is somewhere in the calling stack of procedures that called the procedure that wants to jump back. For example, typical logic for a key press driven application might look something like this:
main-loop:
read the next key press and call dispatch-key
dispatch-key:
lookup the key in a keymap and call an appropriate procedure,
say find-file
find-file:
interactively read the required file name, then call
find-specified-file
find-specified-file:
check whether file exists; if not, jump back to main-loop
...
The jump back to main-loop could be achieved by returning through
the stack one procedure at a time, using the return value of each
procedure to indicate the error condition, but Guile (like most modern
programming languages) provides an additional mechanism called
exception handling that can be used to implement such jumps much
more conveniently.
There are several variations on the terminology for dealing with non-local jumps. It is useful to be aware of them, and to realize that they all refer to the same basic mechanism.
Where signal and signalling are used, special care is needed to avoid the risk of confusion with POSIX signals.
This manual prefers to speak of throwing and catching exceptions, since this terminology matches the corresponding Guile primitives.
catch is used to set up a target for a possible non-local jump.
The arguments of a catch expression are a key, which
restricts the set of exceptions to which this catch applies, a
thunk that specifies the code to execute and one or two handler
procedures that say what to do if an exception is thrown while executing
the code. If the execution thunk executes normally, which means
without throwing any exceptions, the handler procedures are not called
at all.
When an exception is thrown using the throw function, the first
argument of the throw is a symbol that indicates the type of the
exception. For example, Guile throws an exception using the symbol
numerical-overflow to indicate numerical overflow errors such as
division by zero:
(/ 1 0)
⇒
ABORT: (numerical-overflow)
The key argument in a catch expression corresponds to this
symbol. key may be a specific symbol, such as
numerical-overflow, in which case the catch applies
specifically to exceptions of that type; or it may be #t, which
means that the catch applies to all exceptions, irrespective of
their type.
The second argument of a catch expression should be a thunk
(i.e. a procedure that accepts no arguments) that specifies the normal
case code. The catch is active for the execution of this thunk,
including any code called directly or indirectly by the thunk's body.
Evaluation of the catch expression activates the catch and then
calls this thunk.
The third argument of a catch expression is a handler procedure.
If an exception is thrown, this procedure is called with exactly the
arguments specified by the throw. Therefore, the handler
procedure must be designed to accept a number of arguments that
corresponds to the number of arguments in all throw expressions
that can be caught by this catch.
The fourth, optional argument of a catch expression is another
handler procedure, called the pre-unwind handler. It differs from
the third argument in that if an exception is thrown, it is called,
before the third argument handler, in exactly the dynamic context
of the throw expression that threw the exception. This means
that it is useful for capturing or displaying the stack at the point of
the throw, or for examining other aspects of the dynamic context,
such as fluid values, before the context is unwound back to that of the
prevailing catch.
Invoke thunk in the dynamic context of handler for exceptions matching key. If thunk throws to the symbol key, then handler is invoked this way:
(handler key args ...)key is a symbol or
#t.thunk takes no arguments. If thunk returns normally, that is the return value of
catch.Handler is invoked outside the scope of its own
catch. If handler again throws to the same key, a new handler from further up the call chain is invoked.If the key is
#t, then a throw to any symbol will match this call tocatch.If a pre-unwind-handler is given and thunk throws an exception that matches key, Guile calls the pre-unwind-handler before unwinding the dynamic state and invoking the main handler. pre-unwind-handler should be a procedure with the same signature as handler, that is
(lambda (key . args)). It is typically used to save the stack at the point where the exception occurred, but can also query other parts of the dynamic state at that point, such as fluid values.A pre-unwind-handler can exit either normally or non-locally. If it exits normally, Guile unwinds the stack and dynamic context and then calls the normal (third argument) handler. If it exits non-locally, that exit determines the continuation.
If a handler procedure needs to match a variety of throw
expressions with varying numbers of arguments, you should write it like
this:
(lambda (key . args)
...)
The key argument is guaranteed always to be present, because a
throw without a key is not valid. The number and
interpretation of the args varies from one type of exception to
another, but should be specified by the documentation for each exception
type.
Note that, once the normal (post-unwind) handler procedure is invoked, the catch that led to the handler procedure being called is no longer active. Therefore, if the handler procedure itself throws an exception, that exception can only be caught by another active catch higher up the call stack, if there is one.
The above
scm_catch_with_pre_unwind_handlerandscm_catchtake Scheme procedures as body and handler arguments.scm_c_catchandscm_internal_catchare equivalents taking C functions.body is called as body
(body_data)with a catch on exceptions of the given tag type. If an exception is caught, pre_unwind_handler and handler are called as handler(handler_data,key,args). key and args are theSCMkey and argument list from thethrow.body and handler should have the following prototypes.
scm_t_catch_bodyandscm_t_catch_handlerare pointer typedefs for these.SCM body (void *data); SCM handler (void *data, SCM key, SCM args);The body_data and handler_data parameters are passed to the respective calls so an application can communicate extra information to those functions.
If the data consists of an
SCMobject, care should be taken that it isn't garbage collected while still required. If theSCMis a local C variable, one way to protect it is to pass a pointer to that variable as the data parameter, since the C compiler will then know the value must be held on the stack. Another way is to usescm_remember_upto_here_1(see Remembering During Operations).
It's sometimes useful to be able to intercept an exception that is being
thrown before the stack is unwound. This could be to clean up some
related state, to print a backtrace, or to pass information about the
exception to a debugger, for example. The with-throw-handler
procedure provides a way to do this.
Add handler to the dynamic context as a throw handler for key key, then invoke thunk.
This behaves exactly like
catch, except that it does not unwind the stack before invoking handler. If the handler procedure returns normally, Guile rethrows the same exception again to the next innermost catch or throw handler. handler may exit nonlocally, of course, via an explicit throw or via invoking a continuation.
Typically handler is used to display a backtrace of the stack at
the point where the corresponding throw occurred, or to save off
this information for possible display later.
Not unwinding the stack means that throwing an exception that is handled
via a throw handler is equivalent to calling the throw handler handler
inline instead of each throw, and then omitting the surrounding
with-throw-handler. In other words,
(with-throw-handler 'key
(lambda () ... (throw 'key args ...) ...)
handler)
is mostly equivalent to
((lambda () ... (handler 'key args ...) ...))
In particular, the dynamic context when handler is invoked is that
of the site where throw is called. The examples are not quite
equivalent, because the body of a with-throw-handler is not in
tail position with respect to the with-throw-handler, and if
handler exits normally, Guile arranges to rethrow the error, but
hopefully the intention is clear. (For an introduction to what is meant
by dynamic context, See Dynamic Wind.)
The above
scm_with_throw_handlertakes Scheme procedures as body (thunk) and handler arguments.scm_c_with_throw_handleris an equivalent taking C functions. Seescm_c_catch(see Catch) for a description of the parameters, the behaviour however of course followswith-throw-handler.
If thunk throws an exception, Guile handles that exception by
invoking the innermost catch or throw handler whose key matches
that of the exception. When the innermost thing is a throw handler,
Guile calls the specified handler procedure using (apply
handler key args). The handler procedure may either return
normally or exit non-locally. If it returns normally, Guile passes the
exception on to the next innermost catch or throw handler. If it
exits non-locally, that exit determines the continuation.
The behaviour of a throw handler is very similar to that of a
catch expression's optional pre-unwind handler. In particular, a
throw handler's handler procedure is invoked in the exact dynamic
context of the throw expression, just as a pre-unwind handler is.
with-throw-handler may be seen as a half-catch: it does
everything that a catch would do until the point where
catch would start unwinding the stack and dynamic context, but
then it rethrows to the next innermost catch or throw handler
instead.
Note also that since the dynamic context is not unwound, if a
with-throw-handler handler throws to a key that does not match
the with-throw-handler expression's key, the new throw may
be handled by a catch or throw handler that is closer to
the throw than the first with-throw-handler.
Here is an example to illustrate this behavior:
(catch 'a
(lambda ()
(with-throw-handler 'b
(lambda ()
(catch 'a
(lambda ()
(throw 'b))
inner-handler))
(lambda (key . args)
(throw 'a))))
outer-handler)
This code will call inner-handler and then continue with the
continuation of the inner catch.
The throw primitive is used to throw an exception. One argument,
the key, is mandatory, and must be a symbol; it indicates the type
of exception that is being thrown. Following the key,
throw accepts any number of additional arguments, whose meaning
depends on the exception type. The documentation for each possible type
of exception should specify the additional arguments that are expected
for that kind of exception.
Invoke the catch form matching key, passing args to the handler.
key is a symbol. It will match catches of the same symbol or of
#t.If there is no handler at all, Guile prints an error and then exits.
When an exception is thrown, it will be caught by the innermost
catch or throw handler that applies to the type of the thrown
exception; in other words, whose key is either #t or the
same symbol as that used in the throw expression. Once Guile has
identified the appropriate catch or throw handler, it handles the
exception by applying the relevant handler procedure(s) to the arguments
of the throw.
If there is no appropriate catch or throw handler for a thrown
exception, Guile prints an error to the current error port indicating an
uncaught exception, and then exits. In practice, it is quite difficult
to observe this behaviour, because Guile when used interactively
installs a top level catch handler that will catch all exceptions
and print an appropriate error message without exiting. For
example, this is what happens if you try to throw an unhandled exception
in the standard Guile REPL; note that Guile's command loop continues
after the error message:
guile> (throw 'badex)
<unnamed port>:3:1: In procedure gsubr-apply ...
<unnamed port>:3:1: unhandled-exception: badex
ABORT: (misc-error)
guile>
The default uncaught exception behaviour can be observed by evaluating a
throw expression from the shell command line:
$ guile -c "(begin (throw 'badex) (display \"here\\n\"))"
guile: uncaught throw to badex: ()
$
That Guile exits immediately following the uncaught exception
is shown by the absence of any output from the display
expression, because Guile never gets to the point of evaluating that
expression.
It is traditional in Scheme to implement exception systems using
call-with-current-continuation. Continuations
(see Continuations) are such a powerful concept that any other
control mechanism — including catch and throw — can be
implemented in terms of them.
Guile does not implement catch and throw like this,
though. Why not? Because Guile is specifically designed to be easy to
integrate with applications written in C. In a mixed Scheme/C
environment, the concept of continuation must logically include
“what happens next” in the C parts of the application as well as the
Scheme parts, and it turns out that the only reasonable way of
implementing continuations like this is to save and restore the complete
C stack.
So Guile's implementation of call-with-current-continuation is a
stack copying one. This allows it to interact well with ordinary C
code, but means that creating and calling a continuation is slowed down
by the time that it takes to copy the C stack.
The more targeted mechanism provided by catch and throw
does not need to save and restore the C stack because the throw
always jumps to a location higher up the stack of the code that executes
the throw. Therefore Guile implements the catch and
throw primitives independently of
call-with-current-continuation, in a way that takes advantage of
this upwards only nature of exceptions.
Guile provides a set of convenience procedures for signaling error conditions that are implemented on top of the exception primitives just described.
Raise an error with key
misc-errorand a message constructed by displaying msg and writing args.
Raise an error with key key. subr can be a string naming the procedure associated with the error, or
#f. message is the error message string, possibly containing~Sand~Aescapes. When an error is reported, these are replaced by formatting the corresponding members of args:~A(was%sin older versions of Guile) formats usingdisplayand~S(was%S) formats usingwrite. data is a list or#fdepending on key: if key issystem-errorthen it should be a list containing the Unixerrnovalue; If key issignalthen it should be a list containing the Unix signal number; If key isout-of-rangeorwrong-type-arg, it is a list containing the bad value; otherwise it will usually be#f.
Return the Unix error message corresponding to err, an integer
errnovalue.When
setlocalehas been called (see Locales), the message is in the language and charset ofLC_MESSAGES. (This is done by the C library.)
Returns the result of evaluating its argument; however if an exception occurs then
#fis returned instead.
For Scheme code, the fundamental procedure to react to non-local entry
and exits of dynamic contexts is dynamic-wind. C code could
use scm_internal_dynamic_wind, but since C does not allow the
convenient construction of anonymous procedures that close over
lexical variables, this will be, well, inconvenient.
Therefore, Guile offers the functions scm_dynwind_begin and
scm_dynwind_end to delimit a dynamic extent. Within this
dynamic extent, which is called a dynwind context, you can
perform various dynwind actions that control what happens when
the dynwind context is entered or left. For example, you can register
a cleanup routine with scm_dynwind_unwind_handler that is
executed when the context is left. There are several other more
specialized dynwind actions as well, for example to temporarily block
the execution of asyncs or to temporarily change the current output
port. They are described elsewhere in this manual.
Here is an example that shows how to prevent memory leaks.
/* Suppose there is a function called FOO in some library that you
would like to make available to Scheme code (or to C code that
follows the Scheme conventions).
FOO takes two C strings and returns a new string. When an error has
occurred in FOO, it returns NULL.
*/
char *foo (char *s1, char *s2);
/* SCM_FOO interfaces the C function FOO to the Scheme way of life.
It takes care to free up all temporary strings in the case of
non-local exits.
*/
SCM
scm_foo (SCM s1, SCM s2)
{
char *c_s1, *c_s2, *c_res;
scm_dynwind_begin (0);
c_s1 = scm_to_locale_string (s1);
/* Call 'free (c_s1)' when the dynwind context is left.
*/
scm_dynwind_unwind_handler (free, c_s1, SCM_F_WIND_EXPLICITLY);
c_s2 = scm_to_locale_string (s2);
/* Same as above, but more concisely.
*/
scm_dynwind_free (c_s2);
c_res = foo (c_s1, c_s2);
if (c_res == NULL)
scm_memory_error ("foo");
scm_dynwind_end ();
return scm_take_locale_string (res);
}
All three arguments must be 0-argument procedures. in_guard is called, then thunk, then out_guard.
If, any time during the execution of thunk, the dynamic extent of the
dynamic-windexpression is escaped non-locally, out_guard is called. If the dynamic extent of the dynamic-wind is re-entered, in_guard is called. Thus in_guard and out_guard may be called any number of times.(define x 'normal-binding) ⇒ x (define a-cont (call-with-current-continuation (lambda (escape) (let ((old-x x)) (dynamic-wind ;; in-guard: ;; (lambda () (set! x 'special-binding)) ;; thunk ;; (lambda () (display x) (newline) (call-with-current-continuation escape) (display x) (newline) x) ;; out-guard: ;; (lambda () (set! x old-x))))))) ;; Prints: special-binding ;; Evaluates to: ⇒ a-cont x ⇒ normal-binding (a-cont #f) ;; Prints: special-binding ;; Evaluates to: ⇒ a-cont ;; the value of the (define a-cont...) x ⇒ normal-binding a-cont ⇒ special-binding
This is an enumeration of several flags that modify the behavior of
scm_dynwind_begin. The flags are listed in the following table.
SCM_F_DYNWIND_REWINDABLE- The dynamic context is rewindable. This means that it can be reentered non-locally (via the invocation of a continuation). The default is that a dynwind context can not be reentered non-locally.
The function
scm_dynwind_beginstarts a new dynamic context and makes it the `current' one.The flags argument determines the default behavior of the context. Normally, use 0. This will result in a context that can not be reentered with a captured continuation. When you are prepared to handle reentries, include
SCM_F_DYNWIND_REWINDABLEin flags.Being prepared for reentry means that the effects of unwind handlers can be undone on reentry. In the example above, we want to prevent a memory leak on non-local exit and thus register an unwind handler that frees the memory. But once the memory is freed, we can not get it back on reentry. Thus reentry can not be allowed.
The consequence is that continuations become less useful when non-reentrant contexts are captured, but you don't need to worry about that too much.
The context is ended either implicitly when a non-local exit happens, or explicitly with
scm_dynwind_end. You must make sure that a dynwind context is indeed ended properly. If you fail to callscm_dynwind_endfor eachscm_dynwind_begin, the behavior is undefined.
End the current dynamic context explicitly and make the previous one current.
This is an enumeration of several flags that modify the behavior of
scm_dynwind_unwind_handlerandscm_dynwind_rewind_handler. The flags are listed in the following table.
Arranges for func to be called with data as its arguments when the current context ends implicitly. If flags contains
SCM_F_WIND_EXPLICITLY, func is also called when the context ends explicitly withscm_dynwind_end.The function
scm_dynwind_unwind_handler_with_scmtakes care that data is protected from garbage collection.
Arrange for func to be called with data as its argument when the current context is restarted by rewinding the stack. When flags contains
SCM_F_WIND_EXPLICITLY, func is called immediately as well.The function
scm_dynwind_rewind_handler_with_scmtakes care that data is protected from garbage collection.
Arrange for mem to be freed automatically whenever the current context is exited, whether normally or non-locally.
scm_dynwind_free (mem)is an equivalent shorthand forscm_dynwind_unwind_handler (free, mem, SCM_F_WIND_EXPLICITLY).
Error handling is based on catch and throw. Errors are
always thrown with a key and four arguments:
#f.
~A and ~S can be
embedded within the message: they will be replaced with members of the
args list when the message is printed. ~A indicates an
argument printed using display, while ~S indicates an
argument printed using write. message can also be
#f, to allow it to be derived from the key by the error
handler (may be useful if the key is to be thrown from both C and
Scheme).
~A and
~S tokens in message. Can also be #f if no
arguments are required.
'system-error, this contains the C errno value. Can also
be #f if no additional objects are required.
In addition to catch and throw, the following Scheme
facilities are available:
Display an error message to the output port port. frame is the frame in which the error occurred, subr is the name of the procedure in which the error occurred and message is the actual error message, which may contain formatting instructions. These will format the arguments in the list args accordingly. rest is currently ignored.
The following are the error keys defined by libguile and the situations in which they are used:
error-signal: thrown after receiving an unhandled fatal signal
such as SIGSEGV, SIGBUS, SIGFPE etc. The rest argument in the throw
contains the coded signal number (at present this is not the same as the
usual Unix signal number).
system-error: thrown after the operating system indicates an
error condition. The rest argument in the throw contains the
errno value.
numerical-overflow: numerical overflow.
out-of-range: the arguments to a procedure do not fall within the
accepted domain.
wrong-type-arg: an argument to a procedure has the wrong type.
wrong-number-of-args: a procedure was called with the wrong number
of arguments.
memory-allocation-error: memory allocation error.
stack-overflow: stack overflow error.
regular-expression-syntax: errors generated by the regular
expression library.
misc-error: other errors.
In the following C functions, SUBR and MESSAGE parameters
can be NULL to give the effect of #f described above.
Throw an error, as per
scm-error(see Error Reporting).
Throw an error with key
system-errorand supplyerrnoin the rest argument. Forscm_syserrorthe message is generated usingstrerror.Care should be taken that any code in between the failing operation and the call to these routines doesn't change
errno.
Throw an error with the various keys described above. — C Function: void scm_misc_error (const char *subr, const char *message, SCM args)
In
scm_wrong_num_args, proc should be a Scheme symbol which is the name of the procedure incorrectly invoked. The other routines take the name of the invoked procedure as a C string.In
scm_wrong_type_arg_msg, expected is a C string describing the type of argument that was expected.In
scm_misc_error, message is the error message string, possibly containingsimple-formatescapes (see Writing), and the corresponding arguments in the args list.
Every function visible at the Scheme level should aggressively check the types of its arguments, to avoid misinterpreting a value, and perhaps causing a segmentation fault. Guile provides some macros to make this easier.
If test is zero, signal a “wrong type argument” error, attributed to the subroutine named subr, operating on the value obj, which is the position'th argument of subr.
In
SCM_ASSERT_TYPE, expected is a C string describing the type of argument that was expected.
One of the above values can be used for position to indicate the number of the argument of subr which is being checked. Alternatively, a positive integer number can be used, which allows to check arguments after the seventh. However, for parameter numbers up to seven it is preferable to use
SCM_ARGNinstead of the corresponding raw number, since it will make the code easier to understand.
Passing a value of zero or
SCM_ARGnfor position allows to leave it unspecified which argument's type is incorrect. Again,SCM_ARGnshould be preferred over a raw zero constant.
The non-local flow of control caused by continuations might sometimes
not be wanted. You can use with-continuation-barrier to erect
fences that continuations can not pass.
Call proc and return its result. Do not allow the invocation of continuations that would leave or enter the dynamic extent of the call to
with-continuation-barrier. Such an attempt causes an error to be signaled.Throws (such as errors) that are not caught from within proc are caught by
with-continuation-barrier. In that case, a short message is printed to the current error port and#fis returned.Thus,
with-continuation-barrierreturns exactly once.
Like
scm_with_continuation_barrierbut call func on data. When an error is caught,NULLis returned.
Sequential input/output in Scheme is represented by operations on a port. This chapter explains the operations that Guile provides for working with ports.
Ports are created by opening, for instance open-file for a file
(see File Ports). Characters can be read from an input port and
written to an output port, or both on an input/output port. A port
can be closed (see Closing) when no longer required, after which
any attempt to read or write is an error.
The formal definition of a port is very generic: an input port is simply “an object which can deliver characters on demand,” and an output port is “an object which can accept characters.” Because this definition is so loose, it is easy to write functions that simulate ports in software. Soft ports and string ports are two interesting and powerful examples of this technique. (see Soft Ports, and String Ports.)
Ports are garbage collected in the usual way (see Memory Management), and will be closed at that time if not already closed. In this case any errors occurring in the close will not be reported. Usually a program will want to explicitly close so as to be sure all its operations have been successful. Of course if a program has abandoned something due to an error or other condition then closing problems are probably not of interest.
It is strongly recommended that file ports be closed explicitly when no longer required. Most systems have limits on how many files can be open, both on a per-process and a system-wide basis. A program that uses many files should take care not to hit those limits. The same applies to similar system resources such as pipes and sockets.
Note that automatic garbage collection is triggered only by memory
consumption, not by file or other resource usage, so a program cannot
rely on that to keep it away from system limits. An explicit call to
gc can of course be relied on to pick up unreferenced ports.
If program flow makes it hard to be certain when to close then this
may be an acceptable way to control resource usage.
All file access uses the “LFS” large file support functions when available, so files bigger than 2 Gbytes (2^31 bytes) can be read and written on a 32-bit system.
Each port has an associated character encoding that controls how bytes read from the port are converted to characters and string and controls how characters and strings written to the port are converted to bytes. When ports are created, they inherit their character encoding from the current locale, but, that can be modified after the port is created.
Currently, the ports only work with non-modal encodings. Most encodings are non-modal, meaning that the conversion of bytes to a string doesn't depend on its context: the same byte sequence will always return the same string. A couple of modal encodings are in common use, like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
Each port also has an associated conversion strategy: what to do when a Guile character can't be converted to the port's encoded character representation for output. There are three possible strategies: to raise an error, to replace the character with a hex escape, or to replace the character with a substitute character.
Return
#tif x is an input port, otherwise return#f. Any object satisfying this predicate also satisfiesport?.
Return
#tif x is an output port, otherwise return#f. Any object satisfying this predicate also satisfiesport?.
Return a boolean indicating whether x is a port. Equivalent to
(or (input-port?x) (output-port?x)).
Sets the character encoding that will be used to interpret all port I/O. enc is a string containing the name of an encoding. Valid encoding names are those defined by IANA.
A fluid containing
#for the name of the encoding to be used by default for newly created ports (see Fluids and Dynamic States). The value#fis equivalent to"ISO-8859-1".New ports are created with the encoding appropriate for the current locale if
setlocalehas been called or the value specified by this fluid otherwise.
Returns, as a string, the character encoding that port uses to interpret its input and output. The value
#fis equivalent to"ISO-8859-1".
Sets the behavior of the interpreter when outputting a character that is not representable in the port's current encoding. sym can be either
'error,'substitute, or'escape. If it is'error, an error will be thrown when an nonconvertible character is encountered. If it is'substitute, then nonconvertible characters will be replaced with approximate characters, or with question marks if no approximately correct character is available. If it is'escape, it will appear as a hex escape when output.If port is an open port, the conversion error behavior is set for that port. If it is
#f, it is set as the default behavior for any future ports that get created in this thread.
Returns the behavior of the port when outputting a character that is not representable in the port's current encoding. It returns the symbol
errorif unrepresentable characters should cause exceptions,substituteif the port should try to replace unrepresentable characters with question marks or approximate characters, orescapeif unrepresentable characters should be converted to string escapes.If port is
#f, then the current default behavior will be returned. New ports will have this default behavior when they are created.
[Generic procedures for reading from ports.]
These procedures pertain to reading characters and strings from ports. To read general S-expressions from ports, See Scheme Read.
Return
#tif x is an end-of-file object; otherwise return#f.
Return
#tif a character is ready on input port and return#fotherwise. Ifchar-ready?returns#tthen the nextread-charoperation on port is guaranteed not to hang. If port is a file port at end of file thenchar-ready?returns#t.
char-ready?exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors associated with such ports must make sure that characters whose existence has been asserted bychar-ready?cannot be rubbed out. Ifchar-ready?were to return#fat end of file, a port at end of file would be indistinguishable from an interactive port that has no ready characters.
Return the next character available from port, updating port to point to the following character. If no more characters are available, the end-of-file object is returned.
When port's data cannot be decoded according to its character encoding, a
decoding-erroris raised and port points past the erroneous byte sequence.
Read up to size bytes from port and store them in buffer. The return value is the number of bytes actually read, which can be less than size if end-of-file has been reached.
Note that this function does not update
port-lineandport-columnbelow.
Return the next character available from port, without updating port to point to the following character. If no more characters are available, the end-of-file object is returned.
The value returned by a call to
peek-charis the same as the value that would have been returned by a call toread-charon the same port. The only difference is that the very next call toread-charorpeek-charon that port will return the value returned by the preceding call topeek-char. In particular, a call topeek-charon an interactive port will hang waiting for input whenever a call toread-charwould have hung.As for
read-char, adecoding-errormay be raised if such a situation occurs. However, unlike withread-char, port still points at the beginning of the erroneous byte sequence when the error is raised.
Place char in port so that it will be read by the next read operation. If called multiple times, the unread characters will be read again in last-in first-out order. If port is not supplied, the current input port is used.
Place the string str in port so that its characters will be read from left-to-right as the next characters from port during subsequent read operations. If called multiple times, the unread characters will be read again in last-in first-out order. If port is not supplied, the
current-input-portis used.
This procedure clears a port's input buffers, similar to the way that force-output clears the output buffer. The contents of the buffers are returned as a single string, e.g.,
(define p (open-input-file ...)) (drain-input p) => empty string, nothing buffered yet. (unread-char (read-char p) p) (drain-input p) => initial chars from p, up to the buffer size.Draining the buffers may be useful for cleanly finishing buffered I/O so that the file descriptor can be used directly for further input.
Return the current column number or line number of port. If the number is unknown, the result is #f. Otherwise, the result is a 0-origin integer - i.e. the first character of the first line is line 0, column 0. (However, when you display a file position, for example in an error message, we recommend you add 1 to get 1-origin integers. This is because lines and column numbers traditionally start with 1, and that is what non-programmers will find most natural.)
Set the current column or line number of port.
[Generic procedures for writing to ports.]
These procedures are for writing characters and strings to ports. For more information on writing arbitrary Scheme objects to ports, See Scheme Write.
Return the print state of the port port. If port has no associated print state,
#fis returned.
Send a newline to port. If port is omitted, send to the current output port.
Create a new port which behaves like port, but with an included print state pstate. pstate is optional. If pstate isn't supplied and port already has a print state, the old print state is reused.
Write message to destination, defaulting to the current output port. message can contain
~A(was%s) and~S(was%S) escapes. When printed, the escapes are replaced with corresponding members of ARGS:~Aformats usingdisplayand~Sformats usingwrite. If destination is#t, then use the current output port, if destination is#f, then return a string containing the formatted text. Does not add a trailing newline.
Send character chr to port.
Write size bytes at buffer to port.
Note that this function does not update
port-lineandport-column(see Reading).
Flush the specified output port, or the current output port if port is omitted. The current output buffer contents are passed to the underlying port implementation (e.g., in the case of fports, the data will be written to the file and the output buffer will be cleared.) It has no effect on an unbuffered port.
The return value is unspecified.
Equivalent to calling
force-outputon all open output ports. The return value is unspecified.
Close the specified port object. Return
#tif it successfully closes a port or#fif it was already closed. An exception may be raised if an error occurs, for example when flushing buffered output. See also close, for a procedure which can close file descriptors.
Close the specified input or output port. An exception may be raised if an error occurs while closing. If port is already closed, nothing is done. The return value is unspecified.
See also close, for a procedure which can close file descriptors.
Return
#tif port is closed or#fif it is open.
Sets the current position of fd/port to the integer offset, which is interpreted according to the value of whence.
One of the following variables should be supplied for whence:
If fd/port is a file descriptor, the underlying system call islseek. port may be a string port.The value returned is the new position in the file. This means that the current position of a port can be obtained using:
(seek port 0 SEEK_CUR)
Return an integer representing the current position of fd/port, measured from the beginning. Equivalent to:
(seek port 0 SEEK_CUR)
Truncate file to length bytes. file can be a filename string, a port object, or an integer file descriptor. The return value is unspecified.
For a port or file descriptor length can be omitted, in which case the file is truncated at the current position (per
ftellabove).On most systems a file can be extended by giving a length greater than the current size, but this is not mandatory in the POSIX standard.
The delimited-I/O module can be accessed with:
(use-modules (ice-9 rdelim))
It can be used to read or write lines of text, or read text delimited by
a specified set of characters. It's similar to the (scsh rdelim)
module from guile-scsh, but does not use multiple values or character
sets and has an extra procedure write-line.
Return a line of text from port if specified, otherwise from the value returned by
(current-input-port). Under Unix, a line of text is terminated by the first end-of-line character or by end-of-file.If handle-delim is specified, it should be one of the following symbols:
trim- Discard the terminating delimiter. This is the default, but it will be impossible to tell whether the read terminated with a delimiter or end-of-file.
concat- Append the terminating delimiter (if any) to the returned string.
peek- Push the terminating delimiter (if any) back on to the port.
split- Return a pair containing the string read from the port and the terminating delimiter or end-of-file object.
Like
read-char, this procedure can throw todecoding-error(seeread-char).
Read a line of text into the supplied string buf and return the number of characters added to buf. If buf is filled, then
#fis returned. Read from port if specified, otherwise from the value returned by(current-input-port).
Read text until one of the characters in the string delims is found or end-of-file is reached. Read from port if supplied, otherwise from the value returned by
(current-input-port). handle-delim takes the same values as described forread-line.
Read text into the supplied string buf.
If a delimiter was found, return the number of characters written, except if handle-delim is
split, in which case the return value is a pair, as noted above.As a special case, if port was already at end-of-stream, the EOF object is returned. Also, if no characters were written because the buffer was full,
#fis returned.It's something of a wacky interface, to be honest.
Display obj and a newline character to port. If port is not specified,
(current-output-port)is used. This function is equivalent to:(display obj [port]) (newline [port])
Some of the aforementioned I/O functions rely on the following C primitives. These will mainly be of interest to people hacking Guile internals.
Read characters from port into str until one of the characters in the delims string is encountered. If gobble is true, discard the delimiter character; otherwise, leave it in the input stream for the next read. If port is not specified, use the value of
(current-input-port). If start or end are specified, store data only into the substring of str bounded by start and end (which default to the beginning and end of the string, respectively).Return a pair consisting of the delimiter that terminated the string and the number of characters read. If reading stopped at the end of file, the delimiter returned is the eof-object; if the string was filled without encountering a delimiter, this value is
#f.
Read a newline-terminated line from port, allocating storage as necessary. The newline terminator (if any) is removed from the string, and a pair consisting of the line and its delimiter is returned. The delimiter may be either a newline or the eof-object; if
%read-lineis called at the end of file, it returns the pair(#<eof> . #<eof>).
The Block-string-I/O module can be accessed with:
(use-modules (ice-9 rw))
It currently contains procedures that help to implement the
(scsh rw) module in guile-scsh.
Read characters from a port or file descriptor into a string str. A port must have an underlying file descriptor — a so-called fport. This procedure is scsh-compatible and can efficiently read large strings. It will:
- attempt to fill the entire string, unless the start and/or end arguments are supplied. i.e., start defaults to 0 and end defaults to
(string-length str)- use the current input port if port_or_fdes is not supplied.
- return fewer than the requested number of characters in some cases, e.g., on end of file, if interrupted by a signal, or if not all the characters are immediately available.
- wait indefinitely for some input if no characters are currently available, unless the port is in non-blocking mode.
- read characters from the port's input buffers if available, instead from the underlying file descriptor.
- return
#fif end-of-file is encountered before reading any characters, otherwise return the number of characters read.- return 0 if the port is in non-blocking mode and no characters are immediately available.
- return 0 if the request is for 0 bytes, with no end-of-file check.
Write characters from a string str to a port or file descriptor. A port must have an underlying file descriptor — a so-called fport. This procedure is scsh-compatible and can efficiently write large strings. It will:
- attempt to write the entire string, unless the start and/or end arguments are supplied. i.e., start defaults to 0 and end defaults to
(string-length str)- use the current output port if port_of_fdes is not supplied.
- in the case of a buffered port, store the characters in the port's output buffer, if all will fit. If they will not fit then any existing buffered characters will be flushed before attempting to write the new characters directly to the underlying file descriptor. If the port is in non-blocking mode and buffered characters can not be flushed immediately, then an
EAGAINsystem-error exception will be raised (Note: scsh does not support the use of non-blocking buffered ports.)- write fewer than the requested number of characters in some cases, e.g., if interrupted by a signal or if not all of the output can be accepted immediately.
- wait indefinitely for at least one character from str to be accepted by the port, unless the port is in non-blocking mode.
- return the number of characters accepted by the port.
- return 0 if the port is in non-blocking mode and can not accept at least one character from str immediately
- return 0 immediately if the request size is 0 bytes.
Return the current input port. This is the default port used by many input procedures.
Initially this is the standard input in Unix and C terminology. When the standard input is a tty the port is unbuffered, otherwise it's fully buffered.
Unbuffered input is good if an application runs an interactive subprocess, since any type-ahead input won't go into Guile's buffer and be unavailable to the subprocess.
Note that Guile buffering is completely separate from the tty “line discipline”. In the usual cooked mode on a tty Guile only sees a line of input once the user presses <Return>.
Return the current output port. This is the default port used by many output procedures.
Initially this is the standard output in Unix and C terminology. When the standard output is a tty this port is unbuffered, otherwise it's fully buffered.
Unbuffered output to a tty is good for ensuring progress output or a prompt is seen. But an application which always prints whole lines could change to line buffered, or an application with a lot of output could go fully buffered and perhaps make explicit
force-outputcalls (see Writing) at selected points.
Return the port to which errors and warnings should be sent.
Initially this is the standard error in Unix and C terminology. When the standard error is a tty this port is unbuffered, otherwise it's fully buffered.
Change the ports returned by
current-input-port,current-output-portandcurrent-error-port, respectively, so that they use the supplied port for input or output.
These functions must be used inside a pair of calls to
scm_dynwind_beginandscm_dynwind_end(see Dynamic Wind). During the dynwind context, the indicated port is set to port.More precisely, the current port is swapped with a `backup' value whenever the dynwind context is entered or left. The backup value is initialized with the port argument.
[Types of port; how to make them.]
The following procedures are used to open file ports.
See also open, for an interface
to the Unix open system call.
Most systems have limits on how many files can be open, so it's strongly recommended that file ports be closed explicitly when no longer required (see Ports).
Open the file whose name is filename, and return a port representing that file. The attributes of the port are determined by the mode string. The way in which this is interpreted is similar to C stdio. The first character must be one of the following:
- ‘r’
- Open an existing file for input.
- ‘w’
- Open a file for output, creating it if it doesn't already exist or removing its contents if it does.
- ‘a’
- Open a file for output, creating it if it doesn't already exist. All writes to the port will go to the end of the file. The "append mode" can be turned off while the port is in use see fcntl
The following additional characters can be appended:
- ‘+’
- Open the port for both input and output. E.g.,
r+: open an existing file for both input and output.- ‘0’
- Create an "unbuffered" port. In this case input and output operations are passed directly to the underlying port implementation without additional buffering. This is likely to slow down I/O operations. The buffering mode can be changed while a port is in use see setvbuf
- ‘l’
- Add line-buffering to the port. The port output buffer will be automatically flushed whenever a newline character is written.
- ‘b’
- Use binary mode, ensuring that each byte in the file will be read as one Scheme character.
To provide this property, the file will be opened with the 8-bit character encoding "ISO-8859-1", ignoring any coding declaration or port encoding. See Ports, for more information on port encodings.
Note that while it is possible to read and write binary data as characters or strings, it is usually better to treat bytes as octets, and byte sequences as bytevectors. See R6RS Binary Input, and R6RS Binary Output, for more.
This option had another historical meaning, for DOS compatibility: in the default (textual) mode, DOS reads a CR-LF sequence as one LF byte. The
bflag prevents this from happening, addingO_BINARYto the underlyingopencall. Still, the flag is generally useful because of its port encoding ramifications.If a file cannot be opened with the access requested,
open-filethrows an exception.When the file is opened, this procedure will scan for a coding declaration (see Character Encoding of Source Files). If a coding declaration is found, it will be used to interpret the file. Otherwise, the port's encoding will be used. To suppress this behavior, open the file in binary mode and then set the port encoding explicitly using
set-port-encoding!.In theory we could create read/write ports which were buffered in one direction only. However this isn't included in the current interfaces.
Open filename for input. Equivalent to
(open-file filename "r")
Open filename for output. Equivalent to
(open-file filename "w")
Open filename for input or output, and call
(procport)with the resulting port. Return the value returned by proc. filename is opened as peropen-input-fileoropen-output-filerespectively, and an error is signaled if it cannot be opened.When proc returns, the port is closed. If proc does not return (e.g. if it throws an error), then the port might not be closed automatically, though it will be garbage collected in the usual way if not otherwise referenced.
Open filename and call
(thunk)with the new port setup as respectively thecurrent-input-port,current-output-port, orcurrent-error-port. Return the value returned by thunk. filename is opened as peropen-input-fileoropen-output-filerespectively, and an error is signaled if it cannot be opened.When thunk returns, the port is closed and the previous setting of the respective current port is restored.
The current port setting is managed with
dynamic-wind, so the previous value is restored no matter how thunk exits (eg. an exception), and if thunk is re-entered (via a captured continuation) then it's set again to the FILENAME port.The port is closed when thunk returns normally, but not when exited via an exception or new continuation. This ensures it's still ready for use if thunk is re-entered by a captured continuation. Of course the port is always garbage collected and closed in the usual way when no longer referenced anywhere.
Return the port modes associated with the open port port. These will not necessarily be identical to the modes used when the port was opened, since modes such as "append" which are used only during port creation are not retained.
Return the filename associated with port, or
#fif no filename is associated with the port.port must be open,
port-filenamecannot be used once the port is closed.
Change the filename associated with port, using the current input port if none is specified. Note that this does not change the port's source of data, but only the value that is returned by
port-filenameand reported in diagnostic output.
Determine whether obj is a port that is related to a file.
The following allow string ports to be opened by analogy to R4RS file port facilities:
With string ports, the port-encoding is treated differently than other types of ports. When string ports are created, they do not inherit a character encoding from the current locale. They are given a default locale that allows them to handle all valid string characters. Typically one should not modify a string port's character encoding away from its default.
Calls the one-argument procedure proc with a newly created output port. When the function returns, the string composed of the characters written into the port is returned. proc should not close the port.
Note that which characters can be written to a string port depend on the port's encoding. The default encoding of string ports is specified by the
%default-port-encodingfluid (see%default-port-encoding). For instance, it is an error to write Greek letter alpha to an ISO-8859-1-encoded string port since this character cannot be represented with ISO-8859-1:(define alpha (integer->char #x03b1)) ; GREEK SMALL LETTER ALPHA (with-fluids ((%default-port-encoding "ISO-8859-1")) (call-with-output-string (lambda (p) (display alpha p)))) ⇒ Throw to key `encoding-error'Changing the string port's encoding to a Unicode-capable encoding such as UTF-8 solves the problem.
Calls the one-argument procedure proc with a newly created input port from which string's contents may be read. The value yielded by the proc is returned.
Calls the zero-argument procedure thunk with the current output port set temporarily to a new string port. It returns a string composed of the characters written to the current output.
See
call-with-output-stringabove for character encoding considerations.
Calls the zero-argument procedure thunk with the current input port set temporarily to a string port opened on the specified string. The value yielded by thunk is returned.
Take a string and return an input port that delivers characters from the string. The port can be closed by
close-input-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
Return an output port that will accumulate characters for retrieval by
get-output-string. The port can be closed by the procedureclose-output-port, though its storage will be reclaimed by the garbage collector if it becomes inaccessible.
Given an output port created by
open-output-string, return a string consisting of the characters that have been output to the port so far.
get-output-stringmust be used before closing port, once closed the string cannot be obtained.
A string port can be used in many procedures which accept a port but which are not dependent on implementation details of fports. E.g., seeking and truncating will work on a string port, but trying to extract the file descriptor number will fail.
A soft-port is a port based on a vector of procedures capable of accepting or delivering characters. It allows emulation of I/O ports.
Return a port capable of receiving or delivering characters as specified by the modes string (see open-file). pv must be a vector of length 5 or 6. Its components are as follows:
- procedure accepting one character for output
- procedure accepting a string for output
- thunk for flushing output
- thunk for getting one character
- thunk for closing port (not by garbage collection)
- (if present and not
#f) thunk for computing the number of characters that can be read from the port without blocking.For an output-only port only elements 0, 1, 2, and 4 need be procedures. For an input-only port only elements 3 and 4 need be procedures. Thunks 2 and 4 can instead be
#fif there is no useful operation for them to perform.If thunk 3 returns
#for aneof-object(see eof-object?) it indicates that the port has reached end-of-file. For example:(define stdout (current-output-port)) (define p (make-soft-port (vector (lambda (c) (write c stdout)) (lambda (s) (display s stdout)) (lambda () (display "." stdout)) (lambda () (char-upcase (read-char))) (lambda () (display "@" stdout))) "rw")) (write p p) ⇒ #<input-output: soft 8081e20>
This kind of port causes any data to be discarded when written to, and always returns the end-of-file object when read from.
Create and return a new void port. A void port acts like /dev/null. The mode argument specifies the input/output modes for this port: see the documentation for
open-filein File Ports.
The I/O port API of the Revised Report^6 on the Algorithmic Language Scheme (R6RS) is provided by the (rnrs
io ports) module. It provides features, such as binary I/O and Unicode
string I/O, that complement or refine Guile's historical port API
presented above (see Input and Output). Note that R6RS ports are not
disjoint from Guile's native ports, so Guile-specific procedures will
work on ports created using the R6RS API, and vice versa.
The text in this section is taken from the R6RS standard libraries document, with only minor adaptions for inclusion in this manual. The Guile developers offer their thanks to the R6RS editors for having provided the report's text under permissive conditions making this possible.
Note: The implementation of this R6RS API is not complete yet.
A subset of the (rnrs io ports) module is provided by the
(ice-9 binary-ports) module. It contains binary input/output
procedures and does not rely on R6RS support.
Some of the procedures described in this chapter accept a file name as an argument. Valid values for such a file name include strings that name a file using the native notation of file system paths on an implementation's underlying operating system, and may include implementation-dependent values as well.
A filename parameter name means that the corresponding argument must be a file name.
When opening a file, the various procedures in this library accept a
file-options object that encapsulates flags to specify how the
file is to be opened. A file-options object is an enum-set
(see rnrs enums) over the symbols constituting valid file options.
A file-options parameter name means that the corresponding argument must be a file-options object.
Each file-options-symbol must be a symbol.
The
file-optionssyntax returns a file-options object that encapsulates the specified options.When supplied to an operation that opens a file for output, the file-options object returned by
(file-options)specifies that the file is created if it does not exist and an exception with condition type&i/o-file-already-existsis raised if it does exist. The following standard options can be included to modify the default behavior.
no-create- If the file does not already exist, it is not created; instead, an exception with condition type
&i/o-file-does-not-existis raised. If the file already exists, the exception with condition type&i/o-file-already-existsis not raised and the file is truncated to zero length.no-fail- If the file already exists, the exception with condition type
&i/o-file-already-existsis not raised, even ifno-createis not included, and the file is truncated to zero length.no-truncate- If the file already exists and the exception with condition type
&i/o-file-already-existshas been inhibited by inclusion ofno-createorno-fail, the file is not truncated, but the port's current position is still set to the beginning of the file.These options have no effect when a file is opened only for input. Symbols other than those listed above may be used as file-options-symbols; they have implementation-specific meaning, if any.
Note: Only the name of file-options-symbol is significant.
Each port has an associated buffer mode. For an output port, the
buffer mode defines when an output operation flushes the buffer
associated with the output port. For an input port, the buffer mode
defines how much data will be read to satisfy read operations. The
possible buffer modes are the symbols none for no buffering,
line for flushing upon line endings and reading up to line
endings, or other implementation-dependent behavior,
and block for arbitrary buffering. This section uses
the parameter name buffer-mode for arguments that must be
buffer-mode symbols.
If two ports are connected to the same mutable source, both ports are unbuffered, and reading a byte or character from that shared source via one of the two ports would change the bytes or characters seen via the other port, a lookahead operation on one port will render the peeked byte or character inaccessible via the other port, while a subsequent read operation on the peeked port will see the peeked byte or character even though the port is otherwise unbuffered.
In other words, the semantics of buffering is defined in terms of side effects on shared mutable sources, and a lookahead operation has the same side effect on the shared source as a read operation.
buffer-mode-symbol must be a symbol whose name is one of
none,line, andblock. The result is the corresponding symbol, and specifies the associated buffer mode.Note: Only the name of buffer-mode-symbol is significant.
Returns
#tif the argument is a valid buffer-mode symbol, and returns#fotherwise.
Several different Unicode encoding schemes describe standard ways to encode characters and strings as byte sequences and to decode those sequences. Within this document, a codec is an immutable Scheme object that represents a Unicode or similar encoding scheme.
An end-of-line style is a symbol that, if it is not none,
describes how a textual port transcodes representations of line endings.
A transcoder is an immutable Scheme object that combines a codec with an end-of-line style and a method for handling decoding errors. Each transcoder represents some specific bidirectional (but not necessarily lossless), possibly stateful translation between byte sequences and Unicode characters and strings. Every transcoder can operate in the input direction (bytes to characters) or in the output direction (characters to bytes). A transcoder parameter name means that the corresponding argument must be a transcoder.
A binary port is a port that supports binary I/O, does not have an associated transcoder and does not support textual I/O. A textual port is a port that supports textual I/O, and does not support binary I/O. A textual port may or may not have an associated transcoder.
These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16 encoding schemes.
A call to any of these procedures returns a value that is equal in the sense of
eqv?to the result of any other call to the same procedure.
eol-style-symbol should be a symbol whose name is one of
lf,cr,crlf,nel,crnel,ls, andnone.The form evaluates to the corresponding symbol. If the name of eol-style-symbol is not one of these symbols, the effect and result are implementation-dependent; in particular, the result may be an eol-style symbol acceptable as an eol-style argument to
make-transcoder. Otherwise, an exception is raised.All eol-style symbols except
nonedescribe a specific line-ending encoding:
lf- linefeed
cr- carriage return
crlf- carriage return, linefeed
nel- next line
crnel- carriage return, next line
ls- line separator
For a textual port with a transcoder, and whose transcoder has an eol-style symbol
none, no conversion occurs. For a textual input port, any eol-style symbol other thannonemeans that all of the above line-ending encodings are recognized and are translated into a single linefeed. For a textual output port,noneandlfare equivalent. Linefeed characters are encoded according to the specified eol-style symbol, and all other characters that participate in possible line endings are encoded as is.Note: Only the name of eol-style-symbol is significant.
Returns the default end-of-line style of the underlying platform, e.g.,
lfon Unix andcrlfon Windows.
This condition type could be defined by
(define-condition-type &i/o-decoding &i/o-port make-i/o-decoding-error i/o-decoding-error?)An exception with this type is raised when one of the operations for textual input from a port encounters a sequence of bytes that cannot be translated into a character or string by the input direction of the port's transcoder.
When such an exception is raised, the port's position is past the invalid encoding.
This condition type could be defined by
(define-condition-type &i/o-encoding &i/o-port make-i/o-encoding-error i/o-encoding-error? (char i/o-encoding-error-char))An exception with this type is raised when one of the operations for textual output to a port encounters a character that cannot be translated into bytes by the output direction of the port's transcoder. Char is the character that could not be encoded.
error-handling-mode-symbol should be a symbol whose name is one of
ignore,raise, andreplace. The form evaluates to the corresponding symbol. If error-handling-mode-symbol is not one of these identifiers, effect and result are implementation-dependent: The result may be an error-handling-mode symbol acceptable as a handling-mode argument tomake-transcoder. If it is not acceptable as a handling-mode argument tomake-transcoder, an exception is raised.Note: Only the name of error-handling-style-symbol is significant.The error-handling mode of a transcoder specifies the behavior of textual I/O operations in the presence of encoding or decoding errors.
If a textual input operation encounters an invalid or incomplete character encoding, and the error-handling mode is
ignore, an appropriate number of bytes of the invalid encoding are ignored and decoding continues with the following bytes.If the error-handling mode is
replace, the replacement character U+FFFD is injected into the data stream, an appropriate number of bytes are ignored, and decoding continues with the following bytes.If the error-handling mode is
raise, an exception with condition type&i/o-decodingis raised.If a textual output operation encounters a character it cannot encode, and the error-handling mode is
ignore, the character is ignored and encoding continues with the next character. If the error-handling mode isreplace, a codec-specific replacement character is emitted by the transcoder, and encoding continues with the next character. The replacement character is U+FFFD for transcoders whose codec is one of the Unicode encodings, but is the?character for the Latin-1 encoding. If the error-handling mode israise, an exception with condition type&i/o-encodingis raised.
codec must be a codec; eol-style, if present, an eol-style symbol; and handling-mode, if present, an error-handling-mode symbol.
eol-style may be omitted, in which case it defaults to the native end-of-line style of the underlying platform. Handling-mode may be omitted, in which case it defaults to
replace. The result is a transcoder with the behavior specified by its arguments.
Returns an implementation-dependent transcoder that represents a possibly locale-dependent “native” transcoding.
These are accessors for transcoder objects; when applied to a transcoder returned by
make-transcoder, they return the codec, eol-style, and handling-mode arguments, respectively.
Returns the string that results from transcoding the bytevector according to the input direction of the transcoder.
Returns the bytevector that results from transcoding the string according to the output direction of the transcoder.
R5RS' eof-object? procedure is provided by the (rnrs io
ports) module:
Return true if obj is the end-of-file (EOF) object.
In addition, the following procedure is provided:
Return the end-of-file (EOF) object.
(eof-object? (eof-object)) ⇒ #t
The procedures listed below operate on any kind of R6RS I/O port.
Returns the transcoder associated with port if port is textual and has an associated transcoder, and returns
#fif port is binary or does not have an associated transcoder.
Return
#tif port is a binary port, suitable for binary data input/output.Note that internally Guile does not differentiate between binary and textual ports, unlike the R6RS. Thus, this procedure returns true when port does not have an associated encoding—i.e., when
(port-encodingport)is#f(see port-encoding). This is the case for ports returned by R6RS procedures such asopen-bytevector-input-portandmake-custom-binary-output-port.However, Guile currently does not prevent use of textual I/O procedures such as
displayorread-charwith binary ports. Doing so “upgrades” the port from binary to textual, under the ISO-8859-1 encoding. Likewise, Guile does not prevent use ofset-port-encoding!on a binary port, which also turns it into a “textual” port.
Always return #t, as all ports can be used for textual I/O in Guile.
The
transcoded-portprocedure returns a new textual port with the specified transcoder. Otherwise the new textual port's state is largely the same as that of binary-port. If binary-port is an input port, the new textual port will be an input port and will transcode the bytes that have not yet been read from binary-port. If binary-port is an output port, the new textual port will be an output port and will transcode output characters into bytes that are written to the byte sink represented by binary-port.As a side effect, however,
transcoded-portcloses binary-port in a special way that allows the new textual port to continue to use the byte source or sink represented by binary-port, even though binary-port itself is closed and cannot be used by the input and output operations described in this chapter.
If port supports it (see below), return the offset (an integer) indicating where the next octet will be read from/written to in port. If port does not support this operation, an error condition is raised.
This is similar to Guile's
seekprocedure with theSEEK_CURargument (see Random Access).
If port supports it (see below), set the position where the next octet will be read from/written to port to offset (an integer). If port does not support this operation, an error condition is raised.
This is similar to Guile's
seekprocedure with theSEEK_SETargument (see Random Access).
Return
#tis port supportsset-port-position!.
Call proc, passing it port and closing port upon exit of proc. Return the return values of proc.
#t if the argument is an input port (or a combined inputand output port), and returns
#fotherwise.
Returns
#tif thelookahead-u8procedure (if input-port is a binary port) or thelookahead-charprocedure (if input-port is a textual port) would return the end-of-file object, and#fotherwise. The operation may block indefinitely if no data is available but the port cannot be determined to be at end of file.
Maybe-transcoder must be either a transcoder or
#f.The
open-file-input-portprocedure returns an input port for the named file. The file-options and maybe-transcoder arguments are optional.The file-options argument, which may determine various aspects of the returned port (see R6RS File Options), defaults to the value of
(file-options).The buffer-mode argument, if supplied, must be one of the symbols that name a buffer mode. The buffer-mode argument defaults to
block.If maybe-transcoder is a transcoder, it becomes the transcoder associated with the returned port.
If maybe-transcoder is
#for absent, the port will be a binary port and will support theport-positionandset-port-position!operations. Otherwise the port will be a textual port, and whether it supports theport-positionandset-port-position!operations is implementation-dependent (and possibly transcoder-dependent).
Returns a fresh binary input port connected to standard input. Whether the port supports the
port-positionandset-port-position!operations is implementation-dependent.
This returns a default textual port for input. Normally, this default port is associated with standard input, but can be dynamically re-assigned using the
with-input-from-fileprocedure from theio simple (6)library (see rnrs io simple). The port may or may not have an associated transcoder; if it does, the transcoder is implementation-dependent.
R6RS binary input ports can be created with the procedures described below.
Return an input port whose contents are drawn from bytevector bv (see Bytevectors).
The transcoder argument is currently not supported.
Return a new custom binary input port13 named id (a string) whose input is drained by invoking read! and passing it a bytevector, an index where bytes should be written, and the number of bytes to read. The
read!procedure must return an integer indicating the number of bytes read, or0to indicate the end-of-file.Optionally, if get-position is not
#f, it must be a thunk that will be called when port-position is invoked on the custom binary port and should return an integer indicating the position within the underlying data stream; if get-position was not supplied, the returned port does not support port-position.Likewise, if set-position! is not
#f, it should be a one-argument procedure. When set-port-position! is invoked on the custom binary input port, set-position! is passed an integer indicating the position of the next byte is to read.Finally, if close is not
#f, it must be a thunk. It is invoked when the custom binary input port is closed.Using a custom binary input port, the
open-bytevector-input-portprocedure could be implemented as follows:(define (open-bytevector-input-port source) (define position 0) (define length (bytevector-length source)) (define (read! bv start count) (let ((count (min count (- length position)))) (bytevector-copy! source position bv start count) (set! position (+ position count)) count)) (define (get-position) position) (define (set-position! new-position) (set! position new-position)) (make-custom-binary-input-port "the port" read! get-position set-position!)) (read (open-bytevector-input-port (string->utf8 "hello"))) ⇒ hello
Binary input is achieved using the procedures below:
Return an octet read from port, a binary input port, blocking as necessary, or the end-of-file object.
Like
get-u8but does not update port's position to point past the octet.
Read count octets from port, blocking as necessary and return a bytevector containing the octets read. If fewer bytes are available, a bytevector smaller than count is returned.
Read count bytes from port and store them in bv starting at index start. Return either the number of bytes actually read or the end-of-file object.
Read from port, blocking as necessary, until data are available or and end-of-file is reached. Return either a new bytevector containing the data read or the end-of-file object.
Read from port, blocking as necessary, until the end-of-file is reached. Return either a new bytevector containing the data read or the end-of-file object (if no data were available).
Reads from textual-input-port, blocking as necessary, until a complete character is available from textual-input-port, or until an end of file is reached.
If a complete character is available before the next end of file,
get-charreturns that character and updates the input port to point past the character. If an end of file is reached before any character is read,get-charreturns the end-of-file object.
The
lookahead-charprocedure is likeget-char, but it does not update textual-input-port to point past the character.
Count must be an exact, non-negative integer object, representing the number of characters to be read.
The
get-string-nprocedure reads from textual-input-port, blocking as necessary, until count characters are available, or until an end of file is reached.If count characters are available before end of file,
get-string-nreturns a string consisting of those count characters. If fewer characters are available before an end of file, but one or more characters can be read,get-string-nreturns a string containing those characters. In either case, the input port is updated to point just past the characters read. If no characters can be read before an end of file, the end-of-file object is returned.
Start and count must be exact, non-negative integer objects, with count representing the number of characters to be read. String must be a string with at least $start + count$ characters.
The
get-string-n!procedure reads from textual-input-port in the same manner asget-string-n. If count characters are available before an end of file, they are written into string starting at index start, and count is returned. If fewer characters are available before an end of file, but one or more can be read, those characters are written into string starting at index start and the number of characters actually read is returned as an exact integer object. If no characters can be read before an end of file, the end-of-file object is returned.
Reads from textual-input-port until an end of file, decoding characters in the same manner as
get-string-nandget-string-n!.If characters are available before the end of file, a string containing all the characters decoded from that data are returned. If no character precedes the end of file, the end-of-file object is returned.
Reads from textual-input-port up to and including the linefeed character or end of file, decoding characters in the same manner as
get-string-nandget-string-n!.If a linefeed character is read, a string containing all of the text up to (but not including) the linefeed character is returned, and the port is updated to point just past the linefeed character. If an end of file is encountered before any linefeed character is read, but some characters have been read and decoded as characters, a string containing those characters is returned. If an end of file is encountered before any characters are read, the end-of-file object is returned.
Note: The end-of-line style, if notnone, will cause all line endings to be read as linefeed characters. See R6RS Transcoders.
Reads an external representation from textual-input-port and returns the datum it represents. The
get-datumprocedure returns the next datum that can be parsed from the given textual-input-port, updating textual-input-port to point exactly past the end of the external representation of the object.Any interlexeme space (comment or whitespace, see Scheme Syntax) in the input is first skipped. If an end of file occurs after the interlexeme space, the end-of-file object (see R6RS End-of-File) is returned.
If a character inconsistent with an external representation is encountered in the input, an exception with condition types
&lexicaland&i/o-readis raised. Also, if the end of file is encountered after the beginning of an external representation, but the external representation is incomplete and therefore cannot be parsed, an exception with condition types&lexicaland&i/o-readis raised.
Returns
#tif the argument is an output port (or a combined input and output port),#fotherwise.
Flushes any buffered output from the buffer of output-port to the underlying file, device, or object. The
flush-output-portprocedure returns an unspecified values.
maybe-transcoder must be either a transcoder or
#f.The
open-file-output-portprocedure returns an output port for the named file.The file-options argument, which may determine various aspects of the returned port (see R6RS File Options), defaults to the value of
(file-options).The buffer-mode argument, if supplied, must be one of the symbols that name a buffer mode. The buffer-mode argument defaults to
block.If maybe-transcoder is a transcoder, it becomes the transcoder associated with the port.
If maybe-transcoder is
#for absent, the port will be a binary port and will support theport-positionandset-port-position!operations. Otherwise the port will be a textual port, and whether it supports theport-positionandset-port-position!operations is implementation-dependent (and possibly transcoder-dependent).
Returns a fresh binary output port connected to the standard output or standard error respectively. Whether the port supports the
port-positionandset-port-position!operations is implementation-dependent.
These return default textual ports for regular output and error output. Normally, these default ports are associated with standard output, and standard error, respectively. The return value of
current-output-portcan be dynamically re-assigned using thewith-output-to-fileprocedure from theio simple (6)library (see rnrs io simple). A port returned by one of these procedures may or may not have an associated transcoder; if it does, the transcoder is implementation-dependent.
Binary output ports can be created with the procedures below.
Return two values: a binary output port and a procedure. The latter should be called with zero arguments to obtain a bytevector containing the data accumulated by the port, as illustrated below.
(call-with-values (lambda () (open-bytevector-output-port)) (lambda (port get-bytevector) (display "hello" port) (get-bytevector))) ⇒ #vu8(104 101 108 108 111)The transcoder argument is currently not supported.
Return a new custom binary output port named id (a string) whose output is sunk by invoking write! and passing it a bytevector, an index where bytes should be read from this bytevector, and the number of bytes to be “written”. The
write!procedure must return an integer indicating the number of bytes actually written; when it is passed0as the number of bytes to write, it should behave as though an end-of-file was sent to the byte sink.The other arguments are as for
make-custom-binary-input-port(seemake-custom-binary-input-port).
Writing to a binary output port can be done using the following procedures:
Write octet, an integer in the 0–255 range, to port, a binary output port.
Write the contents of bv to port, optionally starting at index start and limiting to count octets.
start and count must be non-negative exact integer objects. string must have a length of at least start + count. start defaults to 0. count defaults to
(string-lengthstring)- start$. Theput-stringprocedure writes the count characters of string starting at index start to the port. Theput-stringprocedure returns an unspecified value.
datum should be a datum value. The
put-datumprocedure writes an external representation of datum to textual-output-port. The specific external representation is implementation-dependent. However, whenever possible, an implementation should produce a representation for whichget-datum, when reading the representation, will return an object equal (in the sense ofequal?) to datum.Note: Not all datums may allow producing an external representation for whichget-datumwill produce an object that is equal to the original. Specifically, NaNs contained in datum may make this impossible.Note: Theput-datumprocedure merely writes the external representation, but no trailing delimiter. Ifput-datumis used to write several subsequent external representations to an output port, care should be taken to delimit them properly so they can be read back in by subsequent calls toget-datum.
This section describes how to use Scheme ports from C.
There are two main data structures. A port type object (ptob) is of
type scm_ptob_descriptor. A port instance is of type
scm_port. Given an SCM variable which points to a port,
the corresponding C port object can be obtained using the
SCM_PTAB_ENTRY macro. The ptob can be obtained by using
SCM_PTOBNUM to give an index into the scm_ptobs
global array.
An input port always has a read buffer and an output port always has a
write buffer. However the size of these buffers is not guaranteed to be
more than one byte (e.g., the shortbuf field in scm_port
which is used when no other buffer is allocated). The way in which the
buffers are allocated depends on the implementation of the ptob. For
example in the case of an fport, buffers may be allocated with malloc
when the port is created, but in the case of an strport the underlying
string is used as the buffer.
rw_random flagSpecial treatment is required for ports which can be seeked at random.
Before various operations, such as seeking the port or changing from
input to output on a bidirectional port or vice versa, the port
implementation must be given a chance to update its state. The write
buffer is updated by calling the flush ptob procedure and the
input buffer is updated by calling the end_input ptob procedure.
In the case of an fport, flush causes buffered output to be
written to the file descriptor, while end_input causes the
descriptor position to be adjusted to account for buffered input which
was never read.
The special treatment must be performed if the rw_random flag in
the port is non-zero.
rw_active variableThe rw_active variable in the port is only used if
rw_random is set. It's defined as an enum with the following
values:
SCM_PORT_READSCM_PORT_WRITESCM_PORT_NEITHERTo read from a port, it's possible to either call existing libguile
procedures such as scm_getc and scm_read_line or to read
data from the read buffer directly. Reading from the buffer involves
the following steps:
rw_active is SCM_PORT_WRITE.
scm_fill_input.
SCM_PORT_READ if rw_random is set.
To write data to a port, calling scm_lfwrite should be sufficient for
most purposes. This takes care of the following steps:
rw_active is SCM_PORT_READ.
write ptob
procedure. The advantage of using the ptob write instead of
manipulating the write buffer directly is that it allows the data to be
written in one operation even if the port is using the single-byte
shortbuf.
rw_active to SCM_PORT_WRITE if rw_random
is set.
This section describes how to implement a new port type in C.
As described in the previous section, a port type object (ptob) is
a structure of type scm_ptob_descriptor. A ptob is created by
calling scm_make_port_type.
Return a new port type object. The name, fill_input and write parameters are initial values for those port type fields, as described below. The other fields are initialized with default values and can be changed later.
All of the elements of the ptob, apart from name, are procedures
which collectively implement the port behaviour. Creating a new port
type mostly involves writing these procedures.
namescm_ptob_descriptor which is not
a procedure. Set via the first argument to scm_make_port_type.
markSCM components. Set using
freeprintwrite is called on the port object, to print a
port description. E.g., for an fport it may produce something like:
#<input: /etc/passwd 3>. Set using
The first argument port is the object being printed, the second argument dest_port is where its description should go.
equalpclosewritescm_make_port_type.
flushrw_active to SCM_PORT_NEITHER.
Set using
end_inputrw_active to SCM_PORT_NEITHER.
Set using
fill_inputscm_make_port_type.
input_waitingrw_active is SCM_PORT_NEITHER.
Set using
seekrw_active when it's
called. It can reset the buffers first if desired by using something
like:
if (pt->rw_active == SCM_PORT_READ)
scm_end_input (port);
else if (pt->rw_active == SCM_PORT_WRITE)
ptob->flush (port);
However note that this will have the side effect of discarding any data
in the unread-char buffer, in addition to any side effects from the
end_input and flush ptob procedures. This is undesirable
when seek is called to measure the current position of the port, i.e.,
(seek p 0 SEEK_CUR). The libguile fport and string port
implementations take care to avoid this problem.
The procedure is set using
truncaterw_active is SCM_PORT_NEITHER.
Set using
A regular expression (or regexp) is a pattern that describes a whole class of strings. A full description of regular expressions and their syntax is beyond the scope of this manual; an introduction can be found in the Emacs manual (see Syntax of Regular Expressions), or in many general Unix reference books.
If your system does not include a POSIX regular expression library,
and you have not linked Guile with a third-party regexp library such
as Rx, these functions will not be available. You can tell whether
your Guile installation includes regular expression support by
checking whether (provided? 'regex) returns true.
The following regexp and string matching features are provided by the
(ice-9 regex) module. Before using the described functions,
you should load this module by executing (use-modules (ice-9
regex)).
By default, Guile supports POSIX extended regular expressions. That means that the characters ‘(’, ‘)’, ‘+’ and ‘?’ are special, and must be escaped if you wish to match the literal characters.
This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions.
Zero bytes (#\nul) cannot be used in regex patterns or input
strings, since the underlying C functions treat that as the end of
string. If there's a zero byte an error is thrown.
Patterns and input strings are treated as being in the locale
character set if setlocale has been called (see Locales),
and in a multibyte locale this includes treating multi-byte sequences
as a single character. (Guile strings are currently merely bytes,
though this may change in the future, See Conversion to/from C.)
Compile the string pattern into a regular expression and compare it with str. The optional numeric argument start specifies the position of str at which to begin matching.
string-matchreturns a match structure which describes what, if anything, was matched by the regular expression. See Match Structures. If str does not match pattern at all,string-matchreturns#f.
Two examples of a match follow. In the first example, the pattern matches the four digits in the match string. In the second, the pattern matches nothing.
(string-match "[0-9][0-9][0-9][0-9]" "blah2002")
⇒ #("blah2002" (4 . 8))
(string-match "[A-Za-z]" "123456")
⇒ #f
Each time string-match is called, it must compile its
pattern argument into a regular expression structure. This
operation is expensive, which makes string-match inefficient if
the same regular expression is used several times (for example, in a
loop). For better performance, you can compile a regular expression in
advance and then match strings against the compiled regexp.
Compile the regular expression described by pat, and return the compiled regexp structure. If pat does not describe a legal regular expression,
make-regexpthrows aregular-expression-syntaxerror.The flag arguments change the behavior of the compiled regular expression. The following values may be supplied:
— Variable: regexp/newline
If a newline appears in the target string, then permit the ‘^’ and ‘$’ operators to match immediately after or immediately before the newline, respectively. Also, the ‘.’ and ‘[^...]’ operators will never match a newline character. The intent of this flag is to treat the target string as a buffer containing many lines of text, and the regular expression as a pattern that may match a single one of those lines.
— Variable: regexp/basic
Compile a basic (“obsolete”) regexp instead of the extended (“modern”) regexps that are the default. Basic regexps do not consider ‘|’, ‘+’ or ‘?’ to be special characters, and require the ‘{...}’ and ‘(...)’ metacharacters to be backslash-escaped (see Backslash Escapes). There are several other differences between basic and extended regular expressions, but these are the most significant.
Match the compiled regular expression rx against
str. If the optional integer start argument is provided, begin matching from that position in the string. Return a match structure describing the results of the match, or#fif no match could be found.The flags argument changes the matching behavior. The following flag values may be supplied, use
logior(see Bitwise Operations) to combine them,
;; Regexp to match uppercase letters
(define r (make-regexp "[A-Z]*"))
;; Regexp to match letters, ignoring case
(define ri (make-regexp "[A-Z]*" regexp/icase))
;; Search for bob using regexp r
(match:substring (regexp-exec r "bob"))
⇒ "" ; no match
;; Search for bob using regexp ri
(match:substring (regexp-exec ri "Bob"))
⇒ "Bob" ; matched case insensitive
Return
#tif obj is a compiled regular expression, or#fotherwise.
Return a list of match structures which are the non-overlapping matches of regexp in str. regexp can be either a pattern string or a compiled regexp. The flags argument is as per
regexp-execabove.(map match:substring (list-matches "[a-z]+" "abc 42 def 78")) ⇒ ("abc" "def")
Apply proc to the non-overlapping matches of regexp in str, to build a result. regexp can be either a pattern string or a compiled regexp. The flags argument is as per
regexp-execabove.proc is called as
(procmatch prev)where match is a match structure and prev is the previous return from proc. For the first call prev is the given init parameter.fold-matchesreturns the final value from proc.For example to count matches,
(fold-matches "[a-z][0-9]" "abc x1 def y2" 0 (lambda (match count) (1+ count))) ⇒ 2
Regular expressions are commonly used to find patterns in one string and replace them with the contents of another string. The following functions are convenient ways to do this.
Write to port selected parts of the match structure match. Or if port is
#fthen form a string from those parts and return that.Each item specifies a part to be written, and may be one of the following,
- A string. String arguments are written out verbatim.
- An integer. The submatch with that number is written (
match:substring). Zero is the entire match.- The symbol ‘pre’. The portion of the matched string preceding the regexp match is written (
match:prefix).- The symbol ‘post’. The portion of the matched string following the regexp match is written (
match:suffix).For example, changing a match and retaining the text before and after,
(regexp-substitute #f (string-match "[0-9]+" "number 25 is good") 'pre "37" 'post) ⇒ "number 37 is good"Or matching a yyyymmdd format date such as ‘20020828’ and re-ordering and hyphenating the fields.
(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") (define s "Date 20020429 12am.") (regexp-substitute #f (string-match date-regex s) 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") ⇒ "Date 04-29-2002 12am. (20020429)"
Write to port selected parts of matches of regexp in target. If port is
#fthen form a string from those parts and return that. regexp can be a string or a compiled regex.This is similar to
regexp-substitute, but allows global substitutions on target. Each item behaves as perregexp-substitute, with the following differences,
- A function. Called as
(itemmatch)with the match structure for the regexp match, it should return a string to be written to port.- The symbol ‘post’. This doesn't output anything, but instead causes
regexp-substitute/globalto recurse on the unmatched portion of target.This must be supplied to perform a global search and replace on target; without it
regexp-substitute/globalreturns after a single match and output.For example, to collapse runs of tabs and spaces to a single hyphen each,
(regexp-substitute/global #f "[ \t]+" "this is the text" 'pre "-" 'post) ⇒ "this-is-the-text"Or using a function to reverse the letters in each word,
(regexp-substitute/global #f "[a-z]+" "to do and not-do" 'pre (lambda (m) (string-reverse (match:substring m))) 'post) ⇒ "ot od dna ton-od"Without the
postsymbol, just one regexp match is made. For example the following is the date example fromregexp-substituteabove, without the need for the separatestring-matchcall.(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") (define s "Date 20020429 12am.") (regexp-substitute/global #f date-regex s 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") ⇒ "Date 04-29-2002 12am. (20020429)"
A match structure is the object returned by string-match and
regexp-exec. It describes which portion of a string, if any,
matched the given regular expression. Match structures include: a
reference to the string that was checked for matches; the starting and
ending positions of the regexp match; and, if the regexp included any
parenthesized subexpressions, the starting and ending positions of each
submatch.
In each of the regexp match functions described below, the match
argument must be a match structure returned by a previous call to
string-match or regexp-exec. Most of these functions
return some information about the original target string that was
matched against a regular expression; we will call that string
target for easy reference.
Return
#tif obj is a match structure returned by a previous call toregexp-exec, or#fotherwise.
Return the portion of target matched by subexpression number n. Submatch 0 (the default) represents the entire regexp match. If the regular expression as a whole matched, but the subexpression number n did not match, return
#f.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:substring s)
⇒ "2002"
;; match starting at offset 6 in the string
(match:substring
(string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6))
⇒ "7654"
In the following example, the result is 4, since the match starts at character index 4:
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:start s)
⇒ 4
In the following example, the result is 8, since the match runs between characters 4 and 8 (i.e. the “2002”).
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:end s)
⇒ 8
Return the unmatched portion of target preceding the regexp match.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) (match:prefix s) ⇒ "blah"
Return the unmatched portion of target following the regexp match.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:suffix s)
⇒ "foo"
Return the number of parenthesized subexpressions from match. Note that the entire regular expression match itself counts as a subexpression, and failed submatches are included in the count.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:string s)
⇒ "blah2002foo"
Sometimes you will want a regexp to match characters like ‘*’ or ‘$’ exactly. For example, to check whether a particular string represents a menu entry from an Info node, it would be useful to match it against a regexp like ‘^* [^:]*::’. However, this won't work; because the asterisk is a metacharacter, it won't match the ‘*’ at the beginning of the string. In this case, we want to make the first asterisk un-magic.
You can do this by preceding the metacharacter with a backslash character ‘\’. (This is also called quoting the metacharacter, and is known as a backslash escape.) When Guile sees a backslash in a regular expression, it considers the following glyph to be an ordinary character, no matter what special meaning it would ordinarily have. Therefore, we can make the above example work by changing the regexp to ‘^\* [^:]*::’. The ‘\*’ sequence tells the regular expression engine to match only a single asterisk in the target string.
Since the backslash is itself a metacharacter, you may force a regexp to match a backslash in the target string by preceding the backslash with itself. For example, to find variable references in a TeX program, you might want to find occurrences of the string ‘\let\’ followed by any number of alphabetic characters. The regular expression ‘\\let\\[A-Za-z]*’ would do this: the double backslashes in the regexp each match a single backslash in the target string.
Quote each special character found in str with a backslash, and return the resulting string.
Very important: Using backslash escapes in Guile source code (as in Emacs Lisp or C) can be tricky, because the backslash character has special meaning for the Guile reader. For example, if Guile encounters the character sequence ‘\n’ in the middle of a string while processing Scheme code, it replaces those characters with a newline character. Similarly, the character sequence ‘\t’ is replaced by a horizontal tab. Several of these escape sequences are processed by the Guile reader before your code is executed. Unrecognized escape sequences are ignored: if the characters ‘\*’ appear in a string, they will be translated to the single character ‘*’.
This translation is obviously undesirable for regular expressions, since we want to be able to include backslashes in a string in order to escape regexp metacharacters. Therefore, to make sure that a backslash is preserved in a string in your Guile program, you must use two consecutive backslashes:
(define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
The string in this example is preprocessed by the Guile reader before
any code is executed. The resulting argument to make-regexp is
the string ‘^\* [^:]*’, which is what we really want.
This also means that in order to write a regular expression that matches a single backslash character, the regular expression string in the source code must include four backslashes. Each consecutive pair of backslashes gets translated by the Guile reader to a single backslash, and the resulting double-backslash is interpreted by the regexp engine as matching a single backslash character. Hence:
(define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
The reason for the unwieldiness of this syntax is historical. Both regular expression pattern matchers and Unix string processing systems have traditionally used backslashes with the special meanings described above. The POSIX regular expression specification and ANSI C standard both require these semantics. Attempting to abandon either convention would cause other kinds of compatibility problems, possibly more severe ones. Therefore, without extending the Scheme reader to support strings with different quoting conventions (an ungainly and confusing extension when implemented in other languages), we must adhere to this cumbersome escape syntax.
The (system base lalr) module provides the
lalr-scm LALR(1) parser generator by Dominique Boucher. lalr-scm uses the same algorithm as GNU
Bison (see Introduction to Bison). Parsers are defined using the
lalr-parser macro.
Generate an LALR(1) syntax analyzer. tokens is a list of symbols representing the terminal symbols of the grammar. rules are the grammar production rules.
Each rule has the form
(non-terminal(rhs...) :action...), where non-terminal is the name of the rule, rhs are the right-hand sides, i.e., the production rule, and action is a semantic action associated with the rule.The generated parser is a two-argument procedure that takes a tokenizer and a syntax error procedure. The tokenizer should be a thunk that returns lexical tokens as produced by
make-lexical-token. The syntax error procedure may be called with at least an error message (a string), and optionally the lexical token that caused the error.
Please refer to the lalr-scm documentation for details.
This chapter describes Guile functions that are concerned with reading, loading, evaluating, and compiling Scheme code at run time.
An expression to be evaluated takes one of the following forms.
(define x 123)
x ⇒ 123
(proc args...)The order in which proc and the arguments are evaluated is unspecified, so be careful when using expressions with side effects.
(max 1 2 3) ⇒ 3
(define (get-some-proc) min)
((get-some-proc) 1 2 3) ⇒ 1
The same sort of parenthesised form is used for a macro invocation,
but in that case the arguments are not evaluated. See the
descriptions of macros for more on this (see Macros, and
see Syntax Rules).
123 ⇒ 123
99.9 ⇒ 99.9
"hello" ⇒ "hello"
#\z ⇒ #\z
#t ⇒ #t
Note that an application must not attempt to modify literal strings,
since they may be in read-only memory.
(quote data)'data' is simply a shorthand for a quote form.
For example,
'x ⇒ x
'(1 2 3) ⇒ (1 2 3)
'#(1 (2 3) 4) ⇒ #(1 (2 3) 4)
(quote x) ⇒ x
(quote (1 2 3)) ⇒ (1 2 3)
(quote #(1 (2 3) 4)) ⇒ #(1 (2 3) 4)
Note that an application must not attempt to modify literal lists or
vectors obtained from a quote form, since they may be in
read-only memory.
(quasiquote data)`dataquote, but selected
sub-expressions are evaluated. This is a convenient way to construct
a list or vector structure most of which is constant, but at certain
points should have expressions substituted.
The same effect can always be had with suitable list,
cons or vector calls, but quasi-quoting is often easier.
(unquote expr),exprunquote or , indicates
an expression to be evaluated and inserted. The comma syntax ,
is simply a shorthand for an unquote form. For example,
`(1 2 ,(* 9 9) 3 4) ⇒ (1 2 81 3 4)
`(1 (unquote (+ 1 1)) 3) ⇒ (1 2 3)
`#(1 ,(/ 12 2)) ⇒ #(1 6)
(unquote-splicing expr),@exprunquote-splicing or
,@ indicates an expression to be evaluated and the elements of
the returned list inserted. expr must evaluate to a list. The
“comma-at” syntax ,@ is simply a shorthand for an
unquote-splicing form.
(define x '(2 3))
`(1 ,@x 4) ⇒ (1 2 3 4)
`(1 (unquote-splicing (map 1+ x))) ⇒ (1 3 4)
`#(9 ,@x 9) ⇒ #(9 2 3 9)
Notice ,@ differs from plain , in the way one level of
nesting is stripped. For ,@ the elements of a returned list
are inserted, whereas with , it would be the list itself
inserted.
Comments in Scheme source files are written by starting them with a
semicolon character (;). The comment then reaches up to the end
of the line. Comments can begin at any column, and the may be inserted
on the same line as Scheme code.
; Comment
;; Comment too
(define x 1) ; Comment after expression
(let ((y 1))
;; Display something.
(display y)
;;; Comment at left margin.
(display (+ y 1)))
It is common to use a single semicolon for comments following expressions on a line, to use two semicolons for comments which are indented like code, and three semicolons for comments which start at column 0, even if they are inside an indented code block. This convention is used when indenting code in Emacs' Scheme mode.
In addition to the standard line comments defined by R5RS, Guile has
another comment type for multiline comments, called block
comments. This type of comment begins with the character sequence
#! and ends with the characters !#, which must appear on a
line of their own. These comments are compatible with the block
comments in the Scheme Shell scsh (see The Scheme shell (scsh)). The characters #! were chosen because they are the
magic characters used in shell scripts for indicating that the name of
the program for executing the script follows on the same line.
Thus a Guile script often starts like this.
#! /usr/local/bin/guile -s
!#
More details on Guile scripting can be found in the scripting section (see Guile Scripting).
Similarly, Guile (starting from version 2.0) supports nested block comments as specified by R6RS and SRFI-30:
(+ #| this is a #| nested |# block comment |# 2)
⇒ 3
For backward compatibility, this syntax can be overridden with
read-hash-extend (see read-hash-extend).
There is one special case where the contents of a comment can actually
affect the interpretation of code. When a character encoding
declaration, such as coding: utf-8 appears in one of the first
few lines of a source file, it indicates to Guile's default reader
that this source code file is not ASCII. For details see Character Encoding of Source Files.
Scheme as defined in R5RS is not case sensitive when reading symbols. Guile, on the contrary is case sensitive by default, so the identifiers
guile-whuzzy
Guile-Whuzzy
are the same in R5RS Scheme, but are different in Guile.
It is possible to turn off case sensitivity in Guile by setting the
reader option case-insensitive. For more information on reader
options, See Scheme Read.
(read-enable 'case-insensitive)
Note that this is seldom a problem, because Scheme programmers tend not to use uppercase letters in their identifiers anyway.
Install the procedure proc for reading expressions starting with the character sequence
#and chr. proc will be called with two arguments: the character chr and the port to read further data from. The object returned will be the return value ofread. Passing#ffor proc will remove a previous setting.
Read an s-expression from the input port port, or from the current input port if port is not specified. Any whitespace before the next token is discarded.
The behaviour of Guile's Scheme reader can be modified by manipulating its read options.
Display the current settings of the read options. If setting is omitted, only a short form of the current read options is printed. Otherwise if setting is the symbol
help, a complete options description is displayed.
The set of available options, and their default values, may be had by
invoking read-options at the prompt.
scheme@(guile-user)> (read-options)
(square-brackets keywords #f positions)
scheme@(guile-user)> (read-options 'help)
copy no Copy source code expressions.
positions yes Record positions of source code expressions.
case-insensitive no Convert symbols to lower case.
keywords #f Style of keyword recognition: #f, 'prefix or 'postfix.
r6rs-hex-escapes no Use R6RS variable-length character and string hex escapes.
square-brackets yes Treat `[' and `]' as parentheses, for R6RS compatibility.
hungry-eol-escapes no In strings, consume leading whitespace after an
escaped end-of-line.
The boolean options may be toggled with read-enable and
read-disable. The non-boolean keywords option must be set
using read-set!.
Modify the read options.
read-enableshould be used with boolean options and switches them on,read-disableswitches them off.
read-set!can be used to set an option to a specific value. Due to historical oddities, it is a macro that expects an unquoted option name.
For example, to make read fold all symbols to their lower case
(perhaps for compatibility with older Scheme code), you can enter:
(read-enable 'case-insensitive)
For more information on the effect of the r6rs-hex-escapes and
hungry-eol-escapes options, see (see String Syntax).
Any scheme value may be written to a port. Not all values may be read back in (see Scheme Read), however.
Send a representation of obj to port or to the current output port if not given.
The output is designed to be machine readable, and can be read back with
read(see Scheme Read). Strings are printed in double quotes, with escapes if necessary, and characters are printed in ‘#\’ notation.
Send a representation of obj to port or to the current output port if not given.
The output is designed for human readability, it differs from
writein that strings are printed without double quotes and escapes, and characters are printed as perwrite-char, not in ‘#\’ form.
As was the case with the Scheme reader, there are a few options that affect the behavior of the Scheme printer.
Display the current settings of the read options. If setting is omitted, only a short form of the current read options is printed. Otherwise if setting is the symbol
help, a complete options description is displayed.
The set of available options, and their default values, may be had by
invoking print-options at the prompt.
scheme@(guile-user)> (print-options)
(quote-keywordish-symbols reader highlight-suffix "}" highlight-prefix "{")
scheme@(guile-user)> (print-options 'help)
highlight-prefix { The string to print before highlighted values.
highlight-suffix } The string to print after highlighted values.
quote-keywordish-symbols reader How to print symbols that have a colon
as their first or last character. The
value '#f' does not quote the colons;
'#t' quotes them; 'reader' quotes them
when the reader option 'keywords' is
not '#f'.
escape-newlines yes Render newlines as \n when printing
using `write'.
These options may be modified with the print-set! syntax.
Modify the print options. Due to historical oddities,
print-set!is a macro that expects an unquoted option name.
Scheme has the lovely property that its expressions may be represented
as data. The eval procedure takes a Scheme datum and evaluates
it as code.
Evaluate exp, a list representing a Scheme expression, in the top-level environment specified by module. While exp is evaluated (using
primitive-eval), module is made the current module. The current module is reset to its previous value when eval returns. XXX - dynamic states. Example: (eval '(+ 1 2) (interaction-environment))
Return a specifier for the environment that contains implementation–defined bindings, typically a superset of those listed in the report. The intent is that this procedure will return the environment in which the implementation would evaluate expressions dynamically typed by the user.
See Environments, for other environments.
One does not always receive code as Scheme data, of course, and this is
especially the case for Guile's other language implementations
(see Other Languages). For the case in which all you have is a
string, we have eval-string. There is a legacy version of this
procedure in the default environment, but you really want the one from
(ice-9 eval-string), so load it up:
(use-modules (ice-9 eval-string))
Parse string according to the current language, normally Scheme. Evaluate or compile the expressions it contains, in order, returning the last expression.
If the module keyword argument is set, save a module excursion (see Module System Reflection) and set the current module to module before evaluation.
The file, line, and column keyword arguments can be used to indicate that the source string begins at a particular source location.
Finally, lang is a language, defaulting to the current language, and the expression is compiled if compile? is true or there is no evaluator for the given language.
These C bindings call
eval-stringfrom(ice-9 eval-string), evaluating within module or the current module.
scm_eval_string, but taking a C string in locale encoding instead of anSCM.
Call proc with arguments arg1 ... argN plus the elements of the arglst list.
scm_applytakes parameters corresponding to a Scheme level(lambda (proc arg . rest) ...). So arg and all but the last element of the rest list make up arg1...argN and the last element of rest is the arglst list. Or if rest is the empty listSCM_EOLthen there's no arg1...argN and arg is the arglst.arglst is not modified, but the rest list passed to
scm_applyis modified.
Call proc with the given arguments.
Call proc with any number of arguments. The argument list must be terminated by
SCM_UNDEFINED. For example:scm_call (scm_c_public_ref ("guile", "+"), scm_from_int (1), scm_from_int (2), SCM_UNDEFINED);
Call proc with the array of arguments argv, as a
SCM*. The length of the arguments should be passed in nargs, as asize_t.
lst should be a list (arg1 ... argN arglst), with arglst being a list. This function returns a list comprising arg1 to argN plus the elements of arglst. lst is modified to form the return. arglst is not modified, though the return does share structure with it.
This operation collects up the arguments from a list which is
applystyle parameters.
Evaluate exp in the top-level environment specified by the current module.
The eval procedure directly interprets the S-expression
representation of Scheme. An alternate strategy for evaluation is to
determine ahead of time what computations will be necessary to
evaluate the expression, and then use that recipe to produce the
desired results. This is known as compilation.
While it is possible to compile simple Scheme expressions such as
(+ 2 2) or even "Hello world!", compilation is most
interesting in the context of procedures. Compiling a lambda expression
produces a compiled procedure, which is just like a normal procedure
except typically much faster, because it can bypass the generic
interpreter.
Functions from system modules in a Guile installation are normally compiled already, so they load and run quickly.
Note that well-written Scheme programs will not typically call the
procedures in this section, for the same reason that it is often bad
taste to use eval. By default, Guile automatically compiles any
files it encounters that have not been compiled yet (see --auto-compile). The compiler can also be invoked
explicitly from the shell as guild compile foo.scm.
(Why are calls to eval and compile usually in bad taste?
Because they are limited, in that they can only really make sense for
top-level expressions. Also, most needs for “compile-time”
computation are fulfilled by macros and closures. Of course one good
counterexample is the REPL itself, or any code that reads expressions
from a port.)
Automatic compilation generally works transparently, without any need
for user intervention. However Guile does not yet do proper dependency
tracking, so that if file a.scm uses macros from
b.scm, and b.scm changes, a.scm
would not be automatically recompiled. To forcibly invalidate the
auto-compilation cache, pass the --fresh-auto-compile option to
Guile, or set the GUILE_AUTO_COMPILE environment variable to
fresh (instead of to 0 or 1).
For more information on the compiler itself, see Compiling to the Virtual Machine. For information on the virtual machine, see A Virtual Machine for Guile.
The command-line interface to Guile's compiler is the guild compile command:
Compile file, a source file, and store bytecode in the compilation cache or in the file specified by the -o option. The following options are available:
- -L dir
- --load-path=dir
- Add dir to the front of the module load path.
- -o ofile
- --output=ofile
- Write output bytecode to ofile. By convention, bytecode file names end in
.go. When -o is omitted, the output file name is as forcompile-file(see below).- -W warning
- --warn=warning
- Emit warnings of type warning; use
--warn=helpfor a list of available warnings and their description. Currently recognized warnings includeunused-variable,unused-toplevel,unbound-variable,arity-mismatch, andformat.- -f lang
- --from=lang
- Use lang as the source language of file. If this option is omitted,
schemeis assumed.- -t lang
- --to=lang
- Use lang as the target language of file. If this option is omitted,
objcodeis assumed.- -T target
- --target=target
- Produce bytecode for target instead of %host-type (see %host-type). Target must be a valid GNU triplet, such as
armv5tel-unknown-linux-gnueabi(see Specifying Target Triplets).Each file is assumed to be UTF-8-encoded, unless it contains a coding declaration as recognized by
file-encoding(see Character Encoding of Source Files).
The compiler can also be invoked directly by Scheme code using the procedures below:
Compile the expression exp in the environment env. If exp is a procedure, the result will be a compiled procedure; otherwise
compileis mostly equivalent toeval.For a discussion of languages and compiler options, See Compiling to the Virtual Machine.
Compile the file named file.
Output will be written to a output-file. If you do not supply an output file name, output is written to a file in the cache directory, as computed by
(compiled-file-namefile).from and to specify the source and target languages. See Compiling to the Virtual Machine, for more information on these options, and on env and opts.
As with guild compile, file is assumed to be UTF-8-encoded unless it contains a coding declaration.
Compute a cached location for a compiled version of a Scheme file named file.
This file will usually be below the $HOME/.cache/guile/ccache directory, depending on the value of the XDG_CACHE_HOME environment variable. The intention is that
compiled-file-nameprovides a fallback location for caching auto-compiled files. If you want to place a compile file in the%load-compiled-path, you should pass the output-file option tocompile-file, explicitly.
This variable contains the options passed to the
compile-fileprocedure when auto-compiling source files. By default, it enables useful compilation warnings. It can be customized from ~/.guile.
Load filename and evaluate its contents in the top-level environment.
reader if provided should be either
#f, or a procedure with the signature(lambda (port) ...)which reads the next expression from port. If reader is#for absent, Guile's built-inreadprocedure is used (see Scheme Read).The reader argument takes effect by setting the value of the
current-readerfluid (see below) before loading the file, and restoring its previous value when loading is complete. The Scheme code inside filename can itself change the current reader procedure on the fly by settingcurrent-readerfluid.If the variable
%load-hookis defined, it should be bound to a procedure that will be called before any code is loaded. See documentation for%load-hooklater in this section.
Load the compiled file named filename.
Compiling a source file (see Read/Load/Eval/Compile) and then calling
load-compiledon the resulting file is equivalent to callingloadon the source file.
Load the file named filename and evaluate its contents in the top-level environment. filename must either be a full pathname or be a pathname relative to the current directory. If the variable
%load-hookis defined, it should be bound to a procedure that will be called before any code is loaded. See the documentation for%load-hooklater in this section.
scm_primitive_load, but taking a C string instead of anSCM.
current-readerholds the read procedure that is currently being used by the above loading procedures to read expressions (from the file that they are loading).current-readeris a fluid, so it has an independent value in each dynamic root and should be read and set usingfluid-refandfluid-set!(see Fluids and Dynamic States).Changing
current-readeris typically useful to introduce local syntactic changes, such that code following thefluid-set!call is read using the newly installed reader. Thecurrent-readerchange should take place at evaluation time when the code is evaluated, or at compilation time when the code is compiled:(eval-when (compile eval) (fluid-set! current-reader my-own-reader))The
eval-whenform above ensures that thecurrent-readerchange occurs at the right time.
A procedure to be called
(%load-hookfilename)whenever a file is loaded, or#ffor no such call.%load-hookis used by all of the loading functions (loadandprimitive-load, andload-from-pathandprimitive-load-pathdocumented in the next section).For example an application can set this to show what's loaded,
(set! %load-hook (lambda (filename) (format #t "Loading ~a ...\n" filename))) (load-from-path "foo.scm") -| Loading /usr/local/share/guile/site/foo.scm ...
Return the current-load-port. The load port is used internally by
primitive-load.
The procedure in the previous section look for Scheme code in the file system at specific location. Guile also has some procedures to search the load path for code.
List of directories which should be searched for Scheme modules and libraries.
%load-pathis initialized when Guile starts up to(list (%site-dir) (%library-dir) (%package-data-dir)), prepended with the contents of the GUILE_LOAD_PATH environment variable, if it is set. See Build Config, for more on%site-dirand related procedures.
Similar to
load, but searches for filename in the load paths. Preferentially loads a compiled version of the file, if it is available and up-to-date.
A user can extend the load path by calling add-to-load-path.
For example, a script might include this form to add the directory that it is in to the load path:
(add-to-load-path (dirname (current-filename)))
It's better to use add-to-load-path than to modify
%load-path directly, because add-to-load-path takes care
of modifying the path both at compile-time and at run-time.
Search
%load-pathfor the file named filename and load it into the top-level environment. If filename is a relative pathname and is not found in the list of search paths, an error is signalled. Preferentially loads a compiled version of the file, if it is available and up-to-date.By default or if exception-on-not-found is true, an exception is raised if filename is not found. If exception-on-not-found is
#fand filename is not found, no exception is raised and#fis returned. For compatibility with Guile 1.8 and earlier, the C function takes only one argument, which can be either a string (the file name) or an argument list.
Search
%load-pathfor the file named filename, which must be readable by the current user. If filename is found in the list of paths to search or is an absolute pathname, return its full pathname. Otherwise, return#f. Filenames may have any of the optional extensions in the%load-extensionslist;%search-load-pathwill try each extension automatically.
A list of default file extensions for files containing Scheme code.
%search-load-pathtries each of these extensions when looking for a file to load. By default,%load-extensionsis bound to the list("" ".scm").
As mentioned above, when Guile searches the %load-path for a
source file, it will also search the %load-compiled-path for a
corresponding compiled file. If the compiled file is as new or newer
than the source file, it will be loaded instead of the source file,
using load-compiled.
Like
%load-path, but for compiled files. By default, this path has two entries: one for compiled files from Guile itself, and one for site packages.
When primitive-load-path searches the %load-compiled-path
for a corresponding compiled file for a relative path it does so by
appending .go to the relative path. For example, searching for
ice-9/popen could find
/usr/lib/guile/2.0/ccache/ice-9/popen.go, and use it instead of
/usr/share/guile/2.0/ice-9/popen.scm.
If primitive-load-path does not find a corresponding .go
file in the %load-compiled-path, or the .go file is out of
date, it will search for a corresponding auto-compiled file in the
fallback path, possibly creating one if one does not exist.
See Installing Site Packages, for more on how to correctly install site packages. See Modules and the File System, for more on the relationship between load paths and modules. See Compilation, for more on the fallback path and auto-compilation.
Finally, there are a couple of helper procedures for general path manipulation.
Parse path, which is expected to be a colon-separated string, into a list and return the resulting list with tail appended. If path is
#f, tail is returned.
Search path for a directory containing a file named filename. The file must be readable, and not a directory. If we find one, return its full filename; otherwise, return
#f. If filename is absolute, return it unchanged. If given, extensions is a list of strings; for each directory in path, we search for filename concatenated with each extension. If require-exts? is true, require that the returned file name have one of the given extensions; if require-exts? is not given, it defaults to#f.For compatibility with Guile 1.8 and earlier, the C function takes only three arguments.
Scheme source code files are usually encoded in ASCII, but, the
built-in reader can interpret other character encodings. The
procedure primitive-load, and by extension the functions that
call it, such as load, first scan the top 500 characters of the
file for a coding declaration.
A coding declaration has the form coding: XXXXXX, where
XXXXXX is the name of a character encoding in which the source
code file has been encoded. The coding declaration must appear in a
scheme comment. It can either be a semicolon-initiated comment or a block
#! comment.
The name of the character encoding in the coding declaration is
typically lower case and containing only letters, numbers, and hyphens,
as recognized by set-port-encoding! (see set-port-encoding!). Common examples of character encoding
names are utf-8 and iso-8859-1,
as defined by IANA. Thus, the coding declaration is mostly compatible with Emacs.
However, there are some differences in encoding names recognized by
Emacs and encoding names defined by IANA, the latter being essentially a
subset of the former. For instance, latin-1 is a valid encoding
name for Emacs, but it's not according to the IANA standard, which Guile
follows; instead, you should use iso-8859-1, which is both
understood by Emacs and dubbed by IANA (IANA writes it uppercase but
Emacs wants it lowercase and Guile is case insensitive.)
For source code, only a subset of all possible character encodings can
be interpreted by the built-in source code reader. Only those
character encodings in which ASCII text appears unmodified can be
used. This includes UTF-8 and ISO-8859-1 through
ISO-8859-15. The multi-byte character encodings UTF-16
and UTF-32 may not be used because they are not compatible with
ASCII.
There might be a scenario in which one would want to read non-ASCII
code from a port, such as with the function read, instead of
with load. If the port's character encoding is the same as the
encoding of the code to be read by the port, not other special
handling is necessary. The port will automatically do the character
encoding conversion. The functions setlocale or by
set-port-encoding! are used to set port encodings
(see Ports).
If a port is used to read code of unknown character encoding, it can
accomplish this in three steps. First, the character encoding of the
port should be set to ISO-8859-1 using set-port-encoding!.
Then, the procedure file-encoding, described below, is used to
scan for a coding declaration when reading from the port. As a side
effect, it rewinds the port after its scan is complete. After that,
the port's character encoding should be set to the encoding returned
by file-encoding, if any, again by using
set-port-encoding!. Then the code can be read as normal.
Scan the port for an Emacs-like character coding declaration near the top of the contents of a port with random-accessible contents (see how Emacs recognizes file encoding). The coding declaration is of the form
coding: XXXXXand must appear in a Scheme comment. Return a string containing the character encoding of the file if a declaration was found, or#fotherwise. The port is rewound.
Promises are a convenient way to defer a calculation until its result is actually needed, and to run such a calculation only once.
Return a promise object which holds the given expr expression, ready to be evaluated by a later
force.
Return the value obtained from evaluating the expr in the given promise p. If p has previously been forced then its expr is not evaluated again, instead the value obtained at that time is simply returned.
During a
force, an expr can callforceagain on its own promise, resulting in a recursive evaluation of that expr. The first evaluation to return gives the value for the promise. Higher evaluations run to completion in the normal way, but their results are ignored,forcealways returns the first value.
Guile includes a facility to capture a lexical environment, and later evaluate a new expression within that environment. This code is implemented in a module.
(use-modules (ice-9 local-eval))
Captures and returns a lexical environment for use with
local-evalorlocal-compile.
Evaluate or compile the expression exp in the lexical environment env.
Here is a simple example, illustrating that it is the variable that gets captured, not just its value at one point in time.
(define e (let ((x 100)) (the-environment)))
(define fetch-x (local-eval '(lambda () x) e))
(fetch-x)
⇒ 100
(local-eval '(set! x 42) e)
(fetch-x)
⇒ 42
While exp is evaluated within the lexical environment of
(the-environment), it has the dynamic environment of the call to
local-eval.
local-eval and local-compile can only evaluate
expressions, not definitions.
(local-eval '(define foo 42)
(let ((x 100)) (the-environment)))
⇒ syntax error: definition in expression context
Note that the current implementation of (the-environment) only
captures “normal” lexical bindings, and pattern variables bound by
syntax-case. It does not currently capture local syntax
transformers bound by let-syntax, letrec-syntax or
non-top-level define-syntax forms. Any attempt to reference such
captured syntactic keywords via local-eval or
local-compile produces an error.
This section has discussed various means of linking Scheme code
together: fundamentally, loading up files at run-time using load
and load-compiled. Guile provides another option to compose
parts of programs together at expansion-time instead of at run-time.
Open file-name, at expansion-time, and read the Scheme forms that it contains, splicing them into the location of the
include, within abegin.
If you are a C programmer, if load in Scheme is like
dlopen in C, consider include to be like the C
preprocessor's #include. When you use include, it is as
if the contents of the included file were typed in instead of the
include form.
Because the code is included at compile-time, it is available to the
macroexpander. Syntax definitions in the included file are available to
later code in the form in which the include appears, without the
need for eval-when. (See Eval When.)
For the same reason, compiling a form that uses include results
in one compilation unit, composed of multiple files. Loading the
compiled file is one stat operation for the compilation unit,
instead of 2*n in the case of load (once for each
loaded source file, and once each corresponding compiled file, in the
best case).
Unlike load, include also works within nested lexical
contexts. It so happens that the optimizer works best within a lexical
context, because all of the uses of bindings in a lexical context are
visible, so composing files by including them within a (let ()
...) can sometimes lead to important speed improvements.
On the other hand, include does have all the disadvantages of
early binding: once the code with the include is compiled, no
change to the included file is reflected in the future behavior of the
including form.
Also, the particular form of include, which requires an absolute
path, or a path relative to the current directory at compile-time, is
not very amenable to compiling the source in one place, but then
installing the source to another place. For this reason, Guile provides
another form, include-from-path, which looks for the source file
to include within a load path.
Like
include, but instead of expectingfile-nameto be an absolute file name, it is expected to be a relative path to search in the%load-path.
include-from-path is more useful when you want to install all of
the source files for a package (as you should!). It makes it possible
to evaluate an installed file from source, instead of relying on the
.go file being up to date.
Guile uses a garbage collector to manage most of its objects. While the garbage collector is designed to be mostly invisible, you sometimes need to interact with it explicitly.
See Garbage Collection for a general discussion of how garbage collection relates to using Guile from C.
Scans all of SCM objects and reclaims for further use those that are no longer accessible. You normally don't need to call this function explicitly. It is called automatically when appropriate.
Protects obj from being freed by the garbage collector, when it otherwise might be. When you are done with the object, call
scm_gc_unprotect_objecton the object. Calls toscm_gc_protect/scm_gc_unprotect_objectcan be nested, and the object remains protected until it has been unprotected as many times as it was protected. It is an error to unprotect an object more times than it has been protected. Returns the SCM object it was passed.Note that storing obj in a C global variable has the same effect14.
Unprotects an object from the garbage collector which was protected by
scm_gc_unprotect_object. Returns the SCM object it was passed.
Similar to
scm_gc_protect_objectin that it causes the collector to always mark the object, except that it should not be nested (only callscm_permanent_objecton an object once), and it has no corresponding unpermanent function. Once an object is declared permanent, it will never be freed. Returns the SCM object it was passed.
Create a reference to the given object or objects, so they're certain to be present on the stack or in a register and hence will not be freed by the garbage collector before this point.
Note that these functions can only be applied to ordinary C local variables (ie. “automatics”). Objects held in global or static variables or some malloced block or the like cannot be protected with this mechanism.
Return an association list of statistics about Guile's current use of storage.
Return an alist of statistics of the current live objects.
Mark the object x, and recurse on any objects x refers to. If x's mark bit is already set, return immediately. This function must only be called during the mark-phase of garbage collection, typically from a smob mark function.
In C programs, dynamic management of memory blocks is normally done with the functions malloc, realloc, and free. Guile has additional functions for dynamic memory allocation that are integrated into the garbage collector and the error reporting system.
Memory blocks that are associated with Scheme objects (for example a
smob) should be allocated with scm_gc_malloc or
scm_gc_malloc_pointerless. These two functions will either
return a valid pointer or signal an error. Memory blocks allocated this
way can be freed with scm_gc_free; however, this is not strictly
needed: memory allocated with scm_gc_malloc or
scm_gc_malloc_pointerless is automatically reclaimed when the
garbage collector no longer sees any live reference to it15.
Memory allocated with scm_gc_malloc is scanned for live pointers.
This means that if scm_gc_malloc-allocated memory contains a
pointer to some other part of the memory, the garbage collector notices
it and prevents it from being reclaimed16. Conversely, memory
allocated with scm_gc_malloc_pointerless is assumed to be
“pointer-less” and is not scanned.
For memory that is not associated with a Scheme object, you can use
scm_malloc instead of malloc. Like
scm_gc_malloc, it will either return a valid pointer or signal
an error. However, it will not assume that the new memory block can
be freed by a garbage collection. The memory must be explicitly freed
with free.
There is also scm_gc_realloc and scm_realloc, to be used
in place of realloc when appropriate, and scm_gc_calloc
and scm_calloc, to be used in place of calloc when
appropriate.
The function scm_dynwind_free can be useful when memory should be
freed with libc's free when leaving a dynwind context,
See Dynamic Wind.
Allocate size bytes of memory and return a pointer to it. When size is 0, return
NULL. When not enough memory is available, signal an error. This function runs the GC to free up some memory when it deems it appropriate.The memory is allocated by the libc
mallocfunction and can be freed withfree. There is noscm_freefunction to go withscm_mallocto make it easier to pass memory back and forth between different modules.The function
scm_callocis similar toscm_malloc, but initializes the block of memory to zero as well.These functions will (indirectly) call
scm_gc_register_allocation.
Change the size of the memory block at mem to new_size and return its new location. When new_size is 0, this is the same as calling
freeon mem andNULLis returned. When mem isNULL, this function behaves likescm_mallocand allocates a new block of size new_size.When not enough memory is available, signal an error. This function runs the GC to free up some memory when it deems it appropriate.
This function will call
scm_gc_register_allocation.
Allocate size bytes of automatically-managed memory. The memory is automatically freed when no longer referenced from any live memory block.
Memory allocated with
scm_gc_mallocorscm_gc_callocis scanned for pointers. Memory allocated byscm_gc_malloc_pointerlessis not scanned.The
scm_gc_realloccall preserves the “pointerlessness” of the memory area pointed to by mem. Note that you need to pass the old size of a reallocated memory block as well. See below for a motivation.
Explicitly free the memory block pointed to by mem, which was previously allocated by one of the above
scm_gcfunctions.Note that you need to explicitly pass the size parameter. This is done since it should normally be easy to provide this parameter (for memory that is associated with GC controlled objects) and help keep the memory management overhead very low. However, in Guile 2.x, size is always ignored.
Informs the garbage collector that size bytes have been allocated, which the collector would otherwise not have known about.
In general, Scheme will decide to collect garbage only after some amount of memory has been allocated. Calling this function will make the Scheme garbage collector know about more allocation, and thus run more often (as appropriate).
It is especially important to call this function when large unmanaged allocations, like images, may be freed by small Scheme allocations, like SMOBs.
Equivalent to
scm_dynwind_unwind_handler (free,mem, SCM_F_WIND_EXPLICITLY). That is, the memory block at mem will be freed (usingfreefrom the C library) when the current dynwind is left.
Return an alist ((what . n) ...) describing number of malloced objects. what is the second argument to
scm_gc_malloc, n is the number of objects of that type currently allocated.This function is only available if the
GUILE_DEBUG_MALLOCpreprocessor macro was defined when Guile was compiled.
Version 1.6 of Guile and earlier did not have the functions from the
previous section. In their place, it had the functions
scm_must_malloc, scm_must_realloc and
scm_must_free. This section explains why we want you to stop
using them, and how to do this.
The functions scm_must_malloc and scm_must_realloc
behaved like scm_gc_malloc and scm_gc_realloc do now,
respectively. They would inform the GC about the newly allocated
memory via the internal equivalent of
scm_gc_register_allocation. However,
scm_must_free did not unregister the memory it was about to
free. The usual way to unregister memory was to return its size from
a smob free function.
This disconnectedness of the actual freeing of memory and reporting
this to the GC proved to be bad in practice. It was easy to make
mistakes and report the wrong size because allocating and freeing was
not done with symmetric code, and because it is cumbersome to compute
the total size of nested data structures that were freed with multiple
calls to scm_must_free. Additionally, there was no equivalent
to scm_malloc, and it was tempting to just use
scm_must_malloc and never to tell the GC that the memory has
been freed.
The effect was that the internal statistics kept by the GC drifted out of sync with reality and could even overflow in long running programs. When this happened, the result was a dramatic increase in (senseless) GC activity which would effectively stop the program dead.
The functions scm_done_malloc and scm_done_free were
introduced to help restore balance to the force, but existing bugs did
not magically disappear, of course.
Therefore we decided to force everybody to review their code by deprecating the existing functions and introducing new ones in their place that are hopefully easier to use correctly.
For every use of scm_must_malloc you need to decide whether to
use scm_malloc or scm_gc_malloc in its place. When the
memory block is not part of a smob or some other Scheme object whose
lifetime is ultimately managed by the garbage collector, use
scm_malloc and free. When it is part of a smob, use
scm_gc_malloc and change the smob free function to use
scm_gc_free instead of scm_must_free or free and
make it return zero.
The important thing is to always pair scm_malloc with
free; and to always pair scm_gc_malloc with
scm_gc_free.
The same reasoning applies to scm_must_realloc and
scm_realloc versus scm_gc_realloc.
[FIXME: This chapter is based on Mikael Djurfeldt's answer to a question by Michael Livshin. Any mistakes are not theirs, of course. ]
Weak references let you attach bookkeeping information to data so that the additional information automatically disappears when the original data is no longer in use and gets garbage collected. In a weak key hash, the hash entry for that key disappears as soon as the key is no longer referenced from anywhere else. For weak value hashes, the same happens as soon as the value is no longer in use. Entries in a doubly weak hash disappear when either the key or the value are not used anywhere else anymore.
Object properties offer the same kind of functionality as weak key hashes in many situations. (see Object Properties)
Here's an example (a little bit strained perhaps, but one of the examples is actually used in Guile):
Assume that you're implementing a debugging system where you want to associate information about filename and position of source code expressions with the expressions themselves.
Hashtables can be used for that, but if you use ordinary hash tables it will be impossible for the scheme interpreter to "forget" old source when, for example, a file is reloaded.
To implement the mapping from source code expressions to positional information it is necessary to use weak-key tables since we don't want the expressions to be remembered just because they are in our table.
To implement a mapping from source file line numbers to source code expressions you would use a weak-value table.
To implement a mapping from source code expressions to the procedures they constitute a doubly-weak table has to be used.
Return a weak hash table with size buckets. As with any hash table, choosing a good size for the table requires some caution.
You can modify weak hash tables in exactly the same way you would modify regular hash tables. (see Hash Tables)
Return
#tif obj is the specified weak hash table. Note that a doubly weak hash table is neither a weak key nor a weak value hash table.
Weak vectors are mainly useful in Guile's implementation of weak hash tables.
Return a weak vector with size elements. If the optional argument fill is given, all entries in the vector will be set to fill. The default value for fill is the empty list.
Construct a weak vector from a list:
weak-vectoruses the list of its arguments whilelist->weak-vectoruses its only argument l (a list) to construct a weak vector the same waylist->vectorwould.
Return
#tif obj is a weak vector. Note that all weak hashes are also weak vectors.
Guardians provide a way to be notified about objects that would otherwise be collected as garbage. Guarding them prevents the objects from being collected and cleanup actions can be performed on them, for example.
See R. Kent Dybvig, Carl Bruggeman, and David Eby (1993) "Guardians in a Generation-Based Garbage Collector". ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1993.
Create a new guardian. A guardian protects a set of objects from garbage collection, allowing a program to apply cleanup or other actions.
make-guardianreturns a procedure representing the guardian. Calling the guardian procedure with an argument adds the argument to the guardian's set of protected objects. Calling the guardian procedure without an argument returns one of the protected objects which are ready for garbage collection, or#fif no such object is available. Objects which are returned in this way are removed from the guardian.You can put a single object into a guardian more than once and you can put a single object into more than one guardian. The object will then be returned multiple times by the guardian procedures.
An object is eligible to be returned from a guardian when it is no longer referenced from outside any guardian.
There is no guarantee about the order in which objects are returned from a guardian. If you want to impose an order on finalization actions, for example, you can do that by keeping objects alive in some global data structure until they are no longer needed for finalizing other objects.
Being an element in a weak vector, a key in a hash table with weak keys, or a value in a hash table with weak values does not prevent an object from being returned by a guardian. But as long as an object can be returned from a guardian it will not be removed from such a weak vector or hash table. In other words, a weak link does not prevent an object from being considered collectable, but being inside a guardian prevents a weak link from being broken.
A key in a weak key hash table can be thought of as having a strong reference to its associated value as long as the key is accessible. Consequently, when the key is only accessible from within a guardian, the reference from the key to the value is also considered to be coming from within a guardian. Thus, if there is no other reference to the value, it is eligible to be returned from a guardian.
When programs become large, naming conflicts can occur when a function or global variable defined in one file has the same name as a function or global variable in another file. Even just a similarity between function names can cause hard-to-find bugs, since a programmer might type the wrong function name.
The approach used to tackle this problem is called information encapsulation, which consists of packaging functional units into a given name space that is clearly separated from other name spaces. The language features that allow this are usually called the module system because programs are broken up into modules that are compiled separately (or loaded separately in an interpreter).
Older languages, like C, have limited support for name space
manipulation and protection. In C a variable or function is public by
default, and can be made local to a module with the static
keyword. But you cannot reference public variables and functions from
another module with different names.
More advanced module systems have become a common feature in recently designed languages: ML, Python, Perl, and Modula 3 all allow the renaming of objects from a foreign module, so they will not clutter the global name space. In addition, Guile offers variables as first-class objects. They can be used for interacting with the module system.
A Guile module can be thought of as a collection of named procedures, variables and macros. More precisely, it is a set of bindings of symbols (names) to Scheme objects.
Within a module, all bindings are visible. Certain bindings can be declared public, in which case they are added to the module's so-called export list; this set of public bindings is called the module's public interface (see Creating Guile Modules).
A client module uses a providing module's bindings by either accessing the providing module's public interface, or by building a custom interface (and then accessing that). In a custom interface, the client module can select which bindings to access and can also algorithmically rename bindings. In contrast, when using the providing module's public interface, the entire export list is available without renaming (see Using Guile Modules).
All Guile modules have a unique module name, for example
(ice-9 popen) or (srfi srfi-11). Module names are lists
of one or more symbols.
When Guile goes to use an interface from a module, for example
(ice-9 popen), Guile first looks to see if it has loaded
(ice-9 popen) for any reason. If the module has not been loaded
yet, Guile searches a load path for a file that might define it,
and loads that file.
The following subsections go into more detail on using, creating, installing, and otherwise manipulating modules and the module system.
To use a Guile module is to access either its public interface or a
custom interface (see General Information about Modules). Both
types of access are handled by the syntactic form use-modules,
which accepts one or more interface specifications and, upon evaluation,
arranges for those interfaces to be available to the current module.
This process may include locating and loading code for a given module if
that code has not yet been loaded, following %load-path
(see Modules and the File System).
An interface specification has one of two forms. The first variation is simply to name the module, in which case its public interface is the one accessed. For example:
(use-modules (ice-9 popen))
Here, the interface specification is (ice-9 popen), and the
result is that the current module now has access to open-pipe,
close-pipe, open-input-pipe, and so on (see Included Guile Modules).
Note in the previous example that if the current module had already
defined open-pipe, that definition would be overwritten by the
definition in (ice-9 popen). For this reason (and others), there
is a second variation of interface specification that not only names a
module to be accessed, but also selects bindings from it and renames
them to suit the current module's needs. For example:
(use-modules ((ice-9 popen)
#:select ((open-pipe . pipe-open) close-pipe)
#:renamer (symbol-prefix-proc 'unixy:)))
Here, the interface specification is more complex than before, and the result is that a custom interface with only two bindings is created and subsequently accessed by the current module. The mapping of old to new names is as follows:
(ice-9 popen) sees: current module sees:
open-pipe unixy:pipe-open
close-pipe unixy:close-pipe
This example also shows how to use the convenience procedure
symbol-prefix-proc.
You can also directly refer to bindings in a module by using the
@ syntax. For example, instead of using the
use-modules statement from above and writing
unixy:pipe-open to refer to the pipe-open from the
(ice-9 popen), you could also write (@ (ice-9 popen)
open-pipe). Thus an alternative to the complete use-modules
statement would be
(define unixy:pipe-open (@ (ice-9 popen) open-pipe))
(define unixy:close-pipe (@ (ice-9 popen) close-pipe))
There is also @@, which can be used like @, but does
not check whether the variable that is being accessed is actually
exported. Thus, @@ can be thought of as the impolite version
of @ and should only be used as a last resort or for
debugging, for example.
Note that just as with a use-modules statement, any module that
has not yet been loaded yet will be loaded when referenced by a
@ or @@ form.
You can also use the @ and @@ syntaxes as the target
of a set! when the binding refers to a variable.
Return a procedure that prefixes its arg (a symbol) with prefix-sym.
Resolve each interface specification spec into an interface and arrange for these to be accessible by the current module. The return value is unspecified.
spec can be a list of symbols, in which case it names a module whose public interface is found and used.
spec can also be of the form:
(MODULE-NAME [#:select SELECTION] [#:renamer RENAMER])in which case a custom interface is newly created and used. module-name is a list of symbols, as above; selection is a list of selection-specs; and renamer is a procedure that takes a symbol and returns its new name. A selection-spec is either a symbol or a pair of symbols
(ORIG . SEEN), where orig is the name in the used module and seen is the name in the using module. Note that seen is also passed through renamer.The
#:selectand#:renamerclauses are optional. If both are omitted, the returned interface has no bindings. If the#:selectclause is omitted, renamer operates on the used module's public interface.In addition to the above, spec can also include a
#:versionclause, of the form:#:version VERSION-SPECwhere version-spec is an R6RS-compatible version reference. An error will be signaled in the case in which a module with the same name has already been loaded, if that module specifies a version and that version is not compatible with version-spec. See R6RS Version References, for more on version references.
If the module name is not resolvable,
use-moduleswill signal an error.
Refer to the binding named binding-name in module module-name. The binding must have been exported by the module.
Refer to the binding named binding-name in module module-name. The binding must not have been exported by the module. This syntax is only intended for debugging purposes or as a last resort.
When you want to create your own modules, you have to take the following steps:
define-module form at the beginning.
define-public or export (both documented below).
module-name is a list of one or more symbols.
(define-module (ice-9 popen))
define-modulemakes this module available to Guile programs under the given module-name.The options are keyword/value pairs which specify more about the defined module. The recognized options and their meaning is shown in the following table.
#:use-moduleinterface-specification- Equivalent to a
(use-modulesinterface-specification)(see Using Guile Modules).#:autoloadmodule symbol-list- Load module when any of symbol-list are accessed. For example,
(define-module (my mod) #:autoload (srfi srfi-1) (partition delete-duplicates)) ... (if something (set! foo (delete-duplicates ...)))When a module is autoloaded, all its bindings become available. symbol-list is just those that will first trigger the load.
An autoload is a good way to put off loading a big module until it's really needed, for instance for faster startup or if it will only be needed in certain circumstances.
@can do a similar thing (see Using Guile Modules), but in that case an@form must be written every time a binding from the module is used.#:exportlist- Export all identifiers in list which must be a list of symbols or pairs of symbols. This is equivalent to
(exportlist)in the module body.#:re-exportlist- Re-export all identifiers in list which must be a list of symbols or pairs of symbols. The symbols in list must be imported by the current module from other modules. This is equivalent to
re-exportbelow.#:replacelist- Export all identifiers in list (a list of symbols or pairs of symbols) and mark them as replacing bindings. In the module user's name space, this will have the effect of replacing any binding with the same name that is not also “replacing”. Normally a replacement results in an “override” warning message,
#:replaceavoids that.In general, a module that exports a binding for which the
(guile)module already has a definition should use#:replaceinstead of#:export.#:replace, in a sense, lets Guile know that the module purposefully replaces a core binding. It is important to note, however, that this binding replacement is confined to the name space of the module user. In other words, the value of the core binding in question remains unchanged for other modules.Note that although it is often a good idea for the replaced binding to remain compatible with a binding in
(guile), to avoid surprising the user, sometimes the bindings will be incompatible. For example, SRFI-19 exports its own version ofcurrent-time(see SRFI-19 Time) which is not compatible with the corecurrent-timefunction (see Time). Guile assumes that a user importing a module knows what she is doing, and uses#:replacefor this binding rather than#:export.A
#:replaceclause is equivalent to(export!list)in the module body.The
#:duplicates(see below) provides fine-grain control about duplicate binding handling on the module-user side.#:versionlist- Specify a version for the module in the form of list, a list of zero or more exact, nonnegative integers. The corresponding
#:versionoption in theuse-modulesform allows callers to restrict the value of this option in various ways.#:duplicateslist- Tell Guile to handle duplicate bindings for the bindings imported by the current module according to the policy defined by list, a list of symbols. list must contain symbols representing a duplicate binding handling policy chosen among the following:
check- Raises an error when a binding is imported from more than one place.
warn- Issue a warning when a binding is imported from more than one place and leave the responsibility of actually handling the duplication to the next duplicate binding handler.
replace- When a new binding is imported that has the same name as a previously imported binding, then do the following:
- If the old binding was said to be replacing (via the
#:replaceoption above) and the new binding is not replacing, the keep the old binding.- If the old binding was not said to be replacing and the new binding is replacing, then replace the old binding with the new one.
- If neither the old nor the new binding is replacing, then keep the old one.
warn-override-core- Issue a warning when a core binding is being overwritten and actually override the core binding with the new one.
first- In case of duplicate bindings, the firstly imported binding is always the one which is kept.
last- In case of duplicate bindings, the lastly imported binding is always the one which is kept.
noop- In case of duplicate bindings, leave the responsibility to the next duplicate handler.
If list contains more than one symbol, then the duplicate binding handlers which appear first will be used first when resolving a duplicate binding situation. As mentioned above, some resolution policies may explicitly leave the responsibility of handling the duplication to the next handler in list.
If GOOPS has been loaded before the
#:duplicatesclause is processed, there are additional strategies available for dealing with generic functions. See Merging Generics, for more information.The default duplicate binding resolution policy is given by the
default-duplicate-binding-handlerprocedure, and is(replace warn-override-core warn last)#:pure- Create a pure module, that is a module which does not contain any of the standard procedure bindings except for the syntax forms. This is useful if you want to create safe modules, that is modules which do not know anything about dangerous procedures.
Add all variables (which must be symbols or pairs of symbols) to the list of exported bindings of the current module. If variable is a pair, its
cargives the name of the variable as seen by the current module and itscdrspecifies a name for the binding in the current module's public interface.
Add all variables (which must be symbols or pairs of symbols) to the list of re-exported bindings of the current module. Pairs of symbols are handled as in
export. Re-exported bindings must be imported by the current module from some other module.
Like
export, but marking the exported variables as replacing. Using a module with replacing bindings will cause any existing bindings to be replaced without issuing any warnings. See the discussion of#:replaceabove.
Typical programs only use a small subset of modules installed on a Guile system. In order to keep startup time down, Guile only loads modules when a program uses them, on demand.
When a program evaluates (use-modules (ice-9 popen)), and the
module is not loaded, Guile searches for a conventionally-named file
from in the load path.
In this case, loading (ice-9 popen) will eventually cause Guile
to run (primitive-load-path "ice-9/popen").
primitive-load-path will search for a file ice-9/popen in
the %load-path (see Load Paths). For each directory in
%load-path, Guile will try to find the file name, concatenated
with the extensions from %load-extensions. By default, this will
cause Guile to stat ice-9/popen.scm, and then
ice-9/popen. See Load Paths, for more on
primitive-load-path.
If a corresponding compiled .go file is found in the
%load-compiled-path or in the fallback path, and is as fresh as
the source file, it will be loaded instead of the source file. If no
compiled file is found, Guile may try to compile the source file and
cache away the resulting .go file. See Compilation, for more
on compilation.
Once Guile finds a suitable source or compiled file is found, the file will be loaded. If, after loading the file, the module under consideration is still not defined, Guile will signal an error.
For more information on where and how to install Scheme modules, See Installing Site Packages.
Guile's module system includes support for locating modules based on
a declared version specifier of the same form as the one described in
R6RS (see R6RS Library Form). By using the
#:version keyword in a define-module form, a module may
specify a version as a list of zero or more exact, nonnegative integers.
This version can then be used to locate the module during the module
search process. Client modules and callers of the use-modules
function may specify constraints on the versions of target modules by
providing a version reference, which has one of the following
forms:
(sub-version-reference ...)
(and version-reference ...)
(or version-reference ...)
(not version-reference)
in which sub-version-reference is in turn one of:
(sub-version)
(>= sub-version)
(<= sub-version)
(and sub-version-reference ...)
(or sub-version-reference ...)
(not sub-version-reference)
in which sub-version is an exact, nonnegative integer as above. A version reference matches a declared module version if each element of the version reference matches a corresponding element of the module version, according to the following rules:
and sub-form matches a version or version element if every
element in the tail of the sub-form matches the specified version or
version element.
or sub-form matches a version or version element if any
element in the tail of the sub-form matches the specified version or
version element.
not sub-form matches a version or version element if the tail
of the sub-form does not match the version or version element.
>= sub-form matches a version element if the element is
greater than or equal to the sub-version in the tail of the
sub-form.
<= sub-form matches a version element if the version is less
than or equal to the sub-version in the tail of the sub-form.
For example, a module declared as:
(define-module (mylib mymodule) #:version (1 2 0))
would be successfully loaded by any of the following use-modules
expressions:
(use-modules ((mylib mymodule) #:version (1 2 (>= 0))))
(use-modules ((mylib mymodule) #:version (or (1 2 0) (1 2 1))))
(use-modules ((mylib mymodule) #:version ((and (>= 1) (not 2)) 2 0)))
In addition to the API described in the previous sections, you also
have the option to create modules using the portable library form
described in R6RS (see R6RS Library Form), and to import
libraries created in this format by other programmers. Guile's R6RS
library implementation takes advantage of the flexibility built into the
module system by expanding the R6RS library form into a corresponding
Guile define-module form that specifies equivalent import and
export requirements and includes the same body expressions. The library
expression:
(library (mylib (1 2))
(import (otherlib (3)))
(export mybinding))
is equivalent to the module definition:
(define-module (mylib)
#:version (1 2)
#:use-module ((otherlib) #:version (3))
#:export (mybinding))
Central to the mechanics of R6RS libraries is the concept of import
and export levels, which control the visibility of bindings at
various phases of a library's lifecycle — macros necessary to
expand forms in the library's body need to be available at expand
time; variables used in the body of a procedure exported by the
library must be available at runtime. R6RS specifies the optional
for sub-form of an import set specification (see below)
as a mechanism by which a library author can indicate that a
particular library import should take place at a particular phase
with respect to the lifecycle of the importing library.
Guile's library implementation uses a technique called
implicit phasing (first described by Abdulaziz Ghuloum and R.
Kent Dybvig), which allows the expander and compiler to automatically
determine the necessary visibility of a binding imported from another
library. As such, the for sub-form described below is ignored by
Guile (but may be required by Schemes in which phasing is explicit).
Defines a new library with the specified name, exports, and imports, and evaluates the specified body expressions in this library's environment.
The library name is a non-empty list of identifiers, optionally ending with a version specification of the form described above (see Creating Guile Modules).
Each export-spec is the name of a variable defined or imported by the library, or must take the form
(rename (internal-name external-name) ...), where the identifier internal-name names a variable defined or imported by the library and external-name is the name by which the variable is seen by importing libraries.Each import-spec must be either an import set (see below) or must be of the form
(for import-set import-level ...), where each import-level is one of:run expand (meta level)where level is an integer. Note that since Guile does not require explicit phase specification, any import-sets found inside of
forsub-forms will be “unwrapped” during expansion and processed as if they had been specified directly.Import sets in turn take one of the following forms:
library-reference (library library-reference) (only import-set identifier ...) (except import-set identifier ...) (prefix import-set identifier) (rename import-set (internal-identifier external-identifier) ...)where library-reference is a non-empty list of identifiers ending with an optional version reference (see R6RS Version References), and the other sub-forms have the following semantics, defined recursively on nested import-sets:
- The
librarysub-form is used to specify libraries for import whose names begin with the identifier “library.”- The
onlysub-form imports only the specified identifiers from the given import-set.- The
exceptsub-form imports all of the bindings exported by import-set except for those that appear in the specified list of identifiers.- The
prefixsub-form imports all of the bindings exported by import-set, first prefixing them with the specified identifier.- The
renamesub-form imports all of the identifiers exported by import-set. The binding for each internal-identifier among these identifiers is made visible to the importing library as the corresponding external-identifier; all other bindings are imported using the names provided by import-set.Note that because Guile translates R6RS libraries into module definitions, an import specification may be used to declare a dependency on a native Guile module — although doing so may make your libraries less portable to other Schemes.
Import into the current environment the libraries specified by the given import specifications, where each import-spec takes the same form as in the
libraryform described above.
Each module has its own hash table, sometimes known as an obarray, that maps the names defined in that module to their corresponding variable objects.
A variable is a box-like object that can hold any Scheme value. It is
said to be undefined if its box holds a special Scheme value that
denotes undefined-ness (which is different from all other Scheme values,
including for example #f); otherwise the variable is
defined.
On its own, a variable object is anonymous. A variable is said to be bound when it is associated with a name in some way, usually a symbol in a module obarray. When this happens, the name is said to be bound to the variable, in that module.
(That's the theory, anyway. In practice, defined-ness and bound-ness sometimes get confused, because Lisp and Scheme implementations have often conflated — or deliberately drawn no distinction between — a name that is unbound and a name that is bound to a variable whose value is undefined. We will try to be clear about the difference and explain any confusion where it is unavoidable.)
Variables do not have a read syntax. Most commonly they are created and
bound implicitly by define expressions: a top-level define
expression of the form
(define name value)
creates a variable with initial value value and binds it to the
name name in the current module. But they can also be created
dynamically by calling one of the constructor procedures
make-variable and make-undefined-variable.
Return a variable that is initially unbound.
Return a variable initialized to value init.
Return
#tiff var is bound to a value. Throws an error if var is not a variable object.
Dereference var and return its value. var must be a variable object; see
make-variableandmake-undefined-variable.
Set the value of the variable var to val. var must be a variable object, val can be any value. Return an unspecified value.
Unset the value of the variable var, leaving var unbound.
Return
#tiff obj is a variable object, else return#f.
The previous sections have described a declarative view of the module system. You can also work with it programmatically by accessing and modifying various parts of the Scheme objects that Guile uses to implement the module system.
At any time, there is a current module. This module is the one
where a top-level define and similar syntax will add new
bindings. You can find other module objects with resolve-module,
for example.
These module objects can be used as the second argument to eval.
Return the current module object.
Set the current module to module and return the previous current module.
Call thunk within a
dynamic-windsuch that the module that is current at invocation time is restored when thunk's dynamic extent is left (see Dynamic Wind).More precisely, if thunk escapes non-locally, the current module (at the time of escape) is saved, and the original current module (at the time thunk's dynamic extent was last entered) is restored. If thunk's dynamic extent is re-entered, then the current module is saved, and the previously saved inner module is set current again.
Find the module named name and return it. When it has not already been defined and autoload is true, try to auto-load it. When it can't be found that way either, create an empty module if ensure is true, otherwise return
#f. If version is true, ensure that the resulting module is compatible with the given version reference (see R6RS Version References). The name is a list of symbols.
Find the module named name as with
resolve-moduleand return its interface. The interface of a module is also a module object, but it contains only the exported bindings.
Add interface to the front of the use-list of module. Both arguments should be module objects, and interface should very likely be a module returned by
resolve-interface.
Revisit the source file that corresponds to module. Raises an error if no source file is associated with the given module.
As mentioned in the previous section, modules contain a mapping between identifiers (as symbols) and storage locations (as variables). Guile defines a number of procedures to allow access to this mapping. If you are programming in C, Accessing Modules from C.
Return the variable bound to name (a symbol) in module, or
#fif name is unbound.
Define a new binding between name (a symbol) and var (a variable) in module.
Look up the value bound to name in module. Like
module-variable, but also does avariable-refon the resulting variable, raising an error if name is unbound.
Locally bind name to value in module. If name was already locally bound in module, i.e., defined locally and not by an imported module, the value stored in the existing variable will be updated. Otherwise, a new variable will be added to the module, via
module-add!.
Update the binding of name in module to value, raising an error if name is not already bound in module.
There are many other reflective procedures available in the default environment. If you find yourself using one of them, please contact the Guile developers so that we can commit to stability for that interface.
The last sections have described how modules are used in Scheme code, which is the recommended way of creating and accessing modules. You can also work with modules from C, but it is more cumbersome.
The following procedures are available.
Call func and make module the current module during the call. The argument data is passed to func. The return value of
scm_c_call_with_current_moduleis the return value of func.
Find a the variable bound to the symbol name in the public interface of the module named module_name.
module_name should be a list of symbols, when represented as a Scheme object, or a space-separated string, in the
const char *case. Seescm_c_define_modulebelow, for more examples.Signals an error if no module was found with the given name. If name is not bound in the module, just returns
#f.
Like
scm_public_variable, but looks in the internals of the module named module_name instead of the public interface. Logically, these procedures should only be called on modules you write.
Like
scm_public_variableorscm_private_variable, but if the name is not bound in the module, signals an error. Returns a variable, always.SCM my_eval_string (SCM str) { static SCM eval_string_var = SCM_BOOL_F; if (scm_is_false (eval_string_var)) eval_string_var = scm_c_public_lookup ("ice-9 eval-string", "eval-string"); return scm_call_1 (scm_variable_ref (eval_string_var), str); }
Like
scm_public_lookuporscm_private_lookup, but additionally dereferences the variable. If the variable object is unbound, signals an error. Returns the value bound to name in module.
In addition, there are a number of other lookup-related procedures. We
suggest that you use the scm_public_ and scm_private_
family of procedures instead, if possible.
Return the variable bound to the symbol indicated by name in the current module. If there is no such binding or the symbol is not bound to a variable, signal an error.
Like
scm_c_lookupandscm_lookup, but the specified module is used instead of the current one.
Like
scm_module_lookup, but if the binding does not exist, just returns#finstead of raising an error.
To define a value, use scm_define:
Bind the symbol indicated by name to a variable in the current module and set that variable to val. When name is already bound to a variable, use that. Else create a new variable.
Like
scm_c_define, but the symbol is specified directly.
Like
scm_c_defineandscm_define, but the specified module is used instead of the current one.
Find the symbol that is bound to variable in module. When no such binding is found, return #f.
Define a new module named name and make it current while init is called, passing it data. Return the module.
The parameter name is a string with the symbols that make up the module name, separated by spaces. For example, ‘"foo bar"’ names the module ‘(foo bar)’.
When there already exists a module named name, it is used unchanged, otherwise, an empty module is created.
Find the module name name and return it. When it has not already been defined, try to auto-load it. When it can't be found that way either, create an empty module. The name is interpreted as for
scm_c_define_module.
Add the module named name to the uses list of the current module, as with
(use-modulesname). The name is interpreted as forscm_c_define_module.
Add the bindings designated by name, ... to the public interface of the current module. The list of names is terminated by
NULL.
Some modules are included in the Guile distribution; here are references to the entries in this manual which describe them in more detail:
readline interactive command line editing (see Readline Support).
receive (see Multiple Values).
syntax-rules macro system (see Syntax Rules).
and-let* (see SRFI-2).
receive (see SRFI-8).
define-record-type (see SRFI-9).
#,() (see SRFI-10).
let-values and let*-values
(see SRFI-11).
case-lambda procedures of variable arity (see SRFI-16).
rec convenient recursive expressions (see SRFI-31)
Aubrey Jaffer, mostly to support his portable Scheme library SLIB, implemented a provide/require mechanism for many Scheme implementations. Library files in SLIB provide a feature, and when user programs require that feature, the library file is loaded in.
For example, the file random.scm in the SLIB package contains the line
(provide 'random)
so to use its procedures, a user would type
(require 'random)
and they would magically become available, but still have the same names! So this method is nice, but not as good as a full-featured module system.
When SLIB is used with Guile, provide and require can be used to access its facilities.
Scheme, as defined in R5RS, does not have a full module system. However it does define the concept of a top-level environment. Such an environment maps identifiers (symbols) to Scheme objects such as procedures and lists: About Closure. In other words, it implements a set of bindings.
Environments in R5RS can be passed as the second argument to
eval (see Fly Evaluation). Three procedures are defined to
return environments: scheme-report-environment,
null-environment and interaction-environment (see Fly Evaluation).
In addition, in Guile any module can be used as an R5RS environment,
i.e., passed as the second argument to eval.
Note: the following two procedures are available only when the
(ice-9 r5rs) module is loaded:
(use-modules (ice-9 r5rs))
version must be the exact integer `5', corresponding to revision 5 of the Scheme report (the Revised^5 Report on Scheme).
scheme-report-environmentreturns a specifier for an environment that is empty except for all bindings defined in the report that are either required or both optional and supported by the implementation.null-environmentreturns a specifier for an environment that is empty except for the (syntactic) bindings for all syntactic keywords defined in the report that are either required or both optional and supported by the implementation.Currently Guile does not support values of version for other revisions of the report.
The effect of assigning (through the use of
eval) a variable bound in ascheme-report-environment(for examplecar) is unspecified. Currently the environments specified byscheme-report-environmentare not immutable in Guile.
The more one hacks in Scheme, the more one realizes that there are actually two computational worlds: one which is warm and alive, that land of parentheses, and one cold and dead, the land of C and its ilk.
But yet we as programmers live in both worlds, and Guile itself is half implemented in C. So it is that Guile's living half pays respect to its dead counterpart, via a spectrum of interfaces to C ranging from dynamic loading of Scheme primitives to dynamic binding of stock C library procedures.
Most modern Unices have something called shared libraries. This ordinarily means that they have the capability to share the executable image of a library between several running programs to save memory and disk space. But generally, shared libraries give a lot of additional flexibility compared to the traditional static libraries. In fact, calling them `dynamic' libraries is as correct as calling them `shared'.
Shared libraries really give you a lot of flexibility in addition to the memory and disk space savings. When you link a program against a shared library, that library is not closely incorporated into the final executable. Instead, the executable of your program only contains enough information to find the needed shared libraries when the program is actually run. Only then, when the program is starting, is the final step of the linking process performed. This means that you need not recompile all programs when you install a new, only slightly modified version of a shared library. The programs will pick up the changes automatically the next time they are run.
Now, when all the necessary machinery is there to perform part of the linking at run-time, why not take the next step and allow the programmer to explicitly take advantage of it from within his program? Of course, many operating systems that support shared libraries do just that, and chances are that Guile will allow you to access this feature from within your Scheme programs. As you might have guessed already, this feature is called dynamic linking.17
We titled this section “foreign libraries” because although the name “foreign” doesn't leak into the API, the world of C really is foreign to Scheme – and that estrangement extends to components of foreign libraries as well, as we see in future sections.
Find the shared library denoted by library (a string) and link it into the running Guile application. When everything works out, return a Scheme object suitable for representing the linked object file. Otherwise an error is thrown. How object files are searched is system dependent.
Normally, library is just the name of some shared library file that will be searched for in the places where shared libraries usually reside, such as in /usr/lib and /usr/local/lib.
library should not contain an extension such as
.so. The correct file name extension for the host operating system is provided automatically, according to libltdl's rules (see lt_dlopenext).When library is omitted, a global symbol handle is returned. This handle provides access to the symbols available to the program at run-time, including those exported by the program itself and the shared libraries already loaded.
Return
#tif obj is a dynamic library handle, or#fotherwise.
Unlink the indicated object file from the application. The argument dobj must have been obtained by a call to
dynamic-link. Afterdynamic-unlinkhas been called on dobj, its content is no longer accessible.
(define libgl-obj (dynamic-link "libGL"))
libgl-obj
⇒ #<dynamic-object "libGL">
(dynamic-unlink libGL-obj)
libGL-obj
⇒ #<dynamic-object "libGL" (unlinked)>
As you can see, after calling dynamic-unlink on a dynamically
linked library, it is marked as ‘(unlinked)’ and you are no longer
able to use it with dynamic-call, etc. Whether the library is
really removed from you program is system-dependent and will generally
not happen when some other parts of your program still use it.
When dynamic linking is disabled or not supported on your system, the above functions throw errors, but they are still available.
The most natural thing to do with a dynamic library is to grovel around
in it for a function pointer: a foreign function.
dynamic-func exists for that purpose.
Return a “handle” for the func name in the shared object referred to by dobj. The handle can be passed to
dynamic-callto actually call the function.Regardless whether your C compiler prepends an underscore ‘_’ to the global names in a program, you should not include this underscore in name since it will be added automatically when necessary.
Guile has static support for calling functions with no arguments,
dynamic-call.
Call the C function indicated by func and dobj. The function is passed no arguments and its return value is ignored. When function is something returned by
dynamic-func, call that function and ignore dobj. When func is a string , look it up in dynobj; this is equivalent to(dynamic-call (dynamic-func func dobj) #f)Interrupts are deferred while the C function is executing (with
SCM_DEFER_INTS/SCM_ALLOW_INTS).
dynamic-call is not very powerful. It is mostly intended to be
used for calling specially written initialization functions that will
then add new primitives to Guile. For example, we do not expect that you
will dynamically link libX11 with dynamic-link and then
construct a beautiful graphical user interface just by using
dynamic-call. Instead, the usual way would be to write a special
Guile-to-X11 glue library that has intimate knowledge about both Guile
and X11 and does whatever is necessary to make them inter-operate
smoothly. This glue library could then be dynamically linked into a
vanilla Guile interpreter and activated by calling its initialization
function. That function would add all the new types and primitives to
the Guile interpreter that it has to offer.
(There is actually another, better option: simply to create a libX11 wrapper in Scheme via the dynamic FFI. See Dynamic FFI, for more information.)
Given some set of C extensions to Guile, the next logical step is to integrate these glue libraries into the module system of Guile so that you can load new primitives into a running system just as you can load new Scheme code.
Load and initialize the extension designated by LIB and INIT. When there is no pre-registered function for LIB/INIT, this is equivalent to
(dynamic-call INIT (dynamic-link LIB))When there is a pre-registered function, that function is called instead.
Normally, there is no pre-registered function. This option exists only for situations where dynamic linking is unavailable or unwanted. In that case, you would statically link your program with the desired library, and register its init function right after Guile has been initialized.
As for
dynamic-link, lib should not contain any suffix such as.so(see dynamic-link). It should also not contain any directory components. Libraries that implement Guile Extensions should be put into the normal locations for shared libraries. We recommend to use the naming convention libguile-bla-blum for a extension related to a module(bla blum).The normal way for a extension to be used is to write a small Scheme file that defines a module, and to load the extension into this module. When the module is auto-loaded, the extension is loaded as well. For example,
(define-module (bla blum)) (load-extension "libguile-bla-blum" "bla_init_blum")
The most interesting application of dynamically linked libraries is probably to use them for providing compiled code modules to Scheme programs. As much fun as programming in Scheme is, every now and then comes the need to write some low-level C stuff to make Scheme even more fun.
Not only can you put these new primitives into their own module (see the previous section), you can even put them into a shared library that is only then linked to your running Guile image when it is actually needed.
An example will hopefully make everything clear. Suppose we want to
make the Bessel functions of the C library available to Scheme in the
module ‘(math bessel)’. First we need to write the appropriate
glue code to convert the arguments and return values of the functions
from Scheme to C and back. Additionally, we need a function that will
add them to the set of Guile primitives. Because this is just an
example, we will only implement this for the j0 function.
#include <math.h>
#include <libguile.h>
SCM
j0_wrapper (SCM x)
{
return scm_from_double (j0 (scm_to_double (x, "j0")));
}
void
init_math_bessel ()
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}
We can already try to bring this into action by manually calling the low
level functions for performing dynamic linking. The C source file needs
to be compiled into a shared library. Here is how to do it on
GNU/Linux, please refer to the libtool documentation for how to
create dynamically linkable libraries portably.
gcc -shared -o libbessel.so -fPIC bessel.c
Now fire up Guile:
(define bessel-lib (dynamic-link "./libbessel.so"))
(dynamic-call "init_math_bessel" bessel-lib)
(j0 2)
⇒ 0.223890779141236
The filename ./libbessel.so should be pointing to the shared
library produced with the gcc command above, of course. The
second line of the Guile interaction will call the
init_math_bessel function which in turn will register the C
function j0_wrapper with the Guile interpreter under the name
j0. This function becomes immediately available and we can call
it from Scheme.
Fun, isn't it? But we are only half way there. This is what
apropos has to say about j0:
(apropos "j0")
-| (guile-user): j0 #<primitive-procedure j0>
As you can see, j0 is contained in the root module, where all
the other Guile primitives like display, etc live. In general,
a primitive is put into whatever module is the current module at
the time scm_c_define_gsubr is called.
A compiled module should have a specially named module init
function. Guile knows about this special name and will call that
function automatically after having linked in the shared library. For
our example, we replace init_math_bessel with the following code in
bessel.c:
void
init_math_bessel (void *unused)
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
scm_c_export ("j0", NULL);
}
void
scm_init_math_bessel_module ()
{
scm_c_define_module ("math bessel", init_math_bessel, NULL);
}
The general pattern for the name of a module init function is: ‘scm_init_’, followed by the name of the module where the individual hierarchical components are concatenated with underscores, followed by ‘_module’.
After libbessel.so has been rebuilt, we need to place the shared library into the right place.
Once the module has been correctly installed, it should be possible to use it like this:
guile> (load-extension "./libbessel.so" "scm_init_math_bessel_module")
guile> (use-modules (math bessel))
guile> (j0 2)
0.223890779141236
guile> (apropos "j0")
-| (math bessel): j0 #<primitive-procedure j0>
That's it!
The new primitives that you add to Guile with scm_c_define_gsubr
(see Primitive Procedures) or with any of the other mechanisms are
placed into the module that is current when the
scm_c_define_gsubr is executed. Extensions loaded from the REPL,
for example, will be placed into the (guile-user) module, if the
REPL module was not changed.
To define C primitives within a specific module, the simplest way is:
(define-module (foo bar))
(load-extension "foobar-c-code" "foo_bar_init")
When loaded with (use-modules (foo bar)), the
load-extension call looks for the foobar-c-code.so (etc)
object file in Guile's extensiondir, which is usually a
subdirectory of the libdir. For example, if your libdir is
/usr/lib, the extensiondir for the Guile 2.0.x
series will be /usr/lib/guile/2.0/.
The extension path includes the major and minor version of Guile (the “effective version”), because Guile guarantees compatibility within a given effective version. This allows you to install different versions of the same extension for different versions of Guile.
If the extension is not found in the extensiondir, Guile will
also search the standard system locations, such as /usr/lib or
/usr/local/lib. It is preferable, however, to keep your extension
out of the system library path, to prevent unintended interference with
other dynamically-linked C libraries.
If someone installs your module to a non-standard location then the object file won't be found. You can address this by inserting the install location in the foo/bar.scm file. This is convenient for the user and also guarantees the intended object is read, even if stray older or newer versions are in the loader's path.
The usual way to specify an install location is with a prefix
at the configure stage, for instance ‘./configure prefix=/opt’
results in library files as say /opt/lib/foobar-c-code.so.
When using Autoconf (see Introduction), the library location is in a libdir
variable. Its value is intended to be expanded by make, and
can by substituted into a source file like foo.scm.in
(define-module (foo bar))
(load-extension "XXextensiondirXX/foobar-c-code" "foo_bar_init")
with the following in a Makefile, using sed (see Introduction A Stream Editor),
foo.scm: foo.scm.in
sed 's|XXextensiondirXX|$(libdir)/guile/2.0|' <foo.scm.in >foo.scm
The actual pattern XXextensiondirXX is arbitrary, it's only something
which doesn't otherwise occur. If several modules need the value, it
can be easier to create one foo/config.scm with a define of the
extensiondir location, and use that as required.
(define-module (foo config))
(define-public foo-config-extensiondir "XXextensiondirXX"")
Such a file might have other locations too, for instance a data
directory for auxiliary files, or localedir if the module has
its own gettext message catalogue
(see Internationalization).
It will be noted all of the above requires that the Scheme code to be
found in %load-path (see Load Paths). Presently it's left up
to the system administrator or each user to augment that path when
installing Guile modules in non-default locations. But having reached
the Scheme code, that code should take care of hitting any of its own
private files etc.
The previous sections have shown how Guile can be extended at runtime by loading compiled C extensions. This approach is all well and good, but wouldn't it be nice if we didn't have to write any C at all? This section takes up the problem of accessing C values from Scheme, and the next discusses C functions.
The first impedance mismatch that one sees between C and Scheme is that in C, the storage locations (variables) are typed, but in Scheme types are associated with values, not variables. See Values and Variables.
So when describing a C function or a C structure so that it can be accessed from Scheme, the data types of the parameters or fields must be passed explicitly.
These “C type values” may be constructed using the constants and
procedures from the (system foreign) module, which may be loaded
like this:
(use-modules (system foreign))
(system foreign) exports a number of values expressing the basic
C types:
These values represent the C numeric types of the specified sizes and signednesses.
In addition there are some convenience bindings for indicating types of platform-dependent size:
Values exported by the
(system foreign)module, representing C numeric types. For example,longmay beequal?toint64on a 64-bit platform.
The
voidtype. It can be used as the first argument topointer->procedureto wrap a C function that returns nothing.
In addition, the symbol * is used by convention to denote pointer
types. Procedures detailed in the following sections, such as
pointer->procedure, accept it as a type descriptor.
Pointers to variables in the current address space may be looked up
dynamically using dynamic-pointer.
Return a “wrapped pointer” for the symbol name in the shared object referred to by dobj. The returned pointer points to a C object.
Regardless whether your C compiler prepends an underscore ‘_’ to the global names in a program, you should not include this underscore in name since it will be added automatically when necessary.
For example, currently Guile has a variable, scm_numptob, as part
of its API. It is declared as a C long. So, to create a handle
pointing to that foreign value, we do:
(use-modules (system foreign))
(define numptob (dynamic-pointer "scm_numptob" (dynamic-link)))
numptob
⇒ #<pointer 0x7fb35b1b4688>
(The next section discusses ways to dereference pointers.)
A value returned by dynamic-pointer is a Scheme wrapper for a C
pointer.
Return the numerical value of pointer.
(pointer-address numptob) ⇒ 139984413364296 ; YMMV
Return a foreign pointer object pointing to address. If finalizer is passed, it should be a pointer to a one-argument C function that will be called when the pointer object becomes unreachable.
For the purpose of passing SCM values directly to foreign functions, and allowing them to return SCM values, Guile also supports some unsafe casting operators.
Return a foreign pointer object with the
object-addressof scm.
Unsafely cast pointer to a Scheme object. Cross your fingers!
Wrapped pointers are untyped, so they are essentially equivalent to C
void pointers. As in C, the memory region pointed to by a
pointer can be accessed at the byte level. This is achieved using
bytevectors (see Bytevectors). The (rnrs bytevector)
module contains procedures that can be used to convert byte sequences to
Scheme objects such as strings, floating point numbers, or integers.
Return a bytevector aliasing the len bytes pointed to by pointer.
The user may specify an alternate default interpretation for the memory by passing the uvec_type argument, to indicate that the memory is an array of elements of that type. uvec_type should be something that
uniform-vector-element-typewould return, likef32ors16.When offset is passed, it specifies the offset in bytes relative to pointer of the memory region aliased by the returned bytevector.
Mutating the returned bytevector mutates the memory pointed to by pointer, so buckle your seatbelts.
Return a pointer pointer aliasing the memory pointed to by bv or offset bytes after bv when offset is passed.
In addition to these primitives, convenience procedures are available:
Assuming pointer points to a memory region that holds a pointer, return this pointer.
Return a foreign pointer to a nul-terminated copy of string in the given encoding, defaulting to the current locale encoding. The C string is freed when the returned foreign pointer becomes unreachable.
This is the Scheme equivalent of
scm_to_stringn.
Return the string representing the C string pointed to by pointer. If length is omitted or
-1, the string is assumed to be nul-terminated. Otherwise length is the number of bytes in memory pointed to by pointer. The C string is assumed to be in the given encoding, defaulting to the current locale encoding.This is the Scheme equivalent of
scm_from_stringn.
Most object-oriented C libraries use pointers to specific data
structures to identify objects. It is useful in such cases to reify the
different pointer types as disjoint Scheme types. The
define-wrapped-pointer-type macro simplifies this.
Define helper procedures to wrap pointer objects into Scheme objects with a disjoint type. Specifically, this macro defines:
- pred, a predicate for the new Scheme type;
- wrap, a procedure that takes a pointer object and returns an object that satisfies pred;
- unwrap, which does the reverse.
wrap preserves pointer identity, for two pointer objects p1 and p2 that are
equal?,(eq? (wrap p1) (wrap p2)) ⇒ #t.Finally, print should name a user-defined procedure to print such objects. The procedure is passed the wrapped object and a port to write to.
For example, assume we are wrapping a C library that defines a type,
bottle_t, and functions that can be passedbottle_t *pointers to manipulate them. We could write:(define-wrapped-pointer-type bottle bottle? wrap-bottle unwrap-bottle (lambda (b p) (format p "#<bottle of ~a ~x>" (bottle-contents b) (pointer-address (unwrap-bottle b))))) (define grab-bottle ;; Wrapper for `bottle_t *grab (void)'. (let ((grab (pointer->procedure '* (dynamic-func "grab_bottle" libbottle) '()))) (lambda () "Return a new bottle." (wrap-bottle (grab))))) (define bottle-contents ;; Wrapper for `const char *bottle_contents (bottle_t *)'. (let ((contents (pointer->procedure '* (dynamic-func "bottle_contents" libbottle) '(*)))) (lambda (b) "Return the contents of B." (pointer->string (contents (unwrap-bottle b)))))) (write (grab-bottle)) ⇒ #<bottle of Château Haut-Brion 803d36>In this example,
grab-bottleis guaranteed to return a genuinebottleobject satisfyingbottle?. Likewise,bottle-contentserrors out when its argument is not a genuinebottleobject.
Going back to the scm_numptob example above, here is how we can
read its value as a C long integer:
(use-modules (rnrs bytevectors))
(bytevector-uint-ref (pointer->bytevector numptob (sizeof long))
0 (native-endianness)
(sizeof long))
⇒ 8
If we wanted to corrupt Guile's internal state, we could set
scm_numptob to another value; but we shouldn't, because that
variable is not meant to be set. Indeed this point applies more widely:
the C API is a dangerous place to be. Not only might setting a value
crash your program, simply accessing the data pointed to by a dangling
pointer or similar can prove equally disastrous.
Finally, one last note on foreign values before moving on to actually calling foreign functions. Sometimes you need to deal with C structs, which requires interpreting each element of the struct according to the its type, offset, and alignment. Guile has some primitives to support this.
Return the size of type, in bytes.
type should be a valid C type, like
int. Alternately type may be the symbol*, in which case the size of a pointer is returned. type may also be a list of types, in which case the size of astructwith ABI-conventional packing is returned.
Return the alignment of type, in bytes.
type should be a valid C type, like
int. Alternately type may be the symbol*, in which case the alignment of a pointer is returned. type may also be a list of types, in which case the alignment of astructwith ABI-conventional packing is returned.
Guile also provides some convenience methods to pack and unpack foreign pointers wrapping C structs.
Create a foreign pointer to a C struct containing vals with types
types.vals and
typesshould be lists of the same length.
Parse a foreign pointer to a C struct, returning a list of values.
typesshould be a list of C types.
For example, to create and parse the equivalent of a struct {
int64_t a; uint8_t b; }:
(parse-c-struct (make-c-struct (list int64 uint8)
(list 300 43))
(list int64 uint8))
⇒ (300 43)
As yet, Guile only has convenience routines to support
conventionally-packed structs. But given the bytevector->foreign
and foreign->bytevector routines, one can create and parse
tightly packed structs and unions by hand. See the code for
(system foreign) for details.
Of course, the land of C is not all nouns and no verbs: there are functions too, and Guile allows you to call them.
Make a foreign function.
Given the foreign void pointer func_ptr, its argument and return types arg_types and return_type, return a procedure that will pass arguments to the foreign function and return appropriate values.
arg_types should be a list of foreign types.
return_typeshould be a foreign type. See Foreign Types, for more information on foreign types.
Here is a better definition of (math bessel):
(define-module (math bessel)
#:use-module (system foreign)
#:export (j0))
(define libm (dynamic-link "libm"))
(define j0
(pointer->procedure double
(dynamic-func "j0" libm)
(list double)))
That's it! No C at all.
Numeric arguments and return values from foreign functions are
represented as Scheme values. For example, j0 in the above
example takes a Scheme number as its argument, and returns a Scheme
number.
Pointers may be passed to and returned from foreign functions as well.
In that case the type of the argument or return value should be the
symbol *, indicating a pointer. For example, the following
code makes memcpy available to Scheme:
(define memcpy
(let ((this (dynamic-link)))
(pointer->procedure '*
(dynamic-func "memcpy" this)
(list '* '* size_t))))
To invoke memcpy, one must pass it foreign pointers:
(use-modules (rnrs bytevectors))
(define src-bits
(u8-list->bytevector '(0 1 2 3 4 5 6 7)))
(define src
(bytevector->pointer src-bits))
(define dest
(bytevector->pointer (make-bytevector 16 0)))
(memcpy dest src (bytevector-length src-bits))
(bytevector->u8-list (pointer->bytevector dest 16))
⇒ (0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0)
One may also pass structs as values, passing structs as foreign pointers. See Foreign Structs, for more information on how to express struct types and struct values.
“Out” arguments are passed as foreign pointers. The memory pointed to by the foreign pointer is mutated in place.
;; struct timeval {
;; time_t tv_sec; /* seconds */
;; suseconds_t tv_usec; /* microseconds */
;; };
;; assuming fields are of type "long"
(define gettimeofday
(let ((f (pointer->procedure
int
(dynamic-func "gettimeofday" (dynamic-link))
(list '* '*)))
(tv-type (list long long)))
(lambda ()
(let* ((timeval (make-c-struct tv-type (list 0 0)))
(ret (f timeval %null-pointer)))
(if (zero? ret)
(apply values (parse-c-struct timeval tv-type))
(error "gettimeofday returned an error" ret))))))
(gettimeofday)
⇒ 1270587589
⇒ 499553
As you can see, this interface to foreign functions is at a very low, somewhat dangerous level18.
The FFI can also work in the opposite direction: making Scheme procedures callable from C. This makes it possible to use Scheme procedures as “callbacks” expected by C function.
Return a pointer to a C function of type return-type taking arguments of types arg-types (a list) and behaving as a proxy to procedure proc. Thus proc's arity, supported argument types, and return type should match return-type and arg-types.
As an example, here's how the C library's qsort array sorting
function can be made accessible to Scheme (see qsort):
(define qsort!
(let ((qsort (pointer->procedure void
(dynamic-func "qsort"
(dynamic-link))
(list '* size_t size_t '*))))
(lambda (bv compare)
;; Sort bytevector BV in-place according to comparison
;; procedure COMPARE.
(let ((ptr (procedure->pointer int
(lambda (x y)
;; X and Y are pointers so,
;; for convenience, dereference
;; them before calling COMPARE.
(compare (dereference-uint8* x)
(dereference-uint8* y)))
(list '* '*))))
(qsort (bytevector->pointer bv)
(bytevector-length bv) 1 ;; we're sorting bytes
ptr)))))
(define (dereference-uint8* ptr)
;; Helper function: dereference the byte pointed to by PTR.
(let ((b (pointer->bytevector ptr 1)))
(bytevector-u8-ref b 0)))
(define bv
;; An unsorted array of bytes.
(u8-list->bytevector '(7 1 127 3 5 4 77 2 9 0)))
;; Sort BV.
(qsort! bv (lambda (x y) (- x y)))
;; Let's see what the sorted array looks like:
(bytevector->u8-list bv)
⇒ (0 1 2 3 4 5 7 9 77 127)
And voilà!
Note that procedure->pointer is not supported (and not defined)
on a few exotic architectures. Thus, user code may need to check
(defined? 'procedure->pointer). Nevertheless, it is available on
many architectures, including (as of libffi 3.0.9) x86, ia64, SPARC,
PowerPC, ARM, and MIPS, to name a few.
Arbiters are synchronization objects, they can be used by threads to control access to a shared resource. An arbiter can be locked to indicate a resource is in use, and unlocked when done.
An arbiter is like a light-weight mutex (see Mutexes and Condition Variables). It uses less memory and may be faster, but there's no way for a thread to block waiting on an arbiter, it can only test and get the status returned.
Return an object of type arbiter and name name. Its state is initially unlocked. Arbiters are a way to achieve process synchronization.
If arb is unlocked, then lock it and return
#t. If arb is already locked, then do nothing and return#f.
If arb is locked, then unlock it and return
#t. If arb is already unlocked, then do nothing and return#f.Typical usage is for the thread which locked an arbiter to later release it, but that's not required, any thread can release it.
Asyncs are a means of deferring the execution of Scheme code until it is safe to do so.
Guile provides two kinds of asyncs that share the basic concept but are otherwise quite different: system asyncs and user asyncs. System asyncs are integrated into the core of Guile and are executed automatically when the system is in a state to allow the execution of Scheme code. For example, it is not possible to execute Scheme code in a POSIX signal handler, but such a signal handler can queue a system async to be executed in the near future, when it is safe to do so.
System asyncs can also be queued for threads other than the current one. This way, you can cause threads to asynchronously execute arbitrary code.
User asyncs offer a convenient means of queuing procedures for future execution and triggering this execution. They will not be executed automatically.
To cause the future asynchronous execution of a procedure in a given
thread, use system-async-mark.
Automatic invocation of system asyncs can be temporarily disabled by
calling call-with-blocked-asyncs. This function works by
temporarily increasing the async blocking level of the current
thread while a given procedure is running. The blocking level starts
out at zero, and whenever a safe point is reached, a blocking level
greater than zero will prevent the execution of queued asyncs.
Analogously, the procedure call-with-unblocked-asyncs will
temporarily decrease the blocking level of the current thread. You
can use it when you want to disable asyncs by default and only allow
them temporarily.
In addition to the C versions of call-with-blocked-asyncs and
call-with-unblocked-asyncs, C code can use
scm_dynwind_block_asyncs and scm_dynwind_unblock_asyncs
inside a dynamic context (see Dynamic Wind) to block or
unblock system asyncs temporarily.
Mark proc (a procedure with zero arguments) for future execution in thread. When proc has already been marked for thread but has not been executed yet, this call has no effect. When thread is omitted, the thread that called
system-async-markis used.This procedure is not safe to be called from signal handlers. Use
scm_sigactionorscm_sigaction_for_threadto install signal handlers.
Call proc and block the execution of system asyncs by one level for the current thread while it is running. Return the value returned by proc. For the first two variants, call proc with no arguments; for the third, call it with data.
The same but with a C function proc instead of a Scheme thunk.
Call proc and unblock the execution of system asyncs by one level for the current thread while it is running. Return the value returned by proc. For the first two variants, call proc with no arguments; for the third, call it with data.
The same but with a C function proc instead of a Scheme thunk.
During the current dynwind context, increase the blocking of asyncs by one level. This function must be used inside a pair of calls to
scm_dynwind_beginandscm_dynwind_end(see Dynamic Wind).
During the current dynwind context, decrease the blocking of asyncs by one level. This function must be used inside a pair of calls to
scm_dynwind_beginandscm_dynwind_end(see Dynamic Wind).
A user async is a pair of a thunk (a parameterless procedure) and a
mark. Setting the mark on a user async will cause the thunk to be
executed when the user async is passed to run-asyncs. Setting
the mark more than once is satisfied by one execution of the thunk.
User asyncs are created with async. They are marked with
async-mark.
Create a new user async for the procedure thunk.
Mark the user async a for future execution.
Execute all thunks from the marked asyncs of the list list_of_a.
Guile supports POSIX threads, unless it was configured with
--without-threads or the host lacks POSIX thread support. When
thread support is available, the threads feature is provided
(see provided?).
The procedures below manipulate Guile threads, which are wrappers around the system's POSIX threads. For application-level parallelism, using higher-level constructs, such as futures, is recommended (see Futures).
Return the thread that called this function.
Call
thunkin a new thread and with a new dynamic state, returning the new thread. The procedure thunk is called viawith-continuation-barrier.When handler is specified, then thunk is called from within a
catchwith tag#tthat has handler as its handler. This catch is established inside the continuation barrier.Once thunk or handler returns, the return value is made the exit value of the thread and the thread is terminated.
Call body in a new thread, passing it body_data, returning the new thread. The function body is called via
scm_c_with_continuation_barrier.When handler is non-
NULL, body is called viascm_internal_catchwith tagSCM_BOOL_Tthat has handler and handler_data as the handler and its data. This catch is established inside the continuation barrier.Once body or handler returns, the return value is made the exit value of the thread and the thread is terminated.
Return
#tiff obj is a thread; otherwise, return#f.
Wait for thread to terminate and return its exit value. Threads that have not been created with
call-with-new-threadorscm_spawn_threadhave an exit value of#f. When timeout is given, it specifies a point in time where the waiting should be aborted. It can be either an integer as returned bycurrent-timeor a pair as returned bygettimeofday. When the waiting is aborted, timeoutval is returned (if it is specified;#fis returned otherwise).
Return
#tiff thread has exited.
If one or more threads are waiting to execute, calling yield forces an immediate context switch to one of them. Otherwise, yield has no effect.
Asynchronously notify thread to exit. Immediately after receiving this notification, thread will call its cleanup handler (if one has been set) and then terminate, aborting any evaluation that is in progress.
Because Guile threads are isomorphic with POSIX threads, thread will not receive its cancellation signal until it reaches a cancellation point. See your operating system's POSIX threading documentation for more information on cancellation points; note that in Guile, unlike native POSIX threads, a thread can receive a cancellation notification while attempting to lock a mutex.
Set proc as the cleanup handler for the thread thread. proc, which must be a thunk, will be called when thread exits, either normally or by being canceled. Thread cleanup handlers can be used to perform useful tasks like releasing resources, such as locked mutexes, when thread exit cannot be predicted.
The return value of proc will be set as the exit value of thread.
To remove a cleanup handler, pass
#ffor proc.
Return the cleanup handler currently installed for the thread thread. If no cleanup handler is currently installed, thread-cleanup returns
#f.
Higher level thread procedures are available by loading the
(ice-9 threads) module. These provide standardized
thread creation.
Apply proc to args in a new thread formed by
call-with-new-threadusing a default error handler that display the error to the current error port. The args... expressions are evaluated in the new thread.
Evaluate forms first and rest in a new thread formed by
call-with-new-threadusing a default error handler that display the error to the current error port.
A mutex is a thread synchronization object, it can be used by threads to control access to a shared resource. A mutex can be locked to indicate a resource is in use, and other threads can then block on the mutex to wait for the resource (or can just test and do something else if not available). “Mutex” is short for “mutual exclusion”.
There are two types of mutexes in Guile, “standard” and
“recursive”. They're created by make-mutex and
make-recursive-mutex respectively, the operation functions are
then common to both.
Note that for both types of mutex there's no protection against a “deadly embrace”. For instance if one thread has locked mutex A and is waiting on mutex B, but another thread owns B and is waiting on A, then an endless wait will occur (in the current implementation). Acquiring requisite mutexes in a fixed order (like always A before B) in all threads is one way to avoid such problems.
Return a new mutex. It is initially unlocked. If flags is specified, it must be a list of symbols specifying configuration flags for the newly-created mutex. The supported flags are:
unchecked-unlock- Unless this flag is present, a call to `unlock-mutex' on the returned mutex when it is already unlocked will cause an error to be signalled.
allow-external-unlock- Allow the returned mutex to be unlocked by the calling thread even if it was originally locked by a different thread.
recursive- The returned mutex will be recursive.
Return
#tiff obj is a mutex; otherwise, return#f.
Create a new recursive mutex. It is initially unlocked. Calling this function is equivalent to calling `make-mutex' and specifying the
recursiveflag.
Lock mutex. If the mutex is already locked, then block and return only when mutex has been acquired.
When timeout is given, it specifies a point in time where the waiting should be aborted. It can be either an integer as returned by
current-timeor a pair as returned bygettimeofday. When the waiting is aborted,#fis returned.When owner is given, it specifies an owner for mutex other than the calling thread. owner may also be
#f, indicating that the mutex should be locked but left unowned.For standard mutexes (
make-mutex), and error is signalled if the thread has itself already locked mutex.For a recursive mutex (
make-recursive-mutex), if the thread has itself already locked mutex, then a furtherlock-mutexcall increments the lock count. An additionalunlock-mutexwill be required to finally release.If mutex was locked by a thread that exited before unlocking it, the next attempt to lock mutex will succeed, but
abandoned-mutex-errorwill be signalled.When a system async (see System asyncs) is activated for a thread blocked in
lock-mutex, the wait is interrupted and the async is executed. When the async returns, the wait resumes.
Arrange for mutex to be locked whenever the current dynwind context is entered and to be unlocked when it is exited.
Try to lock mutex as per
lock-mutex. If mutex can be acquired immediately then this is done and the return is#t. If mutex is locked by some other thread then nothing is done and the return is#f.
Unlock mutex. An error is signalled if mutex is not locked and was not created with the
unchecked-unlockflag set, or if mutex is locked by a thread other than the calling thread and was not created with theallow-external-unlockflag set.If condvar is given, it specifies a condition variable upon which the calling thread will wait to be signalled before returning. (This behavior is very similar to that of
wait-condition-variable, except that the mutex is left in an unlocked state when the function returns.)When timeout is also given, it specifies a point in time where the waiting should be aborted. It can be either an integer as returned by
current-timeor a pair as returned bygettimeofday. When the waiting is aborted,#fis returned. Otherwise the function returns#t.
Return the current owner of mutex, in the form of a thread or
#f(indicating no owner). Note that a mutex may be unowned but still locked.
Return the current lock level of mutex. If mutex is currently unlocked, this value will be 0; otherwise, it will be the number of times mutex has been recursively locked by its current owner.
Return
#tif mutex is locked, regardless of ownership; otherwise, return#f.
Return a new condition variable.
Return
#tiff obj is a condition variable; otherwise, return#f.
Wait until condvar has been signalled. While waiting, mutex is atomically unlocked (as with
unlock-mutex) and is locked again when this function returns. When time is given, it specifies a point in time where the waiting should be aborted. It can be either a integer as returned bycurrent-timeor a pair as returned bygettimeofday. When the waiting is aborted,#fis returned. When the condition variable has in fact been signalled,#tis returned. The mutex is re-locked in any case beforewait-condition-variablereturns.When a system async is activated for a thread that is blocked in a call to
wait-condition-variable, the waiting is interrupted, the mutex is locked, and the async is executed. When the async returns, the mutex is unlocked again and the waiting is resumed. When the thread block while re-acquiring the mutex, execution of asyncs is blocked.
Wake up one thread that is waiting for condvar.
Wake up all threads that are waiting for condvar.
The following are higher level operations on mutexes. These are available from
(use-modules (ice-9 threads))
Lock mutex, evaluate the body forms, then unlock mutex. The return value is the return from the last body form.
The lock, body and unlock form the branches of a
dynamic-wind(see Dynamic Wind), so mutex is automatically unlocked if an error or new continuation exits body, and is re-locked if body is re-entered by a captured continuation.
Evaluate the body forms, with a mutex locked so only one thread can execute that code at any one time. The return value is the return from the last body form.
Each
monitorform has its own private mutex and the locking and evaluation is as perwith-mutexabove. A standard mutex (make-mutex) is used, which means body must not recursively re-enter themonitorform.The term “monitor” comes from operating system theory, where it means a particular bit of code managing access to some resource and which only ever executes on behalf of one process at any one time.
Up to Guile version 1.8, a thread blocked in guile mode would prevent
the garbage collector from running. Thus threads had to explicitly
leave guile mode with scm_without_guile () before making a
potentially blocking call such as a mutex lock, a select ()
system call, etc. The following functions could be used to temporarily
leave guile mode or to perform some common blocking operations in a
supported way.
Starting from Guile 2.0, blocked threads no longer hinder garbage collection. Thus, the functions below are not needed anymore. They can still be used to inform the GC that a thread is about to block, giving it a (small) optimization opportunity for “stop the world” garbage collections, should they occur while the thread is blocked.
Leave guile mode, call func on data, enter guile mode and return the result of calling func.
While a thread has left guile mode, it must not call any libguile functions except
scm_with_guileorscm_without_guileand must not use any libguile macros. Also, local variables of typeSCMthat are allocated while not in guile mode are not protected from the garbage collector.When used from non-guile mode, calling
scm_without_guileis still allowed: it simply calls func. In that way, you can leave guile mode without having to know whether the current thread is in guile mode or not.
Like
pthread_mutex_lock, but leaves guile mode while waiting for the mutex.
Like
pthread_cond_waitandpthread_cond_timedwait, but leaves guile mode while waiting for the condition variable.
Like
selectbut leaves guile mode while waiting. Also, the delivery of a system async causes this function to be interrupted with error codeEINTR.
Like
sleep, but leaves guile mode while sleeping. Also, the delivery of a system async causes this function to be interrupted.
Like
usleep, but leaves guile mode while sleeping. Also, the delivery of a system async causes this function to be interrupted.
These two macros can be used to delimit a critical section. Syntactically, they are both statements and need to be followed immediately by a semicolon.
Executing
SCM_CRITICAL_SECTION_STARTwill lock a recursive mutex and block the executing of system asyncs. ExecutingSCM_CRITICAL_SECTION_ENDwill unblock the execution of system asyncs and unlock the mutex. Thus, the code that executes between these two macros can only be executed in one thread at any one time and no system asyncs will run. However, because the mutex is a recursive one, the code might still be reentered by the same thread. You must either allow for this or avoid it, both by careful coding.On the other hand, critical sections delimited with these macros can be nested since the mutex is recursive.
You must make sure that for each
SCM_CRITICAL_SECTION_START, the correspondingSCM_CRITICAL_SECTION_ENDis always executed. This means that no non-local exit (such as a signalled error) might happen, for example.
Call
scm_dynwind_lock_mutexon mutex and callscm_dynwind_block_asyncs. When mutex is false, a recursive mutex provided by Guile is used instead.The effect of a call to
scm_dynwind_critical_sectionis that the current dynwind context (see Dynamic Wind) turns into a critical section. Because of the locked mutex, no second thread can enter it concurrently and because of the blocked asyncs, no system async can reenter it from the current thread.When the current thread reenters the critical section anyway, the kind of mutex determines what happens: When mutex is recursive, the reentry is allowed. When it is a normal mutex, an error is signalled.
A fluid is an object that can store one value per dynamic state. Each thread has a current dynamic state, and when accessing a fluid, this current dynamic state is used to provide the actual value. In this way, fluids can be used for thread local storage, but they are in fact more flexible: dynamic states are objects of their own and can be made current for more than one thread at the same time, or only be made current temporarily, for example.
Fluids can also be used to simulate the desirable effects of
dynamically scoped variables. Dynamically scoped variables are useful
when you want to set a variable to a value during some dynamic extent
in the execution of your program and have them revert to their
original value when the control flow is outside of this dynamic
extent. See the description of with-fluids below for details.
New fluids are created with make-fluid and fluid? is
used for testing whether an object is actually a fluid. The values
stored in a fluid can be accessed with fluid-ref and
fluid-set!.
Return a newly created fluid, whose initial value is dflt, or
#fif dflt is not given. Fluids are objects that can hold one value per dynamic state. That is, modifications to this value are only visible to code that executes with the same dynamic state as the modifying code. When a new dynamic state is constructed, it inherits the values from its parent. Because each thread normally executes with its own dynamic state, you can use fluids for thread local storage.
Return a new fluid that is initially unbound (instead of being implicitly bound to some definite value).
Return
#tiff obj is a fluid; otherwise, return#f.
Return the value associated with fluid in the current dynamic root. If fluid has not been set, then return its default value. Calling
fluid-refon an unbound fluid produces a runtime error.
Set the value associated with fluid in the current dynamic root.
Disassociate the given fluid from any value, making it unbound.
Returns
#tiff the given fluid is bound to a value, otherwise#f.
with-fluids* temporarily changes the values of one or more fluids,
so that the given procedure and each procedure called by it access the
given values. After the procedure returns, the old values are restored.
Set fluid to value temporarily, and call thunk. thunk must be a procedure with no argument.
Set fluids to values temporary, and call thunk. fluids must be a list of fluids and values must be the same number of their values to be applied. Each substitution is done in the order given. thunk must be a procedure with no argument. It is called inside a
dynamic-windand the fluids are set/restored when control enter or leaves the established dynamic extent.
Execute body... while each fluid is set to the corresponding value. Both fluid and value are evaluated and fluid must yield a fluid. body... is executed inside a
dynamic-windand the fluids are set/restored when control enter or leaves the established dynamic extent.
The function
scm_c_with_fluidsis likescm_with_fluidsexcept that it takes a C function to call instead of a Scheme thunk.The function
scm_c_with_fluidis similar but only allows one fluid to be set instead of a list.
This function must be used inside a pair of calls to
scm_dynwind_beginandscm_dynwind_end(see Dynamic Wind). During the dynwind context, the fluid fluid is set to val.More precisely, the value of the fluid is swapped with a `backup' value whenever the dynwind context is entered or left. The backup value is initialized with the val argument.
Return a copy of the dynamic state object parent or of the current dynamic state when parent is omitted.
Return
#tif obj is a dynamic state object; return#fotherwise.
Return non-zero if obj is a dynamic state object; return zero otherwise.
Return the current dynamic state object.
Set the current dynamic state object to state and return the previous current dynamic state object.
Call proc while state is the current dynamic state object.
Set the current dynamic state to state for the current dynwind context.
Like
scm_with_dynamic_state, but call func with data.
A parameter object is a procedure. Calling it with no arguments returns its value. Calling it with one argument sets the value.
(define my-param (make-parameter 123))
(my-param) ⇒ 123
(my-param 456)
(my-param) ⇒ 456
The parameterize special form establishes new locations for
parameters, those new locations having effect within the dynamic scope
of the parameterize body. Leaving restores the previous
locations. Re-entering (through a saved continuation) will again use
the new locations.
(parameterize ((my-param 789))
(my-param)) ⇒ 789
(my-param) ⇒ 456
Parameters are like dynamically bound variables in other Lisp dialects. They allow an application to establish parameter settings (as the name suggests) just for the execution of a particular bit of code, restoring when done. Examples of such parameters might be case-sensitivity for a search, or a prompt for user input.
Global variables are not as good as parameter objects for this sort of
thing. Changes to them are visible to all threads, but in Guile
parameter object locations are per-thread, thereby truly limiting the
effect of parameterize to just its dynamic execution.
Passing arguments to functions is thread-safe, but that soon becomes tedious when there's more than a few or when they need to pass down through several layers of calls before reaching the point they should affect. And introducing a new setting to existing code is often easier with a parameter object than adding arguments.
Return a new parameter object, with initial value init.
If a converter is given, then a call
(converterval)is made for each value set, its return is the value stored. Such a call is made for the init initial value too.A converter allows values to be validated, or put into a canonical form. For example,
(define my-param (make-parameter 123 (lambda (val) (if (not (number? val)) (error "must be a number")) (inexact->exact val)))) (my-param 0.75) (my-param) ⇒ 3/4
Establish a new dynamic scope with the given params bound to new locations and set to the given values. body is evaluated in that environment, the result is the return from the last form in body.
Each param is an expression which is evaluated to get the parameter object. Often this will just be the name of a variable holding the object, but it can be anything that evaluates to a parameter.
The param expressions and value expressions are all evaluated before establishing the new dynamic bindings, and they're evaluated in an unspecified order.
For example,
(define prompt (make-parameter "Type something: ")) (define (get-input) (display (prompt)) ...) (parameterize ((prompt "Type a number: ")) (get-input) ...)
Parameter objects are implemented using fluids (see Fluids and Dynamic States), so each dynamic state has its own parameter
locations. That includes the separate locations when outside any
parameterize form. When a parameter is created it gets a
separate initial location in each dynamic state, all initialized to the
given init value.
As alluded to above, because each thread usually has a separate dynamic state, each thread has its own locations behind parameter objects, and changes in one thread are not visible to any other. When a new dynamic state or thread is created, the values of parameters in the originating context are copied, into new locations.
Guile's parameters conform to SRFI-39 (see SRFI-39).
The (ice-9 futures) module provides futures, a construct
for fine-grain parallelism. A future is a wrapper around an expression
whose computation may occur in parallel with the code of the calling
thread, and possibly in parallel with other futures. Like promises,
futures are essentially proxies that can be queried to obtain the value
of the enclosed expression:
(touch (future (+ 2 3)))
⇒ 5
However, unlike promises, the expression associated with a future may be evaluated on another CPU core, should one be available. This supports fine-grain parallelism, because even relatively small computations can be embedded in futures. Consider this sequential code:
(define (find-prime lst1 lst2)
(or (find prime? lst1)
(find prime? lst2)))
The two arms of or are potentially computation-intensive. They
are independent of one another, yet, they are evaluated sequentially
when the first one returns #f. Using futures, one could rewrite
it like this:
(define (find-prime lst1 lst2)
(let ((f (future (find prime? lst2))))
(or (find prime? lst1)
(touch f))))
This preserves the semantics of find-prime. On a multi-core
machine, though, the computation of (find prime? lst2) may be
done in parallel with that of the other find call, which can
reduce the execution time of find-prime.
Note that futures are intended for the evaluation of purely functional expressions. Expressions that have side-effects or rely on I/O may require additional care, such as explicit synchronization (see Mutexes and Condition Variables).
Guile's futures are implemented on top of POSIX threads
(see Threads). Internally, a fixed-size pool of threads is used to
evaluate futures, such that offloading the evaluation of an expression
to another thread doesn't incur thread creation costs. By default, the
pool contains one thread per available CPU core, minus one, to account
for the main thread. The number of available CPU cores is determined
using current-processor-count (see Processes).
Return a future for expression exp. This is equivalent to:
(make-future (lambda () exp))
Return a future for thunk, a zero-argument procedure.
This procedure returns immediately. Execution of thunk may begin in parallel with the calling thread's computations, if idle CPU cores are available, or it may start when
touchis invoked on the returned future.If the execution of thunk throws an exception, that exception will be re-thrown when
touchis invoked on the returned future.
Return the result of the expression embedded in future f.
If the result was already computed in parallel,
touchreturns instantaneously. Otherwise, it waits for the computation to complete, if it already started, or initiates it.
The functions described in this section are available from
(use-modules (ice-9 threads))
They provide high-level parallel constructs. The following functions are implemented in terms of futures (see Futures). Thus they are relatively cheap as they re-use existing threads, and portable, since they automatically use one thread per available CPU core.
Evaluate each expr expression in parallel, each in its own thread. Return the results as a set of N multiple values (see Multiple Values).
Evaluate each expr in parallel, each in its own thread, then bind the results to the corresponding var variables and evaluate body.
letparis likelet(see Local Bindings), but all the expressions for the bindings are evaluated in parallel.
Call proc on the elements of the given lists.
par-mapreturns a list comprising the return values from proc.par-for-eachreturns an unspecified value, but waits for all calls to complete.The proc calls are
(proc elem1...elemN), where each elem is from the corresponding lst. Each lst must be the same length. The calls are potentially made in parallel, depending on the number of CPU cores available.These functions are like
mapandfor-each(see List Mapping), but make their proc calls in parallel.
Unlike those above, the functions described below take a number of
threads as an argument. This makes them inherently non-portable since
the specified number of threads may differ from the number of available
CPU cores as returned by current-processor-count
(see Processes). In addition, these functions create the specified
number of threads when they are called and terminate them upon
completion, which makes them quite expensive.
Therefore, they should be avoided.
Call proc on the elements of the given lists, in the same way as
par-mapandpar-for-eachabove, but use no more than n threads at any one time. The order in which calls are initiated within that threads limit is unspecified.These functions are good for controlling resource consumption if proc calls might be costly, or if there are many to be made. On a dual-CPU system for instance n=4 might be enough to keep the CPUs utilized, and not consume too much memory.
Apply pproc to the elements of the given lists, and apply sproc to each result returned by pproc. The final return value is unspecified, but all calls will have been completed before returning.
The calls made are
(sproc(pproc elem1...elemN)), where each elem is from the corresponding lst. Each lst must have the same number of elements.The pproc calls are made in parallel, in separate threads. No more than n threads are used at any one time. The order in which pproc calls are initiated within that limit is unspecified.
The sproc calls are made serially, in list element order, one at a time. pproc calls on later elements may execute in parallel with the sproc calls. Exactly which thread makes each sproc call is unspecified.
This function is designed for individual calculations that can be done in parallel, but with results needing to be handled serially, for instance to write them to a file. The n limit on threads controls system resource usage when there are many calculations or when they might be costly.
It will be seen that
n-for-each-par-mapis like a combination ofn-par-mapandfor-each,(for-each sproc (n-par-map n pproc lst1 ... lstN))But the actual implementation is more efficient since each sproc call, in turn, can be initiated once the relevant pproc call has completed, it doesn't need to wait for all to finish.
Why is my Guile different from your Guile? There are three kinds of possible variation:
Guile provides “introspective” variables and procedures to query all of these possible variations at runtime. For runtime options, it also provides procedures to change the settings of options and to obtain documentation on what the options mean.
The following procedures and variables provide information about how Guile was configured, built and installed on your system.
Return a string describing Guile's full version number, effective version number, major, minor or micro version number, respectively. The
effective-versionfunction returns the version name that should remain unchanged during a stable series. Currently that means that it omits the micro version. The effective version should be used for items like the versioned share directory name i.e. /usr/share/guile/2.0/(version) ⇒ "2.0.4" (effective-version) ⇒ "2.0" (major-version) ⇒ "2" (minor-version) ⇒ "0" (micro-version) ⇒ "4"
Return the name of the directory under which Guile Scheme files in general are stored. On Unix-like systems, this is usually /usr/local/share/guile or /usr/share/guile.
Return the name of the directory where the Guile Scheme files that belong to the core Guile installation (as opposed to files from a 3rd party package) are installed. On Unix-like systems this is usually /usr/local/share/guile/GUILE_EFFECTIVE_VERSION or /usr/share/guile/GUILE_EFFECTIVE_VERSION;
for example /usr/local/share/guile/2.0.
Return the name of the directory where Guile Scheme files specific to your site should be installed. On Unix-like systems, this is usually /usr/local/share/guile/site or /usr/share/guile/site.
Alist of information collected during the building of a particular Guile. Entries can be grouped into one of several categories: directories, env vars, and versioning info.
Briefly, here are the keys in
%guile-build-info, by group:
- directories
- srcdir, top_srcdir, prefix, exec_prefix, bindir, sbindir, libexecdir, datadir, sysconfdir, sharedstatedir, localstatedir, libdir, infodir, mandir, includedir, pkgdatadir, pkglibdir, pkgincludedir
- env vars
- LIBS
- versioning info
- guileversion, libguileinterface, buildstamp
Values are all strings. The value for
LIBSis typically found also as a part ofpkg-config --libs guile-2.0output. The value forguileversionhas form X.Y.Z, and should be the same as returned by(version). The value forlibguileinterfaceis libtool compatible and has form CURRENT:REVISION:AGE (see Library interface versions). The value forbuildstampis the output of the command ‘date -u +'%Y-%m-%d %T'’ (UTC).In the source,
%guile-build-infois initialized from libguile/libpath.h, which is completely generated, so deleting this file before a build guarantees up-to-date values for that build.
The canonical host type (GNU triplet) of the host Guile was configured for, e.g.,
"x86_64-unknown-linux-gnu"(see Canonicalizing).
Guile has a Scheme level variable *features* that keeps track to
some extent of the features that are available in a running Guile.
*features* is a list of symbols, for example threads, each
of which describes a feature of the running Guile process.
You shouldn't modify the *features* variable directly using
set!. Instead, see the procedures that are provided for this
purpose in the following subsection.
To check whether a particular feature is available, use the
provided? procedure:
Return
#tif the specified feature is available, otherwise#f.
To advertise a feature from your own Scheme code, you can use the
provide procedure:
Add feature to the list of available features in this Guile process.
For C code, the equivalent function takes its feature name as a
char * argument for convenience:
Add a symbol with name str to the list of available features in this Guile process.
In general, a particular feature may be available for one of two reasons. Either because the Guile library was configured and compiled with that feature enabled — i.e. the feature is built into the library on your system. Or because some C or Scheme code that was dynamically loaded by Guile has added that feature to the list.
In the first category, here are the features that the current version of Guile may define (depending on how it is built), and what they mean.
arrayarray-for-eacharray-for-each and other array mapping
procedures (see Arrays).
char-ready?char-ready? function is available
(see Reading).
complexcurrent-timetimes,
get-internal-run-time and so on (see Time).
debug-extensionsdelayEIDsgeteuid and getegid really return
effective user and group IDs (see Processes).
inexacti/o-extensionsftell, redirect-port, dup->fdes, dup2,
fileno, isatty?, fdopen,
primitive-move->fdes and fdes->ports (see Ports and File Descriptors).
net-dbscm_gethost, scm_getnet, scm_getproto,
scm_getserv, scm_sethost, scm_setnet, scm_setproto,
scm_setserv, and their `byXXX' variants (see Network Databases).
posixpipe, getgroups,
kill, execl and so on (see POSIX).
randomrandom, copy-random-state, random-uniform and so on
(see Random).
recklessregexmake-regexp, regexp-exec and friends (see Regexp Functions).
socketsocket,
bind, connect and so on (see Network Sockets and Communication).
sortsystemsystem function is available
(see Processes).
threadsvaluesvalues and
call-with-values (see Multiple Values).
Available features in the second category depend, by definition, on what additional code your Guile process has loaded in. The following table lists features that you might encounter for this reason.
defmacrodefmacro macro is available (see Macros).
describe(oop goops describe) module has been loaded,
which provides a procedure for describing the contents of GOOPS
instances.
readlinerecordmake-record-type
and friends (see Records).
Although these tables may seem exhaustive, it is probably unwise in
practice to rely on them, as the correspondences between feature symbols
and available procedures/behaviour are not strictly defined. If you are
writing code that needs to check for the existence of some procedure, it
is probably safer to do so directly using the defined? procedure
than to test for the corresponding feature using provided?.
There are a number of runtime options available for paramaterizing
built-in procedures, like read, and built-in behavior, like what
happens on an uncaught error.
For more information on reader options, See Scheme Read.
For more information on print options, See Scheme Write.
Finally, for more information on debugger options, See Debug Options.
Here is an example of a session in which some read and debug option handling procedures are used. In this example, the user
abc and aBc are not the same
read-options, and sees that case-insensitive
is set to “no”.
case-insensitive
aBc and abc are the same
scheme@(guile-user)> (define abc "hello")
scheme@(guile-user)> abc
$1 = "hello"
scheme@(guile-user)> aBc
<unknown-location>: warning: possibly unbound variable `aBc'
ERROR: In procedure module-lookup:
ERROR: Unbound variable: aBc
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> (read-options 'help)
copy no Copy source code expressions.
positions yes Record positions of source code expressions.
case-insensitive no Convert symbols to lower case.
keywords #f Style of keyword recognition: #f, 'prefix or 'postfix.
r6rs-hex-escapes no Use R6RS variable-length character and string hex escapes.
square-brackets yes Treat `[' and `]' as parentheses, for R6RS compatibility.
hungry-eol-escapes no In strings, consume leading whitespace after an
escaped end-of-line.
scheme@(guile-user) [1]> (read-enable 'case-insensitive)
$2 = (square-brackets keywords #f case-insensitive positions)
scheme@(guile-user) [1]> ,q
scheme@(guile-user)> aBc
$3 = "hello"
In addition to Scheme, a user may write a Guile program in an increasing number of other languages. Currently supported languages include Emacs Lisp and ECMAScript.
Guile is still fundamentally a Scheme, but it tries to support a wide variety of language building-blocks, so that other languages can be implemented on top of Guile. This allows users to write or extend applications in languages other than Scheme, too. This section describes the languages that have been implemented.
(For details on how to implement a language, See Compiling to the Virtual Machine.)
There are currently only two ways to access other languages from within
Guile: at the REPL, and programmatically, via compile,
read-and-compile, and compile-file.
The REPL is Guile's command prompt (see Using Guile Interactively).
The REPL has a concept of the “current language”, which defaults to
Scheme. The user may change that language, via the meta-command
,language.
For example, the following meta-command enables Emacs Lisp input:
scheme@(guile-user)> ,language elisp
Happy hacking with Emacs Lisp! To switch back, type `,L scheme'.
elisp@(guile-user)> (eq 1 2)
$1 = #nil
Each language has its short name: for example, elisp, for Elisp.
The same short name may be used to compile source code programmatically,
via compile:
elisp@(guile-user)> ,L scheme
Happy hacking with Guile Scheme! To switch back, type `,L elisp'.
scheme@(guile-user)> (compile '(eq 1 2) #:from 'elisp)
$2 = #nil
Granted, as the input to compile is a datum, this works best for
Lispy languages, which have a straightforward datum representation.
Other languages that need more parsing are better dealt with as strings.
The easiest way to deal with syntax-heavy language is with files, via
compile-file and friends. However it is possible to invoke a
language's reader on a port, and then compile the resulting expression
(which is a datum at that point). For more information,
See Compilation.
For more details on introspecting aspects of different languages, See Compiler Tower.
Emacs Lisp (Elisp) is a dynamically-scoped Lisp dialect used in the Emacs editor. See Overview, for more information on Emacs Lisp.
We hope that eventually Guile's implementation of Elisp will be good enough to replace Emacs' own implementation of Elisp. For that reason, we have thought long and hard about how to support the various features of Elisp in a performant and compatible manner.
Readers familiar with Emacs Lisp might be curious about how exactly these various Elisp features are supported in Guile. The rest of this section focuses on addressing these concerns of the Elisp elect.
nil in ELisp is an amalgam of Scheme's #f and '().
It is false, and it is the end-of-list; thus it is a boolean, and a list
as well.
Guile has chosen to support nil as a separate value, distinct
from #f and '(). This allows existing Scheme and Elisp
code to maintain their current semantics. nil, which in Elisp
would just be written and read as nil, in Scheme has the external
representation #nil.
This decision to have nil as a low-level distinct value
facilitates interoperability between the two languages. Guile has chosen
to have Scheme deal with nil as follows:
(boolean? #nil) ⇒ #t
(not #nil) ⇒ #t
(null? #nil) ⇒ #t
And in C, one has:
scm_is_bool (SCM_ELISP_NIL) ⇒ 1
scm_is_false (SCM_ELISP_NIL) ⇒ 1
scm_is_null (SCM_ELISP_NIL) ⇒ 1
In this way, a version of fold written in Scheme can correctly
fold a function written in Elisp (or in fact any other language) over a
nil-terminated list, as Elisp makes. The converse holds as well; a
version of fold written in Elisp can fold over a
'()-terminated list, as made by Scheme.
On a low level, the bit representations for #f, #t,
nil, and '() are made in such a way that they differ by
only one bit, and so a test for, for example, #f-or-nil
may be made very efficiently. See libguile/boolean.h, for more
information.
Since Scheme's equal? must be transitive, and '()
is not equal? to #f, to Scheme nil is not
equal? to #f or '().
(eq? #f '()) ⇒ #f
(eq? #nil '()) ⇒ #f
(eq? #nil #f) ⇒ #f
(eqv? #f '()) ⇒ #f
(eqv? #nil '()) ⇒ #f
(eqv? #nil #f) ⇒ #f
(equal? #f '()) ⇒ #f
(equal? #nil '()) ⇒ #f
(equal? #nil #f) ⇒ #f
However, in Elisp, '(), #f, and nil are all
equal (though not eq).
(defvar f (make-scheme-false))
(defvar eol (make-scheme-null))
(eq f eol) ⇒ nil
(eq nil eol) ⇒ nil
(eq nil f) ⇒ nil
(equal f eol) ⇒ t
(equal nil eol) ⇒ t
(equal nil f) ⇒ t
These choices facilitate interoperability between Elisp and Scheme code, but they are not perfect. Some code that is correct standard Scheme is not correct in the presence of a second false and null value. For example:
(define (truthiness x)
(if (eq? x #f)
#f
#t))
This code seems to be meant to test a value for truth, but now that
there are two false values, #f and nil, it is no longer
correct.
Similarly, there is the loop:
(define (my-length l)
(let lp ((l l) (len 0))
(if (eq? l '())
len
(lp (cdr l) (1+ len)))))
Here, my-length will raise an error if l is a
nil-terminated list.
Both of these examples are correct standard Scheme, but, depending on
what they really want to do, they are not correct Guile Scheme.
Correctly written, they would test the properties of falsehood or
nullity, not the individual members of that set. That is to say, they
should use not or null? to test for falsehood or nullity,
not eq? or memv or the like.
Fortunately, using not and null? is in good style, so all
well-written standard Scheme programs are correct, in Guile Scheme.
Here are correct versions of the above examples:
(define (truthiness* x)
(if (not x)
#f
#t))
;; or: (define (t* x) (not (not x)))
;; or: (define (t** x) x)
(define (my-length* l)
(let lp ((l l) (len 0))
(if (null? l)
len
(lp (cdr l) (1+ len)))))
This problem has a mirror-image case in Elisp:
(deffn my-falsep (x)
(if (eq x nil)
t
nil))
Guile can warn when compiling code that has equality comparisons with
#f, '(), or nil. See Compilation, for details.
In contrast to Scheme, which uses “lexical scoping”, Emacs Lisp scopes its variables dynamically. Guile supports dynamic scoping with its “fluids” facility. See Fluids and Dynamic States, for more information.
Buffer-local and mode-local variables should be mentioned here, along with buckybits on characters, Emacs primitive data types, the Lisp-2-ness of Elisp, and other things. Contributions to the documentation are most welcome!
ECMAScript was not the first non-Schemey language implemented by Guile, but it was the first implemented for Guile's bytecode compiler. The goal was to support ECMAScript version 3.1, a relatively small language, but the implementor was completely irresponsible and got distracted by other things before finishing the standard library, and even some bits of the syntax. So, ECMAScript does deserve a mention in the manual, but it doesn't deserve an endorsement until its implementation is completed, perhaps by some more responsible hacker.
In the meantime, the charitable user might investigate such invocations
as ,L ecmascript and cat test-suite/tests/ecmascript.test.
Guile provides internationalization19
support for Scheme programs in two ways. First, procedures to
manipulate text and data in a way that conforms to particular cultural
conventions (i.e., in a “locale-dependent” way) are provided in the
(ice-9 i18n). Second, Guile allows the use of GNU
gettext to translate program message strings.
In order to make use of the functions described thereafter, the
(ice-9 i18n) module must be imported in the usual way:
(use-modules (ice-9 i18n))
The (ice-9 i18n) module provides procedures to manipulate text
and other data in a way that conforms to the cultural conventions
chosen by the user. Each region of the world or language has its own
customs to, for instance, represent real numbers, classify characters,
collate text, etc. All these aspects comprise the so-called
“cultural conventions” of that region or language.
Computer systems typically refer to a set of cultural conventions as a
locale. For each particular aspect that comprise those cultural
conventions, a locale category is defined. For instance, the
way characters are classified is defined by the LC_CTYPE
category, while the language in which program messages are issued to
the user is defined by the LC_MESSAGES category
(see General Locale Information for details).
The procedures provided by this module allow the development of
programs that adapt automatically to any locale setting. As we will
see later, many of these procedures can optionally take a locale
object argument. This additional argument defines the locale
settings that must be followed by the invoked procedure. When it is
omitted, then the current locale settings of the process are followed
(see setlocale).
The following procedures allow the manipulation of such locale objects.
Return a reference to a data structure representing a set of locale datasets. locale-name should be a string denoting a particular locale (e.g.,
"aa_DJ") and category-list should be either a list of locale categories or a single category as used withsetlocale(seesetlocale). Optionally, ifbase-localeis passed, it should be a locale object denoting settings for categories not listed in category-list.The following invocation creates a locale object that combines the use of Swedish for messages and character classification with the default settings for the other categories (i.e., the settings of the default
Clocale which usually represents conventions in use in the USA):(make-locale (list LC_MESSAGE LC_CTYPE) "sv_SE")The following example combines the use of Esperanto messages and conventions with monetary conventions from Croatia:
(make-locale LC_MONETARY "hr_HR" (make-locale LC_ALL "eo_EO"))A
system-errorexception (see Handling Errors) is raised bymake-localewhen locale-name does not match any of the locales compiled on the system. Note that on non-GNU systems, this error may be raised later, when the locale object is actually used.
Return true if obj is a locale object.
This variable is bound to a locale object denoting the current process locale as installed using
setlocale ()(see Locales). It may be used like any other locale object, including as a third argument tomake-locale, for instance.
The following procedures provide support for text collation, i.e., locale-dependent string and character sorting.
Compare strings s1 and s2 in a locale-dependent way. If locale is provided, it should be locale object (as returned by
make-locale) and will be used to perform the comparison; otherwise, the current system locale is used. For the-civariants, the comparison is made in a case-insensitive way.
Compare strings s1 and s2 in a case-insensitive, and locale-dependent way. If locale is provided, it should be a locale object (as returned by
make-locale) and will be used to perform the comparison; otherwise, the current system locale is used.
Compare characters c1 and c2 according to either locale (a locale object as returned by
make-locale) or the current locale. For the-civariants, the comparison is made in a case-insensitive way.
Return true if character c1 is equal to c2, in a case insensitive way according to locale or to the current locale.
The procedures below provide support for “character case mapping”, i.e., to convert characters or strings to their upper-case or lower-case equivalent. Note that SRFI-13 provides procedures that look similar (see Alphabetic Case Mapping). However, the SRFI-13 procedures are locale-independent. Therefore, they do not take into account specificities of the customs in use in a particular language or region of the world. For instance, while most languages using the Latin alphabet map lower-case letter “i” to upper-case letter “I”, Turkish maps lower-case “i” to “Latin capital letter I with dot above”. The following procedures allow programmers to provide idiomatic character mapping.
Return the lowercase character that corresponds to chr according to either locale or the current locale.
Return the uppercase character that corresponds to chr according to either locale or the current locale.
Return the titlecase character that corresponds to chr according to either locale or the current locale.
Return a new string that is the uppercase version of str according to either locale or the current locale.
Return a new string that is the down-case version of str according to either locale or the current locale.
Return a new string that is the titlecase version of str according to either locale or the current locale.
The following procedures allow programs to read and write numbers
written according to a particular locale. As an example, in English,
“ten thousand and a half” is usually written 10,000.5 while
in French it is written 10 000,5. These procedures allow such
differences to be taken into account.
Convert string str into an integer according to either locale (a locale object as returned by
make-locale) or the current process locale. If base is specified, then it determines the base of the integer being read (e.g.,16for an hexadecimal number,10for a decimal number); by default, decimal numbers are read. Return two values (see Multiple Values): an integer (on success) or#f, and the number of characters read from str (0on failure).This function is based on the C library's
strtolfunction (seestrtol).
Convert string str into an inexact number according to either locale (a locale object as returned by
make-locale) or the current process locale. Return two values (see Multiple Values): an inexact number (on success) or#f, and the number of characters read from str (0on failure).This function is based on the C library's
strtodfunction (seestrtod).
Convert number (an inexact) into a string according to the cultural conventions of either locale (a locale object) or the current locale. Optionally, fraction-digits may be bound to an integer specifying the number of fractional digits to be displayed.
Convert amount (an inexact denoting a monetary amount) into a string according to the cultural conventions of either locale (a locale object) or the current locale. If intl? is true, then the international monetary format for the given locale is used (see international and locale monetary formats).
It is sometimes useful to obtain very specific information about a
locale such as the word it uses for days or months, its format for
representing floating-point figures, etc. The (ice-9 i18n)
module provides support for this in a way that is similar to the libc
functions nl_langinfo () and localeconv ()
(see accessing locale information from C). The available functions
are listed below.
Return the name of the encoding (a string whose interpretation is system-dependent) of either locale or the current locale.
The following functions deal with dates and times.
Return the word (a string) used in either locale or the current locale to name the day (or month) denoted by day (or month), an integer between 1 and 7 (or 1 and 12). The
-shortvariants provide an abbreviation instead of a full name.
Return a (potentially empty) string that is used to denote ante meridiem (or post meridiem) hours in 12-hour format.
These procedures return format strings suitable to
strftime(see Time) that may be used to display (part of) a date/time according to certain constraints and to the conventions of either locale or the current locale (see thenl_langinfo ()items).
These functions return, respectively, the era and the year of the relevant era used in locale or the current locale. Most locales do not define this value. In this case, the empty string is returned. An example of a locale that does define this value is the Japanese one.
The following procedures give information about number representation.
These functions return a string denoting the representation of the decimal point or that of the thousand separator (respectively) for either locale or the current locale.
Return a (potentially circular) list of integers denoting how digits of the integer part of a number are to be grouped, starting at the decimal point and going to the left. The list contains integers indicating the size of the successive groups, from right to left. If the list is non-circular, then no grouping occurs for digits beyond the last group.
For instance, if the returned list is a circular list that contains only
3and the thousand separator is","(as is the case with English locales), then the number12345678should be printed12,345,678.
The following procedures deal with the representation of monetary amounts. Some of them take an additional intl? argument (a boolean) that tells whether the international or local monetary conventions for the given locale are to be used.
These are the monetary counterparts of the above procedures. These procedures apply to monetary amounts.
Return the currency symbol (a string) of either locale or the current locale.
The following example illustrates the difference between the local and international monetary formats:
(define us (make-locale LC_MONETARY "en_US")) (locale-currency-symbol #f us) ⇒ "-$" (locale-currency-symbol #t us) ⇒ "USD "
Return the number of fractional digits to be used when printing monetary amounts according to either locale or the current locale. If the locale does not specify it, then
#fis returned.
These procedures return a boolean indicating whether the currency symbol should precede a positive/negative number, and whether a whitespace should be inserted between the currency symbol and a positive/negative amount.
Return a string denoting the positive (respectively negative) sign that should be used when printing a monetary amount.
These functions return a symbol telling where a sign of a positive/negative monetary amount is to appear when printing it. The possible values are:
parenthesize- The currency symbol and quantity should be surrounded by parentheses.
sign-before- Print the sign string before the quantity and currency symbol.
sign-after- Print the sign string after the quantity and currency symbol.
sign-before-currency-symbol- Print the sign string right before the currency symbol.
sign-after-currency-symbol- Print the sign string right after the currency symbol.
unspecified- Unspecified. We recommend you print the sign after the currency symbol.
Finally, the two following procedures may be helpful when programming user interfaces:
Return a string that can be used as a regular expression to recognize a positive (respectively, negative) response to a yes/no question. For the C locale, the default values are typically
"^[yY]"and"^[nN]", respectively.Here is an example:
(use-modules (ice-9 rdelim)) (format #t "Does Guile rock?~%") (let lp ((answer (read-line))) (cond ((string-match (locale-yes-regexp) answer) (format #t "High fives!~%")) ((string-match (locale-no-regexp) answer) (format #t "How about now? Does it rock yet?~%") (lp (read-line))) (else (format #t "What do you mean?~%") (lp (read-line)))))For an internationalized yes/no string output,
gettextshould be used (see Gettext Support).
Example uses of some of these functions are the implementation of the
number->locale-string and monetary-amount->locale-string
procedures (see Number Input and Output), as well as that the
SRFI-19 date and time conversion to/from strings (see SRFI-19).
Guile provides an interface to GNU gettext for translating
message strings (see Introduction).
Messages are collected in domains, so different libraries and programs maintain different message catalogues. The domain parameter in the functions below is a string (it becomes part of the message catalog filename).
When gettext is not available, or if Guile was configured
‘--without-nls’, dummy functions doing no translation are
provided. When gettext support is available in Guile, the
i18n feature is provided (see Feature Tracking).
Return the translation of msg in domain. domain is optional and defaults to the domain set through
textdomainbelow. category is optional and defaults toLC_MESSAGES(see Locales).Normal usage is for msg to be a literal string. xgettext can extract those from the source to form a message catalogue ready for translators (see Invoking the xgettext Program).
(display (gettext "You are in a maze of twisty passages."))
_is a commonly used shorthand, an application can make that an alias forgettext. Or a library can make a definition that uses its specific domain (so an application can change the default without affecting the library).(define (_ msg) (gettext msg "mylibrary")) (display (_ "File not found."))
_is also a good place to perhaps strip disambiguating extra text from the message string, as for instance in How to usegettextin GUI programs.
Return the translation of msg/msgplural in domain, with a plural form chosen appropriately for the number n. domain is optional and defaults to the domain set through
textdomainbelow. category is optional and defaults toLC_MESSAGES(see Locales).msg is the singular form, and msgplural the plural. When no translation is available, msg is used if n = 1, or msgplural otherwise. When translated, the message catalogue can have a different rule, and can have more than two possible forms.
As per
gettextabove, normal usage is for msg and msgplural to be literal strings, since xgettext can extract them from the source to build a message catalogue. For example,(define (done n) (format #t (ngettext "~a file processed\n" "~a files processed\n" n) n)) (done 1) -| 1 file processed (done 3) -| 3 files processedIt's important to use
ngettextrather than plaingettextfor plurals, since the rules for singular and plural forms in English are not the same in other languages. Onlyngettextwill allow translators to give correct forms (see Additional functions for plural forms).
Get or set the default gettext domain. When called with no parameter the current domain is returned. When called with a parameter, domain is set as the current domain, and that new value returned. For example,
(textdomain "myprog") ⇒ "myprog"
Get or set the directory under which to find message files for domain. When called without a directory the current setting is returned. When called with a directory, directory is set for domain and that new setting returned. For example,
(bindtextdomain "myprog" "/my/tree/share/locale") ⇒ "/my/tree/share/locale"When using Autoconf/Automake, an application should arrange for the configured
localedirto get into the program (by substituting, or by generating a config file) and set that for its domain. This ensures the catalogue can be found even when installed in a non-standard location.
Get or set the text encoding to be used by
gettextfor messages from domain. encoding is a string, the name of a coding system, for instance"8859_1". (On a Unix/POSIX system the iconv program can list all available encodings.)When called without an encoding the current setting is returned, or
#fif none yet set. When called with an encoding, it is set for domain and that new setting returned. For example,(bind-textdomain-codeset "myprog") ⇒ #f (bind-textdomain-codeset "myprog" "latin-9") ⇒ "latin-9"The encoding requested can be different from the translated data file, messages will be recoded as necessary. But note that when there is no translation,
gettextreturns its msg unchanged, ie. without any recoding. For that reason source message strings are best as plain ASCII.Currently Guile has no understanding of multi-byte characters, and string functions won't recognise character boundaries in multi-byte strings. An application will at least be able to pass such strings through to some output though. Perhaps this will change in the future.
In order to understand Guile's debugging facilities, you first need to understand a little about how Guile represent the Scheme control stack. With that in place we explain the low level trap calls that the virtual machine can be configured to make, and the trap and breakpoint infrastructure that builds on top of those calls.
The idea of the Scheme stack is central to a lot of debugging. The Scheme stack is a reified representation of the pending function returns in an expression's continuation. As Guile implements function calls using a stack, this reification takes the form of a number of nested stack frames, each of which corresponds to the application of a procedure to a set of arguments.
A Scheme stack always exists implicitly, and can be summoned into
concrete existence as a first-class Scheme value by the
make-stack call, so that an introspective Scheme program – such
as a debugger – can present it in some way and allow the user to query
its details. The first thing to understand, therefore, is how Guile's
function call convention creates the stack.
Broadly speaking, Guile represents all control flow on a stack. Calling a function involves pushing an empty frame on the stack, then evaluating the procedure and its arguments, then fixing up the new frame so that it points to the old one. Frames on the stack are thus linked together. A tail call is the same, except it reuses the existing frame instead of pushing on a new one.
In this way, the only frames that are on the stack are “active” frames, frames which need to do some work before the computation is complete. On the other hand, a function that has tail-called another function will not be on the stack, as it has no work left to do.
Therefore, when an error occurs in a running program, or the program hits a breakpoint, or in fact at any point that the programmer chooses, its state at that point can be represented by a stack of all the procedure applications that are logically in progress at that time, each of which is known as a frame. The programmer can learn more about the program's state at that point by inspecting the stack and its frames.
A Scheme program can use the make-stack primitive anywhere in its
code, with first arg #t, to construct a Scheme value that
describes the Scheme stack at that point.
(make-stack #t)
⇒
#<stack 25205a0>
Use start-stack to limit the stack extent captured by future
make-stack calls.
Create a new stack. If obj is
#t, the current evaluation stack is used for creating the stack frames, otherwise the frames are taken from obj (which must be a continuation or a frame object).args should be a list containing any combination of integer, procedure, prompt tag and
#tvalues.These values specify various ways of cutting away uninteresting stack frames from the top and bottom of the stack that
make-stackreturns. They come in pairs like this:(inner_cut_1 outer_cut_1 inner_cut_2 outer_cut_2...).Each inner_cut_N can be
#t, an integer, a prompt tag, or a procedure.#tmeans to cut away all frames up to but excluding the first user module frame. An integer means to cut away exactly that number of frames. A prompt tag means to cut away all frames that are inside a prompt with the given tag. A procedure means to cut away all frames up to but excluding the application frame whose procedure matches the specified one.Each outer_cut_N can be an integer, a prompt tag, or a procedure. An integer means to cut away that number of frames. A prompt tag means to cut away all frames that are outside a prompt with the given tag. A procedure means to cut away frames down to but excluding the application frame whose procedure matches the specified one.
If the outer_cut_N of the last pair is missing, it is taken as 0.
Evaluate exp on a new calling stack with identity id. If exp is interrupted during evaluation, backtraces will not display frames farther back than exp's top-level form. This macro is a way of artificially limiting backtraces and stack procedures, largely as a convenience to the user.
Return the identifier given to stack by
start-stack.
Return the length of stack.
Return the index'th frame from stack.
Display a backtrace to the output port port. stack is the stack to take the backtrace from, first specifies where in the stack to start and depth how many frames to display. first and depth can be
#f, which means that default values will be used. If highlights is given it should be a list; the elements of this list will be highlighted wherever they appear in the backtrace.
Return the previous frame of frame, or
#fif frame is the first frame in its stack.
Return the procedure for frame, or
#fif no procedure is associated with frame.
Return the arguments of frame.
Accessors for the three VM registers associated with this frame: the frame pointer (fp), instruction pointer (ip), and stack pointer (sp), respectively. See VM Concepts, for more information.
Accessors for the three saved VM registers in a frame: the previous frame pointer, the single-value return address, and the multiple-value return address. See Stack Layout, for more information.
Accessors for the temporary values corresponding to frame's procedure application. The first local is the first argument given to the procedure. After the arguments, there are the local variables, and after that temporary values. See Stack Layout, for more information.
Display a procedure application frame to the output port port. indent specifies the indentation of the output.
Additionally, the (system vm frame) module defines a number of
higher-level introspective procedures, for example to retrieve the names
of local variables, and the source location to correspond to a
frame. See its source code for more details.
As Guile reads in Scheme code from file or from standard input, it remembers the file name, line number and column number where each expression begins. These pieces of information are known as the source properties of the expression. Syntax expanders and the compiler propagate these source properties to compiled procedures, so that, if an error occurs when evaluating the transformed expression, Guile's debugger can point back to the file and location where the expression originated.
The way that source properties are stored means that Guile can only
associate source properties with parenthesized expressions, and not, for
example, with individual symbols, numbers or strings. The difference
can be seen by typing (xxx) and xxx at the Guile prompt
(where the variable xxx has not been defined):
scheme@(guile-user)> (xxx)
<unnamed port>:4:1: In procedure module-lookup:
<unnamed port>:4:1: Unbound variable: xxx
scheme@(guile-user)> xxx
ERROR: In procedure module-lookup:
ERROR: Unbound variable: xxx
In the latter case, no source properties were stored, so the error doesn't have any source information.
The recording of source properties is controlled by the read option named “positions” (see Scheme Read). This option is switched on by default.
The following procedures can be used to access and set the source properties of read expressions.
Install the association list alist as the source property list for obj.
Set the source property of object obj, which is specified by key to datum. Normally, the key will be a symbol.
Return the source property association list of obj.
Return the property specified by key from obj's source properties.
If the positions reader option is enabled, each parenthesized
expression will have values set for the filename, line and
column properties.
Source properties are also associated with syntax objects. Procedural
macros can get at the source location of their input using the
syntax-source accessor. See Syntax Transformer Helpers, for
more.
Guile also defines a couple of convenience macros built on
syntax-source:
Expands to the source properties corresponding to the location of the
(current-source-location)form.
Expands to the current filename: the filename that the
(current-filename)form appears in. Expands to#fif this information is unavailable.
If you're stuck with defmacros (see Defmacros), and want to preserve source information, the following helper function might be useful to you:
Create and return a new pair whose car and cdr are x and y. Any source properties associated with xorig are also associated with the new pair.
For better or for worse, all programs have bugs, and dealing with bugs is part of programming. This section deals with that class of bugs that causes an exception to be raised – from your own code, from within a library, or from Guile itself.
A common requirement is to be able to show as much useful context as
possible when a Scheme program hits an error. The most immediate
information about an error is the kind of error that it is – such as
“division by zero” – and any parameters that the code which signalled
the error chose explicitly to provide. This information originates with
the error or throw call (or their C code equivalents, if
the error is detected by C code) that signals the error, and is passed
automatically to the handler procedure of the innermost applicable
catch or with-throw-handler expression.
Therefore, to catch errors that occur within a chunk of Scheme code, and
to intercept basic information about those errors, you need to execute
that code inside the dynamic context of a catch or
with-throw-handler expression, or the equivalent in C. In Scheme,
this means you need something like this:
(catch #t
(lambda ()
;; Execute the code in which
;; you want to catch errors here.
...)
(lambda (key . parameters)
;; Put the code which you want
;; to handle an error here.
...))
The catch here can also be with-throw-handler; see
Throw Handlers for information on the when you might want to use
with-throw-handler instead of catch.
For example, to print out a message and return #f when an error occurs, you might use:
(define (catch-all thunk)
(catch #t
thunk
(lambda (key . parameters)
(format (current-error-port)
"Uncaught throw to '~a: ~a\n" key parameters)
#f)))
(catch-all
(lambda () (error "Not a vegetable: tomato")))
-| Uncaught throw to 'misc-error: (#f ~A (Not a vegetable: tomato) #f)
⇒ #f
The #t means that the catch is applicable to all kinds of error.
If you want to restrict your catch to just one kind of error, you can
put the symbol for that kind of error instead of #t. The
equivalent to this in C would be something like this:
SCM my_body_proc (void *body_data)
{
/* Execute the code in which
you want to catch errors here. */
...
}
SCM my_handler_proc (void *handler_data,
SCM key,
SCM parameters)
{
/* Put the code which you want
to handle an error here. */
...
}
{
...
scm_c_catch (SCM_BOOL_T,
my_body_proc, body_data,
my_handler_proc, handler_data,
NULL, NULL);
...
}
Again, as with the Scheme version, scm_c_catch could be replaced
by scm_c_with_throw_handler, and SCM_BOOL_T could instead
be the symbol for a particular kind of error.
The other interesting information about an error is the full Scheme stack at the point where the error occurred; in other words what innermost expression was being evaluated, what was the expression that called that one, and so on. If you want to write your code so that it captures and can display this information as well, there are a couple important things to understand.
Firstly, the stack at the point of the error needs to be explicitly
captured by a make-stack call (or the C equivalent
scm_make_stack). The Guile library does not do this
“automatically” for you, so you will need to write code with a
make-stack or scm_make_stack call yourself. (We emphasise
this point because some people are misled by the fact that the Guile
interactive REPL code does capture and display the stack
automatically. But the Guile interactive REPL is itself a Scheme
program20
running on top of the Guile library, and which uses catch and
make-stack in the way we are about to describe to capture the
stack when an error occurs.)
And secondly, in order to capture the stack effectively at the point
where the error occurred, the make-stack call must be made before
Guile unwinds the stack back to the location of the prevailing catch
expression. This means that the make-stack call must be made
within the handler of a with-throw-handler expression, or the
optional "pre-unwind" handler of a catch. (For the full story of
how these alternatives differ from each other, see Exceptions. The
main difference is that catch terminates the error, whereas
with-throw-handler only intercepts it temporarily and then allow
it to continue propagating up to the next innermost handler.)
So, here are some examples of how to do all this in Scheme and in C. For the purpose of these examples we assume that the captured stack should be stored in a variable, so that it can be displayed or arbitrarily processed later on. In Scheme:
(let ((captured-stack #f))
(catch #t
(lambda ()
;; Execute the code in which
;; you want to catch errors here.
...)
(lambda (key . parameters)
;; Put the code which you want
;; to handle an error after the
;; stack has been unwound here.
...)
(lambda (key . parameters)
;; Capture the stack here:
(set! captured-stack (make-stack #t))))
...
(if captured-stack
(begin
;; Display or process the captured stack.
...))
...)
And in C:
SCM my_body_proc (void *body_data)
{
/* Execute the code in which
you want to catch errors here. */
...
}
SCM my_handler_proc (void *handler_data,
SCM key,
SCM parameters)
{
/* Put the code which you want
to handle an error after the
stack has been unwound here. */
...
}
SCM my_preunwind_proc (void *handler_data,
SCM key,
SCM parameters)
{
/* Capture the stack here: */
*(SCM *)handler_data = scm_make_stack (SCM_BOOL_T, SCM_EOL);
}
{
SCM captured_stack = SCM_BOOL_F;
...
scm_c_catch (SCM_BOOL_T,
my_body_proc, body_data,
my_handler_proc, handler_data,
my_preunwind_proc, &captured_stack);
...
if (captured_stack != SCM_BOOL_F)
{
/* Display or process the captured stack. */
...
}
...
}
Once you have a captured stack, you can interrogate and display its
details in any way that you want, using the stack-... and
frame-... API described in Stacks and
Frames.
If you want to print out a backtrace in the same format that the Guile
REPL does, you can use the display-backtrace procedure to do so.
You can also use display-application to display an individual
frame in the Guile REPL format.
Instead of saving a stack away and waiting for the catch to
return, you can handle errors directly, from within the pre-unwind
handler.
For example, to show a backtrace when an error is thrown, you might want to use a procedure like this:
(define (with-backtrace thunk)
(with-throw-handler #t
thunk
(lambda args (backtrace))))
(with-backtrace (lambda () (error "Not a vegetable: tomato")))
Since we used with-throw-handler here, we didn't actually catch
the error. See Throw Handlers, for more information. However, we did
print out a context at the time of the error, using the built-in
procedure, backtrace.
Display a backtrace of the current stack to the current output port. If highlights is given it should be a list; the elements of this list will be highlighted wherever they appear in the backtrace.
The Guile REPL code (in system/repl/repl.scm and related files)
uses a catch with a pre-unwind handler to capture the stack when
an error occurs in an expression that was typed into the REPL, and debug
that stack interactively in the context of the error.
These procedures are available for use by user programs, in the
(system repl error-handling) module.
(use-modules (system repl error-handling))
Call a thunk in a context in which errors are handled.
There are four keyword arguments:
- on-error
- Specifies what to do before the stack is unwound.
Valid options are
debug(the default), which will enter a debugger;pass, in which case nothing is done, and the exception is rethrown; or a procedure, which will be the pre-unwind handler.- post-error
- Specifies what to do after the stack is unwound.
Valid options are
catch(the default), which will silently catch errors, returning the unspecified value;report, which prints out a description of the error (viadisplay-error), and then returns the unspecified value; or a procedure, which will be the catch handler.- trap-handler
- Specifies a trap handler: what to do when a breakpoint is hit.
Valid options are
debug, which will enter the debugger;pass, which does nothing; ordisabled, which disables traps entirely. See Traps, for more information.- pass-keys
- A set of keys to ignore, as a list.
The behavior of the backtrace procedure and of the default error
handler can be parameterized via the debug options.
Display the current settings of the debug options. If setting is omitted, only a short form of the current read options is printed. Otherwise if setting is the symbol
help, a complete options description is displayed.
The set of available options, and their default values, may be had by
invoking debug-options at the prompt.
scheme@(guile-user)>
backwards no Display backtrace in anti-chronological order.
width 79 Maximal width of backtrace.
depth 20 Maximal length of printed backtrace.
backtrace yes Show backtrace on error.
stack 1048576 Stack size limit (measured in words;
0 = no check).
show-file-name #t Show file names and line numbers in backtraces
when not `#f'. A value of `base' displays only
base names, while `#t' displays full names.
warn-deprecated no Warn when deprecated features are used.
The boolean options may be toggled with debug-enable and
debug-disable. The non-boolean keywords option must be set
using debug-set!.
Modify the debug options.
debug-enableshould be used with boolean options and switches them on,debug-disableswitches them off.
debug-set!can be used to set an option to a specific value. Due to historical oddities, it is a macro that expects an unquoted option name.
Stack overflow errors are caused by a computation trying to use more
stack space than has been enabled by the stack option. There are
actually two kinds of stack that can overflow, the C stack and the
Scheme stack.
Scheme stack overflows can occur if Scheme procedures recurse too far deeply. An example would be the following recursive loop:
scheme@(guile-user)> (let lp () (+ 1 (lp)))
<unnamed port>:8:17: In procedure vm-run:
<unnamed port>:8:17: VM: Stack overflow
The default stack size should allow for about 10000 frames or so, so one usually doesn't hit this level of recursion. Unfortunately there is no way currently to make a VM with a bigger stack. If you are in this unfortunate situation, please file a bug, and in the meantime, rewrite your code to be tail-recursive (see Tail Calls).
The other limit you might hit would be C stack overflows. If you call a primitive procedure which then calls a Scheme procedure in a loop, you will consume C stack space. Guile tries to detect excessive consumption of C stack space, throwing an error when you have hit 80% of the process' available stack (as allocated by the operating system), or 160 kilowords in the absence of a strict limit.
For example, looping through call-with-vm, a primitive that calls
a thunk, gives us the following:
scheme@(guile-user)> (use-modules (system vm vm))
scheme@(guile-user)> (debug-set! stack 10000)
scheme@(guile-user)> (let lp () (call-with-vm (the-vm) lp))
ERROR: In procedure call-with-vm:
ERROR: Stack overflow
If you get an error like this, you can either try rewriting your code to
use less stack space, or increase the maximum stack size. To increase
the maximum stack size, use debug-set!, for example:
(debug-set! stack 200000)
But of course it's better to have your code operate without so much resource consumption, avoiding loops through C trampolines.
Guile's virtual machine can be configured to call out at key points to arbitrary user-specified procedures.
In principle, these hooks allow Scheme code to implement any model it chooses for examining the evaluation stack as program execution proceeds, and for suspending execution to be resumed later.
VM hooks are very low-level, though, and so Guile also has a library of higher-level traps on top of the VM hooks. A trap is an execution condition that, when fulfilled, will fire a handler. For example, Guile defines a trap that fires when control reaches a certain source location.
Finally, Guile also defines a third level of abstractions: per-thread trap states. A trap state exists to give names to traps, and to hold on to the set of traps so that they can be enabled, disabled, or removed. The trap state infrastructure defines the most useful abstractions for most cases. For example, Guile's REPL uses trap state functions to set breakpoints and tracepoints.
The following subsections describe all this in detail, for both the user wanting to use traps, and the developer interested in understanding how the interface hangs together.
Everything that runs in Guile runs on its virtual machine, a C program that defines a number of operations that Scheme programs can perform.
Note that there are multiple VM “engines” for Guile. Only some of them have support for hooks compiled in. Normally the deal is that you get hooks if you are running interactively, and otherwise they are disabled, as they do have some overhead (about 10 or 20 percent).
To ensure that you are running with hooks, pass --debug to Guile
when running your program, or otherwise use the call-with-vm and
set-vm-engine! procedures to ensure that you are running in a VM
with the debug engine.
To digress, Guile's VM has 6 different hooks (see Hooks) that can be fired at different times, which may be accessed with the following procedures.
All hooks are called with one argument, the frame in question. See Frames. Since these hooks may be fired very frequently, Guile does a terrible thing: it allocates the frames on the C stack instead of the garbage-collected heap.
The upshot here is that the frames are only valid within the dynamic extent of the call to the hook. If a hook procedure keeps a reference to the frame outside the extent of the hook, bad things will happen.
The interface to hooks is provided by the (system vm vm) module:
(use-modules (system vm vm))
The result of calling the-vm is usually passed as the vm
argument to all of these procedures.
The hook that will be fired before an instruction is retired (and executed).
The hook that will be fired after preparing a new frame. Fires just before applying a procedure in a non-tail context, just before the corresponding apply-hook.
The hook that will be fired before returning from a frame.
This hook is a bit trickier than the rest, in that there is a particular interpretation of the values on the stack. Specifically, the top value on the stack is the number of values being returned, and the next n values are the actual values being returned, with the last value highest on the stack.
The hook that will be fired before a procedure is applied. The frame's procedure will have already been set to the new procedure.
Note that procedure application is somewhat orthogonal to continuation pushes and pops. A non-tail call to a procedure will result first in a firing of the push-continuation hook, then this application hook, whereas a tail call will run without having fired a push-continuation hook.
The hook that will be called after aborting to a prompt. See Prompts. The stack will be in the same state as for
vm-pop-continuation-hook.
The hook that will be called after restoring an undelimited continuation. Unfortunately it's not currently possible to introspect on the values that were given to the continuation.
These hooks do impose a performance penalty, if they are on. Obviously,
the vm-next-hook has quite an impact, performance-wise. Therefore
Guile exposes a single, heavy-handed knob to turn hooks on or off, the
VM trace level. If the trace level is positive, hooks run;
otherwise they don't.
For convenience, when the VM fires a hook, it does so with the trap level temporarily set to 0. That way the hooks don't fire while you're handling a hook. The trace level is restored to whatever it was once the hook procedure finishes.
Retrieve the “trace level” of the VM. If positive, the trace hooks associated with vm will be run. The initial trace level is 0.
See A Virtual Machine for Guile, for more information on Guile's virtual machine.
The capabilities provided by hooks are great, but hooks alone rarely correspond to what users want to do.
For example, if a user wants to break when and if control reaches a certain source location, how do you do it? If you install a “next” hook, you get unacceptable overhead for the execution of the entire program. It would be possible to install an “apply” hook, then if the procedure encompasses those source locations, install a “next” hook, but already you're talking about one concept that might be implemented by a varying number of lower-level concepts.
It's best to be clear about things and define one abstraction for all such conditions: the trap.
Considering the myriad capabilities offered by the hooks though, there is only a minimum of functionality shared by all traps. Guile's current take is to reduce this to the absolute minimum, and have the only standard interface of a trap be “turn yourself on” or “turn yourself off”.
This interface sounds a bit strange, but it is useful to procedurally compose higher-level traps from lower-level building blocks. For example, Guile defines a trap that calls one handler when control enters a procedure, and another when control leaves the procedure. Given that trap, one can define a trap that adds to the next-hook only when within a given procedure. Building further, one can define a trap that fires when control reaches particular instructions within a procedure.
Or of course you can stop at any of these intermediate levels. For example, one might only be interested in calls to a given procedure. But the point is that a simple enable/disable interface is all the commonality that exists between the various kinds of traps, and furthermore that such an interface serves to allow “higher-level” traps to be composed from more primitive ones.
Specifically, a trap, in Guile, is a procedure. When a trap is created, by convention the trap is enabled; therefore, the procedure that is the trap will, when called, disable the trap, and return a procedure that will enable the trap, and so on.
Trap procedures take one optional argument: the current frame. (A trap may want to add to different sets of hooks depending on the frame that is current at enable-time.)
If this all sounds very complicated, it's because it is. Some of it is essential, but probably most of it is not. The advantage of using this minimal interface is that composability is more lexically apparent than when, for example, using a stateful interface based on GOOPS. But perhaps this reflects the cognitive limitations of the programmer who made the current interface more than anything else.
To summarize the last sections, traps are enabled or disabled, and when they are enabled, they add to various VM hooks.
Note, however, that traps do not increase the VM trace level. So if you create a trap, it will be enabled, but unless something else increases the VM's trace level (see VM Hooks), the trap will not fire. It turns out that getting the VM trace level right is tricky without a global view of what traps are enabled. See Trap States, for Guile's answer to this problem.
Traps are created by calling procedures. Most of these procedures share a set of common keyword arguments, so rather than document them separately, we discuss them all together here:
#:vm#:closure?#f.
#:current-frame#f.
To have access to these procedures, you'll need to have imported the
(system vm traps) module:
(use-modules (system vm traps))
A trap that calls handler when proc is applied.
A trap that calls enter-handler when control enters proc, and exit-handler when control leaves proc.
Control can enter a procedure via:
- A procedure call.
- A return to a procedure's frame on the stack.
- A continuation returning directly to an application of this procedure.
Control can leave a procedure via:
- A normal return from the procedure.
- An application of another procedure.
- An invocation of a continuation.
- An abort.
A trap that calls next-handler for every instruction executed in proc, and exit-handler when execution leaves proc.
A trap that calls handler when execution enters a range of instructions in proc. range is a simple of pairs,
((start.end) ...). The start addresses are inclusive, and end addresses are exclusive.
A trap that fires when control reaches a given source location. The user-line parameter is one-indexed, as a user counts lines, instead of zero-indexed, as Guile counts lines.
A trap that fires when control leaves the given frame. frame should be a live frame in the current continuation. return-handler will be called on a normal return, and abort-handler on a nonlocal exit.
A more traditional dynamic-wind trap, which fires enter-handler when control enters proc, return-handler on a normal return, and abort-handler on a nonlocal exit.
Note that rewinds are not handled, so there is no rewind handler.
A trap that calls apply-handler every time a procedure is applied, and return-handler for returns, but only during the dynamic extent of an application of proc.
A trap that calls next-handler for all retired instructions within the dynamic extent of a call to proc.
A trap that calls apply-handler whenever proc is applied, and return-handler when it returns, but with an additional argument, the call depth.
That is to say, the handlers will get two arguments: the frame in question, and the call depth (a non-negative integer).
A trap that calls frame-pred at every instruction, and if frame-pred returns a true value, calls handler on the frame.
The (system vm trace) module defines a number of traps for
tracing of procedure applications. When a procedure is traced, it
means that every call to that procedure is reported to the user during a
program run. The idea is that you can mark a collection of procedures
for tracing, and Guile will subsequently print out a line of the form
| | (procedure args ...)
whenever a marked procedure is about to be applied to its arguments. This can help a programmer determine whether a function is being called at the wrong time or with the wrong set of arguments.
In addition, the indentation of the output is useful for demonstrating how the traced applications are or are not tail recursive with respect to each other. Thus, a trace of a non-tail recursive factorial implementation looks like this:
scheme@(guile-user)> (define (fact1 n)
(if (zero? n) 1
(* n (fact1 (1- n)))))
scheme@(guile-user)> ,trace (fact1 4)
trace: (fact1 4)
trace: | (fact1 3)
trace: | | (fact1 2)
trace: | | | (fact1 1)
trace: | | | | (fact1 0)
trace: | | | | 1
trace: | | | 1
trace: | | 2
trace: | 6
trace: 24
While a typical tail recursive implementation would look more like this:
scheme@(guile-user)> (define (facti acc n)
(if (zero? n) acc
(facti (* n acc) (1- n))))
scheme@(guile-user)> (define (fact2 n) (facti 1 n))
scheme@(guile-user)> ,trace (fact2 4)
trace: (fact2 4)
trace: (facti 1 4)
trace: (facti 4 3)
trace: (facti 12 2)
trace: (facti 24 1)
trace: (facti 24 0)
trace: 24
The low-level traps below (see Low-Level Traps) share some common options:
#:width#:vm#:prefix"trace: ".
To have access to these procedures, you'll need to have imported the
(system vm trace) module:
(use-modules (system vm trace))
Print a trace at applications of and returns from proc.
Print a trace at all applications and returns within the dynamic extent of calls to proc.
Print a trace at all instructions executed in the dynamic extent of calls to proc.
In addition, Guile defines a procedure to call a thunk, tracing all procedure calls and returns within the thunk.
Call thunk, tracing all execution within its dynamic extent.
If calls? is true, Guile will print a brief report at each procedure call and return, as given above.
If instructions? is true, Guile will also print a message each time an instruction is executed. This is a lot of output, but it is sometimes useful when doing low-level optimization.
Note that because this procedure manipulates the VM trace level directly, it doesn't compose well with traps at the REPL.
See Profile Commands, for more information on tracing at the REPL.
When multiple traps are present in a system, we begin to have a bookkeeping problem. How are they named? How does one disable, enable, or delete them?
Guile's answer to this is to keep an implicit per-thread trap state. The trap state object is not exposed to the user; rather, API that works on trap states fetches the current trap state from the dynamic environment.
Traps are identified by integers. A trap can be enabled, disabled, or removed, and can have an associated user-visible name.
These procedures have their own module:
(use-modules (system vm trap-state))
Add a trap to the current trap state, associating the given name with it. Returns a fresh trap identifier (an integer).
Note that usually the more specific functions detailed in High-Level Traps are used in preference to this one.
List the current set of traps, both enabled and disabled. Returns a list of integers.
Returns the name associated with trap idx, or
#fif there is no such trap.
Returns
#tif trap idx is present and enabled, or#fotherwise.
The low-level trap API allows one to make traps that call procedures, and the trap state API allows one to keep track of what traps are there. But neither of these APIs directly helps you when you want to set a breakpoint, because it's unclear what to do when the trap fires. Do you enter a debugger, or mail a summary of the situation to your great-aunt, or what?
So for the common case in which you just want to install breakpoints, and then have them all result in calls to one parameterizable procedure, we have the high-level trap interface.
Perhaps we should have started this section with this interface, as it's clearly the one most people should use. But as its capabilities and limitations proceed from the lower layers, we felt that the character-building exercise of building a mental model might be helpful.
These procedures share a module with trap states:
(use-modules (system vm trap-state))
Call thunk in a dynamic context in which handler is the current trap handler.
Additionally, during the execution of thunk, the VM trace level (see VM Hooks) is set to the number of enabled traps. This ensures that traps will in fact fire.
handler may be
#f, in which case VM hooks are not enabled as they otherwise would be, as there is nothing to handle the traps.
The trace-level-setting behavior of with-default-trap-handler is
one of its more useful aspects, but if you are willing to forgo that,
and just want to install a global trap handler, there's a function for
that too:
Trap handlers are called when traps installed by procedures from this module fire. The current “consumer” of this API is Guile's REPL, but one might easily imagine other trap handlers being used to integrate with other debugging tools.
Install a trap that will fire when proc is called.
This is a breakpoint.
Install a trap that will print a tracing message when proc is called. See Tracing Traps, for more information.
This is a tracepoint.
Install a trap that will fire when control reaches the given source location. user-line is one-indexed, as users count lines, instead of zero-indexed, as Guile counts lines.
This is a source breakpoint.
Install a trap that will call handler when frame finishes executing. The trap will be removed from the trap state after firing, or on nonlocal exit.
This is a finish trap, used to implement the “finish” REPL command.
Install a trap that will call handler after stepping to a different source line or instruction. The trap will be removed from the trap state after firing, or on nonlocal exit.
If instruction? is false (the default), the trap will fire when control reaches a new source line. Otherwise it will fire when control reaches a new instruction.
Additionally, if into? is false (not the default), the trap will only fire for frames at or prior to the given frame. If into? is true (the default), the trap may step into nested procedure invocations.
This is a stepping trap, used to implement the “step”, “next”, “step-instruction”, and “next-instruction” REPL commands.
When writing a test suite for a program or library, it is desirable to know what
part of the code is covered by the test suite. The (system vm
coverage) module provides tools to gather code coverage data and to present
them, as detailed below.
Run thunk, a zero-argument procedure, using vm; instrument vm to collect code coverage data. Return code coverage data and the values returned by thunk.
Return
#tif obj is a coverage data object as returned bywith-code-coverage.
Traverse code coverage information data, as obtained with
with-code-coverage, and write coverage information to port in the.infoformat used by LCOV. The report will include all of modules (or, by default, all the currently loaded modules) even if their code was not executed.The generated data can be fed to LCOV's genhtml command to produce an HTML report, which aids coverage data visualization.
Here's an example use:
(use-modules (system vm coverage)
(system vm vm))
(call-with-values (lambda ()
(with-code-coverage (the-vm)
(lambda ()
(do-something-tricky))))
(lambda (data result)
(let ((port (open-output-file "lcov.info")))
(coverage-data->lcov data port)
(close file))))
In addition, the module provides low-level procedures that would make it possible to write other user interfaces to the coverage data.
Return the list of “instrumented” source files, i.e., source files whose code was loaded at the time data was collected.
Return a list of line number/execution count pairs for file, or
#fif file is not among the files covered by data. This includes lines with zero count.
Return the number of instrumented and the number of executed source lines in file according to data.
Return the number of times proc's code was executed, according to data, or
#fif proc was not executed. When proc is a closure, the number of times its code was executed is returned, not the number of times this code associated with this particular closure was executed.
SLIB is a portable library of Scheme packages which can be used with Guile and other Scheme implementations. SLIB is not included in the Guile distribution, but can be installed separately (see SLIB installation). It is available from http://people.csail.mit.edu/jaffer/SLIB.html.
After SLIB is installed, the following Scheme expression must be executed before the SLIB facilities can be used:
(use-modules (ice-9 slib))
require can then be used in the usual way (see Require). For example,
(use-modules (ice-9 slib))
(require 'primes)
(prime? 13)
⇒ #t
A few Guile core functions are overridden by the SLIB setups; for
example the SLIB version of delete-file returns a boolean
indicating success or failure, whereas the Guile core version throws
an error for failure. In general (and as might be expected) when SLIB
is loaded it's the SLIB specifications that are followed.
The following procedure works, e.g., with SLIB version 3a3 (see SLIB installation):
make install from its directory.
By default, this will install SLIB in /usr/local/lib/slib/.
Running make install-info installs its documentation, by default
under /usr/local/info/.
SCHEME_LIBRARY_PATH environment variable:
$ SCHEME_LIBRARY_PATH=/usr/local/lib/slib/
$ export SCHEME_LIBRARY_PATH
Alternatively, you can create a symlink in the Guile directory to SLIB, e.g.:
ln -s /usr/local/lib/slib /usr/local/share/guile/2.0/slib
# guile
guile> (use-modules (ice-9 slib))
guile> (require 'new-catalog)
guile> (quit)
The catalog data should now be in /usr/local/share/guile/2.0/slibcat.
If instead you get an error such as:
Unbound variable: scheme-implementation-type
then a solution is to get a newer version of Guile,
or to modify ice-9/slib.scm to use define-public for the
offending variables.
Jacal is a symbolic math package written in Scheme by Aubrey Jaffer. It is usually installed as an extra package in SLIB.
You can use Guile's interface to SLIB to invoke Jacal:
(use-modules (ice-9 slib))
(slib:load "math")
(math)
For complete documentation on Jacal, please read the Jacal manual. If it has been installed on line, you can look at Jacal. Otherwise you can find it on the web at http://www-swiss.ai.mit.edu/~jaffer/JACAL.html
These interfaces provide access to operating system facilities. They provide a simple wrapping around the underlying C interfaces to make usage from Scheme more convenient. They are also used to implement the Guile port of scsh (see The Scheme shell (scsh)).
Generally there is a single procedure for each corresponding Unix
facility. There are some exceptions, such as procedures implemented for
speed and convenience in Scheme with no primitive Unix equivalent,
e.g. copy-file.
The interfaces are intended as far as possible to be portable across different versions of Unix. In some cases procedures which can't be implemented on particular systems may become no-ops, or perform limited actions. In other cases they may throw errors.
General naming conventions are as follows:
recv!.
#t or #f) have question marks
appended, e.g., access?.
primitive-fork.
EPERM or R_OK are converted
to Scheme variables of the same name (underscores are not replaced
with hyphens).
Unexpected conditions are generally handled by raising exceptions.
There are a few procedures which return a special value if they don't
succeed, e.g., getenv returns #f if it the requested
string is not found in the environment. These cases are noted in
the documentation.
For ways to deal with exceptions, see Exceptions.
Errors which the C library would report by returning a null pointer or
through some other means are reported by raising a system-error
exception with scm-error (see Error Reporting). The
data parameter is a list containing the Unix errno value
(an integer). For example,
(define (my-handler key func fmt fmtargs data)
(display key) (newline)
(display func) (newline)
(apply format #t fmt fmtargs) (newline)
(display data) (newline))
(catch 'system-error
(lambda () (dup2 -123 -456))
my-handler)
-|
system-error
dup2
Bad file descriptor
(9)
Return the
errnovalue from a list which is the arguments to an exception handler. If the exception is not asystem-error, then the return is#f. For example,(catch 'system-error (lambda () (mkdir "/this-ought-to-fail-if-I'm-not-root")) (lambda stuff (let ((errno (system-error-errno stuff))) (cond ((= errno EACCES) (display "You're not allowed to do that.")) ((= errno EEXIST) (display "Already exists.")) (#t (display (strerror errno)))) (newline))))
Conventions generally follow those of scsh, The Scheme shell (scsh).
File ports are implemented using low-level operating system I/O facilities, with optional buffering to improve efficiency; see File Ports.
Note that some procedures (e.g., recv!) will accept ports as
arguments, but will actually operate directly on the file descriptor
underlying the port. Any port buffering is ignored, including the
buffer which implements peek-char and unread-char.
The force-output and drain-input procedures can be used
to clear the buffers.
Each open file port has an associated operating system file descriptor. File descriptors are generally not useful in Scheme programs; however they may be needed when interfacing with foreign code and the Unix environment.
A file descriptor can be extracted from a port and a new port can be created from a file descriptor. However a file descriptor is just an integer and the garbage collector doesn't recognize it as a reference to the port. If all other references to the port were dropped, then it's likely that the garbage collector would free the port, with the side-effect of closing the file descriptor prematurely.
To assist the programmer in avoiding this problem, each port has an associated revealed count which can be used to keep track of how many times the underlying file descriptor has been stored in other places. If a port's revealed count is greater than zero, the file descriptor will not be closed when the port is garbage collected. A programmer can therefore ensure that the revealed count will be greater than zero if the file descriptor is needed elsewhere.
For the simple case where a file descriptor is “imported” once to become a port, it does not matter if the file descriptor is closed when the port is garbage collected. There is no need to maintain a revealed count. Likewise when “exporting” a file descriptor to the external environment, setting the revealed count is not required provided the port is kept open (i.e., is pointed to by a live Scheme binding) while the file descriptor is in use.
To correspond with traditional Unix behaviour, three file descriptors
(0, 1, and 2) are automatically imported when a program starts up and
assigned to the initial values of the current/standard input, output,
and error ports, respectively. The revealed count for each is
initially set to one, so that dropping references to one of these
ports will not result in its garbage collection: it could be retrieved
with fdopen or fdes->ports.
Return the revealed count for port.
Sets the revealed count for a port to rcount. The return value is unspecified.
Return the integer file descriptor underlying port. Does not change its revealed count.
Returns the integer file descriptor underlying port. As a side effect the revealed count of port is incremented.
Return a new port based on the file descriptor fdes. Modes are given by the string modes. The revealed count of the port is initialized to zero. The modes string is the same as that accepted by
open-file(see open-file).
Return a list of existing ports which have fdes as an underlying file descriptor, without changing their revealed counts.
Returns an existing input port which has fdes as its underlying file descriptor, if one exists, and increments its revealed count. Otherwise, returns a new input port with a revealed count of 1.
Returns an existing output port which has fdes as its underlying file descriptor, if one exists, and increments its revealed count. Otherwise, returns a new output port with a revealed count of 1.
Moves the underlying file descriptor for port to the integer value fdes without changing the revealed count of port. Any other ports already using this descriptor will be automatically shifted to new descriptors and their revealed counts reset to zero. The return value is
#fif the file descriptor already had the required value or#tif it was moved.
Moves the underlying file descriptor for port to the integer value fdes and sets its revealed count to one. Any other ports already using this descriptor will be automatically shifted to new descriptors and their revealed counts reset to zero. The return value is unspecified.
Copies any unwritten data for the specified output file descriptor to disk. If port/fd is a port, its buffer is flushed before the underlying file descriptor is fsync'd. The return value is unspecified.
Open the file named by path for reading and/or writing. flags is an integer specifying how the file should be opened. mode is an integer specifying the permission bits of the file, if it needs to be created, before the umask (see Processes) is applied. The default is 666 (Unix itself has no default).
flags can be constructed by combining variables using
logior. Basic flags are:See File Status Flags, for additional flags.
Similar to
openbut return a file descriptor instead of a port.
Similar to
close-port(see close-port), but also works on file descriptors. A side effect of closing a file descriptor is that any ports using that file descriptor are moved to a different file descriptor and have their revealed counts set to zero.
A simple wrapper for the
closesystem call. Close file descriptor fd, which must be an integer. Unlikeclose, the file descriptor will be closed even if a port is using it. The return value is unspecified.
Place char in port so that it will be read by the next read operation on that port. If called multiple times, the unread characters will be read again in “last-in, first-out” order (i.e. a stack). If port is not supplied, the current input port is used.
Place the string str in port so that its characters will be read in subsequent read operations. If called multiple times, the unread characters will be read again in last-in first-out order. If port is not supplied, the current-input-port is used.
Return a newly created pipe: a pair of ports which are linked together on the local machine. The CAR is the input port and the CDR is the output port. Data written (and flushed) to the output port can be read from the input port. Pipes are commonly used for communication with a newly forked child process. The need to flush the output port can be avoided by making it unbuffered using
setvbuf.— Variable: PIPE_BUF
A write of up to
PIPE_BUFmany bytes to a pipe is atomic, meaning when done it goes into the pipe instantaneously and as a contiguous block (see Atomicity of Pipe I/O).Note that the output port is likely to block if too much data has been written but not yet read from the input port. Typically the capacity is
PIPE_BUFbytes.
The next group of procedures perform a dup2
system call, if newfd (an
integer) is supplied, otherwise a dup. The file descriptor to be
duplicated can be supplied as an integer or contained in a port. The
type of value returned varies depending on which procedure is used.
All procedures also have the side effect when performing dup2 that any
ports using newfd are moved to a different file descriptor and have
their revealed counts set to zero.
Return a new integer file descriptor referring to the open file designated by fd_or_port, which must be either an open file port or a file descriptor.
Returns a new input port using the new file descriptor.
Returns a new output port using the new file descriptor.
Returns a new port if port/fd is a port, with the same mode as the supplied port, otherwise returns an integer file descriptor.
Returns a new port using the new file descriptor. mode supplies a mode string for the port (see open-file).
Returns a new port which is opened on a duplicate of the file descriptor underlying port, with mode string modes as for open-file. The two ports will share a file position and file status flags.
Unexpected behaviour can result if both ports are subsequently used and the original and/or duplicate ports are buffered. The mode string can include
0to obtain an unbuffered duplicate port.This procedure is equivalent to
(dup->portport modes).
This procedure takes two ports and duplicates the underlying file descriptor from old-port into new-port. The current file descriptor in new-port will be closed. After the redirection the two ports will share a file position and file status flags.
The return value is unspecified.
Unexpected behaviour can result if both ports are subsequently used and the original and/or duplicate ports are buffered.
This procedure does not have any side effects on other ports or revealed counts.
A simple wrapper for the
dup2system call. Copies the file descriptor oldfd to descriptor number newfd, replacing the previous meaning of newfd. Both oldfd and newfd must be integers. Unlike fordup->fdesorprimitive-move->fdes, no attempt is made to move away ports which are using newfd. The return value is unspecified.
Return the port modes associated with the open port port. These will not necessarily be identical to the modes used when the port was opened, since modes such as “append” which are used only during port creation are not retained.
Apply proc to each port in the Guile port table (FIXME: what is the Guile port table?) in turn. The return value is unspecified. More specifically, proc is applied exactly once to every port that exists in the system at the time
port-for-eachis invoked. Changes to the port table whileport-for-eachis running have no effect as far asport-for-eachis concerned.The C function
scm_port_for_eachtakes a Scheme procedure encoded as aSCMvalue, whilescm_c_port_for_eachtakes a pointer to a C function and passes along a arbitrary data cookie.
Apply cmd on port/fd, either a port or file descriptor. The value argument is used by the
SETcommands described below, it's an integer value.Values for cmd are:
— Variable: F_GETFD
— Variable: F_SETFD
Get or set flags associated with the file descriptor. The only flag is the following,
— Variable: FD_CLOEXEC
“Close on exec”, meaning the file descriptor will be closed on an
execcall (a successful such call). For example to set that flag,(fcntl port F_SETFD FD_CLOEXEC)Or better, set it but leave any other possible future flags unchanged,
(fcntl port F_SETFD (logior FD_CLOEXEC (fcntl port F_GETFD)))— Variable: F_GETFL
— Variable: F_SETFL
Get or set flags associated with the open file. These flags are
O_RDONLYetc described underopenabove.A common use is to set
O_NONBLOCKon a network socket. The following sets that flag, and leaves other flags unchanged.(fcntl sock F_SETFL (logior O_NONBLOCK (fcntl sock F_GETFL)))
Apply or remove an advisory lock on an open file. operation specifies the action to be done:
— Variable: LOCK_SH
Shared lock. More than one process may hold a shared lock for a given file at a given time.
— Variable: LOCK_EX
Exclusive lock. Only one process may hold an exclusive lock for a given file at a given time.
— Variable: LOCK_NB
Don't block when locking. This is combined with one of the other operations using
logior(see Bitwise Operations). Ifflockwould block anEWOULDBLOCKerror is thrown (see Conventions).The return value is not specified. file may be an open file descriptor or an open file descriptor port.
Note that
flockdoes not lock files across NFS.
This procedure has a variety of uses: waiting for the ability to provide input, accept output, or the existence of exceptional conditions on a collection of ports or file descriptors, or waiting for a timeout to occur. It also returns if interrupted by a signal.
reads, writes and excepts can be lists or vectors, with each member a port or a file descriptor. The value returned is a list of three corresponding lists or vectors containing only the members which meet the specified requirement. The ability of port buffers to provide input or accept output is taken into account. Ordering of the input lists or vectors is not preserved.
The optional arguments secs and usecs specify the timeout. Either secs can be specified alone, as either an integer or a real number, or both secs and usecs can be specified as integers, in which case usecs is an additional timeout expressed in microseconds. If secs is omitted or is
#fthen select will wait for as long as it takes for one of the other conditions to be satisfied.The scsh version of
selectdiffers as follows: Only vectors are accepted for the first three arguments. The usecs argument is not supported. Multiple values are returned instead of a list. Duplicates in the input vectors appear only once in output. An additionalselect!interface is provided.
These procedures allow querying and setting file system attributes (such as owner, permissions, sizes and types of files); deleting, copying, renaming and linking files; creating and removing directories and querying their contents; syncing the file system and creating special files.
Test accessibility of a file under the real UID and GID of the calling process. The return is
#tif path exists and the permissions requested by how are all allowed, or#fif not.how is an integer which is one of the following values, or a bitwise-OR (
logior) of multiple values.— Variable: F_OK
Test for existence of the file. This is implied by each of the other tests, so there's no need to combine it with them.
It's important to note that
access?does not simply indicate what will happen on attempting to read or write a file. In normal circumstances it does, but in a set-UID or set-GID program it doesn't becauseaccess?tests the real ID, whereas an open or execute attempt uses the effective ID.A program which will never run set-UID/GID can ignore the difference between real and effective IDs, but for maximum generality, especially in library functions, it's best not to use
access?to predict the result of an open or execute, instead simply attempt that and catch any exception.The main use for
access?is to let a set-UID/GID program determine what the invoking user would have been allowed to do, without the greater (or perhaps lesser) privileges afforded by the effective ID. For more on this, see Testing File Access.
Return an object containing various information about the file determined by obj. obj can be a string containing a file name or a port or integer file descriptor which is open on a file (in which case
fstatis used as the underlying system call).The object returned by
statcan be passed as a single parameter to the following procedures, all of which return integers:— Scheme Procedure: stat:ino st
The file serial number, which distinguishes this file from all other files on the same device.
— Scheme Procedure: stat:mode st
The mode of the file. This is an integer which incorporates file type information and file permission bits. See also
stat:typeandstat:permsbelow.— Scheme Procedure: stat:rdev st
Device ID; this entry is defined only for character or block special files. On some systems this field is not available at all, in which case
stat:rdevreturns#f.— Scheme Procedure: stat:ctime st
The last modification time for the attributes of the file, in seconds.
— Scheme Procedure: stat:atimensec st
— Scheme Procedure: stat:mtimensec st
— Scheme Procedure: stat:ctimensec st
The fractional part of a file's access, modification, or attribute modification time, in nanoseconds. Nanosecond timestamps are only available on some operating systems and file systems. If Guile cannot retrieve nanosecond-level timestamps for a file, these fields will be set to 0.
— Scheme Procedure: stat:blksize st
The optimal block size for reading or writing the file, in bytes. On some systems this field is not available, in which case
stat:blksizereturns a sensible suggested block size.— Scheme Procedure: stat:blocks st
The amount of disk space that the file occupies measured in units of 512 byte blocks. On some systems this field is not available, in which case
stat:blocksreturns#f.In addition, the following procedures return the information from
stat:modein a more convenient form:
Similar to
stat, but does not follow symbolic links, i.e., it will return information about a symbolic link itself, not the file it points to. path must be a string.
Return the value of the symbolic link named by path (a string), i.e., the file that the link points to.
Change the ownership and group of the file referred to by object to the integer values owner and group. object can be a string containing a file name or, if the platform supports
fchown(see File Owner), a port or integer file descriptor which is open on the file. The return value is unspecified.If object is a symbolic link, either the ownership of the link or the ownership of the referenced file will be changed depending on the operating system (lchown is unsupported at present). If owner or group is specified as
-1, then that ID is not changed.
Changes the permissions of the file referred to by obj. obj can be a string containing a file name or a port or integer file descriptor which is open on a file (in which case
fchmodis used as the underlying system call). mode specifies the new permissions as a decimal number, e.g.,(chmod "foo" #o755). The return value is unspecified.
utimesets the access and modification times for the file named by path. If actime or modtime is not supplied, then the current time is used. actime and modtime must be integer time values as returned by thecurrent-timeprocedure.The optional actimens and modtimens are nanoseconds to add actime and modtime. Nanosecond precision is only supported on some combinations of file systems and operating systems.
(utime "foo" (- (current-time) 3600))will set the access time to one hour in the past and the modification time to the current time.
Deletes (or “unlinks”) the file whose path is specified by str.
Copy the file specified by oldfile to newfile. The return value is unspecified.
Renames the file specified by oldname to newname. The return value is unspecified.
Creates a new name newpath in the file system for the file named by oldpath. If oldpath is a symbolic link, the link may or may not be followed depending on the system.
Create a symbolic link named newpath with the value (i.e., pointing to) oldpath. The return value is unspecified.
Create a new directory named by path. If mode is omitted then the permissions of the directory file are set using the current umask (see Processes). Otherwise they are set to the decimal value specified with mode. The return value is unspecified.
Remove the existing directory named by path. The directory must be empty for this to succeed. The return value is unspecified.
Open the directory specified by dirname and return a directory stream.
Before using this and the procedures below, make sure to see the higher-level procedures for directory traversal that are available (see File Tree Walk).
Return a boolean indicating whether object is a directory stream as returned by
opendir.
Return (as a string) the next directory entry from the directory stream stream. If there is no remaining entry to be read then the end of file object is returned.
Reset the directory port stream so that the next call to
readdirwill return the first directory entry.
Close the directory stream stream. The return value is unspecified.
Here is an example showing how to display all the entries in a directory:
(define dir (opendir "/usr/lib"))
(do ((entry (readdir dir) (readdir dir)))
((eof-object? entry))
(display entry)(newline))
(closedir dir)
Flush the operating system disk buffers. The return value is unspecified.
Creates a new special file, such as a file corresponding to a device. path specifies the name of the file. type should be one of the following symbols: ‘regular’, ‘directory’, ‘symlink’, ‘block-special’, ‘char-special’, ‘fifo’, or ‘socket’. perms (an integer) specifies the file permissions. dev (an integer) specifies which device the special file refers to. Its exact interpretation depends on the kind of special file being created.
E.g.,
(mknod "/dev/fd0" 'block-special #o660 (+ (* 2 256) 2))The return value is unspecified.
Return an auto-generated name of a temporary file, a file which doesn't already exist. The name includes a path, it's usually in /tmp but that's system dependent.
Care must be taken when using
tmpnam. In between choosing the name and creating the file another program might use that name, or an attacker might even make it a symlink pointing at something important and causing you to overwrite that.The safe way is to create the file using
openwithO_EXCLto avoid any overwriting. A loop can try again with another name if the file exists (errorEEXIST).mkstemp!below does that.
Create a new unique file in the file system and return a new buffered port open for reading and writing to the file.
tmpl is a string specifying where the file should be created: it must end with ‘XXXXXX’ and those ‘X’s will be changed in the string to return the name of the file. (
port-filenameon the port also gives the name.)POSIX doesn't specify the permissions mode of the file, on GNU and most systems it's
#o600. An application can usechmodto relax that if desired. For example#o666lessumask, which is usual for ordinary file creation,(let ((port (mkstemp! (string-copy "/tmp/myfile-XXXXXX")))) (chmod port (logand #o666 (lognot (umask)))) ...)
Return an input/output port to a unique temporary file named using the path prefix
P_tmpdirdefined in stdio.h. The file is automatically deleted when the port is closed or the program terminates.
Return the directory name component of the file name filename. If filename does not contain a directory component,
.is returned.
Return the base name of the file name filename. The base name is the file name without any directory components. If suffix is provided, and is equal to the end of basename, it is removed also.
(basename "/tmp/test.xml" ".xml") ⇒ "test"
The facilities in this section provide an interface to the user and group database. They should be used with care since they are not reentrant.
The following functions accept an object representing user information and return a selected component:
Initializes a stream used by
getpwentto read from the user database. The next use ofgetpwentwill return the first entry. The return value is unspecified.
Read the next entry in the user database stream. The return is a passwd user object as above, or
#fwhen no more entries.
If called with a true argument, initialize or reset the password data stream. Otherwise, close the stream. The
setpwentandendpwentprocedures are implemented on top of this.
Look up an entry in the user database. obj can be an integer, a string, or omitted, giving the behaviour of getpwuid, getpwnam or getpwent respectively.
The following functions accept an object representing group information and return a selected component:
Initializes a stream used by
getgrentto read from the group database. The next use ofgetgrentwill return the first entry. The return value is unspecified.
Return the next entry in the group database, using the stream set by
setgrent.
If called with a true argument, initialize or reset the group data stream. Otherwise, close the stream. The
setgrentandendgrentprocedures are implemented on top of this.
Look up an entry in the group database. obj can be an integer, a string, or omitted, giving the behaviour of getgrgid, getgrnam or getgrent respectively.
In addition to the accessor procedures for the user database, the following shortcut procedure is also available.
Return a string containing the name of the user logged in on the controlling terminal of the process, or
#fif this information cannot be obtained.
Return the number of seconds since 1970-01-01 00:00:00 UTC, excluding leap seconds.
Return a pair containing the number of seconds and microseconds since 1970-01-01 00:00:00 UTC, excluding leap seconds. Note: whether true microsecond resolution is available depends on the operating system.
The following procedures either accept an object representing a broken down time and return a selected component, or accept an object representing a broken down time and a value and set the component to the value. The numbers in parentheses give the usual range.
Year (70-), the year minus 1900.
Day of the week (0-6) with Sunday represented as 0.
Day of the year (0-364, 365 in leap years).
Daylight saving indicator (0 for “no”, greater than 0 for “yes”, less than 0 for “unknown”).
Time zone offset in seconds west of UTC (-46800 to 43200). For example on East coast USA (zone ‘EST+5’) this would be 18000 (ie. 5*60*60) in winter, or 14400 (ie. 4*60*60) during daylight savings.
Note
tm:gmtoffis not the same astm_gmtoffin the Ctmstructure.tm_gmtoffis seconds east and hence the negative of the value here.
Time zone label (a string), not necessarily unique.
Return an object representing the broken down components of time, an integer like the one returned by
current-time. The time zone for the calculation is optionally specified by zone (a string), otherwise the TZ environment variable or the system default is used.
Return an object representing the broken down components of time, an integer like the one returned by
current-time. The values are calculated for UTC.
For a broken down time object sbd-time, return a pair the
carof which is an integer time likecurrent-time, and thecdrof which is a new broken down time with normalized fields.zone is a timezone string, or the default is the TZ environment variable or the system default (see Specifying the Time Zone with TZ). sbd-time is taken to be in that zone.
The following fields of sbd-time are used:
tm:year,tm:mon,tm:mday,tm:hour,tm:min,tm:sec,tm:isdst. The values can be outside their usual ranges. For exampletm:hournormally goes up to 23, but a value say 33 would mean 9 the following day.
tm:isdstin sbd-time says whether the time given is with daylight savings or not. This is ignored if zone doesn't have any daylight savings adjustment amount.The broken down time in the return normalizes the values of sbd-time by bringing them into their usual ranges, and using the actual daylight savings rule for that time in zone (which may differ from what sbd-time had). The easiest way to think of this is that sbd-time plus zone converts to the integer UTC time, then a
localtimeis applied to get the normal presentation of that time, in zone.
Initialize the timezone from the TZ environment variable or the system default. It's not usually necessary to call this procedure since it's done automatically by other procedures that depend on the timezone.
Return a string which is broken-down time structure tm formatted according to the given format string.
format contains field specifications introduced by a ‘%’ character. See Formatting Calendar Time, or ‘man 3 strftime’, for the available formatting.
(strftime "%c" (localtime (current-time))) ⇒ "Mon Mar 11 20:17:43 2002"If
setlocalehas been called (see Locales), month and day names are from the current locale and in the locale character set.
Performs the reverse action to
strftime, parsing string according to the specification supplied in template. The interpretation of month and day names is dependent on the current locale. The value returned is a pair. The CAR has an object with time components in the form returned bylocaltimeorgmtime, but the time zone components are not usefully set. The CDR reports the number of characters from string which were used for the conversion.
The value of this variable is the number of time units per second reported by the following procedures.
Return an object with information about real and processor time. The following procedures accept such an object as an argument and return a selected component:
— Scheme Procedure: tms:clock tms
The current real time, expressed as time units relative to an arbitrary base.
— Scheme Procedure: tms:stime tms
The CPU time units used by the system on behalf of the calling process.
Return the number of time units since the interpreter was started.
Return the number of time units of processor time used by the interpreter. Both system and user time are included but subprocesses are not.
Get the command line arguments passed to Guile, or set new arguments.
The arguments are a list of strings, the first of which is the invoked program name. This is just
"guile"(or the executable path) when run interactively, or it's the script name when running a script with -s (see Invoking Guile).guile -L /my/extra/dir -s foo.scm abc def (program-arguments) ⇒ ("foo.scm" "abc" "def")
set-program-argumentsallows a library module or similar to modify the arguments, for example to strip options it recognises, leaving the rest for the mainline.The argument list is held in a fluid, which means it's separate for each thread. Neither the list nor the strings within it are copied at any point and normally should not be mutated.
The two names
program-argumentsandcommand-lineare an historical accident, they both do exactly the same thing. The namescm_set_program_arguments_scmhas an extra_scmon the end to avoid clashing with the C function below.
Set the list of command line arguments for
program-argumentsandcommand-lineabove.argv is an array of null-terminated strings, as in a C
mainfunction. argc is the number of strings in argv, or if it's negative then aNULLin argv marks its end.first is an extra string put at the start of the arguments, or
NULLfor no such extra. This is a convenient way to pass the program name after advancing argv to strip option arguments. Eg.{ char *progname = argv[0]; for (argv++; argv[0] != NULL && argv[0][0] == '-'; argv++) { /* munch option ... */ } /* remaining args for scheme level use */ scm_set_program_arguments (-1, argv, progname); }This sort of thing is often done at startup under
scm_boot_guilewith options handled at the C level removed. The given strings are all copied, so the C data is not accessed again oncescm_set_program_argumentsreturns.
Looks up the string name in the current environment. The return value is
#funless a string of the formNAME=VALUEis found, in which case the stringVALUEis returned.
Modifies the environment of the current process, which is also the default environment inherited by child processes.
If value is
#f, then name is removed from the environment. Otherwise, the string name=value is added to the environment, replacing any existing string with name matching name.The return value is unspecified.
Remove variable name from the environment. The name can not contain a ‘=’ character.
If env is omitted, return the current environment (in the Unix sense) as a list of strings. Otherwise set the current environment, which is also the default environment for child processes, to the supplied list of strings. Each member of env should be of the form NAME=VALUE and values of NAME should not be duplicated. If env is supplied then the return value is unspecified.
Modifies the environment of the current process, which is also the default environment inherited by child processes.
If string is of the form
NAME=VALUEthen it will be written directly into the environment, replacing any existing environment string with name matchingNAME. If string does not contain an equal sign, then any existing string with name matching string will be removed.The return value is unspecified.
Change the current working directory to path. The return value is unspecified.
Return the name of the current working directory.
If mode is omitted, returns a decimal number representing the current file creation mask. Otherwise the file creation mask is set to mode and the previous value is returned. See Assigning File Permissions, for more on how to use umasks.
E.g.,
(umask #o022)sets the mask to octal 22/decimal 18.
Change the root directory to that specified in path. This directory will be used for path names beginning with /. The root directory is inherited by all children of the current process. Only the superuser may change the root directory.
Return an integer representing the current process ID.
Return a vector of integers representing the current supplementary group IDs.
Return an integer representing the process ID of the parent process.
Return an integer representing the current real user ID.
Return an integer representing the current real group ID.
Return an integer representing the current effective user ID. If the system does not support effective IDs, then the real ID is returned.
(provided? 'EIDs)reports whether the system supports effective IDs.
Return an integer representing the current effective group ID. If the system does not support effective IDs, then the real ID is returned.
(provided? 'EIDs)reports whether the system supports effective IDs.
Set the current set of supplementary group IDs to the integers in the given vector vec. The return value is unspecified.
Generally only the superuser can set the process group IDs (see Setting the Group IDs).
Sets both the real and effective user IDs to the integer id, provided the process has appropriate privileges. The return value is unspecified.
Sets both the real and effective group IDs to the integer id, provided the process has appropriate privileges. The return value is unspecified.
Sets the effective user ID to the integer id, provided the process has appropriate privileges. If effective IDs are not supported, the real ID is set instead—
(provided? 'EIDs)reports whether the system supports effective IDs. The return value is unspecified.
Sets the effective group ID to the integer id, provided the process has appropriate privileges. If effective IDs are not supported, the real ID is set instead—
(provided? 'EIDs)reports whether the system supports effective IDs. The return value is unspecified.
Return an integer representing the current process group ID. This is the POSIX definition, not BSD.
Move the process pid into the process group pgid. pid or pgid must be integers: they can be zero to indicate the ID of the current process. Fails on systems that do not support job control. The return value is unspecified.
Creates a new session. The current process becomes the session leader and is put in a new process group. The process will be detached from its controlling terminal if it has one. The return value is an integer representing the new process group ID.
Returns the session ID of process pid. (The session ID of a process is the process group ID of its session leader.)
This procedure collects status information from a child process which has terminated or (optionally) stopped. Normally it will suspend the calling process until this can be done. If more than one child process is eligible then one will be chosen by the operating system.
The value of pid determines the behaviour:
- pid greater than 0
- Request status information from the specified child process.
- pid equal to -1 or
WAIT_ANY- Request status information for any child process.
- pid equal to 0 or
WAIT_MYPGRP- Request status information for any child process in the current process group.
- pid less than -1
- Request status information for any child process whose process group ID is −pid.
The options argument, if supplied, should be the bitwise OR of the values of zero or more of the following variables:
— Variable: WUNTRACED
Report status information for stopped processes as well as terminated processes.
The return value is a pair containing:
- The process ID of the child process, or 0 if
WNOHANGwas specified and no process was collected.- The integer status value.
The following three
functions can be used to decode the process status code returned
by waitpid.
Return the exit status value, as would be set if a process ended normally through a call to
exitor_exit, if any, otherwise#f.
Return the signal number which terminated the process, if any, otherwise
#f.
Return the signal number which stopped the process, if any, otherwise
#f.
Execute cmd using the operating system's “command processor”. Under Unix this is usually the default shell
sh. The value returned is cmd's exit status as returned bywaitpid, which can be interpreted using the functions above.If
systemis called without arguments, return a boolean indicating whether the command processor is available.
Execute the command indicated by args. The first element must be a string indicating the command to be executed, and the remaining items must be strings representing each of the arguments to that command.
This function returns the exit status of the command as provided by
waitpid. This value can be handled withstatus:exit-valand the related functions.
system*is similar tosystem, but accepts only one string per-argument, and performs no shell interpretation. The command is executed using fork and execlp. Accordingly this function may be safer thansystemin situations where shell interpretation is not required.Example: (system* "echo" "foo" "bar")
Terminate the current process without unwinding the Scheme stack. The exit status is status if supplied, otherwise zero.
primitive-exituses the Cexitfunction and hence runs usual C level cleanups (flush output streams, callatexitfunctions, etc, see Normal Termination)).
primitive-_exitis the_exitsystem call (see Termination Internals). This terminates the program immediately, with neither Scheme-level nor C-level cleanups.The typical use for
primitive-_exitis from a child process created withprimitive-fork. For example in a Gdk program the child process inherits the X server connection and a C-levelatexitcleanup which will close that connection. But closing in the child would upset the protocol in the parent, soprimitive-_exitshould be used to exit without that.
Executes the file named by path as a new process image. The remaining arguments are supplied to the process; from a C program they are accessible as the
argvargument tomain. Conventionally the first arg is the same as path. All arguments must be strings.If arg is missing, path is executed with a null argument list, which may have system-dependent side-effects.
This procedure is currently implemented using the
execvsystem call, but we call itexeclbecause of its Scheme calling interface.
Similar to
execl, however if filename does not contain a slash then the file to execute will be located by searching the directories listed in thePATHenvironment variable.This procedure is currently implemented using the
execvpsystem call, but we call itexeclpbecause of its Scheme calling interface.
Similar to
execl, but the environment of the new process is specified by env, which must be a list of strings as returned by theenvironprocedure.This procedure is currently implemented using the
execvesystem call, but we call itexeclebecause of its Scheme calling interface.
Creates a new “child” process by duplicating the current “parent” process. In the child the return value is 0. In the parent the return value is the integer process ID of the child.
This procedure has been renamed from
forkto avoid a naming conflict with the scsh fork.
Increment the priority of the current process by incr. A higher priority value means that the process runs less often. The return value is unspecified.
Set the scheduling priority of the process, process group or user, as indicated by which and who. which is one of the variables
PRIO_PROCESS,PRIO_PGRPorPRIO_USER, and who is interpreted relative to which (a process identifier forPRIO_PROCESS, process group identifier forPRIO_PGRP, and a user identifier forPRIO_USER. A zero value of who denotes the current process, process group, or user. prio is a value in the range [−20,20]. The default priority is 0; lower priorities (in numerical terms) cause more favorable scheduling. Sets the priority of all of the specified processes. Only the super-user may lower priorities. The return value is not specified.
Return the scheduling priority of the process, process group or user, as indicated by which and who. which is one of the variables
PRIO_PROCESS,PRIO_PGRPorPRIO_USER, and who should be interpreted depending on which (a process identifier forPRIO_PROCESS, process group identifier forPRIO_PGRP, and a user identifier forPRIO_USER). A zero value of who denotes the current process, process group, or user. Return the highest priority (lowest numerical value) of any of the specified processes.
Return a bitvector representing the CPU affinity mask for process pid. Each CPU the process has affinity with has its corresponding bit set in the returned bitvector. The number of bits set is a good estimate of how many CPUs Guile can use without stepping on other processes' toes.
Currently this procedure is only defined on GNU variants (see
sched_getaffinity).
Install the CPU affinity mask mask, a bitvector, for the process or thread with ID pid. The return value is unspecified.
Currently this procedure is only defined on GNU variants (see
sched_setaffinity).
Return the total number of processors of the machine, which is guaranteed to be at least 1. A “processor” here is a thread execution unit, which can be either:
- an execution core in a (possibly multi-core) chip, in a (possibly multi- chip) module, in a single computer, or
- a thread execution unit inside a core in the case of hyper-threaded CPUs.
Which of the two definitions is used, is unspecified.
Like
total-processor-count, but return the number of processors available to the current process. Seesetaffinityandgetaffinityfor more information.
The following procedures raise, handle and wait for signals.
Scheme code signal handlers are run via a system async (see System asyncs), so they're called in the handler's thread at the next safe opportunity. Generally this is after any currently executing primitive procedure finishes (which could be a long time for primitives that wait for an external event).
Sends a signal to the specified process or group of processes.
pid specifies the processes to which the signal is sent:
- pid greater than 0
- The process whose identifier is pid.
- pid equal to 0
- All processes in the current process group.
- pid less than -1
- The process group whose identifier is -pid
- pid equal to -1
- If the process is privileged, all processes except for some special system processes. Otherwise, all processes with the current effective user ID.
sig should be specified using a variable corresponding to the Unix symbolic name, e.g.,
A full list of signals on the GNU system may be found in Standard Signals.
Sends a specified signal sig to the current process, where sig is as described for the
killprocedure.
Install or report the signal handler for a specified signal.
signum is the signal number, which can be specified using the value of variables such as
SIGINT.If handler is omitted,
sigactionreturns a pair: the CAR is the current signal hander, which will be either an integer with the valueSIG_DFL(default action) orSIG_IGN(ignore), or the Scheme procedure which handles the signal, or#fif a non-Scheme procedure handles the signal. The CDR contains the currentsigactionflags for the handler.If handler is provided, it is installed as the new handler for signum. handler can be a Scheme procedure taking one argument, or the value of
SIG_DFL(default action) orSIG_IGN(ignore), or#fto restore whatever signal handler was installed beforesigactionwas first used. When a scheme procedure has been specified, that procedure will run in the given thread. When no thread has been given, the thread that made this call tosigactionis used.flags is a
logior(see Bitwise Operations) of the following (where provided by the system), or0for none.— Variable: SA_NOCLDSTOP
By default,
SIGCHLDis signalled when a child process stops (ie. receivesSIGSTOP), and when a child process terminates. With theSA_NOCLDSTOPflag,SIGCHLDis only signalled for termination, not stopping.
SA_NOCLDSTOPhas no effect on signals other thanSIGCHLD.— Variable: SA_RESTART
If a signal occurs while in a system call, deliver the signal then restart the system call (as opposed to returning an
EINTRerror from that call).The return value is a pair with information about the old handler as described above.
This interface does not provide access to the “signal blocking” facility. Maybe this is not needed, since the thread support may provide solutions to the problem of consistent access to data structures.
Return all signal handlers to the values they had before any call to
sigactionwas made. The return value is unspecified.
Set a timer to raise a
SIGALRMsignal after the specified number of seconds (an integer). It's advisable to install a signal handler forSIGALRMbeforehand, since the default action is to terminate the process.The return value indicates the time remaining for the previous alarm, if any. The new value replaces the previous alarm. If there was no previous alarm, the return value is zero.
Pause the current process (thread?) until a signal arrives whose action is to either terminate the current process or invoke a handler procedure. The return value is unspecified.
Wait the given period secs seconds or usecs microseconds (both integers). If a signal arrives the wait stops and the return value is the time remaining, in seconds or microseconds respectively. If the period elapses with no signal the return is zero.
On most systems the process scheduler is not microsecond accurate and the actual period slept by
usleepmight be rounded to a system clock tick boundary, which might be 10 milliseconds for instance.See
scm_std_sleepandscm_std_usleepfor equivalents at the C level (see Blocking).
Get or set the periods programmed in certain system timers. These timers have a current interval value which counts down and on reaching zero raises a signal. An optional periodic value can be set to restart from there each time, for periodic operation. which_timer is one of the following values
— Variable: ITIMER_REAL
A real-time timer, counting down elapsed real time. At zero it raises
SIGALRM. This is likealarmabove, but with a higher resolution period.— Variable: ITIMER_VIRTUAL
A virtual-time timer, counting down while the current process is actually using CPU. At zero it raises
SIGVTALRM.— Variable: ITIMER_PROF
A profiling timer, counting down while the process is running (like
ITIMER_VIRTUAL) and also while system calls are running on the process's behalf. At zero it raises aSIGPROF.This timer is intended for profiling where a program is spending its time (by looking where it is when the timer goes off).
getitimerreturns the current timer value and its programmed restart value, as a list containing two pairs. Each pair is a time in seconds and microseconds:((interval_secs.interval_usecs) (periodic_secs.periodic_usecs)).
setitimersets the timer values similarly, in seconds and microseconds (which must be integers). The periodic value can be zero to have the timer run down just once. The return value is the timer's previous setting, in the same form asgetitimerreturns.(setitimer ITIMER_REAL 5 500000 ;; first SIGALRM in 5.5 seconds time 2 0) ;; then repeat every 2 secondsAlthough the timers are programmed in microseconds, the actual accuracy might not be that high.
Return
#tif port is using a serial non–file device, otherwise#f.
Return a string with the name of the serial terminal device underlying port.
Return a string containing the file name of the controlling terminal for the current process.
Return the process group ID of the foreground process group associated with the terminal open on the file descriptor underlying port.
If there is no foreground process group, the return value is a number greater than 1 that does not match the process group ID of any existing process group. This can happen if all of the processes in the job that was formerly the foreground job have terminated, and no other job has yet been moved into the foreground.
Set the foreground process group ID for the terminal used by the file descriptor underlying port to the integer pgid. The calling process must be a member of the same session as pgid and must have the same controlling terminal. The return value is unspecified.
The following procedures are similar to the popen and
pclose system routines. The code is in a separate “popen”
module:
(use-modules (ice-9 popen))
Execute a command in a subprocess, with a pipe to it or from it, or with pipes in both directions.
open-piperuns the shell command using ‘/bin/sh -c’.open-pipe*executes prog directly, with the optional args arguments (all strings).mode should be one of the following values.
OPEN_READis an input pipe, ie. to read from the subprocess.OPEN_WRITEis an output pipe, ie. to write to it.For an input pipe, the child's standard output is the pipe and standard input is inherited from
current-input-port. For an output pipe, the child's standard input is the pipe and standard output is inherited fromcurrent-output-port. In all cases cases the child's standard error is inherited fromcurrent-error-port(see Default Ports).If those
current-X-portsare not files of some kind, and hence don't have file descriptors for the child, then /dev/null is used instead.Care should be taken with
OPEN_BOTH, a deadlock will occur if both parent and child are writing, and waiting until the write completes before doing any reading. Each direction hasPIPE_BUFbytes of buffering (see Ports and File Descriptors), which will be enough for small writes, but not for say putting a big file through a filter.
Equivalent to
open-pipewith modeOPEN_READ.(let* ((port (open-input-pipe "date --utc")) (str (read-line port))) (close-pipe port) str) ⇒ "Mon Mar 11 20:10:44 UTC 2002"
Equivalent to
open-pipewith modeOPEN_WRITE.(let ((port (open-output-pipe "lpr"))) (display "Something for the line printer.\n" port) (if (not (eqv? 0 (status:exit-val (close-pipe port)))) (error "Cannot print")))
Close a pipe created by
open-pipe, wait for the process to terminate, and return the wait status code. The status is as perwaitpidand can be decoded withstatus:exit-valetc (see Processes)
waitpid WAIT_ANY should not be used when pipes are open, since
it can reap a pipe's child process, causing an error from a subsequent
close-pipe.
close-port (see Closing) can close a pipe, but it doesn't
reap the child process.
The garbage collector will close a pipe no longer in use, and reap the
child process with waitpid. If the child hasn't yet terminated
the garbage collector doesn't block, but instead checks again in the
next GC.
Many systems have per-user and system-wide limits on the number of processes, and a system-wide limit on the number of pipes, so pipes should be closed explicitly when no longer needed, rather than letting the garbage collector pick them up at some later time.
This section describes procedures which convert internet addresses between numeric and string formats.
An IPv4 Internet address is a 4-byte value, represented in Guile as an integer in host byte order, so that say “0.0.0.1” is 1, or “1.0.0.0” is 16777216.
Some underlying C functions use network byte order for addresses, Guile converts as necessary so that at the Scheme level its host byte order everywhere.
For a server, this can be used with
bind(see Network Sockets and Communication) to allow connections from any interface on the machine.
The address of the local host using the loopback device, ie. ‘127.0.0.1’.
This function is deprecated in favor of
inet-pton.Convert an IPv4 Internet address from printable string (dotted decimal notation) to an integer. E.g.,
(inet-aton "127.0.0.1") ⇒ 2130706433
This function is deprecated in favor of
inet-ntop.Convert an IPv4 Internet address to a printable (dotted decimal notation) string. E.g.,
(inet-ntoa 2130706433) ⇒ "127.0.0.1"
Return the network number part of the given IPv4 Internet address. E.g.,
(inet-netof 2130706433) ⇒ 127
Return the local-address-with-network part of the given IPv4 Internet address, using the obsolete class A/B/C system. E.g.,
(inet-lnaof 2130706433) ⇒ 1
Make an IPv4 Internet address by combining the network number net with the local-address-within-network number lna. E.g.,
(inet-makeaddr 127 1) ⇒ 2130706433
An IPv6 Internet address is a 16-byte value, represented in Guile as an integer in host byte order, so that say “::1” is 1.
Convert a network address from an integer to a printable string. family can be
AF_INETorAF_INET6. E.g.,(inet-ntop AF_INET 2130706433) ⇒ "127.0.0.1" (inet-ntop AF_INET6 (- (expt 2 128) 1)) ⇒ "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"
Convert a string containing a printable network address to an integer address. family can be
AF_INETorAF_INET6. E.g.,(inet-pton AF_INET "127.0.0.1") ⇒ 2130706433 (inet-pton AF_INET6 "::1") ⇒ 1
This section describes procedures which query various network databases. Care should be taken when using the database routines since they are not reentrant.
getaddrinfo
The getaddrinfo procedure maps host and service names to socket addresses
and associated information in a protocol-independent way.
Return a list of
addrinfostructures containing a socket address and associated information for host name and/or service to be used in creating a socket with which to address the specified service.(let* ((ai (car (getaddrinfo "www.gnu.org" "http"))) (s (socket (addrinfo:fam ai) (addrinfo:socktype ai) (addrinfo:protocol ai)))) (connect s (addrinfo:addr ai)) s)When service is omitted or is
#f, return network-level addresses for name. When name is#fservice must be provided and service locations local to the caller are returned.Additional hints can be provided. When specified, hint_flags should be a bitwise-or of zero or more constants among the following:
AI_PASSIVE- Socket address is intended for
bind.AI_CANONNAME- Request for canonical host name, available via
addrinfo:canonname. This makes sense mainly when DNS lookups are involved.AI_NUMERICHOST- Specifies that name is a numeric host address string (e.g.,
"127.0.0.1"), meaning that name resolution will not be used.AI_NUMERICSERV- Likewise, specifies that service is a numeric port string (e.g.,
"80").AI_ADDRCONFIG- Return only addresses configured on the local system It is highly recommended to provide this flag when the returned socket addresses are to be used to make connections; otherwise, some of the returned addresses could be unreachable or use a protocol that is not supported.
AI_V4MAPPED- When looking up IPv6 addresses, return mapped IPv4 addresses if there is no IPv6 address available at all.
AI_ALL- If this flag is set along with
AI_V4MAPPEDwhen looking up IPv6 addresses, return all IPv6 addresses as well as all IPv4 addresses, the latter mapped to IPv6 format.When given, hint_family should specify the requested address family, e.g.,
AF_INET6. Similarly, hint_socktype should specify the requested socket type (e.g.,SOCK_DGRAM), and hint_protocol should specify the requested protocol (its value is interpreted as in calls tosocket).On error, an exception with key
getaddrinfo-erroris thrown, with an error code (an integer) as its argument:(catch 'getaddrinfo-error (lambda () (getaddrinfo "www.gnu.org" "gopher")) (lambda (key errcode) (cond ((= errcode EAI_SERVICE) (display "doesn't know about Gopher!\n")) ((= errcode EAI_NONAME) (display "www.gnu.org not found\\n")) (else (format #t "something wrong: ~a\n" (gai-strerror errcode))))))Error codes are:
EAI_AGAIN- The name or service could not be resolved at this time. Future attempts may succeed.
EAI_BADFLAGS- hint_flags contains an invalid value.
EAI_FAIL- A non-recoverable error occurred when attempting to resolve the name.
EAI_FAMILY- hint_family was not recognized.
EAI_NONAME- Either name does not resolve for the supplied parameters, or neither name nor service were supplied.
EAI_NODATA- This non-POSIX error code can be returned on GNU systems when a request was actually made but returned no data, meaning that no address is associated with name. Error handling code should be prepared to handle it when it is defined.
EAI_SERVICE- service was not recognized for the specified socket type.
EAI_SOCKTYPE- hint_socktype was not recognized.
EAI_SYSTEM- A system error occurred; the error code can be found in
errno.Users are encouraged to read the "POSIX specification for more details.
The following procedures take an addrinfo object as returned by
getaddrinfo:
Return flags for ai as a bitwise or of
AI_values (see above).
Return the socket address associated with ai as a
sockaddrobject (see Network Socket Address).
Return a string for the canonical name associated with ai if the
AI_CANONNAMEflag was supplied.
A host object is a structure that represents what is known about a network host, and is the usual way of representing a system's network identity inside software.
The following functions accept a host object and return a selected component:
The host address type, one of the
AFconstants, such asAF_INETorAF_INET6.
The list of network addresses associated with host. For
AF_INETthese are integer IPv4 address (see Network Address Conversion).
The following procedures can be used to search the host database. However,
getaddrinfo should be preferred over them since it's more generic and
thread-safe.
Look up a host by name or address, returning a host object. The
gethostprocedure will accept either a string name or an integer address; if given no arguments, it behaves likegethostent(see below). If a name or address is supplied but the address can not be found, an error will be thrown to one of the keys:host-not-found,try-again,no-recoveryorno-data, corresponding to the equivalenth_errorvalues. Unusual conditions may result in errors thrown to thesystem-errorormisc_errorkeys.(gethost "www.gnu.org") ⇒ #("www.gnu.org" () 2 4 (3353880842)) (gethostbyname "www.emacs.org") ⇒ #("emacs.org" ("www.emacs.org") 2 4 (1073448978))
The following procedures may be used to step through the host database from beginning to end.
Initialize an internal stream from which host objects may be read. This procedure must be called before any calls to
gethostent, and may also be called afterward to reset the host entry stream. If stayopen is supplied and is not#f, the database is not closed by subsequentgethostbynameorgethostbyaddrcalls, possibly giving an efficiency gain.
Return the next host object from the host database, or
#fif there are no more hosts to be found (or an error has been encountered). This procedure may not be used beforesethostenthas been called.
Close the stream used by
gethostent. The return value is unspecified.
If stayopen is omitted, this is equivalent to
endhostent. Otherwise it is equivalent tosethostent stayopen.
The following functions accept an object representing a network and return a selected component:
The type of the network number. Currently, this returns only
AF_INET.
The following procedures are used to search the network database:
Look up a network by name or net number in the network database. The net-name argument must be a string, and the net-number argument must be an integer.
getnetwill accept either type of argument, behaving likegetnetent(see below) if no arguments are given.
The following procedures may be used to step through the network database from beginning to end.
Initialize an internal stream from which network objects may be read. This procedure must be called before any calls to
getnetent, and may also be called afterward to reset the net entry stream. If stayopen is supplied and is not#f, the database is not closed by subsequentgetnetbynameorgetnetbyaddrcalls, possibly giving an efficiency gain.
If stayopen is omitted, this is equivalent to
endnetent. Otherwise it is equivalent tosetnetent stayopen.
The following functions accept an object representing a protocol and return a selected component:
The following procedures are used to search the protocol database:
Look up a network protocol by name or by number.
getprotobynametakes a string argument, andgetprotobynumbertakes an integer argument.getprotowill accept either type, behaving likegetprotoent(see below) if no arguments are supplied.
The following procedures may be used to step through the protocol database from beginning to end.
Initialize an internal stream from which protocol objects may be read. This procedure must be called before any calls to
getprotoent, and may also be called afterward to reset the protocol entry stream. If stayopen is supplied and is not#f, the database is not closed by subsequentgetprotobynameorgetprotobynumbercalls, possibly giving an efficiency gain.
Close the stream used by
getprotoent. The return value is unspecified.
If stayopen is omitted, this is equivalent to
endprotoent. Otherwise it is equivalent tosetprotoent stayopen.
The following functions accept an object representing a service and return a selected component:
The protocol used by the service. A service may be listed many times in the database under different protocol names.
The following procedures are used to search the service database:
Look up a network service by name or by service number, and return a network service object. The protocol argument specifies the name of the desired protocol; if the protocol found in the network service database does not match this name, a system error is signalled.
The
getservprocedure will take either a service name or number as its first argument; if given no arguments, it behaves likegetservent(see below).(getserv "imap" "tcp") ⇒ #("imap2" ("imap") 143 "tcp") (getservbyport 88 "udp") ⇒ #("kerberos" ("kerberos5" "krb5") 88 "udp")
The following procedures may be used to step through the service database from beginning to end.
Initialize an internal stream from which service objects may be read. This procedure must be called before any calls to
getservent, and may also be called afterward to reset the service entry stream. If stayopen is supplied and is not#f, the database is not closed by subsequentgetservbynameorgetservbyportcalls, possibly giving an efficiency gain.
Close the stream used by
getservent. The return value is unspecified.
If stayopen is omitted, this is equivalent to
endservent. Otherwise it is equivalent tosetservent stayopen.
A socket address object identifies a socket endpoint for
communication. In the case of AF_INET for instance, the socket
address object comprises the host address (or interface on the host)
and a port number which specifies a particular open socket in a
running client or server process. A socket address object can be
created with,
Return a new socket address object. The first argument is the address family, one of the
AFconstants, then the arguments vary according to the family.For
AF_INETthe arguments are an IPv4 network address number (see Network Address Conversion), and a port number.For
AF_INET6the arguments are an IPv6 network address number and a port number. Optional flowinfo and scopeid arguments may be given (both integers, default 0).For
AF_UNIXthe argument is a filename (a string).The C function
scm_make_socket_addresstakes the family and address arguments directly, then arglist is a list of further arguments, being the port for IPv4, port and optional flowinfo and scopeid for IPv6, or the empty listSCM_EOLfor Unix domain.
The following functions access the fields of a socket address object,
Return the address family from socket address object sa. This is one of the
AFconstants (e.g.AF_INET).
For an
AF_INETorAF_INET6socket address object sa, return the network address number.
For an
AF_INETorAF_INET6socket address object sa, return the port number.
For an
AF_INET6socket address object sa, return the flowinfo value.
For an
AF_INET6socket address object sa, return the scope ID value.
The functions below convert to and from the C struct sockaddr
(see Address Formats).
That structure is a generic type, an application can cast to or from
struct sockaddr_in, struct sockaddr_in6 or struct
sockaddr_un according to the address family.
In a struct sockaddr taken or returned, the byte ordering in
the fields follows the C conventions (see Byte Order Conversion). This means
network byte order for AF_INET host address
(sin_addr.s_addr) and port number (sin_port), and
AF_INET6 port number (sin6_port). But at the Scheme
level these values are taken or returned in host byte order, so the
port is an ordinary integer, and the host address likewise is an
ordinary integer (as described in Network Address Conversion).
Return a newly-
mallocedstruct sockaddrcreated from arguments like those taken byscm_make_socket_addressabove.The size (in bytes) of the
struct sockaddrreturn is stored into*outsize. An application must callfreeto release the returned structure when no longer required.
Return a Scheme socket address object from the C address structure. address_size is the size in bytes of address.
Return a newly-
mallocedstruct sockaddrfrom a Scheme level socket address object.The size (in bytes) of the
struct sockaddrreturn is stored into*outsize. An application must callfreeto release the returned structure when no longer required.
Socket ports can be created using socket and socketpair.
The ports are initially unbuffered, to make reading and writing to the
same port more reliable. A buffer can be added to the port using
setvbuf; see Ports and File Descriptors.
Most systems have limits on how many files and sockets can be open, so it's strongly recommended that socket ports be closed explicitly when no longer required (see Ports).
Some of the underlying C functions take values in network byte order, but the convention in Guile is that at the Scheme level everything is ordinary host byte order and conversions are made automatically where necessary.
Return a new socket port of the type specified by family, style and proto. All three parameters are integers. The possible values for family are as follows, where supported by the system,
The possible values for style are as follows, again where supported by the system,
— Variable: SOCK_STREAM
— Variable: SOCK_DGRAM
— Variable: SOCK_RAW
— Variable: SOCK_RDM
— Variable: SOCK_SEQPACKET
proto can be obtained from a protocol name using
getprotobyname(see Network Databases). A value of zero means the default protocol, which is usually right.A socket cannot by used for communication until it has been connected somewhere, usually with either
connectoracceptbelow.
Return a pair, the
carandcdrof which are two unnamed socket ports connected to each other. The connection is full-duplex, so data can be transferred in either direction between the two.family, style and proto are as per
socketabove. But many systems only support socket pairs in thePF_UNIXfamily. Zero is likely to be the only meaningful value for proto.
Get or set an option on socket port sock.
getsockoptreturns the current value.setsockoptsets a value and the return is unspecified.level is an integer specifying a protocol layer, either
SOL_SOCKETfor socket level options, or a protocol number from theIPPROTOconstants orgetprotoent(see Network Databases).optname is an integer specifying an option within the protocol layer.
For
SOL_SOCKETlevel the following optnames are defined (when provided by the system). For their meaning see Socket-Level Options, or man 7 socket.— Variable: SO_DEBUG
— Variable: SO_REUSEADDR
— Variable: SO_STYLE
— Variable: SO_TYPE
— Variable: SO_ERROR
— Variable: SO_DONTROUTE
— Variable: SO_BROADCAST
— Variable: SO_SNDBUF
— Variable: SO_RCVBUF
— Variable: SO_KEEPALIVE
— Variable: SO_OOBINLINE
— Variable: SO_NO_CHECK
— Variable: SO_PRIORITY
The value taken or returned is an integer.
— Variable: SO_LINGER
The value taken or returned is a pair of integers
(ENABLE.TIMEOUT). On old systems without timeout support (ie. withoutstruct linger), only ENABLE has an effect but the value in Guile is always a pair.For IP level (
IPPROTO_IP) the following optnames are defined (when provided by the system). See man ip for what they mean.— Variable: IP_MULTICAST_TTL
This sets the default TTL for multicast traffic. This defaults to 1 and should be increased to allow traffic to pass beyond the local network.
— Variable: IP_ADD_MEMBERSHIP
— Variable: IP_DROP_MEMBERSHIP
These can be used only with
setsockopt, notgetsockopt. value is a pair(MULTIADDR.INTERFACEADDR)of integer IPv4 addresses (see Network Address Conversion). MULTIADDR is a multicast address to be added to or dropped from the interface INTERFACEADDR. INTERFACEADDR can beINADDR_ANYto have the system select the interface. INTERFACEADDR can also be an interface index number, on systems supporting that.
Sockets can be closed simply by using
close-port. Theshutdownprocedure allows reception or transmission on a connection to be shut down individually, according to the parameter how:
- 0
- Stop receiving data for this socket. If further data arrives, reject it.
- 1
- Stop trying to transmit data from this socket. Discard any data waiting to be sent. Stop looking for acknowledgement of data already sent; don't retransmit it if it is lost.
- 2
- Stop both reception and transmission.
The return value is unspecified.
Initiate a connection on socket port sock to a given address. The destination is either a socket address object, or arguments the same as
make-socket-addresswould take to make such an object (see Network Socket Address). The return value is unspecified.(connect sock AF_INET INADDR_LOOPBACK 23) (connect sock (make-socket-address AF_INET INADDR_LOOPBACK 23))
Bind socket port sock to the given address. The address is either a socket address object, or arguments the same as
make-socket-addresswould take to make such an object (see Network Socket Address). The return value is unspecified.Generally a socket is only explicitly bound to a particular address when making a server, i.e. to listen on a particular port. For an outgoing connection the system will assign a local address automatically, if not already bound.
(bind sock AF_INET INADDR_ANY 12345) (bind sock (make-socket-address AF_INET INADDR_ANY 12345))
Enable sock to accept connection requests. backlog is an integer specifying the maximum length of the queue for pending connections. If the queue fills, new clients will fail to connect until the server calls
acceptto accept a connection from the queue.The return value is unspecified.
Accept a connection from socket port sock which has been enabled for listening with
listenabove. If there are no incoming connections in the queue, wait until one is available (unlessO_NONBLOCKhas been set on the socket, seefcntl).The return value is a pair. The
caris a new socket port, connected and ready to communicate. Thecdris a socket address object (see Network Socket Address) which is where the remote connection is from (likegetpeernamebelow).All communication takes place using the new socket returned. The given sock remains bound and listening, and
acceptmay be called on it again to get another incoming connection when desired.
Return a socket address object which is the where sock is bound locally. sock may have obtained its local address from
bind(above), or if aconnectis done with an otherwise unbound socket (which is usual) then the system will have assigned an address.Note that on many systems the address of a socket in the
AF_UNIXnamespace cannot be read.
Return a socket address object which is where sock is connected to, i.e. the remote endpoint.
Note that on many systems the address of a socket in the
AF_UNIXnamespace cannot be read.
Receive data from a socket port. sock must already be bound to the address from which data is to be received. buf is a bytevector into which the data will be written. The size of buf limits the amount of data which can be received: in the case of packet protocols, if a packet larger than this limit is encountered then some data will be irrevocably lost.
The optional flags argument is a value or bitwise OR of
MSG_OOB,MSG_PEEK,MSG_DONTROUTEetc.The value returned is the number of bytes read from the socket.
Note that the data is read directly from the socket file descriptor: any unread buffered port data is ignored.
Transmit bytevector message on socket port sock. sock must already be bound to a destination address. The value returned is the number of bytes transmitted—it's possible for this to be less than the length of message if the socket is set to be non-blocking. The optional flags argument is a value or bitwise OR of
MSG_OOB,MSG_PEEK,MSG_DONTROUTEetc.Note that the data is written directly to the socket file descriptor: any unflushed buffered port data is ignored.
Receive data from socket port sock, returning the originating address as well as the data. This function is usually for datagram sockets, but can be used on stream-oriented sockets too.
The data received is stored in bytevector buf, using either the whole bytevector or just the region between the optional start and end positions. The size of buf limits the amount of data that can be received. For datagram protocols if a packet larger than this is received then excess bytes are irrevocably lost.
The return value is a pair. The
caris the number of bytes read. Thecdris a socket address object (see Network Socket Address) which is where the data came from, or#fif the origin is unknown.The optional flags argument is a or bitwise-OR (
logior) ofMSG_OOB,MSG_PEEK,MSG_DONTROUTEetc.Data is read directly from the socket file descriptor, any buffered port data is ignored.
On a GNU/Linux system
recvfrom!is not multi-threading, all threads stop while arecvfrom!call is in progress. An application may need to useselect,O_NONBLOCKorMSG_DONTWAITto avoid this.
Transmit bytevector message as a datagram socket port sock. The destination is specified either as a socket address object, or as arguments the same as would be taken by
make-socket-addressto create such an object (see Network Socket Address).The destination address may be followed by an optional flags argument which is a
logior(see Bitwise Operations) ofMSG_OOB,MSG_PEEK,MSG_DONTROUTEetc.The value returned is the number of bytes transmitted – it's possible for this to be less than the length of message if the socket is set to be non-blocking. Note that the data is written directly to the socket file descriptor: any unflushed buffered port data is ignored.
The following functions can be used to convert short and long integers between “host” and “network” order. Although the procedures above do this automatically for addresses, the conversion will still need to be done when sending or receiving encoded integer data from the network.
Convert a 16 bit quantity from host to network byte ordering. value is packed into 2 bytes, which are then converted and returned as a new integer.
Convert a 16 bit quantity from network to host byte ordering. value is packed into 2 bytes, which are then converted and returned as a new integer.
Convert a 32 bit quantity from host to network byte ordering. value is packed into 4 bytes, which are then converted and returned as a new integer.
Convert a 32 bit quantity from network to host byte ordering. value is packed into 4 bytes, which are then converted and returned as a new integer.
These procedures are inconvenient to use at present, but consider:
(define write-network-long
(lambda (value port)
(let ((v (make-uniform-vector 1 1 0)))
(uniform-vector-set! v 0 (htonl value))
(uniform-vector-write v port))))
(define read-network-long
(lambda (port)
(let ((v (make-uniform-vector 1 1 0)))
(uniform-vector-read! v port)
(ntohl (uniform-vector-ref v 0)))))
The following give examples of how to use network sockets.
The following example demonstrates an Internet socket client. It connects to the HTTP daemon running on the local machine and returns the contents of the root index URL.
(let ((s (socket PF_INET SOCK_STREAM 0)))
(connect s AF_INET (inet-pton AF_INET "127.0.0.1") 80)
(display "GET / HTTP/1.0\r\n\r\n" s)
(do ((line (read-line s) (read-line s)))
((eof-object? line))
(display line)
(newline)))
The following example shows a simple Internet server which listens on port 2904 for incoming connections and sends a greeting back to the client.
(let ((s (socket PF_INET SOCK_STREAM 0)))
(setsockopt s SOL_SOCKET SO_REUSEADDR 1)
;; Specific address?
;; (bind s AF_INET (inet-pton AF_INET "127.0.0.1") 2904)
(bind s AF_INET INADDR_ANY 2904)
(listen s 5)
(simple-format #t "Listening for clients in pid: ~S" (getpid))
(newline)
(while #t
(let* ((client-connection (accept s))
(client-details (cdr client-connection))
(client (car client-connection)))
(simple-format #t "Got new client connection: ~S"
client-details)
(newline)
(simple-format #t "Client address: ~S"
(gethostbyaddr
(sockaddr:addr client-details)))
(newline)
;; Send back the greeting to the client port
(display "Hello client\r\n" client)
(close client))))
This section lists the various procedures Guile provides for accessing information about the system it runs on.
Return an object with some information about the computer system the program is running on.
The following procedures accept an object as returned by
unameand return a selected component (all of which are strings).— Scheme Procedure: utsname:release un
The current release level of the operating system implementation.
Set the host name of the current processor to name. May only be used by the superuser. The return value is not specified.
Get or set the current locale, used for various internationalizations. Locales are strings, such as ‘sv_SE’.
If locale is given then the locale for the given category is set and the new value returned. If locale is not given then the current value is returned. category should be one of the following values (see Categories of Activities that Locales Affect):
— Variable: LC_ALL
— Variable: LC_COLLATE
— Variable: LC_CTYPE
— Variable: LC_MESSAGES
— Variable: LC_MONETARY
— Variable: LC_NUMERIC
— Variable: LC_TIME
A common usage is ‘(setlocale LC_ALL "")’, which initializes all categories based on standard environment variables (
LANGetc). For full details on categories and locale names see Locales and Internationalization.Note that
setlocaleaffects locale settings for the whole process. See locale objects andmake-locale, for a thread-safe alternative.
Please note that the procedures in this section are not suited for strong encryption, they are only interfaces to the well-known and common system library functions of the same name. They are just as good (or bad) as the underlying functions, so you should refer to your system documentation before using them (see Encrypting Passwords).
Encrypt key, with the addition of salt (both strings), using the
cryptC library call.
Although getpass is not an encryption procedure per se, it
appears here because it is often used in combination with crypt:
Display prompt to the standard error output and read a password from /dev/tty. If this file is not accessible, it reads from standard input. The password may be up to 127 characters in length. Additional characters and the terminating newline character are discarded. While reading the password, echoing and the generation of signals by special characters is disabled.
It has always been possible to connect computers together and share information between them, but the rise of the World-Wide Web over the last couple of decades has made it much easier to do so. The result is a richly connected network of computation, in which Guile forms a part.
By “the web”, we mean the HTTP protocol21 as handled by servers, clients, proxies, caches, and the various kinds of messages and message components that can be sent and received by that protocol, notably HTML.
On one level, the web is text in motion: the protocols themselves are textual (though the payload may be binary), and it's possible to create a socket and speak text to the web. But such an approach is obviously primitive. This section details the higher-level data types and operations provided by Guile: URIs, HTTP request and response records, and a conventional web server implementation.
The material in this section is arranged in ascending order, in which later concepts build on previous ones. If you prefer to start with the highest-level perspective, see Web Examples, and work your way back.
It is a truth universally acknowledged, that a program with good use of data types, will be free from many common bugs. Unfortunately, the common practice in web programming seems to ignore this maxim. This subsection makes the case for expressive data types in web programming.
By “expressive data types”, we mean that the data types say something about how a program solves a problem. For example, if we choose to represent dates using SRFI 19 date records (see SRFI-19), this indicates that there is a part of the program that will always have valid dates. Error handling for a number of basic cases, like invalid dates, occurs on the boundary in which we produce a SRFI 19 date record from other types, like strings.
With regards to the web, data types are helpful in the two broad phases of HTTP messages: parsing and generation.
Consider a server, which has to parse a request, and produce a response. Guile will parse the request into an HTTP request object (see Requests), with each header parsed into an appropriate Scheme data type. This transition from an incoming stream of characters to typed data is a state change in a program—the strings might parse, or they might not, and something has to happen if they do not. (Guile throws an error in this case.) But after you have the parsed request, “client” code (code built on top of the Guile web framework) will not have to check for syntactic validity. The types already make this information manifest.
This state change on the parsing boundary makes programs more robust, as they themselves are freed from the need to do a number of common error checks, and they can use normal Scheme procedures to handle a request instead of ad-hoc string parsers.
The need for types on the response generation side (in a server) is more subtle, though not less important. Consider the example of a POST handler, which prints out the text that a user submits from a form. Such a handler might include a procedure like this:
;; First, a helper procedure
(define (para . contents)
(string-append "<p>" (string-concatenate contents) "</p>"))
;; Now the meat of our simple web application
(define (you-said text)
(para "You said: " text))
(display (you-said "Hi!"))
-| <p>You said: Hi!</p>
This is a perfectly valid implementation, provided that the incoming text does not contain the special HTML characters ‘<’, ‘>’, or ‘&’. But this provision of a restricted character set is not reflected anywhere in the program itself: we must assume that the programmer understands this, and performs the check elsewhere.
Unfortunately, the short history of the practice of programming does not bear out this assumption. A cross-site scripting (XSS) vulnerability is just such a common error in which unfiltered user input is allowed into the output. A user could submit a crafted comment to your web site which results in visitors running malicious Javascript, within the security context of your domain:
(display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
-| <p>You said: <script src="http://bad.com/nasty.js" /></p>
The fundamental problem here is that both user data and the program template are represented using strings. This identity means that types can't help the programmer to make a distinction between these two, so they get confused.
There are a number of possible solutions, but perhaps the best is to treat HTML not as strings, but as native s-expressions: as SXML. The basic idea is that HTML is either text, represented by a string, or an element, represented as a tagged list. So ‘foo’ becomes ‘"foo"’, and ‘<b>foo</b>’ becomes ‘(b "foo")’. Attributes, if present, go in a tagged list headed by ‘@’, like ‘(img (@ (src "http://example.com/foo.png")))’. See sxml simple, for more information.
The good thing about SXML is that HTML elements cannot be confused with
text. Let's make a new definition of para:
(define (para . contents)
`(p ,@contents))
(use-modules (sxml simple))
(sxml->xml (you-said "Hi!"))
-| <p>You said: Hi!</p>
(sxml->xml (you-said "<i>Rats, foiled again!</i>"))
-| <p>You said: <i>Rats, foiled again!</i></p>
So we see in the second example that HTML elements cannot be unwittingly
introduced into the output. However it is now perfectly acceptable to
pass SXML to you-said; in fact, that is the big advantage of SXML
over everything-as-a-string.
(sxml->xml (you-said (you-said "<Hi!>")))
-| <p>You said: <p>You said: <Hi!></p></p>
The SXML types allow procedures to compose. The types make manifest which parts are HTML elements, and which are text. So you needn't worry about escaping user input; the type transition back to a string handles that for you. XSS vulnerabilities are a thing of the past.
Well. That's all very nice and opinionated and such, but how do I use the thing? Read on!
Guile provides a standard data type for Universal Resource Identifiers (URIs), as defined in RFC 3986.
The generic URI syntax is as follows:
URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \
[ "?" query ] [ "#" fragment ]
For example, in the URI, <http://www.gnu.org/help/>, the
scheme is http, the host is www.gnu.org, the path is
/help/, and there is no userinfo, port, query, or path. All URIs
have a scheme and a path (though the path might be empty). Some URIs
have a host, and some of those have ports and userinfo. Any URI might
have a query part or a fragment.
Userinfo is something of an abstraction, as some legacy URI schemes
allowed userinfo of the form username:passwd. But
since passwords do not belong in URIs, the RFC does not want to condone
this practice, so it calls anything before the @ sign
userinfo.
Properly speaking, a fragment is not part of a URI. For example, when a
web browser follows a link to <http://example.com/#foo>, it
sends a request for <http://example.com/>, then looks in the
resulting page for the fragment identified foo reference. A
fragment identifies a part of a resource, not the resource itself. But
it is useful to have a fragment field in the URI record itself, so we
hope you will forgive the inconsistency.
(use-modules (web uri))
The following procedures can be found in the (web uri)
module. Load it into your Guile, using a form like the above, to have
access to them.
#f] [#:host=#f] [#:port=#f] [#:path=""] [#:query=#f] [#:fragment=#f] [#:validate?=#t]Construct a URI object. scheme should be a symbol, and the rest of the fields are either strings or
#f. If validate? is true, also run some consistency checks to make sure that the constructed URI is valid.
A predicate and field accessors for the URI record type. The URI scheme will be a symbol, and the rest either strings or
#fif not present.
Parse string into a URI object. Return
#fif the string could not be parsed.
Serialize uri to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization.
Declare a default port for the given URI scheme.
"utf-8"]Percent-decode the given str, according to encoding, which should be the name of a character encoding.
Note that this function should not generally be applied to a full URI string. For paths, use split-and-decode-uri-path instead. For query strings, split the query on
&and=boundaries, and decode the components separately.Note also that percent-encoded strings encode bytes, not characters. There is no guarantee that a given byte sequence is a valid string encoding. Therefore this routine may signal an error if the decoded bytes are not valid for the given encoding. Pass
#ffor encoding if you want decoded bytes as a bytevector directly. Seeset-port-encoding!, for more information on character encodings.Returns a string of the decoded characters, or a bytevector if encoding was
#f.
Fixme: clarify return type. indicate default values. type of unescaped-chars.
"utf-8"] [#:unescaped-chars]Percent-encode any character not in the character set, unescaped-chars.
The default character set includes alphanumerics from ASCII, as well as the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any other character will be percent-encoded, by writing out the character to a bytevector within the given encoding, then encoding each byte as
%HH, where HH is the hexadecimal representation of the byte.
Split path into its components, and decode each component, removing empty components.
For example,
"/foo/bar%20baz/"decodes to the two-element list,("foo" "bar baz").
URI-encode each element of parts, which should be a list of strings, and join the parts together with
/as a delimiter.For example, the list
("scrambled eggs" "biscuits&gravy")encodes as"scrambled%20eggs/biscuits%26gravy".
The initial motivation for including web functionality in Guile, rather than rely on an external package, was to establish a standard base on which people can share code. To that end, we continue the focus on data types by providing a number of low-level parsers and unparsers for elements of the HTTP protocol.
If you are want to skip the low-level details for now and move on to web pages, see Web Client, and see Web Server. Otherwise, load the HTTP module, and read on.
(use-modules (web http))
The focus of the (web http) module is to parse and unparse
standard HTTP headers, representing them to Guile as native data
structures. For example, a Date: header will be represented as a
SRFI-19 date record (see SRFI-19), rather than as a string.
Guile tries to follow RFCs fairly strictly—the road to perdition being paved with compatibility hacks—though some allowances are made for not-too-divergent texts.
Header names are represented as lower-case symbols.
For example:
(string->header "Content-Length")
⇒ content-length
(header->string 'content-length)
⇒ "Content-Length"
(string->header "FOO")
⇒ foo
(header->string 'foo)
⇒ "Foo"
Guile keeps a registry of known headers, their string names, and some parsing and serialization procedures. If a header is unknown, its string name is simply its symbol name in title-case.
Return
#tiff sym is a known header, with associated parsers and serialization procedures.
Return the value parser for headers named sym. The result is a procedure that takes one argument, a string, and returns the parsed value. If the header isn't known to Guile, a default parser is returned that passes through the string unchanged.
Return a predicate which returns
#tif the given value is valid for headers named sym. The default validator for unknown headers isstring?.
Return a procedure that writes values for headers named sym to a port. The resulting procedure takes two arguments: a value and a port. The default writer is
display.
For more on the set of headers that Guile knows about out of the box,
see HTTP Headers. To add your own, use the declare-header!
procedure:
#f]Declare a parser, validator, and writer for a given header.
For example, let's say you are running a web server behind some sort of
proxy, and your proxy adds an X-Client-Address header, indicating
the IPv4 address of the original client. You would like for the HTTP
request record to parse out this header to a Scheme value, instead of
leaving it as a string. You could register this header with Guile's
HTTP stack like this:
(declare-header! "X-Client-Address"
(lambda (str)
(inet-aton str))
(lambda (ip)
(and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
(lambda (ip port)
(display (inet-ntoa ip) port)))
Return a true value iff val is a valid Scheme value for the header with name sym.
Now that we have a generic interface for reading and writing headers, we do just that.
Read one HTTP header from port. Return two values: the header name and the parsed Scheme value. May raise an exception if the header was known but the value was invalid.
Returns the end-of-file object for both values if the end of the message body was reached (i.e., a blank line).
Parse val, a string, with the parser for the header named name. Returns the parsed value.
Write the given header name and value to port, using the writer from
header-writer.
Read the headers of an HTTP message from port, returning the headers as an ordered alist.
Write the given header alist to port. Doesn't write the final ‘\r\n’, as the user might want to add another header.
The (web http) module also has some utility procedures to read
and write request and response lines.
Parse an HTTP method from str. The result is an upper-case symbol, like
GET.
Parse an HTTP version from str, returning it as a major-minor pair. For example,
HTTP/1.1parses as the pair of integers,(1 . 1).
Parse a URI from an HTTP request line. Note that URIs in requests do not have to have a scheme or host name. The result is a URI object.
Read the first line of an HTTP request from port, returning three values: the method, the URI, and the version.
Write the first line of an HTTP request to port.
Read the first line of an HTTP response from port, returning three values: the HTTP version, the response code, and the "reason phrase".
Write the first line of an HTTP response to port.
In addition to defining the infrastructure to parse headers, the
(web http) module defines specific parsers and unparsers for all
headers defined in the HTTP/1.1 standard.
For example, if you receive a header named ‘Accept-Language’ with a value ‘en, es;q=0.8’, Guile parses it as a quality list (defined below):
(parse-header 'accept-language "en, es;q=0.8")
⇒ ((1000 . "en") (800 . "es"))
The format of the value for ‘Accept-Language’ headers is defined below, along with all other headers defined in the HTTP standard. (If the header were unknown, the value would have been returned as a string.)
For brevity, the header definitions below are given in the form, Type name, indicating that values for the header name will be of the given Type. Since Guile internally treats header names in lower case, in this document we give types title-cased names. A short description of the each header's purpose and an example follow.
For full details on the meanings of all of these headers, see the HTTP 1.1 standard, RFC 2616.
Here we define the types that are used below, when defining headers.
A list whose elements are keys or key-value pairs. Keys are parsed to symbols. Values are strings by default. Non-string values are the exception, and are mentioned explicitly below, as appropriate.
An exact integer between 0 and 1000. Qualities are used to express preference, given multiple options. An option with a quality of 870, for example, is preferred over an option with quality 500.
(Qualities are written out over the wire as numbers between 0.0 and 1.0, but since the standard only allows three digits after the decimal, it's equivalent to integers between 0 and 1000, so that's what Guile uses.)
A quality list: a list of pairs, the car of which is a quality, and the cdr a string. Used to express a list of options, along with their qualities.
An entity tag, represented as a pair. The car of the pair is an opaque string, and the cdr is
#tif the entity tag is a “strong” entity tag, and#fotherwise.
General HTTP headers may be present in any HTTP message.
A key-value list of cache-control directives. See RFC 2616, for more details.
If present, parameters to
max-age,max-stale,min-fresh, ands-maxageare all parsed as non-negative integers.If present, parameters to
privateandno-cacheare parsed as lists of header names, as symbols.(parse-header 'cache-control "no-cache,no-store" ⇒ (no-cache no-store) (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store" ⇒ ((no-cache . (authorization date)) no-store) (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10" ⇒ ((no-cache . (authorization date)) (max-age . 10))
A list of header names that apply only to this HTTP connection, as symbols. Additionally, the symbol ‘close’ may be present, to indicate that the server should close the connection after responding to the request.
(parse-header 'connection "close") ⇒ (close)
The date that a given HTTP message was originated.
(parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT") ⇒ #<date ...>
A key-value list of implementation-specific directives.
(parse-header 'pragma "no-cache, broccoli=tasty") ⇒ (no-cache (broccoli . "tasty"))
A list of header names which will appear after the message body, instead of with the message headers.
(parse-header 'trailer "ETag") ⇒ (etag)
A list of transfer codings, expressed as key-value lists. The only transfer coding defined by the specification is
chunked.(parse-header 'transfer-encoding "chunked") ⇒ ((chunked))
A list of strings, indicating additional protocols that a server could use in response to a request.
(parse-header 'upgrade "WebSocket") ⇒ ("WebSocket")
FIXME: parse out more fully?
A list of strings, indicating the protocol versions and hosts of intermediate servers and proxies. There may be multiple
viaheaders in one message.(parse-header 'via "1.0 venus, 1.1 mars") ⇒ ("1.0 venus" "1.1 mars")
A list of warnings given by a server or intermediate proxy. Each warning is a itself a list of four elements: a code, as an exact integer between 0 and 1000, a host as a string, the warning text as a string, and either
#for a SRFI-19 date.There may be multiple
warningheaders in one message.(parse-header 'warning "123 foo \"core breach imminent\"") ⇒ ((123 "foo" "core-breach imminent" #f))
Entity headers may be present in any HTTP message, and refer to the resource referenced in the HTTP request or response.
A list of allowed methods on a given resource, as symbols.
(parse-header 'allow "GET, HEAD") ⇒ (GET HEAD)
A list of content codings, as symbols.
(parse-header 'content-encoding "gzip") ⇒ (GET HEAD)
The languages that a resource is in, as strings.
(parse-header 'content-language "en") ⇒ ("en")
The number of bytes in a resource, as an exact, non-negative integer.
(parse-header 'content-length "300") ⇒ 300
The canonical URI for a resource, in the case that it is also accessible from a different URI.
(parse-header 'content-location "http://example.com/foo") ⇒ #<<uri> ...>
The MD5 digest of a resource.
(parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5") ⇒ "ffaea1a79810785575e29e2bd45e2fa5"
A range specification, as a list of three elements: the symbol
bytes, either the symbol*or a pair of integers, indicating the byte rage, and either*or an integer, for the instance length. Used to indicate that a response only includes part of a resource.(parse-header 'content-range "bytes 10-20/*") ⇒ (bytes (10 . 20) *)
The MIME type of a resource, as a symbol, along with any parameters.
(parse-header 'content-length "text/plain") ⇒ (text/plain) (parse-header 'content-length "text/plain;charset=utf-8") ⇒ (text/plain (charset . "utf-8"))Note that the
charsetparameter is something is a misnomer, and the HTTP specification admits this. It specifies the encoding of the characters, not the character set.
The date/time after which the resource given in a response is considered stale.
(parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") ⇒ #<date ...>
The date/time on which the resource given in a response was last modified.
(parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") ⇒ #<date ...>
Request headers may only appear in an HTTP request, not in a response.
A list of preferred media types for a response. Each element of the list is itself a list, in the same format as
content-type.(parse-header 'accept "text/html,text/plain;charset=utf-8") ⇒ ((text/html) (text/plain (charset . "utf-8")))Preference is expressed with quality values:
(parse-header 'accept "text/html;q=0.8,text/plain;q=0.6") ⇒ ((text/html (q . 800)) (text/plain (q . 600)))
A quality list of acceptable charsets. Note again that what HTTP calls a “charset” is what Guile calls a “character encoding”.
(parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8") ⇒ ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
A quality list of acceptable content codings.
(parse-header 'accept-encoding "gzip,identity=0.8") ⇒ ((1000 . "gzip") (800 . "identity"))
A quality list of acceptable languages.
(parse-header 'accept-language "cn,en=0.75") ⇒ ((1000 . "cn") (750 . "en"))
Authorization credentials. The car of the pair indicates the authentication scheme, like
basic. For basic authentication, the cdr of the pair will be the base64-encoded ‘user:pass’ string. For other authentication schemes, likedigest, the cdr will be a key-value list of credentials.(parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==" ⇒ (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
A list of expectations that a client has of a server. The expectations are key-value lists.
(parse-header 'expect "100-continue") ⇒ ((100-continue))
The email address of a user making an HTTP request.
(parse-header 'from "bob@example.com") ⇒ "bob@example.com"
The host for the resource being requested, as a hostname-port pair. If no port is given, the port is
#f.(parse-header 'host "gnu.org:80") ⇒ ("gnu.org" . 80) (parse-header 'host "gnu.org") ⇒ ("gnu.org" . #f)
A set of etags, indicating that the request should proceed if and only if the etag of the resource is in that set. Either the symbol
*, indicating any etag, or a list of entity tags.(parse-header 'if-match "*") ⇒ * (parse-header 'if-match "asdfadf") ⇒ (("asdfadf" . #t)) (parse-header 'if-match W/"asdfadf") ⇒ (("asdfadf" . #f))
Indicates that a response should proceed if and only if the resource has been modified since the given date.
(parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") ⇒ #<date ...>
A set of etags, indicating that the request should proceed if and only if the etag of the resource is not in the set. Either the symbol
*, indicating any etag, or a list of entity tags.(parse-header 'if-none-match "*") ⇒ *
Indicates that the range request should proceed if and only if the resource matches a modification date or an etag. Either an entity tag, or a SRFI-19 date.
(parse-header 'if-range "\"original-etag\"") ⇒ ("original-etag" . #t)
Indicates that a response should proceed if and only if the resource has not been modified since the given date.
(parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") ⇒ #<date ...>
The maximum number of proxy or gateway hops that a request should be subject to.
(parse-header 'max-forwards "10") ⇒ 10
Authorization credentials for a proxy connection. See the documentation for
authorizationabove for more information on the format.(parse-header 'proxy-authorization "Digest foo=bar,baz=qux" ⇒ (digest (foo . "bar") (baz . "qux"))
A range request, indicating that the client wants only part of a resource. The car of the pair is the symbol
bytes, and the cdr is a list of pairs. Each element of the cdr indicates a range; the car is the first byte position and the cdr is the last byte position, as integers, or#fif not given.(parse-header 'range "bytes=10-30,50-") ⇒ (bytes (10 . 30) (50 . #f))
The URI of the resource that referred the user to this resource. The name of the header is a misspelling, but we are stuck with it.
(parse-header 'referer "http://www.gnu.org/") ⇒ #<uri ...>
A list of transfer codings, expressed as key-value lists. A common transfer coding is
trailers.(parse-header 'te "trailers") ⇒ ((trailers))
A string indicating the user agent making the request. The specification defines a structured format for this header, but it is widely disregarded, so Guile does not attempt to parse strictly.
(parse-header 'user-agent "Mozilla/5.0") ⇒ "Mozilla/5.0"
A list of range units that the server supports, as symbols.
(parse-header 'accept-ranges "bytes") ⇒ (bytes)
The entity-tag of the resource.
(parse-header 'etag "\"foo\"") ⇒ ("foo" . #t)
A URI on which a request may be completed. Used in combination with a redirecting status code to perform client-side redirection.
(parse-header 'location "http://example.com/other") ⇒ #<uri ...>
A list of challenges to a proxy, indicating the need for authentication.
(parse-header 'proxy-authenticate "Basic realm=\"foo\"") ⇒ ((basic (realm . "foo")))
Used in combination with a server-busy status code, like 503, to indicate that a client should retry later. Either a number of seconds, or a date.
(parse-header 'retry-after "60") ⇒ 60
A string identifying the server.
(parse-header 'server "My first web server") ⇒ "My first web server"
A set of request headers that were used in computing this response. Used to indicate that server-side content negotiation was performed, for example in response to the
accept-languageheader. Can also be the symbol*, indicating that all headers were considered.(parse-header 'vary "Accept-Language, Accept") ⇒ (accept-language accept)
A list of challenges to a user, indicating the need for authentication.
(parse-header 'www-authenticate "Basic realm=\"foo\"") ⇒ ((basic (realm . "foo")))
(use-modules (web request))
The request module contains a data type for HTTP requests.
HTTP requests consist of two parts: the request proper, consisting of a request line and a set of headers, and (optionally) a body. The body might have a binary content-type, and even in the textual case its length is specified in bytes, not characters.
Therefore, HTTP is a fundamentally binary protocol. However the request line and headers are specified to be in a subset of ASCII, so they can be treated as text, provided that the port's encoding is set to an ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1) is just such an encoding, and happens to be very efficient for Guile.
So what Guile does when reading requests from the wire, or writing them out, is to set the port's encoding to latin-1, and treating the request headers as text.
The request body is another issue. For binary data, the data is probably in a bytevector, so we use the R6RS binary output procedures to write out the binary payload. Textual data usually has to be written out to some character encoding, usually UTF-8, and then the resulting bytevector is written out to the port.
In summary, Guile reads and writes HTTP over latin-1 sockets, without any loss of generality.
A predicate and field accessors for the request type. The fields are as follows:
method- The HTTP method, for example,
GET.uri- The URI as a URI record.
version- The HTTP version pair, like
(1 . 1).headers- The request headers, as an alist of parsed values.
meta- An arbitrary alist of other data, for example information returned in the
sockaddrfromaccept(see Network Sockets and Communication).port- The port on which to read or write a request body, if any.
Read an HTTP request from port, optionally attaching the given metadata, meta.
As a side effect, sets the encoding on port to ISO-8859-1 (latin-1), so that reading one character reads one byte. See the discussion of character sets above, for more information.
Note that the body is not part of the request. Once you have read a request, you may read the body separately, and likewise for writing requests.
Construct an HTTP request object. If validate-headers? is true, the headers are each run through their respective validators.
Write the given HTTP request to port.
Return a new request, whose
request-portwill continue writing on port, perhaps using some transfer encoding.
Reads the request body from r, as a bytevector. Return
#fif there was no request body.
Write body, a bytevector, to the port corresponding to the HTTP request r.
The various headers that are typically associated with HTTP requests may be accessed with these dedicated accessors. See HTTP Headers, for more information on the format of parsed headers.
Return the given request header, or default if none was present.
A helper routine to determine the absolute URI of a request, using the
hostheader and the default host and port.
(use-modules (web response))
As with requests (see Requests), Guile offers a data type for HTTP responses. Again, the body is represented separately from the request.
A predicate and field accessors for the response type. The fields are as follows:
version- The HTTP version pair, like
(1 . 1).code- The HTTP response code, like
200.reason-phrase- The reason phrase, or the standard reason phrase for the response's code.
headers- The response headers, as an alist of parsed values.
port- The port on which to read or write a response body, if any.
Read an HTTP response from port.
As a side effect, sets the encoding on port to ISO-8859-1 (latin-1), so that reading one character reads one byte. See the discussion of character sets in Responses, for more information.
Construct an HTTP response object. If validate-headers? is true, the headers are each run through their respective validators.
Adapt the given response to a different HTTP version. Return a new HTTP response.
The idea is that many applications might just build a response for the default HTTP version, and this method could handle a number of programmatic transformations to respond to older HTTP versions (0.9 and 1.0). But currently this function is a bit heavy-handed, just updating the version field.
Write the given HTTP response to port.
Return a new response, whose
response-portwill continue writing on port, perhaps using some transfer encoding.
Read the response body from r, as a bytevector. Returns
#fif there was no response body.
Write body, a bytevector, to the port corresponding to the HTTP response r.
As with requests, the various headers that are typically associated with HTTP responses may be accessed with these dedicated accessors. See HTTP Headers, for more information on the format of parsed headers.
Return the given response header, or default if none was present.
(web client) provides a simple, synchronous HTTP client, built on
the lower-level HTTP, request, and response modules.
Connect to the server corresponding to uri and ask for the resource, using the
GETmethod. If you already have a port open, pass it as port. The port will be closed at the end of the request unless keep-alive? is true. Any extra headers in the alist extra-headers will be added to the request.If decode-body? is true, as is the default, the body of the response will be decoded to string, if it is a textual content-type. Otherwise it will be returned as a bytevector.
http-get is useful for making one-off requests to web sites. If
you are writing a web spider or some other client that needs to handle a
number of requests in parallel, it's better to build an event-driven URL
fetcher, similar in structure to the web server (see Web Server).
Another option, good but not as performant, would be to use threads, possibly via par-map or futures.
More helper procedures for the other common HTTP verbs would be a good addition to this module. Send your code to guile-user@gnu.org.
(web server) is a generic web server interface, along with a main
loop implementation for web servers controlled by Guile.
(use-modules (web server))
The lowest layer is the <server-impl> object, which defines a set
of hooks to open a server, read a request from a client, write a
response to a client, and close a server. These hooks – open,
read, write, and close, respectively – are bound
together in a <server-impl> object. Procedures in this module take a
<server-impl> object, if needed.
A <server-impl> may also be looked up by name. If you pass the
http symbol to run-server, Guile looks for a variable
named http in the (web server http) module, which should
be bound to a <server-impl> object. Such a binding is made by
instantiation of the define-server-impl syntax. In this way the
run-server loop can automatically load other backends if available.
The life cycle of a server goes as follows:
open hook is called, to open the server. open takes 0 or
more arguments, depending on the backend, and returns an opaque
server socket object, or signals an error.
read hook is called, to read a request from a new client.
The read hook takes one argument, the server socket. It should
return three values: an opaque client socket, the request, and the
request body. The request should be a <request> object, from
(web request). The body should be a string or a bytevector, or
#f if there is no body.
If the read failed, the read hook may return #f for the client
socket, request, and body.
<response> record from (web response), and the
response body as bytevector, or #f if not present.
The respose and response body are run through sanitize-response,
documented below. This allows the handler writer to take some
convenient shortcuts: for example, instead of a <response>, the
handler can simply return an alist of headers, in which case a default
response object is constructed with those headers. Instead of a
bytevector for the body, the handler can return a string, which will be
serialized into an appropriate encoding; or it can return a procedure,
which will be called on a port to write out the data. See the
sanitize-response documentation, for more.
write hook is called with three arguments: the client
socket, the response, and the body. The write hook returns no
values.
close hook is called on
the server socket.
A user may define a server implementation with the following form:
Make a
<server-impl>object with the hooks open, read, write, and close, and bind it to the symbol name in the current module.
Look up a server implementation. If impl is a server implementation already, it is returned directly. If it is a symbol, the binding named impl in the
(web serverimpl)module is looked up. Otherwise an error is signaled.Currently a server implementation is a somewhat opaque type, useful only for passing to other procedures in this module, like
read-client.
The (web server) module defines a number of routines that use
<server-impl> objects to implement parts of a web server. Given
that we don't expose the accessors for the various fields of a
<server-impl>, indeed these routines are the only procedures with
any access to the impl objects.
Open a server for the given implementation. Return one value, the new server object. The implementation's
openprocedure is applied to open-params, which should be a list.
Read a new client from server, by applying the implementation's
readprocedure to the server. If successful, return three values: an object corresponding to the client, a request object, and the request body. If any exception occurs, return#ffor all three values.
Handle a given request, returning the response and body.
The response and response body are produced by calling the given handler with request and body as arguments.
The elements of state are also passed to handler as arguments, and may be returned as additional values. The new state, collected from the handler's return values, is then returned as a list. The idea is that a server loop receives a handler from the user, along with whatever state values the user is interested in, allowing the user's handler to explicitly manage its state.
"Sanitize" the given response and body, making them appropriate for the given request.
As a convenience to web handler authors, response may be given as an alist of headers, in which case it is used to construct a default response. Ensures that the response version corresponds to the request version. If body is a string, encodes the string to a bytevector, in an encoding appropriate for response. Adds a
content-lengthandcontent-typeheader, as necessary.If body is a procedure, it is called with a port as an argument, and the output collected as a bytevector. In the future we might try to instead use a compressing, chunk-encoded port, and call this procedure later, in the write-client procedure. Authors are advised not to rely on the procedure being called at any particular time.
Write an HTTP response and body to client. If the server and client support persistent connections, it is the implementation's responsibility to keep track of the client thereafter, presumably by attaching it to the server argument somehow.
Release resources allocated by a previous invocation of
open-server.
Given the procedures above, it is a small matter to make a web server:
Read one request from server, call handler on the request and body, and write the response to the client. Return the new state produced by the handler procedure.
Run Guile's built-in web server.
handler should be a procedure that takes two or more arguments, the HTTP request and request body, and returns two or more values, the response and response body.
For examples, skip ahead to the next section, Web Examples.
The response and body will be run through
sanitize-responsebefore sending back to the client.Additional arguments to handler are taken from state. Additional return values are accumulated into a new state, which will be used for subsequent requests. In this way a handler can explicitly manage its state.
The default web server implementation is http, which binds to a
socket, listening for request on that port.
The default HTTP implementation. We document it as a function with keyword arguments, because that is precisely the way that it is – all of the open-params to
run-serverget passed to the implementation's open function.;; The defaults: localhost:8080 (run-server handler) ;; Same thing (run-server handler 'http '()) ;; On a different port (run-server handler 'http '(#:port 8081)) ;; IPv6 (run-server handler 'http '(#:family AF_INET6 #:port 8081)) ;; Custom socket (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
Well, enough about the tedious internals. Let's make a web application!
The first program we have to write, of course, is “Hello, World!”. This means that we have to implement a web handler that does what we want.
Now we define a handler, a function of two arguments and two return values:
(define (handler request request-body)
(values response response-body))
In this first example, we take advantage of a short-cut, returning an alist of headers instead of a proper response object. The response body is our payload:
(define (hello-world-handler request request-body)
(values '((content-type . (text/plain)))
"Hello World!"))
Now let's test it, by running a server with this handler. Load up the web server module if you haven't yet done so, and run a server with this handler:
(use-modules (web server))
(run-server hello-world-handler)
By default, the web server listens for requests on
localhost:8080. Visit that address in your web browser to
test. If you see the string, Hello World!, sweet!
The Hello World program above is a general greeter, responding to all URIs. To make a more exclusive greeter, we need to inspect the request object, and conditionally produce different results. So let's load up the request, response, and URI modules, and do just that.
(use-modules (web server)) ; you probably did this already
(use-modules (web request)
(web response)
(web uri))
(define (request-path-components request)
(split-and-decode-uri-path (uri-path (request-uri request))))
(define (hello-hacker-handler request body)
(if (equal? (request-path-components request)
'("hacker"))
(values '((content-type . (text/plain)))
"Hello hacker!")
(not-found request)))
(run-server hello-hacker-handler)
Here we see that we have defined a helper to return the components of
the URI path as a list of strings, and used that to check for a request
to /hacker/. Then the success case is just as before – visit
http://localhost:8080/hacker/ in your browser to check.
You should always match against URI path components as decoded by
split-and-decode-uri-path. The above example will work for
/hacker/, //hacker///, and /h%61ck%65r.
But we forgot to define not-found! If you are pasting these
examples into a REPL, accessing any other URI in your web browser will
drop your Guile console into the debugger:
<unnamed port>:38:7: In procedure module-lookup:
<unnamed port>:38:7: Unbound variable: not-found
Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>
So let's define the function, right there in the debugger. As you probably know, we'll want to return a 404 response.
;; Paste this in your REPL
(define (not-found request)
(values (build-response #:code 404)
(string-append "Resource not found: "
(uri->string (request-uri request)))))
;; Now paste this to let the web server keep going:
,continue
Now if you access http://localhost/foo/, you get this error
message. (Note that some popular web browsers won't show
server-generated 404 messages, showing their own instead, unless the 404
message body is long enough.)
The web handler interface is a common baseline that all kinds of Guile web applications can use. You will usually want to build something on top of it, however, especially when producing HTML. Here is a simple example that builds up HTML output using SXML (see sxml simple).
First, load up the modules:
(use-modules (web server)
(web request)
(web response)
(sxml simple))
Now we define a simple templating function that takes a list of HTML body elements, as SXML, and puts them in our super template:
(define (templatize title body)
`(html (head (title ,title))
(body ,@body)))
For example, the simplest Hello HTML can be produced like this:
(sxml->xml (templatize "Hello!" '((b "Hi!"))))
-|
<html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
Much better to work with Scheme data types than to work with HTML as strings. Now we define a little response helper:
(define* (respond #:optional body #:key
(status 200)
(title "Hello hello!")
(doctype "<!DOCTYPE html>\n")
(content-type-params '((charset . "utf-8")))
(content-type 'text/html)
(extra-headers '())
(sxml (and body (templatize title body))))
(values (build-response
#:code status
#:headers `((content-type
. (,content-type ,@content-type-params))
,@extra-headers))
(lambda (port)
(if sxml
(begin
(if doctype (display doctype port))
(sxml->xml sxml port))))))
Here we see the power of keyword arguments with default initializers. By
the time the arguments are fully parsed, the sxml local variable
will hold the templated SXML, ready for sending out to the client.
Also, instead of returning the body as a string, respond gives a
procedure, which will be called by the web server to write out the
response to the client.
Now, a simple example using this responder, which lays out the incoming headers in an HTML table.
(define (debug-page request body)
(respond
`((h1 "hello world!")
(table
(tr (th "header") (th "value"))
,@(map (lambda (pair)
`(tr (td (tt ,(with-output-to-string
(lambda () (display (car pair))))))
(td (tt ,(with-output-to-string
(lambda ()
(write (cdr pair))))))))
(request-headers request))))))
(run-server debug-page)
Now if you visit any local address in your web browser, we actually see some HTML, finally.
Well, this is about as far as Guile's built-in web support goes, for now. There are many ways to make a web application, but hopefully by standardizing the most fundamental data types, users will be able to choose the approach that suits them best, while also being able to switch between implementations of the server. This is a relatively new part of Guile, so if you have feedback, let us know, and we can take it into account. Happy hacking on the web!
The (ice-9 getopt-long) module exports two procedures:
getopt-long and option-ref.
getopt-long takes a list of strings — the command line
arguments — an option specification, and some optional keyword
parameters. It parses the command line arguments according to the
option specification and keyword parameters, and returns a data
structure that encapsulates the results of the parsing.
option-ref then takes the parsed data structure and a specific
option's name, and returns information about that option in particular.
To make these procedures available to your Guile script, include the
expression (use-modules (ice-9 getopt-long)) somewhere near the
top, before the first usage of getopt-long or option-ref.
This section illustrates how getopt-long is used by presenting
and dissecting a simple example. The first thing that we need is an
option specification that tells getopt-long how to parse
the command line. This specification is an association list with the
long option name as the key. Here is how such a specification might
look:
(define option-spec
'((version (single-char #\v) (value #f))
(help (single-char #\h) (value #f))))
This alist tells getopt-long that it should accept two long
options, called version and help, and that these options
can also be selected by the single-letter abbreviations v and
h, respectively. The (value #f) clauses indicate that
neither of the options accepts a value.
With this specification we can use getopt-long to parse a given
command line:
(define options (getopt-long (command-line) option-spec))
After this call, options contains the parsed command line and is
ready to be examined by option-ref. option-ref is called
like this:
(option-ref options 'help #f)
It expects the parsed command line, a symbol indicating the option to
examine, and a default value. The default value is returned if the
option was not present in the command line, or if the option was present
but without a value; otherwise the value from the command line is
returned. Usually option-ref is called once for each possible
option that a script supports.
The following example shows a main program which puts all this together to parse its command line and figure out what the user wanted.
(define (main args)
(let* ((option-spec '((version (single-char #\v) (value #f))
(help (single-char #\h) (value #f))))
(options (getopt-long args option-spec))
(help-wanted (option-ref options 'help #f))
(version-wanted (option-ref options 'version #f)))
(if (or version-wanted help-wanted)
(begin
(if version-wanted
(display "getopt-long-example version 0.3\n"))
(if help-wanted
(display "\
getopt-long-example [options]
-v, --version Display version
-h, --help Display this help
")))
(begin
(display "Hello, World!") (newline)))))
An option specification is an association list (see Association Lists) with one list element for each supported option. The key of each list element is a symbol that names the option, while the value is a list of option properties:
OPTION-SPEC ::= '( (OPT-NAME1 (PROP-NAME PROP-VALUE) ...)
(OPT-NAME2 (PROP-NAME PROP-VALUE) ...)
(OPT-NAME3 (PROP-NAME PROP-VALUE) ...)
...
)
Each opt-name specifies the long option name for that option. For
example, a list element with opt-name background specifies
an option that can be specified on the command line using the long
option --background. Further information about the option —
whether it takes a value, whether it is required to be present in the
command line, and so on — is specified by the option properties.
In the example of the preceding section, we already saw that a long
option name can have a equivalent short option character. The
equivalent short option character can be set for an option by specifying
a single-char property in that option's property list. For
example, a list element like '(output (single-char #\o) ...)
specifies an option with long name --output that can also be
specified by the equivalent short name -o.
The value property specifies whether an option requires or
accepts a value. If the value property is set to #t, the
option requires a value: getopt-long will signal an error if the
option name is present without a corresponding value. If set to
#f, the option does not take a value; in this case, a non-option
word that follows the option name in the command line will be treated as
a non-option argument. If set to the symbol optional, the option
accepts a value but does not require one: a non-option word that follows
the option name in the command line will be interpreted as that option's
value. If the option name for an option with '(value optional)
is immediately followed in the command line by another option
name, the value for the first option is implicitly #t.
The required? property indicates whether an option is required to
be present in the command line. If the required? property is
set to #t, getopt-long will signal an error if the option
is not specified.
Finally, the predicate property can be used to constrain the
possible values of an option. If used, the predicate property
should be set to a procedure that takes one argument — the proposed
option value as a string — and returns either #t or #f
according as the proposed value is or is not acceptable. If the
predicate procedure returns #f, getopt-long will signal an
error.
By default, options do not have single-character equivalents, are not
required, and do not take values. Where the list element for an option
includes a value property but no predicate property, the
option values are unconstrained.
In order for getopt-long to correctly parse a command line, that
command line must conform to a standard set of rules for how command
line options are specified. This section explains what those rules
are.
getopt-long splits a given command line into several pieces. All
elements of the argument list are classified to be either options or
normal arguments. Options consist of two dashes and an option name
(so-called long options), or of one dash followed by a single
letter (short options).
Options can behave as switches, when they are given without a value, or they can be used to pass a value to the program. The value for an option may be specified using an equals sign, or else is simply the next word in the command line, so the following two invocations are equivalent:
$ ./foo.scm --output=bar.txt
$ ./foo.scm --output bar.txt
Short options can be used instead of their long equivalents and can be grouped together after a single dash. For example, the following commands are equivalent.
$ ./foo.scm --version --help
$ ./foo.scm -v --help
$ ./foo.scm -vh
If an option requires a value, it can only be grouped together with other short options if it is the last option in the group; the value is the next argument. So, for example, with the following option specification —
((apples (single-char #\a))
(blimps (single-char #\b) (value #t))
(catalexis (single-char #\c) (value #t)))
— the following command lines would all be acceptable:
$ ./foo.scm -a -b bang -c couth
$ ./foo.scm -ab bang -c couth
$ ./foo.scm -ac couth -b bang
But the next command line is an error, because -b is not the last
option in its combination, and because a group of short options cannot
include two options that both require values:
$ ./foo.scm -abc couth bang
If an option's value is optional, getopt-long decides whether the
option has a value by looking at what follows it in the argument list.
If the next element is a string, and it does not appear to be an option
itself, then that string is the option's value.
If the option -- appears in the argument list, argument parsing
stops there and subsequent arguments are returned as ordinary arguments,
even if they resemble options. So, with the command line
$ ./foo.scm --apples "Granny Smith" -- --blimp Goodyear
getopt-long will recognize the --apples option as having
the value "Granny Smith", but will not treat --blimp as an
option. The strings --blimp and Goodyear will be returned
as ordinary argument strings.
getopt-longParse the command line given in args (which must be a list of strings) according to the option specification grammar.
The grammar argument is expected to be a list of this form:
((option(property value) ...) ...)where each option is a symbol denoting the long option, but without the two leading dashes (e.g.
versionif the option is called--version).For each option, there may be list of arbitrarily many property/value pairs. The order of the pairs is not important, but every property may only appear once in the property list. The following table lists the possible properties:
(single-charchar)- Accept
-char as a single-character equivalent to--option. This is how to specify traditional Unix-style flags.(required?bool)- If bool is true, the option is required.
getopt-longwill raise an error if it is not found in args.(valuebool)- If bool is
#t, the option accepts a value; if it is#f, it does not; and if it is the symboloptional, the option may appear in args with or without a value.(predicatefunc)- If the option accepts a value (i.e. you specified
(value #t)for this option), thengetopt-longwill apply func to the value, and throw an exception if it returns#f. func should be a procedure which accepts a string and returns a boolean value; you may need to use quasiquotes to get it into grammar.The
#:stop-at-first-non-optionkeyword, if specified with any true value, tellsgetopt-longto stop when it gets to the first non-option in the command line. That is, at the first word which is neither an option itself, nor the value of an option. Everything in the command line from that word onwards will be returned as non-option arguments.
getopt-long's args parameter is expected to be a list of
strings like the one returned by command-line, with the first
element being the name of the command. Therefore getopt-long
ignores the first element in args and starts argument
interpretation with the second element.
getopt-long signals an error if any of the following conditions
hold.
--opt=value syntax).
#:stop-at-first-non-option is useful for command line invocations
like guild [--help | --version] [script [script-options]]
and cvs [general-options] command [command-options], where there
are options at two levels: some generic and understood by the outer
command, and some that are specific to the particular script or command
being invoked. To use getopt-long in such cases, you would call
it twice: firstly with #:stop-at-first-non-option #t, so as to
parse any generic options and identify the wanted script or sub-command;
secondly, and after trimming off the initial generic command words, with
a script- or sub-command-specific option grammar, so as to process those
specific options.
option-refSearch options for a command line option named key and return its value, if found. If the option has no value, but was given, return
#t. If the option was not given, return default. options must be the result of a call togetopt-long.
option-ref always succeeds, either by returning the requested
option value from the command line, or the default value.
The special key '() can be used to get a list of all
non-option arguments.
SRFI is an acronym for Scheme Request For Implementation. The SRFI documents define a lot of syntactic and procedure extensions to standard Scheme as defined in R5RS.
Guile has support for a number of SRFIs. This chapter gives an overview over the available SRFIs and some usage hints. For complete documentation, design rationales and further examples, we advise you to get the relevant SRFI documents from the SRFI home page http://srfi.schemers.org/.
SRFI support in Guile is currently implemented partly in the core library, and partly as add-on modules. That means that some SRFIs are automatically available when the interpreter is started, whereas the other SRFIs require you to use the appropriate support module explicitly.
There are several reasons for this inconsistency. First, the feature
checking syntactic form cond-expand (see SRFI-0) must be
available immediately, because it must be there when the user wants to
check for the Scheme implementation, that is, before she can know that
it is safe to use use-modules to load SRFI support modules. The
second reason is that some features defined in SRFIs had been
implemented in Guile before the developers started to add SRFI
implementations as modules (for example SRFI-6 (see SRFI-6)). In
the future, it is possible that SRFIs in the core library might be
factored out into separate modules, requiring explicit module loading
when they are needed. So you should be prepared to have to use
use-modules someday in the future to access SRFI-6 bindings. If
you want, you can do that already. We have included the module
(srfi srfi-6) in the distribution, which currently does nothing,
but ensures that you can write future-safe code.
Generally, support for a specific SRFI is made available by using
modules named (srfi srfi-number), where number is the
number of the SRFI needed. Another possibility is to use the command
line option --use-srfi, which will load the necessary modules
automatically (see Invoking Guile).
This SRFI lets a portable Scheme program test for the presence of certain features, and adapt itself by using different blocks of code, or fail if the necessary features are not available. There's no module to load, this is in the Guile core.
A program designed only for Guile will generally not need this mechanism, such a program can of course directly use the various documented parts of Guile.
Expand to the body of the first clause whose feature specification is satisfied. It is an error if no feature is satisfied.
Features are symbols such as
srfi-1, and a feature specification can useand,orandnotforms to test combinations. The last clause can be anelse, to be used if no other passes.For example, define a private version of
alist-consif SRFI-1 is not available.(cond-expand (srfi-1 ) (else (define (alist-cons key val alist) (cons (cons key val) alist))))Or demand a certain set of SRFIs (list operations, string ports,
receiveand string operations), failing if they're not available.(cond-expand ((and srfi-1 srfi-6 srfi-8 srfi-13) ))
The Guile core has the following features,
guile
guile-2 ;; starting from Guile 2.x
r5rs
srfi-0
srfi-4
srfi-6
srfi-13
srfi-14
Other SRFI feature symbols are defined once their code has been loaded
with use-modules, since only then are their bindings available.
The ‘--use-srfi’ command line option (see Invoking Guile) is
a good way to load SRFIs to satisfy cond-expand when running a
portable program.
Testing the guile feature allows a program to adapt itself to
the Guile module system, but still run on other Scheme systems. For
example the following demands SRFI-8 (receive), but also knows
how to load it with the Guile mechanism.
(cond-expand (srfi-8
)
(guile
(use-modules (srfi srfi-8))))
Likewise, testing the guile-2 feature allows code to be portable
between Guile 2.0 and previous versions of Guile. For instance, it
makes it possible to write code that accounts for Guile 2.0's compiler,
yet be correctly interpreted on 1.8 and earlier versions:
(cond-expand (guile-2 (eval-when (compile)
;; This must be evaluated at compile time.
(fluid-set! current-reader my-reader)))
(guile
;; Earlier versions of Guile do not have a
;; separate compilation phase.
(fluid-set! current-reader my-reader)))
It should be noted that cond-expand is separate from the
*features* mechanism (see Feature Tracking), feature
symbols in one are unrelated to those in the other.
The list library defined in SRFI-1 contains a lot of useful list processing procedures for construction, examining, destructuring and manipulating lists and pairs.
Since SRFI-1 also defines some procedures which are already contained in R5RS and thus are supported by the Guile core library, some list and pair procedures which appear in the SRFI-1 document may not appear in this section. So when looking for a particular list/pair processing procedure, you should also have a look at the sections Lists and Pairs.
New lists can be constructed by calling one of the following procedures.
Like
cons, but with interchanged arguments. Useful mostly when passed to higher-order procedures.
Return an n-element list, where each list element is produced by applying the procedure init-proc to the corresponding list index. The order in which init-proc is applied to the indices is not specified.
Return a new list containing the elements of the list lst.
This function differs from the core
list-copy(see List Constructors) in accepting improper lists too. And if lst is not a pair at all then it's treated as the final tail of an improper list and simply returned.
Return a circular list containing the given arguments elt1 elt2 ....
Return a list containing count numbers, starting from start and adding step each time. The default start is 0, the default step is 1. For example,
(iota 6) ⇒ (0 1 2 3 4 5) (iota 4 2.5 -2) ⇒ (2.5 0.5 -1.5 -3.5)This function takes its name from the corresponding primitive in the APL language.
The procedures in this section test specific properties of lists.
Return
#tif obj is a proper list, or#fotherwise. This is the same as the corelist?(see List Predicates).A proper list is a list which ends with the empty list
()in the usual way. The empty list()itself is a proper list too.(proper-list? '(1 2 3)) ⇒ #t (proper-list? '()) ⇒ #t
Return
#tif obj is a circular list, or#fotherwise.A circular list is a list where at some point the
cdrrefers back to a previous pair in the list (either the start or some later point), so that following thecdrs takes you around in a circle, with no end.(define x (list 1 2 3 4)) (set-cdr! (last-pair x) (cddr x)) x ⇒ (1 2 3 4 3 4 3 4 ...) (circular-list? x) ⇒ #t
Return
#tif obj is a dotted list, or#fotherwise.A dotted list is a list where the
cdrof the last pair is not the empty list(). Any non-pair obj is also considered a dotted list, with length zero.(dotted-list? '(1 2 . 3)) ⇒ #t (dotted-list? 99) ⇒ #t
It will be noted that any Scheme object passes exactly one of the
above three tests proper-list?, circular-list? and
dotted-list?. Non-lists are dotted-list?, finite lists
are either proper-list? or dotted-list?, and infinite
lists are circular-list?.
Return
#tif lst is the empty list(),#fotherwise. If something else than a proper or circular list is passed as lst, an error is signalled. This procedure is recommended for checking for the end of a list in contexts where dotted lists are not allowed.
Return
#tis obj is not a pair,#fotherwise. This is shorthand notation(not (pair?obj))and is supposed to be used for end-of-list checking in contexts where dotted lists are allowed.
Return
#tif all argument lists are equal,#fotherwise. List equality is determined by testing whether all lists have the same length and the corresponding elements are equal in the sense of the equality predicate elt=. If no or only one list is given,#tis returned.
These are synonyms for
car,cadr,caddr, ....
Return a list containing the first i elements of lst.
take!may modify the structure of the argument list lst in order to produce the result.
Return a list containing the i last elements of lst. The return shares a common tail with lst.
Return a list containing all but the i last elements of lst.
drop-rightalways returns a new list, even when i is zero.drop-right!may modify the structure of the argument list lst in order to produce the result.
Return two values, a list containing the first i elements of the list lst and a list containing the remaining elements.
split-at!may modify the structure of the argument list lst in order to produce the result.
Return the length of the argument list lst. When lst is a circular list,
#fis returned.
Construct a list by appending all lists in list-of-lists.
concatenate!may modify the structure of the given lists in order to produce the result.
concatenateis the same as(apply appendlist-of-lists). It exists because some Scheme implementations have a limit on the number of arguments a function takes, which theapplymight exceed. In Guile there is no such limit.
Reverse rev-head, append tail to it, and return the result. This is equivalent to
(append (reverserev-head)tail), but its implementation is more efficient.(append-reverse '(1 2 3) '(4 5 6)) ⇒ (3 2 1 4 5 6)
append-reverse!may modify rev-head in order to produce the result.
Return a list as long as the shortest of the argument lists, where each element is a list. The first list contains the first elements of the argument lists, the second list contains the second elements, and so on.
unzip1takes a list of lists, and returns a list containing the first elements of each list,unzip2returns two lists, the first containing the first elements of each lists and the second containing the second elements of each lists, and so on.
Return a count of the number of times pred returns true when called on elements from the given lists.
pred is called with N parameters
(pred elem1...elemN), each element being from the corresponding lst1 ... lstN. The first call is with the first element of each list, the second with the second element from each, and so on.Counting stops when the end of the shortest list is reached. At least one list must be non-circular.
Apply proc to the elements of lst1 ... lstN to build a result, and return that result.
Each proc call is
(proc elem1...elemN previous), where elem1 is from lst1, through elemN from lstN. previous is the return from the previous call to proc, or the given init for the first call. If any list is empty, just init is returned.
foldworks through the list elements from first to last. The following shows a list reversal and the calls it makes,(fold cons '() '(1 2 3)) (cons 1 '()) (cons 2 '(1)) (cons 3 '(2 1) ⇒ (3 2 1)
fold-rightworks through the list elements from last to first, ie. from the right. So for example the following finds the longest string, and the last among equal longest,(fold-right (lambda (str prev) (if (> (string-length str) (string-length prev)) str prev)) "" '("x" "abc" "xyz" "jk")) ⇒ "xyz"If lst1 through lstN have different lengths,
foldstops when the end of the shortest is reached;fold-rightcommences at the last element of the shortest. Ie. elements past the length of the shortest are ignored in the other lsts. At least one lst must be non-circular.
foldshould be preferred overfold-rightif the order of processing doesn't matter, or can be arranged either way, sincefoldis a little more efficient.The way
foldbuilds a result from iterating is quite general, it can do more than other iterations like saymaporfilter. The following for example removes adjacent duplicate elements from a list,(define (delete-adjacent-duplicates lst) (fold-right (lambda (elem ret) (if (equal? elem (first ret)) ret (cons elem ret))) (list (last lst)) lst)) (delete-adjacent-duplicates '(1 2 3 3 4 4 4 5)) ⇒ (1 2 3 4 5)Clearly the same sort of thing can be done with a
for-eachand a variable in which to build the result, but a self-contained proc can be re-used in multiple contexts, where afor-eachwould have to be written out each time.
The same as
foldandfold-right, but apply proc to the pairs of the lists instead of the list elements.
reduceis a variant offold, where the first call to proc is on two elements from lst, rather than one element and a given initial value.If lst is empty,
reducereturns default (this is the only use for default). If lst has just one element then that's the return value. Otherwise proc is called on the elements of lst.Each proc call is
(proc elem previous), where elem is from lst (the second and subsequent elements of lst), and previous is the return from the previous call to proc. The first element of lst is the previous for the first call to proc.For example, the following adds a list of numbers, the calls made to
+are shown. (Of course+accepts multiple arguments and can add a list directly, withapply.)(reduce + 0 '(5 6 7)) ⇒ 18 (+ 6 5) ⇒ 11 (+ 7 11) ⇒ 18
reducecan be used instead offoldwhere the init value is an “identity”, meaning a value which under proc doesn't change the result, in this case 0 is an identity since(+ 5 0)is just 5.reduceavoids that unnecessary call.
reduce-rightis a similar variation onfold-right, working from the end (ie. the right) of lst. The last element of lst is the previous for the first call to proc, and the elem values go from the second last.
reduceshould be preferred overreduce-rightif the order of processing doesn't matter, or can be arranged either way, sincereduceis a little more efficient.
unfoldis defined as follows:(unfold p f g seed) = (if (p seed) (tail-gen seed) (cons (f seed) (unfold p f g (g seed))))
- p
- Determines when to stop unfolding.
- f
- Maps each seed value to the corresponding list element.
- g
- Maps each seed value to next seed value.
- seed
- The state value for the unfold.
- tail-gen
- Creates the tail of the list; defaults to
(lambda (x) '()).g produces a series of seed values, which are mapped to list elements by f. These elements are put into a list in left-to-right order, and p tells when to stop unfolding.
Construct a list with the following loop.
(let lp ((seed seed) (lis tail)) (if (p seed) lis (lp (g seed) (cons (f seed) lis))))
- p
- Determines when to stop unfolding.
- f
- Maps each seed value to the corresponding list element.
- g
- Maps each seed value to next seed value.
- seed
- The state value for the unfold.
- tail-gen
- Creates the tail of the list; defaults to
(lambda (x) '()).
Map the procedure over the list(s) lst1, lst2, ... and return a list containing the results of the procedure applications. This procedure is extended with respect to R5RS, because the argument lists may have different lengths. The result list will have the same length as the shortest argument lists. The order in which f will be applied to the list element(s) is not specified.
Apply the procedure f to each pair of corresponding elements of the list(s) lst1, lst2, .... The return value is not specified. This procedure is extended with respect to R5RS, because the argument lists may have different lengths. The shortest argument list determines the number of times f is called. f will be applied to the list elements in left-to-right order.
Equivalent to
(apply append (map f clist1 clist2 ...))and
(apply append! (map f clist1 clist2 ...))Map f over the elements of the lists, just as in the
mapfunction. However, the results of the applications are appended together to make the final result.append-mapusesappendto append the results together;append-map!usesappend!.The dynamic order in which the various applications of f are made is not specified.
Linear-update variant of
map–map!is allowed, but not required, to alter the cons cells of lst1 to construct the result list.The dynamic order in which the various applications of f are made is not specified. In the n-ary case, lst2, lst3, ... must have at least as many elements as lst1.
Like
for-each, but applies the procedure f to the pairs from which the argument lists are constructed, instead of the list elements. The return value is not specified.
Like
map, but only results from the applications of f which are true are saved in the result list.
Filtering means to collect all elements from a list which satisfy a specific condition. Partitioning a list means to make two groups of list elements, one which contains the elements satisfying a condition, and the other for the elements which don't.
The filter and filter! functions are implemented in the
Guile core, See List Modification.
Split lst into those elements which do and don't satisfy the predicate pred.
The return is two values (see Multiple Values), the first being a list of all elements from lst which satisfy pred, the second a list of those which do not.
The elements in the result lists are in the same order as in lst but the order in which the calls
(predelem)are made on the list elements is unspecified.
partitiondoes not change lst, but one of the returned lists may share a tail with it.partition!may modify lst to construct its return.
Return a list containing all elements from lst which do not satisfy the predicate pred. The elements in the result list have the same order as in lst. The order in which pred is applied to the list elements is not specified.
remove!is allowed, but not required to modify the structure of the input list.
The procedures for searching elements in lists either accept a predicate or a comparison object for determining which elements are to be searched.
Return the first element of lst which satisfies the predicate pred and
#fif no such element is found.
Return the first pair of lst whose car satisfies the predicate pred and
#fif no such element is found.
Return the longest initial prefix of lst whose elements all satisfy the predicate pred.
take-while!is allowed, but not required to modify the input list while producing the result.
Drop the longest initial prefix of lst whose elements all satisfy the predicate pred.
spansplits the list lst into the longest initial prefix whose elements all satisfy the predicate pred, and the remaining tail.breakinverts the sense of the predicate.
span!andbreak!are allowed, but not required to modify the structure of the input list lst in order to produce the result.Note that the name
breakconflicts with thebreakbinding established bywhile(see while do). Applications wanting to usebreakfrom within awhileloop will need to make a new define under a different name.
Test whether any set of elements from lst1 ... lstN satisfies pred. If so the return value is the return from the successful pred call, or if not the return is
#f.Each pred call is
(pred elem1...elemN)taking an element from each lst. The calls are made successively for the first, second, etc elements of the lists, stopping when pred returns non-#f, or when the end of the shortest list is reached.The pred call on the last set of elements (ie. when the end of the shortest list has been reached), if that point is reached, is a tail call.
Test whether every set of elements from lst1 ... lstN satisfies pred. If so the return value is the return from the final pred call, or if not the return is
#f.Each pred call is
(pred elem1...elemN)taking an element from each lst. The calls are made successively for the first, second, etc elements of the lists, stopping if pred returns#f, or when the end of any of the lists is reached.The pred call on the last set of elements (ie. when the end of the shortest list has been reached) is a tail call.
If one of lst1 ... lstN is empty then no calls to pred are made, and the return is
#t.
Return the index of the first set of elements, one from each of lst1...lstN, which satisfies pred.
pred is called as
(predelem1 ... elemN). Searching stops when the end of the shortest lst is reached. The return index starts from 0 for the first set of elements. If no set of elements pass then the return is#f.(list-index odd? '(2 4 6 9)) ⇒ 3 (list-index = '(1 2 3) '(3 1 2)) ⇒ #f
Return the first sublist of lst whose car is equal to x. If x does not appear in lst, return
#f.Equality is determined by
equal?, or by the equality predicate = if given. = is called(=xelem), ie. with the given x first, so for example to find the first element greater than 5,(member 5 '(3 5 1 7 2 9) <) ⇒ (7 2 9)This version of
memberextends the coremember(see List Searching) by accepting an equality predicate.
Return a list containing the elements of lst but with those equal to x deleted. The returned elements will be in the same order as they were in lst.
Equality is determined by the = predicate, or
equal?if not given. An equality call is made just once for each element, but the order in which the calls are made on the elements is unspecified.The equality calls are always
(= x elem), ie. the given x is first. This means for instance elements greater than 5 can be deleted with(delete 5 lst <).
deletedoes not modify lst, but the return might share a common tail with lst.delete!may modify the structure of lst to construct its return.These functions extend the core
deleteanddelete!(see List Modification) in accepting an equality predicate. See alsolset-difference(see SRFI-1 Set Operations) for deleting multiple elements from a list.
Return a list containing the elements of lst but without duplicates.
When elements are equal, only the first in lst is retained. Equal elements can be anywhere in lst, they don't have to be adjacent. The returned list will have the retained elements in the same order as they were in lst.
Equality is determined by the = predicate, or
equal?if not given. Calls(= x y)are made with element x being before y in lst. A call is made at most once for each combination, but the sequence of the calls across the elements is unspecified.
delete-duplicatesdoes not modify lst, but the return might share a common tail with lst.delete-duplicates!may modify the structure of lst to construct its return.In the worst case, this is an O(N^2) algorithm because it must check each element against all those preceding it. For long lists it is more efficient to sort and then compare only adjacent elements.
Association lists are described in detail in section Association Lists. The present section only documents the additional procedures for dealing with association lists defined by SRFI-1.
Return the pair from alist which matches key. This extends the core
assoc(see Retrieving Alist Entries) by taking an optional = comparison procedure.The default comparison is
equal?. If an = parameter is given it's called(= key alistcar), i.e. the given target key is the first argument, and acarfrom alist is second.For example a case-insensitive string lookup,
(assoc "yy" '(("XX" . 1) ("YY" . 2)) string-ci=?) ⇒ ("YY" . 2)
Cons a new association key and datum onto alist and return the result. This is equivalent to
(cons (cons key datum) alist)
acons(see Adding or Setting Alist Entries) in the Guile core does the same thing.
Return a newly allocated copy of alist, that means that the spine of the list as well as the pairs are copied.
Return a list containing the elements of alist but with those elements whose keys are equal to key deleted. The returned elements will be in the same order as they were in alist.
Equality is determined by the = predicate, or
equal?if not given. The order in which elements are tested is unspecified, but each equality call is made(= key alistkey), i.e. the given key parameter is first and the key from alist second. This means for instance all associations with a key greater than 5 can be removed with(alist-delete 5 alist <).
alist-deletedoes not modify alist, but the return might share a common tail with alist.alist-delete!may modify the list structure of alist to construct its return.
Lists can be used to represent sets of objects. The procedures in this section operate on such lists as sets.
Note that lists are not an efficient way to implement large sets. The procedures here typically take time mxn when operating on m and n element lists. Other data structures like trees, bitsets (see Bit Vectors) or hash tables (see Hash Tables) are faster.
All these procedures take an equality predicate as the first argument.
This predicate is used for testing the objects in the list sets for
sameness. This predicate must be consistent with eq?
(see Equality) in the sense that if two list elements are
eq? then they must also be equal under the predicate. This
simply means a given object must be equal to itself.
Return
#tif each list is a subset of the one following it. Ie. list1 a subset of list2, list2 a subset of list3, etc, for as many lists as given. If only one list or no lists are given then the return is#t.A list x is a subset of y if each element of x is equal to some element in y. Elements are compared using the given = procedure, called as
(=xelem yelem).(lset<= eq?) ⇒ #t (lset<= eqv? '(1 2 3) '(1)) ⇒ #f (lset<= eqv? '(1 3 2) '(4 3 1 2)) ⇒ #t
Return
#tif all argument lists are set-equal. list1 is compared to list2, list2 to list3, etc, for as many lists as given. If only one list or no lists are given then the return is#t.Two lists x and y are set-equal if each element of x is equal to some element of y and conversely each element of y is equal to some element of x. The order of the elements in the lists doesn't matter. Element equality is determined with the given = procedure, called as
(=xelem yelem), but exactly which calls are made is unspecified.(lset= eq?) ⇒ #t (lset= eqv? '(1 2 3) '(3 2 1)) ⇒ #t (lset= string-ci=? '("a" "A" "b") '("B" "b" "a")) ⇒ #t
Add to list any of the given elems not already in the list. elems are
consed onto the start of list (so the return shares a common tail with list), but the order they're added is unspecified.The given = procedure is used for comparing elements, called as
(=listelem elem), ie. the second argument is one of the given elem parameters.(lset-adjoin eqv? '(1 2 3) 4 1 5) ⇒ (5 4 1 2 3)
Return the union of the argument list sets. The result is built by taking the union of list1 and list2, then the union of that with list3, etc, for as many lists as given. For one list argument that list itself is the result, for no list arguments the result is the empty list.
The union of two lists x and y is formed as follows. If x is empty then the result is y. Otherwise start with x as the result and consider each y element (from first to last). A y element not equal to something already in the result is
consed onto the result.The given = procedure is used for comparing elements, called as
(=relem yelem). The first argument is from the result accumulated so far, and the second is from the list being union-ed in. But exactly which calls are made is otherwise unspecified.Notice that duplicate elements in list1 (or the first non-empty list) are preserved, but that repeated elements in subsequent lists are only added once.
(lset-union eqv?) ⇒ () (lset-union eqv? '(1 2 3)) ⇒ (1 2 3) (lset-union eqv? '(1 2 1 3) '(2 4 5) '(5)) ⇒ (5 4 1 2 1 3)
lset-uniondoesn't change the given lists but the result may share a tail with the first non-empty list.lset-union!can modify all of the given lists to form the result.
Return the intersection of list1 with the other argument lists, meaning those elements of list1 which are also in all of list2 etc. For one list argument, just that list is returned.
The test for an element of list1 to be in the return is simply that it's equal to some element in each of list2 etc. Notice this means an element appearing twice in list1 but only once in each of list2 etc will go into the return twice. The return has its elements in the same order as they were in list1.
The given = procedure is used for comparing elements, called as
(=elem1 elemN). The first argument is from list1 and the second is from one of the subsequent lists. But exactly which calls are made and in what order is unspecified.(lset-intersection eqv? '(x y)) ⇒ (x y) (lset-intersection eqv? '(1 2 3) '(4 3 2)) ⇒ (2 3) (lset-intersection eqv? '(1 1 2 2) '(1 2) '(2 1) '(2)) ⇒ (2 2)The return from
lset-intersectionmay share a tail with list1.lset-intersection!may modify list1 to form its result.
Return list1 with any elements in list2, list3 etc removed (ie. subtracted). For one list argument, just that list is returned.
The given = procedure is used for comparing elements, called as
(=elem1 elemN). The first argument is from list1 and the second from one of the subsequent lists. But exactly which calls are made and in what order is unspecified.(lset-difference eqv? '(x y)) ⇒ (x y) (lset-difference eqv? '(1 2 3) '(3 1)) ⇒ (2) (lset-difference eqv? '(1 2 3) '(3) '(2)) ⇒ (1)The return from
lset-differencemay share a tail with list1.lset-difference!may modify list1 to form its result.
Return two values (see Multiple Values), the difference and intersection of the argument lists as per
lset-differenceandlset-intersectionabove.For two list arguments this partitions list1 into those elements of list1 which are in list2 and not in list2. (But for more than two arguments there can be elements of list1 which are neither part of the difference nor the intersection.)
One of the return values from
lset-diff+intersectionmay share a tail with list1.lset-diff+intersection!may modify list1 to form its results.
Return an XOR of the argument lists. For two lists this means those elements which are in exactly one of the lists. For more than two lists it means those elements which appear in an odd number of the lists.
To be precise, the XOR of two lists x and y is formed by taking those elements of x not equal to any element of y, plus those elements of y not equal to any element of x. Equality is determined with the given = procedure, called as
(=e1 e2). One argument is from x and the other from y, but which way around is unspecified. Exactly which calls are made is also unspecified, as is the order of the elements in the result.(lset-xor eqv? '(x y)) ⇒ (x y) (lset-xor eqv? '(1 2 3) '(4 3 2)) ⇒ (4 1)The return from
lset-xormay share a tail with one of the list arguments.lset-xor!may modify list1 to form its result.
The following syntax can be obtained with
(use-modules (srfi srfi-2))
A combination of
andandlet*.Each clause is evaluated in turn, and if
#fis obtained then evaluation stops and#fis returned. If all are non-#fthen body is evaluated and the last form gives the return value, or if body is empty then the result is#t. Each clause should be one of the following,
(symbol expr)- Evaluate expr, check for
#f, and bind it to symbol. Likelet*, that binding is available to subsequent clauses.(expr)- Evaluate expr and check for
#f.symbol- Get the value bound to symbol and check for
#f.Notice that
(expr)has an “extra” pair of parentheses, for instance((eq? x y)). One way to remember this is to imagine thesymbolin(symbol expr)is omitted.
and-let*is good for calculations where a#fvalue means termination, but where a non-#fvalue is going to be needed in subsequent expressions.The following illustrates this, it returns text between brackets ‘[...]’ in a string, or
#fif there are no such brackets (ie. eitherstring-indexgives#f).(define (extract-brackets str) (and-let* ((start (string-index str #\[)) (end (string-index str #\] start))) (substring str (1+ start) end)))The following shows plain variables and expressions tested too.
diagnostic-levelsis taken to be an alist associating a diagnostic type with a level.stris printed only if the type is known and its level is high enough.(define (show-diagnostic type str) (and-let* (want-diagnostics (level (assq-ref diagnostic-levels type)) ((>= level current-diagnostic-level))) (display str)))The advantage of
and-let*is that an extended sequence of expressions and tests doesn't require lots of nesting as would arise from separateandandlet*, or fromcondwith=>.
SRFI-4 provides an interface to uniform numeric vectors: vectors whose elements are all of a single numeric type. Guile offers uniform numeric vectors for signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers, two sizes of floating point values, and, as an extension to SRFI-4, complex floating-point numbers of these two sizes.
The standard SRFI-4 procedures and data types may be included via loading the appropriate module:
(use-modules (srfi srfi-4))
This module is currently a part of the default Guile environment, but it is a good practice to explicitly import the module. In the future, using SRFI-4 procedures without importing the SRFI-4 module will cause a deprecation message to be printed. (Of course, one may call the C functions at any time. Would that C had modules!)
Uniform numeric vectors can be useful since they consume less memory than the non-uniform, general vectors. Also, since the types they can store correspond directly to C types, it is easier to work with them efficiently on a low level. Consider image processing as an example, where you want to apply a filter to some image. While you could store the pixels of an image in a general vector and write a general convolution function, things are much more efficient with uniform vectors: the convolution function knows that all pixels are unsigned 8-bit values (say), and can use a very tight inner loop.
This is implemented in Scheme by having the compiler notice calls to the SRFI-4 accessors, and inline them to appropriate compiled code. From C you have access to the raw array; functions for efficiently working with uniform numeric vectors from C are listed at the end of this section.
Uniform numeric vectors are the special case of one dimensional uniform numeric arrays.
There are 12 standard kinds of uniform numeric vectors, and they all have their own complement of constructors, accessors, and so on. Procedures that operate on a specific kind of uniform numeric vector have a “tag” in their name, indicating the element type.
u8s8u16s16u32s32u64s64f32float
f64double
In addition, Guile supports uniform arrays of complex numbers, with the nonstandard tags:
c32float
c64double
The external representation (ie. read syntax) for these vectors is similar to normal Scheme vectors, but with an additional tag from the tables above indicating the vector's type. For example,
#u16(1 2 3)
#f64(3.1415 2.71)
Note that the read syntax for floating-point here conflicts with
#f for false. In Standard Scheme one can write (1 #f3)
for a three element list (1 #f 3), but for Guile (1 #f3)
is invalid. (1 #f 3) is almost certainly what one should write
anyway to make the intention clear, so this is rarely a problem.
Note that the c32 and c64 functions are only available from
(srfi srfi-4 gnu).
Return
#tif obj is a homogeneous numeric vector of the indicated type.
Return a newly allocated homogeneous numeric vector holding n elements of the indicated type. If value is given, the vector is initialized with that value, otherwise the contents are unspecified.
Return a newly allocated homogeneous numeric vector of the indicated type, holding the given parameter values. The vector length is the number of parameters given.
Return the number of elements in vec.
Return the element at index i in vec. The first element in vec is index 0.
Set the element at index i in vec to value. The first element in vec is index 0. The return value is unspecified.
Return a newly allocated list holding all elements of vec.
Return a newly allocated homogeneous numeric vector of the indicated type, initialized with the elements of the list lst.
Return a new uniform numeric vector of the indicated type and length that uses the memory pointed to by data to store its elements. This memory will eventually be freed with
free. The argument len specifies the number of elements in data, not its size in bytes.The
c32andc64variants take a pointer to a C array offloats ordoubles. The real parts of the complex numbers are at even indices in that array, the corresponding imaginary parts are at the following odd index.
Like
scm_vector_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector of the indicated kind.
Like
scm_vector_writable_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector of the indicated kind.
Guile also provides procedures that operate on all types of uniform numeric
vectors. In what is probably a bug, these procedures are currently available in
the default environment as well; however prudent hackers will make sure to
import (srfi srfi-4 gnu) before using these.
Return non-zero when uvec is a uniform numeric vector, zero otherwise.
Return the number of elements of uvec as a
size_t.
Return
#tif obj is a homogeneous numeric vector of the indicated type.
Return the number of elements in vec.
Return the element at index i in vec. The first element in vec is index 0.
Set the element at index i in vec to value. The first element in vec is index 0. The return value is unspecified.
Return a newly allocated list holding all elements of vec.
Like
scm_vector_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector.
Like
scm_vector_writable_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector.
Unless you really need to the limited generality of these functions, it is best to use the type-specific functions, or the generalized vector accessors.
Guile implements SRFI-4 vectors using bytevectors (see Bytevectors). Often when you have a numeric vector, you end up wanting to write its bytes somewhere, or have access to the underlying bytes, or read in bytes from somewhere else. Bytevectors are very good at this sort of thing. But the SRFI-4 APIs are nicer to use when doing number-crunching, because they are addressed by element and not by byte.
So as a compromise, Guile allows all bytevector functions to operate on numeric vectors. They address the underlying bytes in the native endianness, as one would expect.
Following the same reasoning, that it's just bytes underneath, Guile also allows
uniform vectors of a given type to be accessed as if they were of any type. One
can fill a u32vector, and access its elements with
u8vector-ref. One can use f64vector-ref on bytevectors. It's
all the same to Guile.
In this way, uniform numeric vectors may be written to and read from input/output ports using the procedures that operate on bytevectors.
See Bytevectors, for more information.
Guile defines some useful extensions to SRFI-4, which are not available in the default Guile environment. They may be imported by loading the extensions module:
(use-modules (srfi srfi-4 gnu))
Return a (maybe newly allocated) uniform numeric vector of the indicated type, initialized with the elements of obj, which must be a list, a vector, or a uniform vector. When obj is already a suitable uniform numeric vector, it is returned unchanged.
SRFI-6 defines the procedures open-input-string,
open-output-string and get-output-string. These
procedures are included in the Guile core, so using this module does not
make any difference at the moment. But it is possible that support for
SRFI-6 will be factored out of the core library in the future, so using
this module does not hurt, after all.
receive is a syntax for making the handling of multiple-value
procedures easier. It is documented in See Multiple Values.
This SRFI is a syntax for defining new record types and creating predicate, constructor, and field getter and setter functions. In Guile this is simply an alternate interface to the core record functionality (see Records). It can be used with,
(use-modules (srfi srfi-9))
Create a new record type, and make variousdefines for using it. This syntax can only occur at the top-level, not nested within some other form.type is bound to the record type, which is as per the return from the core
make-record-type. type also provides the name for the record, as perrecord-type-name.constructor is bound to a function to be called as
(constructorfieldval ...)to create a new record of this type. The arguments are initial values for the fields, one argument for each field, in the order they appear in thedefine-record-typeform.The fieldnames provide the names for the record fields, as per the core
record-type-fieldsetc, and are referred to in the subsequent accessor/modifier forms.predicate is bound to a function to be called as
(predicateobj). It returns#tor#faccording to whether obj is a record of this type.Each accessor is bound to a function to be called
(accessorrecord)to retrieve the respective field from a record. Similarly each modifier is bound to a function to be called(modifierrecord val)to set the respective field in a record.
An example will illustrate typical usage,
(define-record-type employee-type
(make-employee name age salary)
employee?
(name get-employee-name)
(age get-employee-age set-employee-age)
(salary get-employee-salary set-employee-salary))
This creates a new employee data type, with name, age and salary fields. Accessor functions are created for each field, but no modifier function for the name (the intention in this example being that it's established only when an employee object is created). These can all then be used as for example,
employee-type ⇒ #<record-type employee-type>
(define fred (make-employee "Fred" 45 20000.00))
(employee? fred) ⇒ #t
(get-employee-age fred) ⇒ 45
(set-employee-salary fred 25000.00) ;; pay rise
The functions created by define-record-type are ordinary
top-level defines. They can be redefined or set! as
desired, exported from a module, etc.
The SRFI-9 specification explicitly disallows record definitions in a
non-toplevel context, such as inside lambda body or inside a
let block. However, Guile's implementation does not enforce that
restriction.
You may use set-record-type-printer! to customize the default printing
behavior of records. This is a Guile extension and is not part of SRFI-9. It
is located in the (srfi srfi-9 gnu) module.
Where type corresponds to the first argument of
define-record-type, and thunk is a procedure accepting two arguments, the record to print, and an output port.
This example prints the employee's name in brackets, for instance [Fred].
(set-record-type-printer! employee-type
(lambda (record port)
(write-char #\[ port)
(display (get-employee-name record) port)
(write-char #\] port)))
This SRFI implements a reader extension #,() called hash-comma.
It allows the reader to give new kinds of objects, for use both in
data and as constants or literals in source code. This feature is
available with
(use-modules (srfi srfi-10))
The new read syntax is of the form
#,(tag arg...)
where tag is a symbol and the args are objects taken as parameters. tags are registered with the following procedure.
Register proc as the constructor for a hash-comma read syntax starting with symbol tag, i.e.
#,(tagarg...). proc is called with the given arguments(procarg...)and the object it returns is the result of the read.
For example, a syntax giving a list of N copies of an object.
(define-reader-ctor 'repeat
(lambda (obj reps)
(make-list reps obj)))
(display '#,(repeat 99 3))
-| (99 99 99)
Notice the quote ' when the #,( ) is used. The
repeat handler returns a list and the program must quote to use
it literally, the same as any other list. Ie.
(display '#,(repeat 99 3))
⇒
(display '(99 99 99))
When a handler returns an object which is self-evaluating, like a number or a string, then there's no need for quoting, just as there's no need when giving those directly as literals. For example an addition,
(define-reader-ctor 'sum
(lambda (x y)
(+ x y)))
(display #,(sum 123 456)) -| 579
A typical use for #,() is to get a read syntax for objects
which don't otherwise have one. For example, the following allows a
hash table to be given literally, with tags and values, ready for fast
lookup.
(define-reader-ctor 'hash
(lambda elems
(let ((table (make-hash-table)))
(for-each (lambda (elem)
(apply hash-set! table elem))
elems)
table)))
(define (animal->family animal)
(hash-ref '#,(hash ("tiger" "cat")
("lion" "cat")
("wolf" "dog"))
animal))
(animal->family "lion") ⇒ "cat"
Or for example the following is a syntax for a compiled regular expression (see Regular Expressions).
(use-modules (ice-9 regex))
(define-reader-ctor 'regexp make-regexp)
(define (extract-angs str)
(let ((match (regexp-exec '#,(regexp "<([A-Z0-9]+)>") str)))
(and match
(match:substring match 1))))
(extract-angs "foo <BAR> quux") ⇒ "BAR"
#,() is somewhat similar to define-macro
(see Macros) in that handler code is run to produce a result, but
#,() operates at the read stage, so it can appear in data for
read (see Scheme Read), not just in code to be executed.
Because #,() is handled at read-time it has no direct access
to variables etc. A symbol in the arguments is just a symbol, not a
variable reference. The arguments are essentially constants, though
the handler procedure can use them in any complicated way it might
want.
Once (srfi srfi-10) has loaded, #,() is available
globally, there's no need to use (srfi srfi-10) in later
modules. Similarly the tags registered are global and can be used
anywhere once registered.
There's no attempt to record what previous #,() forms have
been seen, if two identical forms occur then two calls are made to the
handler procedure. The handler might like to maintain a cache or
similar to avoid making copies of large objects, depending on expected
usage.
In code the best uses of #,() are generally when there's a
lot of objects of a particular kind as literals or constants. If
there's just a few then some local variables and initializers are
fine, but that becomes tedious and error prone when there's a lot, and
the anonymous and compact syntax of #,() is much better.
This module implements the binding forms for multiple values
let-values and let*-values. These forms are similar to
let and let* (see Local Bindings), but they support
binding of the values returned by multiple-valued expressions.
Write (use-modules (srfi srfi-11)) to make the bindings
available.
(let-values (((x y) (values 1 2))
((z f) (values 3 4)))
(+ x y z f))
⇒
10
let-values performs all bindings simultaneously, which means that
no expression in the binding clauses may refer to variables bound in the
same clause list. let*-values, on the other hand, performs the
bindings sequentially, just like let* does for single-valued
expressions.
The SRFI-13 procedures are always available, See Strings.
The SRFI-14 data type and procedures are always available, See Character Sets.
SRFI-16 defines a variable-arity lambda form,
case-lambda. This form is available in the default Guile
environment. See Case-lambda, for more information.
This SRFI implements a generalized set!, allowing some
“referencing” functions to be used as the target location of a
set!. This feature is available from
(use-modules (srfi srfi-17))
For example vector-ref is extended so that
(set! (vector-ref vec idx) new-value)
is equivalent to
(vector-set! vec idx new-value)
The idea is that a vector-ref expression identifies a location,
which may be either fetched or stored. The same form is used for the
location in both cases, encouraging visual clarity. This is similar
to the idea of an “lvalue” in C.
The mechanism for this kind of set! is in the Guile core
(see Procedures with Setters). This module adds definitions of
the following functions as procedures with setters, allowing them to
be targets of a set!,
car,cdr,caar,cadr,cdar,cddr,caaar,caadr,cadar,caddr,cdaar,cdadr,cddar,cdddr,caaaar,caaadr,caadar,caaddr,cadaar,cadadr,caddar,cadddr,cdaaar,cdaadr,cdadar,cdaddr,cddaar,cddadr,cdddar,cddddr
string-ref,vector-ref
The SRFI specifies setter (see Procedures with Setters) as
a procedure with setter, allowing the setter for a procedure to be
changed, eg. (set! (setter foo) my-new-setter-handler).
Currently Guile does not implement this, a setter can only be
specified on creation (getter-with-setter below).
The same as the Guile core
make-procedure-with-setter(see Procedures with Setters).
This is an implementation of the SRFI-18 threading and synchronization library. The functions and variables described here are provided by
(use-modules (srfi srfi-18))
As a general rule, the data types and functions in this SRFI-18
implementation are compatible with the types and functions in Guile's
core threading code. For example, mutexes created with the SRFI-18
make-mutex function can be passed to the built-in Guile
function lock-mutex (see Mutexes and Condition Variables),
and mutexes created with the built-in Guile function make-mutex
can be passed to the SRFI-18 function mutex-lock!. Cases in
which this does not hold true are noted in the following sections.
Threads created by SRFI-18 differ in two ways from threads created by
Guile's built-in thread functions. First, a thread created by SRFI-18
make-thread begins in a blocked state and will not start
execution until thread-start! is called on it. Second, SRFI-18
threads are constructed with a top-level exception handler that
captures any exceptions that are thrown on thread exit. In all other
regards, SRFI-18 threads are identical to normal Guile threads.
Returns the thread that called this function. This is the same procedure as the same-named built-in procedure
current-thread(see Threads).
Returns
#tif obj is a thread,#fotherwise. This is the same procedure as the same-named built-in procedurethread?(see Threads).
Call
thunkin a new thread and with a new dynamic state, returning the new thread and optionally assigning it the object name name, which may be any Scheme object.Note that the name
make-threadconflicts with the(ice-9 threads)functionmake-thread. Applications wanting to use both of these functions will need to refer to them by different names.
Returns the name assigned to thread at the time of its creation, or
#fif it was not given a name.
Get or set the “object-specific” property of thread. In Guile's implementation of SRFI-18, this value is stored as an object property, and will be
#fif not set.
Unblocks thread and allows it to begin execution if it has not done so already.
If one or more threads are waiting to execute, calling
thread-yield!forces an immediate context switch to one of them. Otherwise,thread-yield!has no effect.thread-yield!behaves identically to the Guile built-in functionyield.
The current thread waits until the point specified by the time object timeout is reached (see SRFI-18 Time). This blocks the thread only if timeout represents a point in the future. it is an error for timeout to be
#f.
Causes an abnormal termination of thread. If thread is not already terminated, all mutexes owned by thread become unlocked/abandoned. If thread is the current thread,
thread-terminate!does not return. Otherwisethread-terminate!returns an unspecified value; the termination of thread will occur beforethread-terminate!returns. Subsequent attempts to join on thread will cause a “terminated thread exception” to be raised.
thread-terminate!is compatible with the thread cancellation procedures in the core threads API (see Threads) in that if a cleanup handler has been installed for the target thread, it will be called before the thread exits and its return value (or exception, if any) will be stored for later retrieval via a call tothread-join!.
Wait for thread to terminate and return its exit value. When a time value timeout is given, it specifies a point in time where the waiting should be aborted. When the waiting is aborted, timeoutval is returned if it is specified; otherwise, a
join-timeout-exceptionexception is raised (see SRFI-18 Exceptions). Exceptions may also be raised if the thread was terminated by a call tothread-terminate!(terminated-thread-exceptionwill be raised) or if the thread exited by raising an exception that was handled by the top-level exception handler (uncaught-exceptionwill be raised; the original exception can be retrieved usinguncaught-exception-reason).
The behavior of Guile's built-in mutexes is parameterized via a set of
flags passed to the make-mutex procedure in the core
(see Mutexes and Condition Variables). To satisfy the requirements
for mutexes specified by SRFI-18, the make-mutex procedure
described below sets the following flags:
recursive: the mutex can be locked recursively
unchecked-unlock: attempts to unlock a mutex that is already
unlocked will not raise an exception
allow-external-unlock: the mutex can be unlocked by any thread,
not just the thread that locked it originally
Returns a new mutex, optionally assigning it the object name name, which may be any Scheme object. The returned mutex will be created with the configuration described above. Note that the name
make-mutexconflicts with Guile core functionmake-mutex. Applications wanting to use both of these functions will need to refer to them by different names.
Returns the name assigned to mutex at the time of its creation, or
#fif it was not given a name.
Get or set the “object-specific” property of mutex. In Guile's implementation of SRFI-18, this value is stored as an object property, and will be
#fif not set.
Returns information about the state of mutex. Possible values are:
- thread
T: the mutex is in the locked/owned state and thread T is the owner of the mutex- symbol
not-owned: the mutex is in the locked/not-owned state- symbol
abandoned: the mutex is in the unlocked/abandoned state- symbol
not-abandoned: the mutex is in the unlocked/not-abandoned state
Lock mutex, optionally specifying a time object timeout after which to abort the lock attempt and a thread thread giving a new owner for mutex different than the current thread. This procedure has the same behavior as the
lock-mutexprocedure in the core library.
Unlock mutex, optionally specifying a condition variable condition-variable on which to wait, either indefinitely or, optionally, until the time object timeout has passed, to be signalled. This procedure has the same behavior as the
unlock-mutexprocedure in the core library.
SRFI-18 does not specify a “wait” function for condition variables.
Waiting on a condition variable can be simulated using the SRFI-18
mutex-unlock! function described in the previous section, or
Guile's built-in wait-condition-variable procedure can be used.
Returns
#tif obj is a condition variable,#fotherwise. This is the same procedure as the same-named built-in procedure (seecondition-variable?).
Returns a new condition variable, optionally assigning it the object name name, which may be any Scheme object. This procedure replaces a procedure of the same name in the core library.
Returns the name assigned to thread at the time of its creation, or
#fif it was not given a name.
Get or set the “object-specific” property of condition-variable. In Guile's implementation of SRFI-18, this value is stored as an object property, and will be
#fif not set.
Wake up one thread that is waiting for condition-variable, in the case of
condition-variable-signal!, or all threads waiting for it, in the case ofcondition-variable-broadcast!. The behavior of these procedures is equivalent to that of the proceduressignal-condition-variableandbroadcast-condition-variablein the core library.
The SRFI-18 time functions manipulate time in two formats: a “time object” type that represents an absolute point in time in some implementation-specific way; and the number of seconds since some unspecified “epoch”. In Guile's implementation, the epoch is the Unix epoch, 00:00:00 UTC, January 1, 1970.
Return the current time as a time object. This procedure replaces the procedure of the same name in the core library, which returns the current time in seconds since the epoch.
Convert between time objects and numerical values representing the number of seconds since the epoch. When converting from a time object to seconds, the return value is the number of seconds between time and the epoch. When converting from seconds to a time object, the return value is a time object that represents a time seconds seconds after the epoch.
SRFI-18 exceptions are identical to the exceptions provided by Guile's implementation of SRFI-34. The behavior of exception handlers invoked to handle exceptions thrown from SRFI-18 functions, however, differs from the conventional behavior of SRFI-34 in that the continuation of the handler is the same as that of the call to the function. Handlers are called in a tail-recursive manner; the exceptions do not “bubble up”.
Installs handler as the current exception handler and calls the procedure thunk with no arguments, returning its value as the value of the exception. handler must be a procedure that accepts a single argument. The current exception handler at the time this procedure is called will be restored after the call returns.
Raise obj as an exception. This is the same procedure as the same-named procedure defined in SRFI 34.
Returns
#tif obj is an exception raised as the result of performing a timed join on a thread that does not exit within the specified timeout,#fotherwise.
Returns
#tif obj is an exception raised as the result of attempting to lock a mutex that has been abandoned by its owner thread,#fotherwise.
Returns
#tif obj is an exception raised as the result of joining on a thread that exited as the result of a call tothread-terminate!.
uncaught-exception?returns#tif obj is an exception thrown as the result of joining a thread that exited by raising an exception that was handled by the top-level exception handler installed bymake-thread. When this occurs, the original exception is preserved as part of the exception thrown bythread-join!and can be accessed by callinguncaught-exception-reasonon that exception. Note that because this exception-preservation mechanism is a side-effect ofmake-thread, joining on threads that exited as described above but were created by other means will not raise thisuncaught-exceptionerror.
This is an implementation of the SRFI-19 time/date library. The functions and variables described here are provided by
(use-modules (srfi srfi-19))
Caution: The current code in this module incorrectly extends the Gregorian calendar leap year rule back prior to the introduction of those reforms in 1582 (or the appropriate year in various countries). The Julian calendar was used prior to 1582, and there were 10 days skipped for the reform, but the code doesn't implement that.
This will be fixed some time. Until then calculations for 1583 onwards are correct, but prior to that any day/month/year and day of the week calculations are wrong.
This module implements time and date representations and calculations, in various time systems, including universal time (UTC) and atomic time (TAI).
For those not familiar with these time systems, TAI is based on a fixed length second derived from oscillations of certain atoms. UTC differs from TAI by an integral number of seconds, which is increased or decreased at announced times to keep UTC aligned to a mean solar day (the orbit and rotation of the earth are not quite constant).
So far, only increases in the TAI <-> UTC difference have been needed. Such an increase is a “leap second”, an extra second of TAI introduced at the end of a UTC day. When working entirely within UTC this is never seen, every day simply has 86400 seconds. But when converting from TAI to a UTC date, an extra 23:59:60 is present, where normally a day would end at 23:59:59. Effectively the UTC second from 23:59:59 to 00:00:00 has taken two TAI seconds.
In the current implementation, the system clock is assumed to be UTC, and a table of leap seconds in the code converts to TAI. See comments in srfi-19.scm for how to update this table.
Also, for those not familiar with the terminology, a Julian Day is a real number which is a count of days and fraction of a day, in UTC, starting from -4713-01-01T12:00:00Z, ie. midday Monday 1 Jan 4713 B.C. A Modified Julian Day is the same, but starting from 1858-11-17T00:00:00Z, ie. midnight 17 November 1858 UTC. That time is julian day 2400000.5.
A time object has type, seconds and nanoseconds fields representing a point in time starting from some epoch. This is an arbitrary point in time, not just a time of day. Although times are represented in nanoseconds, the actual resolution may be lower.
The following variables hold the possible time types. For instance
(current-time time-process) would give the current CPU process
time.
Monotonic time, meaning a monotonically increasing time starting from an unspecified epoch.
Note that in the current implementation
time-monotonicis the same astime-tai, and unfortunately is therefore affected by adjustments to the system clock. Perhaps this will change in the future.
CPU time spent in the current process, starting from when the process began.
Create a time object with the given type, seconds and nanoseconds.
Get or set the type, seconds or nanoseconds fields of a time object.
set-time-type!merely changes the field, it doesn't convert the time value. For conversions, see SRFI-19 Time/Date conversions.
Return the current time of the given type. The default type is
time-utc.Note that the name
current-timeconflicts with the Guile corecurrent-timefunction (see Time) as well as the SRFI-18current-timefunction (see SRFI-18 Time). Applications wanting to use more than one of these functions will need to refer to them by different names.
Return the resolution, in nanoseconds, of the given time type. The default type is
time-utc.
Return
#tor#faccording to the respective relation between time objects t1 and t2. t1 and t2 must be the same time type.
Return a time object of type
time-durationrepresenting the period between t1 and t2. t1 and t2 must be the same time type.
time-differencereturns a new time object,time-difference!may modify t1 to form its return.
Return a time object which is time with the given duration added or subtracted. duration must be a time object of type
time-duration.
add-durationandsubtract-durationreturn a new time object.add-duration!andsubtract-duration!may modify the given time to form their return.
A date object represents a date in the Gregorian calendar and a time of day on that date in some timezone.
The fields are year, month, day, hour, minute, second, nanoseconds and timezone. A date object is immutable, its fields can be read but they cannot be modified once the object is created.
Create a new date object.
Seconds, 0 to 59, or 60 for a leap second. 60 is never seen when working entirely within UTC, it's only when converting to or from TAI.
Year, eg. 2003. Dates B.C. are negative, eg. -46 is 46 B.C. There is no year 0, year -1 is followed by year 1.
Week of the year, ignoring a first partial week. dstartw is the day of the week which is taken to start a week, 0 for Sunday, 1 for Monday, etc.
Return a date object representing the current date/time, in UTC offset by tz-offset. tz-offset is seconds east of Greenwich and defaults to the local timezone.
Convert between dates, times and days of the respective types. For instancetime-tai->time-utcaccepts a time object of typetime-taiand returns an object of typetime-utc.The
!variants may modify their time argument to form their return. The plain functions create a new object.For conversions to dates, tz-offset is seconds east of Greenwich. The default is the local timezone, at the given time, as provided by the system, using
localtime(see Time).On 32-bit systems,
localtimeis limited to a 32-bittime_t, so a default tz-offset is only available for times between Dec 1901 and Jan 2038. For prior dates an application might like to use the value in 1902, though some locations have zone changes prior to that. For future dates an application might like to assume today's rules extend indefinitely. But for correct daylight savings transitions it will be necessary to take an offset for the same day and time but a year in range and which has the same starting weekday and same leap/non-leap (to support rules like last Sunday in October).
Convert a date to a string under the control of a format. format should be a string containing ‘~’ escapes, which will be expanded as per the following conversion table. The default format is ‘~c’, a locale-dependent date and time.
Many of these conversion characters are the same as POSIX
strftime(see Time), but there are some extras and some variations.
~~literal ~ ~alocale abbreviated weekday, eg. ‘Sun’ ~Alocale full weekday, eg. ‘Sunday’ ~blocale abbreviated month, eg. ‘Jan’ ~Blocale full month, eg. ‘January’ ~clocale date and time, eg.
‘Fri Jul 14 20:28:42-0400 2000’~dday of month, zero padded, ‘01’ to ‘31’
~eday of month, blank padded, ‘ 1’ to ‘31’ ~fseconds and fractional seconds, with locale decimal point, eg. ‘5.2’ ~hsame as ~b~Hhour, 24-hour clock, zero padded, ‘00’ to ‘23’ ~Ihour, 12-hour clock, zero padded, ‘01’ to ‘12’ ~jday of year, zero padded, ‘001’ to ‘366’ ~khour, 24-hour clock, blank padded, ‘ 0’ to ‘23’ ~lhour, 12-hour clock, blank padded, ‘ 1’ to ‘12’ ~mmonth, zero padded, ‘01’ to ‘12’ ~Mminute, zero padded, ‘00’ to ‘59’ ~nnewline ~Nnanosecond, zero padded, ‘000000000’ to ‘999999999’ ~plocale AM or PM ~rtime, 12 hour clock, ‘~I:~M:~S ~p’ ~snumber of full seconds since “the epoch” in UTC ~Ssecond, zero padded ‘00’ to ‘60’
(usual limit is 59, 60 is a leap second)~thorizontal tab character ~Ttime, 24 hour clock, ‘~H:~M:~S’ ~Uweek of year, Sunday first day of week, ‘00’ to ‘52’ ~Vweek of year, Monday first day of week, ‘01’ to ‘53’ ~wday of week, 0 for Sunday, ‘0’ to ‘6’ ~Wweek of year, Monday first day of week, ‘00’ to ‘52’
~yyear, two digits, ‘00’ to ‘99’ ~Yyear, full, eg. ‘2003’ ~ztime zone, RFC-822 style ~Ztime zone symbol (not currently implemented) ~1ISO-8601 date, ‘~Y-~m-~d’ ~2ISO-8601 time+zone, ‘~k:~M:~S~z’ ~3ISO-8601 time, ‘~k:~M:~S’ ~4ISO-8601 date/time+zone, ‘~Y-~m-~dT~k:~M:~S~z’ ~5ISO-8601 date/time, ‘~Y-~m-~dT~k:~M:~S’
Conversions ‘~D’, ‘~x’ and ‘~X’ are not currently described here, since the specification and reference implementation differ.
Conversion