This reference manual documents Guile, GNU's Ubiquitous Intelligent Language for Extensions. This is edition 1.1 corresponding to Guile 1.8.7.
Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005 Free Software Foundation.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with the no Invariant Sections, with the Front-Cover Texts being “A GNU Manual,” and with the Back-Cover Text “You are free to copy and modify this GNU Manual.”. A copy of the license is included in the section entitled “GNU Free Documentation License”.
Appendices
Indices
syntax-rules System
syntax-case System
cond clause
This reference manual documents Guile, GNU's Ubiquitous Intelligent Language for Extensions. It describes how to use Guile in many useful and interesting ways.
This is edition 1.1 of the reference manual, and corresponds to Guile version 1.8.7.
The manual is divided into five chapters.
guile program from the command-line
and how to write scripts in Scheme. It also gives an introduction
into the basic ideas of Scheme itself and to the various extensions
that Guile offers beyond standard Scheme.
We use some conventions in this manual.
#t” or “non-#f”. This typically means that
val is returned if condition holds, and that `#f' is
returned otherwise. To clarify: val will only be
returned when condition is true.
The symbol `=>' is used to tell which value is returned by an evaluation:
(+ 1 2)
=> 3
Some procedures produce some output besides returning a value. This is denoted by the symbol `-|'.
(begin (display 1) (newline) 'hooray)
-| 1
=> hooray
As you can see, this code prints `1' (denoted by
`-|'), and returns hooray (denoted by
`=>'). Do not confuse the two.
The Guile reference and tutorial manuals were written and edited largely by Mark Galassi and Jim Blandy. In particular, Jim wrote the original tutorial on Guile's data representation and the C API for accessing Guile objects.
Significant portions were contributed by Gary Houston (contributions to POSIX system calls and networking, expect, I/O internals and extensions, slib installation, error handling) and Tim Pierce (sections on script interpreter triggers, alists, function tracing).
Tom Lord contributed a great deal of material with early Guile snapshots; although most of this text has been rewritten, all of it was important, and some of the structure remains.
Aubrey Jaffer wrote the SCM Scheme implementation and manual upon which the Guile program and manual are based. Some portions of the SCM and SLIB manuals have been included here verbatim.
Since Guile 1.4, Neil Jerram has been maintaining and improving the reference manual. Among other contributions, he wrote the Basic Ideas chapter, developed the tools for keeping the manual in sync with snarfed libguile docstrings, and reorganized the structure so as to accommodate docstrings for all Guile's primitives.
Martin Grabmueller has made substantial contributions throughout the reference manual in preparation for the Guile 1.6 release, including filling out a lot of the documentation of Scheme data types, control mechanisms and procedures. In addition, he wrote the documentation for Guile's SRFI modules and modules associated with the Guile REPL.
Guile is Free Software. Guile is copyrighted, not public domain, and there are restrictions on its distribution or redistribution, but these restrictions are designed to permit everything a cooperating person would want to do.
C code linking to the Guile library is subject to terms of that library. Basically such code may be published on any terms, provided users can re-link against a new or modified version of Guile.
C code linking to the Guile readline module is subject to the terms of that module. Basically such code must be published on Free terms.
Scheme level code written to be run by Guile (but not derived from Guile itself) is not resticted in any way, and may be published on any terms. We encourage authors to publish on Free terms.
You must be aware there is no warranty whatsoever for Guile. This is described in full in the licenses.
Guile is an interpreter for the Scheme programming language, packaged for use in a wide variety of environments. Guile implements Scheme as described in the Revised^5 Report on the Algorithmic Language Scheme (usually known as R5RS), providing clean and general data and control structures. Guile goes beyond the rather austere language presented in R5RS, extending it with a module system, full access to POSIX system calls, networking support, multiple threads, dynamic linking, a foreign function call interface, powerful string processing, and many other features needed for programming in the real world.
Like a shell, Guile can run interactively, reading expressions from the user, evaluating them, and displaying the results, or as a script interpreter, reading and executing Scheme code from a file. However, Guile is also packaged as an object library, allowing other applications to easily incorporate a complete Scheme interpreter. An application can then use Guile as an extension language, a clean and powerful configuration language, or as multi-purpose “glue”, connecting primitives provided by the application. It is easy to call Scheme code from C code and vice versa, giving the application designer full control of how and when to invoke the interpreter. Applications can add new functions, data types, control structures, and even syntax to Guile, creating a domain-specific language tailored to the task at hand, but based on a robust language design.
Guile's module system allows one to break up a large program into manageable sections with well-defined interfaces between them. Modules may contain a mixture of interpreted and compiled code; Guile can use either static or dynamic linking to incorporate compiled code. Modules also encourage developers to package up useful collections of routines for general distribution; as of this writing, one can find Emacs interfaces, database access routines, compilers, GUI toolkit interfaces, and HTTP client functions, among others.
In the future, we hope to expand Guile to support other languages like Tcl and Perl by translating them to Scheme code. This means that users can program applications which use Guile in the language of their choice, rather than having the tastes of the application's author imposed on them.
Guile can be obtained from the main GNU archive site ftp://ftp.gnu.org or any of its mirrors. The file will be named guile-version.tar.gz. The current version is 1.8.7, so the file you should grab is:
ftp://ftp.gnu.org/pub/gnu/guile-1.8.7.tar.gz
To unbundle Guile use the instruction
zcat guile-1.8.7.tar.gz | tar xvf -
which will create a directory called guile-1.8.7 with all the sources. You can look at the file INSTALL for detailed instructions on how to build and install Guile, but you should be able to just do
cd guile-1.8.7
./configure
make
make install
This will install the Guile executable guile, the Guile library -lguile and various associated header files and support libraries. It will also install the Guile tutorial and reference manual.
Since this manual frequently refers to the Scheme “standard”, also known as R5RS, or the “Revised^5 Report on the Algorithmic Language Scheme”, we have included the report in the Guile distribution; See Introduction. This will also be installed in your info directory.
This chapter presents a quick tour of all the ways that Guile can be used. There are additional examples in the examples/ directory in the Guile source distribution.
The following examples assume that Guile has been installed in
/usr/local/.
In its simplest form, Guile acts as an interactive interpreter for the
Scheme programming language, reading and evaluating Scheme expressions
the user enters from the terminal. Here is a sample interaction between
Guile and a user; the user's input appears after the $ and
guile> prompts:
$ guile
guile> (+ 1 2 3) ; add some numbers
6
guile> (define (factorial n) ; define a function
(if (zero? n) 1 (* n (factorial (- n 1)))))
guile> (factorial 20)
2432902008176640000
guile> (getpwnam "jimb") ; find my entry in /etc/passwd
#("jimb" ".0krIpK2VqNbU" 4008 10 "Jim Blandy" "/u/jimb"
"/usr/local/bin/bash")
guile> C-d
$
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
Here is a trivial Guile script, for more details See Guile Scripting.
#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)
The Guile interpreter is available as an object library, to be linked into applications using Scheme as a configuration or extension language.
Here is simple-guile.c, source code for a program that will
produce a complete Guile interpreter. In addition to all usual
functions provided by Guile, it will also offer the function
my-hostname.
#include <stdlib.h>
#include <libguile.h>
static SCM
my_hostname (void)
{
char *s = getenv ("HOSTNAME");
if (s == NULL)
return SCM_BOOL_F;
else
return scm_from_locale_string (s);
}
static void
inner_main (void *data, int argc, char **argv)
{
scm_c_define_gsubr ("my-hostname", 0, 0, 0, my_hostname);
scm_shell (argc, argv);
}
int
main (int argc, char **argv)
{
scm_boot_guile (argc, argv, inner_main, 0);
return 0; /* never reached */
}
When Guile is correctly installed on your system, the above program can be compiled and linked like this:
$ gcc -o simple-guile simple-guile.c -lguile
When it is run, it behaves just like the guile program except
that you can also call the new my-hostname function.
$ ./simple-guile
guile> (+ 1 2 3)
6
guile> (my-hostname)
"burns"
You can link Guile into your program and make Scheme available to the users of your program. You can also link your library into Guile and make its functionality available to all users of Guile.
A library that is linked into Guile is called an extensions, but it really just is an ordinary object library.
The following example shows how to write a simple extension for Guile
that makes the j0 function available to Scheme code.
#include <math.h>
#include <libguile.h>
SCM
j0_wrapper (SCM x)
{
return scm_make_real (j0 (scm_num2dbl (x, "j0")));
}
void
init_bessel ()
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc -shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction).
A shared library can be loaded into a running Guile process with the
function load-extension. The j0 is then immediately
available:
$ guile
guile> (load-extension "./libguile-bessel" "init_bessel")
guile> (j0 2)
0.223890779141236
Guile has support for dividing a program into modules. By using modules, you can group related code together and manage the composition of complete programs from largely independent parts.
(Although the module system implementation is in flux, feel free to use it anyway. Guile will provide reasonable backwards compatibility.)
Details on the module system beyond this introductory material can be found in See Modules.
Guile comes with a lot of useful modules, for example for string processing or command line parsing. Additionally, there exist many Guile modules written by other Guile hackers, but which have to be installed manually.
Here is a sample interactive session that shows how to use the
(ice-9 popen) module which provides the means for communicating
with other processes over pipes together with the (ice-9
rdelim) module that provides the function read-line.
$ guile
guile> (use-modules (ice-9 popen))
guile> (use-modules (ice-9 rdelim))
guile> (define p (open-input-pipe "ls -l"))
guile> (read-line p)
"total 30"
guile> (read-line p)
"drwxr-sr-x 2 mgrabmue mgrabmue 1024 Mar 29 19:57 CVS"
You can create new modules using the syntactic form
define-module. All definitions following this form until the
next define-module are placed into the new module.
One module is usually placed into one file, and that file is installed in a location where Guile can automatically find it. The following session shows a simple example.
$ cat /usr/local/share/guile/foo/bar.scm
(define-module (foo bar))
(export frob)
(define (frob x) (* 2 x))
$ guile
guile> (use-modules (foo bar))
guile> (frob 12)
24
In addition to Scheme code you can also put things that are defined in C into a module.
You do this by writing a small Scheme file that defines the module and
call load-extension directly in the body of the module.
$ cat /usr/local/share/guile/math/bessel.scm
(define-module (math bessel))
(export j0)
(load-extension "libguile-bessel" "init_bessel")
$ file /usr/local/lib/libguile-bessel.so
... ELF 32-bit LSB shared object ...
$ guile
guile> (use-modules (math bessel))
guile> (j0 2)
0.223890779141236
There is also a way to manipulate the module system from C but only Scheme files can be autoloaded. Thus, we recommend that you define your modules in Scheme.
From time to time functions and other features of Guile become obsolete. Guile has some mechanisms in place that can help you cope with this.
Guile has two levels of obsoleteness: things can be deprecated, meaning that their use is considered harmful and should be avoided, even in old code; or they can be merely discouraged, meaning that they are fine in and of themselves, but that there are better alternatives that should be used in new code.
When you use a feature that is deprecated, you will likely get a warning message at run-time. Also, deprecated features are not ready for production use: they might be very slow. When something is merely discouraged, it performs normally and you wont get any messages at run-time.
The primary source for information about just what things are discouraged or deprecated in a given release is the file NEWS. That file also documents what you should use instead of the obsoleted things.
The file README contains instructions on how to control the inclusion or removal of the deprecated and/or discouraged features from the public API of Guile, and how to control the warning messages for deprecated features.
The idea behind those mechanisms is that normally all deprecated and discouraged features are available, but that you can omit them on purpose to check whether your code still relies on them.
Any problems with the installation should be reported to bug-guile@gnu.org.
Whenever you have found a bug in Guile you are encouraged to report it to the Guile developers, so they can fix it. They may also be able to suggest workarounds when it is not possible for you to apply the bug-fix or install a new version of Guile yourself.
Before sending in bug reports, please check with the following list that you really have found a bug.
When you write a bug report, please make sure to include as much of the information described below in the report. If you can't figure out some of the items, it is not a problem, but the more information we get, the more likely we can diagnose and fix the bug.
You can get the version number by invoking the command
$ guile --version
Guile 1.4.1
Copyright (c) 1995, 1996, 1997, 2000, 2006 Free Software Foundation
Guile may be distributed under the terms of the GNU General Public License;
certain other uses are permitted as well. For details, see the file
`COPYING', which is included in the Guile distribution.
There is no warranty, to the extent permitted by law.
$ uname -a
Linux tortoise 2.2.17 #1 Thu Dec 21 17:29:05 CET 2000 i586 unknown
guile-config info.
Be precise about these changes. A description in English is not enough—send a context diff for them.
Adding files of your own, or porting to another machine, is a modification of the source.
If you can tell us a way to cause the problem without loading any source files, please do so. This makes it much easier to debug. If you do need files, make sure you arrange for us to see their exact contents.
Of course, if the bug is that Guile gets a fatal signal, then one can't miss it. But if the bug is incorrect results, the maintainer might fail to notice what is wrong. Why leave it to chance?
If the manifestation of the bug is a Guile error message, it is important to report the precise text of the error message, and a backtrace showing how the Scheme program arrived at the error.
This can be done using the procedure backtrace in the REPL.
-q
switch to prevent loading the init file). If the problem does
not occur then, you must report the precise contents of any
programs that you must load into Guile in order to cause the problem to
occur.
The line numbers in the development sources might not match those in your sources. It would take extra work for the maintainers to determine what code is in your version at a given line number, and we could not be certain.
gdb guile or gdb .libs/guile (if using GNU Libtool).
However, you need to think when you collect the additional information if you want it to show what causes the bug.
For example, many people send just a backtrace, but that is not very useful by itself. A simple backtrace with arguments often conveys little about what is happening inside Guile, because most of the arguments listed in the backtrace are pointers to Scheme objects. The numeric values of these pointers have no significance whatever; all that matters is the contents of the objects they point to (and most of the contents are themselves pointers).
Guile's core language is Scheme, and an awful lot can be achieved simply by using Guile to write and run Scheme programs. In this part of the manual, we explain how to use Guile in this mode, and describe the tools that Guile provides to help you with script writing, debugging and packaging your programs for distribution.
For readers who are not yet familiar with the Scheme language, this part includes a chapter that presents the basic concepts of the language, and gives references to freely available Scheme tutorial material on the web.
For detailed reference information on the variables, functions etc. that make up Guile's application programming interface (API), See API Reference.
In this chapter, we introduce the basic concepts that underpin the elegance and power of the Scheme language.
Readers who already possess a background knowledge of Scheme may happily skip this chapter. For the reader who is new to the language, however, the following discussions on data, procedures, expressions and closure are designed to provide a minimum level of Scheme understanding that is more or less assumed by the reference chapters that follow.
The style of this introductory material aims about halfway between the terse precision of R5RS and the discursive randomness of a Scheme tutorial.
This section discusses the representation of data types and values, what it means for Scheme to be a latently typed language, and the role of variables. We conclude by introducing the Scheme syntaxes for defining a new variable, and for changing the value of an existing variable.
The term latent typing is used to describe a computer language, such as Scheme, for which you cannot, in general, simply look at a program's source code and determine what type of data will be associated with a particular variable, or with the result of a particular expression.
Sometimes, of course, you can tell from the code what the type of
an expression will be. If you have a line in your program that sets the
variable x to the numeric value 1, you can be certain that,
immediately after that line has executed (and in the absence of multiple
threads), x has the numeric value 1. Or if you write a procedure
that is designed to concatenate two strings, it is likely that the rest
of your application will always invoke this procedure with two string
parameters, and quite probable that the procedure would go wrong in some
way if it was ever invoked with parameters that were not both strings.
Nevertheless, the point is that there is nothing in Scheme which
requires the procedure parameters always to be strings, or x
always to hold a numeric value, and there is no way of declaring in your
program that such constraints should always be obeyed. In the same
vein, there is no way to declare the expected type of a procedure's
return value.
Instead, the types of variables and expressions are only known – in general – at run time. If you need to check at some point that a value has the expected type, Scheme provides run time procedures that you can invoke to do so. But equally, it can be perfectly valid for two separate invocations of the same procedure to specify arguments with different types, and to return values with different types.
The next subsection explains what this means in practice, for the ways that Scheme programs use data types, values and variables.
Scheme provides many data types that you can use to represent your data. Primitive types include characters, strings, numbers and procedures. Compound types, which allow a group of primitive and compound values to be stored together, include lists, pairs, vectors and multi-dimensional arrays. In addition, Guile allows applications to define their own data types, with the same status as the built-in standard Scheme types.
As a Scheme program runs, values of all types pop in and out of existence. Sometimes values are stored in variables, but more commonly they pass seamlessly from being the result of one computation to being one of the parameters for the next.
Consider an example. A string value is created because the interpreter reads in a literal string from your program's source code. Then a numeric value is created as the result of calculating the length of the string. A second numeric value is created by doubling the calculated length. Finally the program creates a list with two elements – the doubled length and the original string itself – and stores this list in a program variable.
All of the values involved here – in fact, all values in Scheme – carry their type with them. In other words, every value “knows,” at runtime, what kind of value it is. A number, a string, a list, whatever.
A variable, on the other hand, has no fixed type. A variable –
x, say – is simply the name of a location – a box – in which
you can store any kind of Scheme value. So the same variable in a
program may hold a number at one moment, a list of procedures the next,
and later a pair of strings. The “type” of a variable – insofar as
the idea is meaningful at all – is simply the type of whatever value
the variable happens to be storing at a particular moment.
To define a new variable, you use Scheme's define syntax like
this:
(define variable-name value)
This makes a new variable called variable-name and stores value in it as the variable's initial value. For example:
;; Make a variable `x' with initial numeric value 1.
(define x 1)
;; Make a variable `organization' with an initial string value.
(define organization "Free Software Foundation")
(In Scheme, a semicolon marks the beginning of a comment that continues
until the end of the line. So the lines beginning ;; are
comments.)
Changing the value of an already existing variable is very similar,
except that define is replaced by the Scheme syntax set!,
like this:
(set! variable-name new-value)
Remember that variables do not have fixed types, so new-value may have a completely different type from whatever was previously stored in the location named by variable-name. Both of the following examples are therefore correct.
;; Change the value of `x' to 5.
(set! x 5)
;; Change the value of `organization' to the FSF's street number.
(set! organization 545)
In these examples, value and new-value are literal numeric
or string values. In general, however, value and new-value
can be any Scheme expression. Even though we have not yet covered the
forms that Scheme expressions can take (see About Expressions), you
can probably guess what the following set! example does...
(set! x (+ x 1))
(Note: this is not a complete description of define and
set!, because we need to introduce some other aspects of Scheme
before the missing pieces can be filled in. If, however, you are
already familiar with the structure of Scheme, you may like to read
about those missing pieces immediately by jumping ahead to the following
references.
define syntax that can be used when defining new procedures.
set! syntax that helps with changing a single value in the depths
of a compound data structure.)
define other
than at top level in a Scheme program, including a discussion of when it
works to use define rather than set! to change the value
of an existing variable.
This section introduces the basics of using and creating Scheme
procedures. It discusses the representation of procedures as just
another kind of Scheme value, and shows how procedure invocation
expressions are constructed. We then explain how lambda is used
to create new procedures, and conclude by presenting the various
shorthand forms of define that can be used instead of writing an
explicit lambda expression.
One of the great simplifications of Scheme is that a procedure is just
another type of value, and that procedure values can be passed around
and stored in variables in exactly the same way as, for example, strings
and lists. When we talk about a built-in standard Scheme procedure such
as open-input-file, what we actually mean is that there is a
pre-defined top level variable called open-input-file, whose
value is a procedure that implements what R5RS says that
open-input-file should do.
Note that this is quite different from many dialects of Lisp — including Emacs Lisp — in which a program can use the same name with two quite separate meanings: one meaning identifies a Lisp function, while the other meaning identifies a Lisp variable, whose value need have nothing to do with the function that is associated with the first meaning. In these dialects, functions and variables are said to live in different namespaces.
In Scheme, on the other hand, all names belong to a single unified namespace, and the variables that these names identify can hold any kind of Scheme value, including procedure values.
One consequence of the “procedures as values” idea is that, if you don't happen to like the standard name for a Scheme procedure, you can change it.
For example, call-with-current-continuation is a very important
standard Scheme procedure, but it also has a very long name! So, many
programmers use the following definition to assign the same procedure
value to the more convenient name call/cc.
(define call/cc call-with-current-continuation)
Let's understand exactly how this works. The definition creates a new
variable call/cc, and then sets its value to the value of the
variable call-with-current-continuation; the latter value is a
procedure that implements the behaviour that R5RS specifies under the
name “call-with-current-continuation”. So call/cc ends up
holding this value as well.
Now that call/cc holds the required procedure value, you could
choose to use call-with-current-continuation for a completely
different purpose, or just change its value so that you will get an
error if you accidentally use call-with-current-continuation as a
procedure in your program rather than call/cc. For example:
(set! call-with-current-continuation "Not a procedure any more!")
Or you could just leave call-with-current-continuation as it was.
It's perfectly fine for more than one variable to hold the same
procedure value.
A procedure invocation in Scheme is written like this:
(procedure [arg1 [arg2 ...]])
In this expression, procedure can be any Scheme expression whose value is a procedure. Most commonly, however, procedure is simply the name of a variable whose value is a procedure.
For example, string-append is a standard Scheme procedure whose
behaviour is to concatenate together all the arguments, which are
expected to be strings, that it is given. So the expression
(string-append "/home" "/" "andrew")
is a procedure invocation whose result is the string value
"/home/andrew".
Similarly, string-length is a standard Scheme procedure that
returns the length of a single string argument, so
(string-length "abc")
is a procedure invocation whose result is the numeric value 3.
Each of the parameters in a procedure invocation can itself be any Scheme expression. Since a procedure invocation is itself a type of expression, we can put these two examples together to get
(string-length (string-append "/home" "/" "andrew"))
— a procedure invocation whose result is the numeric value 12.
(You may be wondering what happens if the two examples are combined the other way round. If we do this, we can make a procedure invocation expression that is syntactically correct:
(string-append "/home" (string-length "abc"))
but when this expression is executed, it will cause an error, because
the result of (string-length "abc") is a numeric value, and
string-append is not designed to accept a numeric value as one of
its arguments.)
Scheme has lots of standard procedures, and Guile provides all of these via predefined top level variables. All of these standard procedures are documented in the later chapters of this reference manual.
Before very long, though, you will want to create new procedures that
encapsulate aspects of your own applications' functionality. To do
this, you can use the famous lambda syntax.
For example, the value of the following Scheme expression
(lambda (name address) expression ...)
is a newly created procedure that takes two arguments:
name and address. The behaviour of the
new procedure is determined by the sequence of expressions in the
body of the procedure definition. (Typically, these
expressions would use the arguments in some way, or else there
wouldn't be any point in giving them to the procedure.) When invoked,
the new procedure returns a value that is the value of the last
expression in the procedure body.
To make things more concrete, let's suppose that the two arguments are both strings, and that the purpose of this procedure is to form a combined string that includes these arguments. Then the full lambda expression might look like this:
(lambda (name address)
(string-append "Name=" name ":Address=" address))
We noted in the previous subsection that the procedure part of a procedure invocation expression can be any Scheme expression whose value is a procedure. But that's exactly what a lambda expression is! So we can use a lambda expression directly in a procedure invocation, like this:
((lambda (name address)
(string-append "Name=" name ":Address=" address))
"FSF"
"Cambridge")
This is a valid procedure invocation expression, and its result is the
string "Name=FSF:Address=Cambridge".
It is more common, though, to store the procedure value in a variable —
(define make-combined-string
(lambda (name address)
(string-append "Name=" name ":Address=" address)))
— and then to use the variable name in the procedure invocation:
(make-combined-string "FSF" "Cambridge")
Which has exactly the same result.
It's important to note that procedures created using lambda have
exactly the same status as the standard built in Scheme procedures, and
can be invoked, passed around, and stored in variables in exactly the
same ways.
Since it is so common in Scheme programs to want to create a procedure
and then store it in a variable, there is an alternative form of the
define syntax that allows you to do just that.
A define expression of the form
(define (name [arg1 [arg2 ...]])
expression ...)
is exactly equivalent to the longer form
(define name
(lambda ([arg1 [arg2 ...]])
expression ...))
So, for example, the definition of make-combined-string in the
previous subsection could equally be written:
(define (make-combined-string name address)
(string-append "Name=" name ":Address=" address))
This kind of procedure definition creates a procedure that requires
exactly the expected number of arguments. There are two further forms
of the lambda expression, which create a procedure that can
accept a variable number of arguments:
(lambda (arg1 ... . args) expression ...)
(lambda args expression ...)
The corresponding forms of the alternative define syntax are:
(define (name arg1 ... . args) expression ...)
(define (name . args) expression ...)
For details on how these forms work, see See Lambda.
(It could be argued that the alternative define forms are rather
confusing, especially for newcomers to the Scheme language, as they hide
both the role of lambda and the fact that procedures are values
that are stored in variables in the some way as any other kind of value.
On the other hand, they are very convenient, and they are also a good
example of another of Scheme's powerful features: the ability to specify
arbitrary syntactic transformations at run time, which can be applied to
subsequently read input.)
So far, we have met expressions that do things, such as the
define expressions that create and initialize new variables, and
we have also talked about expressions that have values, for
example the value of the procedure invocation expression:
(string-append "/home" "/" "andrew")
but we haven't yet been precise about what causes an expression like this procedure invocation to be reduced to its “value”, or how the processing of such expressions relates to the execution of a Scheme program as a whole.
This section clarifies what we mean by an expression's value, by introducing the idea of evaluation. It discusses the side effects that evaluation can have, explains how each of the various types of Scheme expression is evaluated, and describes the behaviour and use of the Guile REPL as a mechanism for exploring evaluation. The section concludes with a very brief summary of Scheme's common syntactic expressions.
In Scheme, the process of executing an expression is known as evaluation. Evaluation has two kinds of result:
Of the expressions that we have met so far, define and
set! expressions have side effects — the creation or
modification of a variable — but no value; lambda expressions
have values — the newly constructed procedures — but no side
effects; and procedure invocation expressions, in general, have either
values, or side effects, or both.
It is tempting to try to define more intuitively what we mean by “value” and “side effects”, and what the difference between them is. In general, though, this is extremely difficult. It is also unnecessary; instead, we can quite happily define the behaviour of a Scheme program by specifying how Scheme executes a program as a whole, and then by describing the value and side effects of evaluation for each type of expression individually.
So, some1 definitions...
2.3 or a string
"Hello world!"
The following subsections describe how each of these types of expression is evaluated.
When a literal data expression is evaluated, the value of the expression is simply the value that the expression describes. The evaluation of a literal data expression has no side effects.
So, for example,
"abc" is the string value
"abc"
3+4i is the complex number 3 + 4i
#(1 2 3) is a three-element vector
containing the numeric values 1, 2 and 3.
For any data type which can be expressed literally like this, the syntax of the literal data expression for that data type — in other words, what you need to write in your code to indicate a literal value of that type — is known as the data type's read syntax. This manual specifies the read syntax for each such data type in the section that describes that data type.
Some data types do not have a read syntax. Procedures, for example,
cannot be expressed as literal data; they must be created using a
lambda expression (see Creating a Procedure) or implicitly
using the shorthand form of define (see Lambda Alternatives).
When an expression that consists simply of a variable name is evaluated, the value of the expression is the value of the named variable. The evaluation of a variable reference expression has no side effects.
So, after
(define key "Paul Evans")
the value of the expression key is the string value "Paul
Evans". If key is then modified by
(set! key 3.74)
the value of the expression key is the numeric value 3.74.
If there is no variable with the specified name, evaluation of the variable reference expression signals an error.
This is where evaluation starts getting interesting! As already noted, a procedure invocation expression has the form
(procedure [arg1 [arg2 ...]])
where procedure must be an expression whose value, when evaluated, is a procedure.
The evaluation of a procedure invocation expression like this proceeds by
For a procedure defined in Scheme, “calling the procedure with the list of values as its parameters” means binding the values to the procedure's formal parameters and then evaluating the sequence of expressions that make up the body of the procedure definition. The value of the procedure invocation expression is the value of the last evaluated expression in the procedure body. The side effects of calling the procedure are the combination of the side effects of the sequence of evaluations of expressions in the procedure body.
For a built-in procedure, the value and side-effects of calling the procedure are best described by that procedure's documentation.
Note that the complete side effects of evaluating a procedure invocation expression consist not only of the side effects of the procedure call, but also of any side effects of the preceding evaluation of the expressions procedure, arg1, arg2, and so on.
To illustrate this, let's look again at the procedure invocation expression:
(string-length (string-append "/home" "/" "andrew"))
In the outermost expression, procedure is string-length and
arg1 is (string-append "/home" "/" "andrew").
string-length, which is a variable, gives a
procedure value that implements the expected behaviour for
“string-length”.
(string-append "/home" "/" "andrew"), which is
another procedure invocation expression, means evaluating each of
string-append, which gives a procedure value that implements the
expected behaviour for “string-append”
"/home", which gives the string value "/home"
"/", which gives the string value "/"
"andrew", which gives the string value "andrew"
and then invoking the procedure value with this list of string values as
its arguments. The resulting value is a single string value that is the
concatenation of all the arguments, namely "/home/andrew".
In the evaluation of the outermost expression, the interpreter can now invoke the procedure value obtained from procedure with the value obtained from arg1 as its arguments. The resulting value is a numeric value that is the length of the argument string, which is 12.
When a procedure invocation expression is evaluated, the procedure and all the argument expressions must be evaluated before the procedure can be invoked. Special syntactic expressions are special because they are able to manipulate their arguments in an unevaluated form, and can choose whether to evaluate any or all of the argument expressions.
Why is this needed? Consider a program fragment that asks the user whether or not to delete a file, and then deletes the file if the user answers yes.
(if (string=? (read-answer "Should I delete this file?")
"yes")
(delete-file file))
If the outermost (if ...) expression here was a procedure
invocation expression, the expression (delete-file file), whose
side effect is to actually delete a file, would already have been
evaluated before the if procedure even got invoked! Clearly this
is no use — the whole point of an if expression is that the
consequent expression is only evaluated if the condition of the
if expression is “true”.
Therefore if must be special syntax, not a procedure. Other
special syntaxes that we have already met are define, set!
and lambda. define and set! are syntax because
they need to know the variable name that is given as the first
argument in a define or set! expression, not that
variable's value. lambda is syntax because it does not
immediately evaluate the expressions that define the procedure body;
instead it creates a procedure object that incorporates these
expressions so that they can be evaluated in the future, when that
procedure is invoked.
The rules for evaluating each special syntactic expression are specified individually for each special syntax. For a summary of standard special syntax, see See Syntax Summary.
Scheme is “properly tail recursive”, meaning that tail calls or recursions from certain contexts do not consume stack space or other resources and can therefore be used on arbitrarily large data or for an arbitrarily long calculation. Consider for example,
(define (foo n)
(display n)
(newline)
(foo (1+ n)))
(foo 1)
-|
1
2
3
...
foo prints numbers infinitely, starting from the given n.
It's implemented by printing n then recursing to itself to print
n+1 and so on. This recursion is a tail call, it's the
last thing done, and in Scheme such tail calls can be made without
limit.
Or consider a case where a value is returned, a version of the SRFI-1
last function (see SRFI-1 Selectors) returning the last
element of a list,
(define (my-last lst)
(if (null? (cdr lst))
(car lst)
(my-last (cdr lst))))
(my-last '(1 2 3)) => 3
If the list has more than one element, my-last applies itself
to the cdr. This recursion is a tail call, there's no code
after it, and the return value is the return value from that call. In
Scheme this can be used on an arbitrarily long list argument.
A proper tail call is only available from certain contexts, namely the following special form positions,
and — last expression
begin — last expression
case — last expression in each clause
cond — last expression in each clause, and the call to a
=> procedure is a tail call
do — last result expression
if — “true” and “false” leg expressions
lambda — last expression in body
let, let*, letrec, let-syntax,
letrec-syntax — last expression in body
or — last expression
The following core functions make tail calls,
apply — tail call to given procedure
call-with-current-continuation — tail call to the procedure
receiving the new continuation
call-with-values — tail call to the values-receiving
procedure
eval — tail call to evaluate the form
string-any, string-every — tail call to predicate on
the last character (if that point is reached)
The above are just core functions and special forms. Tail calls in other modules are described with the relevant documentation, for example SRFI-1
any and every (see SRFI-1 Searching).
It will be noted there are a lot of places which could potentially be
tail calls, for instance the last call in a for-each, but only
those explicitly described are guaranteed.
If you start Guile without specifying a particular program for it to execute, Guile enters its standard Read Evaluate Print Loop — or REPL for short. In this mode, Guile repeatedly reads in the next Scheme expression that the user types, evaluates it, and prints the resulting value.
The REPL is a useful mechanism for exploring the evaluation behaviour
described in the previous subsection. If you type string-append,
for example, the REPL replies #<primitive-procedure
string-append>, illustrating the relationship between the variable
string-append and the procedure value stored in that variable.
In this manual, the notation => is used to mean “evaluates to”. Wherever you see an example of the form
expression
=>
result
feel free to try it out yourself by typing expression into the REPL and checking that it gives the expected result.
This subsection lists the most commonly used Scheme syntactic expressions, simply so that you will recognize common special syntax when you see it. For a full description of each of these syntaxes, follow the appropriate reference.
lambda (see Lambda) is used to construct procedure objects.
define (see Top Level) is used to create a new variable and
set its initial value.
set! (see Top Level) is used to modify an existing variable's
value.
let, let* and letrec (see Local Bindings)
create an inner lexical environment for the evaluation of a sequence of
expressions, in which a specified set of local variables is bound to the
values of a corresponding set of expressions. For an introduction to
environments, see See About Closure.
begin (see begin) executes a sequence of expressions in order
and returns the value of the last expression. Note that this is not the
same as a procedure which returns its last argument, because the
evaluation of a procedure invocation expression does not guarantee to
evaluate the arguments in order.
if and cond (see if cond case) provide conditional
evaluation of argument expressions depending on whether one or more
conditions evaluate to “true” or “false”.
case (see if cond case) provides conditional evaluation of
argument expressions depending on whether a variable has one of a
specified group of values.
and (see and or) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“false”.
or (see and or) executes a sequence of expressions in order
until either there are no expressions left, or one of them evaluates to
“true”.
The concept of closure is the idea that a lambda expression “captures” the variable bindings that are in lexical scope at the point where the lambda expression occurs. The procedure created by the lambda expression can refer to and mutate the captured bindings, and the values of those bindings persist between procedure calls.
This section explains and explores the various parts of this idea in more detail.
We said earlier that a variable name in a Scheme program is associated with a location in which any kind of Scheme value may be stored. (Incidentally, the term “vcell” is often used in Lisp and Scheme circles as an alternative to “location”.) Thus part of what we mean when we talk about “creating a variable” is in fact establishing an association between a name, or identifier, that is used by the Scheme program code, and the variable location to which that name refers. Although the value that is stored in that location may change, the location to which a given name refers is always the same.
We can illustrate this by breaking down the operation of the
define syntax into three parts: define
define expression
define expression.
A collection of associations between names and locations is called an
environment. When you create a top level variable in a program
using define, the name-location association for that variable is
added to the “top level” environment. The “top level” environment
also includes name-location associations for all the procedures that are
supplied by standard Scheme.
It is also possible to create environments other than the top level one, and to create variable bindings, or name-location associations, in those environments. This ability is a key ingredient in the concept of closure; the next subsection shows how it is done.
We have seen how to create top level variables using the define
syntax (see Definition). It is often useful to create variables
that are more limited in their scope, typically as part of a procedure
body. In Scheme, this is done using the let syntax, or one of
its modified forms let* and letrec. These syntaxes are
described in full later in the manual (see Local Bindings). Here
our purpose is to illustrate their use just enough that we can see how
local variables work.
For example, the following code uses a local variable s to
simplify the computation of the area of a triangle given the lengths of
its three sides.
(define a 5.3)
(define b 4.7)
(define c 2.8)
(define area
(let ((s (/ (+ a b c) 2)))
(sqrt (* s (- s a) (- s b) (- s c)))))
The effect of the let expression is to create a new environment
and, within this environment, an association between the name s
and a new location whose initial value is obtained by evaluating
(/ (+ a b c) 2). The expressions in the body of the let,
namely (sqrt (* s (- s a) (- s b) (- s c))), are then evaluated
in the context of the new environment, and the value of the last
expression evaluated becomes the value of the whole let
expression, and therefore the value of the variable area.
In the example of the previous subsection, we glossed over an important
point. The body of the let expression in that example refers not
only to the local variable s, but also to the top level variables
a, b, c and sqrt. (sqrt is the
standard Scheme procedure for calculating a square root.) If the body
of the let expression is evaluated in the context of the
local let environment, how does the evaluation get at the
values of these top level variables?
The answer is that the local environment created by a let
expression automatically has a reference to its containing environment
— in this case the top level environment — and that the Scheme
interpreter automatically looks for a variable binding in the containing
environment if it doesn't find one in the local environment. More
generally, every environment except for the top level one has a
reference to its containing environment, and the interpreter keeps
searching back up the chain of environments — from most local to top
level — until it either finds a variable binding for the required
identifier or exhausts the chain.
This description also determines what happens when there is more than
one variable binding with the same name. Suppose, continuing the
example of the previous subsection, that there was also a pre-existing
top level variable s created by the expression:
(define s "Some beans, my lord!")
Then both the top level environment and the local let environment
would contain bindings for the name s. When evaluating code
within the let body, the interpreter looks first in the local
let environment, and so finds the binding for s created by
the let syntax. Even though this environment has a reference to
the top level environment, which also has a binding for s, the
interpreter doesn't get as far as looking there. When evaluating code
outside the let body, the interpreter looks up variable names in
the top level environment, so the name s refers to the top level
variable.
Within the let body, the binding for s in the local
environment is said to shadow the binding for s in the top
level environment.
The rules that we have just been describing are the details of how Scheme implements “lexical scoping”. This subsection takes a brief diversion to explain what lexical scope means in general and to present an example of non-lexical scoping.
“Lexical scope” in general is the idea that
In practice, lexical scoping is the norm for most programming languages, and probably corresponds to what you would intuitively consider to be “normal”. You may even be wondering how the situation could possibly — and usefully — be otherwise. To demonstrate that another kind of scoping is possible, therefore, and to compare it against lexical scoping, the following subsection presents an example of non-lexical scoping and examines in detail how its behavior differs from the corresponding lexically scoped code.
To demonstrate that non-lexical scoping does exist and can be useful, we present the following example from Emacs Lisp, which is a “dynamically scoped” language.
(defvar currency-abbreviation "USD")
(defun currency-string (units hundredths)
(concat currency-abbreviation
(number-to-string units)
"."
(number-to-string hundredths)))
(defun french-currency-string (units hundredths)
(let ((currency-abbreviation "FRF"))
(currency-string units hundredths)))
The question to focus on here is: what does the identifier
currency-abbreviation refer to in the currency-string
function? The answer, in Emacs Lisp, is that all variable bindings go
onto a single stack, and that currency-abbreviation refers to the
topmost binding from that stack which has the name
“currency-abbreviation”. The binding that is created by the
defvar form, to the value "USD", is only relevant if none
of the code that calls currency-string rebinds the name
“currency-abbreviation” in the meanwhile.
The second function french-currency-string works precisely by
taking advantage of this behaviour. It creates a new binding for the
name “currency-abbreviation” which overrides the one established by
the defvar form.
;; Note! This is Emacs Lisp evaluation, not Scheme!
(french-currency-string 33 44)
=>
"FRF33.44"
Now let's look at the corresponding, lexically scoped Scheme code:
(define currency-abbreviation "USD")
(define (currency-string units hundredths)
(string-append currency-abbreviation
(number->string units)
"."
(number->string hundredths)))
(define (french-currency-string units hundredths)
(let ((currency-abbreviation "FRF"))
(currency-string units hundredths)))
According to the rules of lexical scoping, the
currency-abbreviation in currency-string refers to the
variable location in the innermost environment at that point in the code
which has a binding for currency-abbreviation, which is the
variable location in the top level environment created by the preceding
(define currency-abbreviation ...) expression.
In Scheme, therefore, the french-currency-string procedure does
not work as intended. The variable binding that it creates for
“currency-abbreviation” is purely local to the code that forms the
body of the let expression. Since this code doesn't directly use
the name “currency-abbreviation” at all, the binding is pointless.
(french-currency-string 33 44)
=>
"USD33.44"
This begs the question of how the Emacs Lisp behaviour can be
implemented in Scheme. In general, this is a design question whose
answer depends upon the problem that is being addressed. In this case,
the best answer may be that currency-string should be
redesigned so that it can take an optional third argument. This third
argument, if supplied, is interpreted as a currency abbreviation that
overrides the default.
It is possible to change french-currency-string so that it mostly
works without changing currency-string, but the fix is inelegant,
and susceptible to interrupts that could leave the
currency-abbreviation variable in the wrong state:
(define (french-currency-string units hundredths)
(set! currency-abbreviation "FRF")
(let ((result (currency-string units hundredths)))
(set! currency-abbreviation "USD")
result))
The key point here is that the code does not create any local binding
for the identifier currency-abbreviation, so all occurrences of
this identifier refer to the top level variable.
Consider a let expression that doesn't contain any
lambdas:
(let ((s (/ (+ a b c) 2)))
(sqrt (* s (- s a) (- s b) (- s c))))
When the Scheme interpreter evaluates this, it
let
s in the new environment, with
value given by (/ (+ a b c) 2)
let in the context of
the new local environment, and remembers the value V
let, using
the value V as the value of the let expression, in the
context of the containing environment.
After the let expression has been evaluated, the local
environment that was created is simply forgotten, and there is no longer
any way to access the binding that was created in this environment. If
the same code is evaluated again, it will follow the same steps again,
creating a second new local environment that has no connection with the
first, and then forgetting this one as well.
If the let body contains a lambda expression, however, the
local environment is not forgotten. Instead, it becomes
associated with the procedure that is created by the lambda
expression, and is reinstated every time that that procedure is called.
In detail, this works as follows.
lambda expression, to
create a procedure object, it stores the current environment as part of
the procedure definition.
The result is that the procedure body is always evaluated in the context of the environment that was current when the procedure was created.
This is what is meant by closure. The next few subsections present examples that explore the usefulness of this concept.
This example uses closure to create a procedure with a variable binding that is private to the procedure, like a local variable, but whose value persists between procedure calls.
(define (make-serial-number-generator)
(let ((current-serial-number 0))
(lambda ()
(set! current-serial-number (+ current-serial-number 1))
current-serial-number)))
(define entry-sn-generator (make-serial-number-generator))
(entry-sn-generator)
=>
1
(entry-sn-generator)
=>
2
When make-serial-number-generator is called, it creates a local
environment with a binding for current-serial-number whose
initial value is 0, then, within this environment, creates a procedure.
The local environment is stored within the created procedure object and
so persists for the lifetime of the created procedure.
Every time the created procedure is invoked, it increments the value of
the current-serial-number binding in the captured environment and
then returns the current value.
Note that make-serial-number-generator can be called again to
create a second serial number generator that is independent of the
first. Every new invocation of make-serial-number-generator
creates a new local let environment and returns a new procedure
object with an association to this environment.
This example uses closure to create two procedures, get-balance
and deposit, that both refer to the same captured local
environment so that they can both access the balance variable
binding inside that environment. The value of this variable binding
persists between calls to either procedure.
Note that the captured balance variable binding is private to
these two procedures: it is not directly accessible to any other code.
It can only be accessed indirectly via get-balance or
deposit, as illustrated by the withdraw procedure.
(define get-balance #f)
(define deposit #f)
(let ((balance 0))
(set! get-balance
(lambda ()
balance))
(set! deposit
(lambda (amount)
(set! balance (+ balance amount))
balance)))
(define (withdraw amount)
(deposit (- amount)))
(get-balance)
=>
0
(deposit 50)
=>
50
(withdraw 75)
=>
-25
An important detail here is that the get-balance and
deposit variables must be set up by defineing them at top
level and then set!ing their values inside the let body.
Using define within the let body would not work: this
would create variable bindings within the local let environment
that would not be accessible at top level.
A frequently used programming model for library code is to allow an application to register a callback function for the library to call when some particular event occurs. It is often useful for the application to make several such registrations using the same callback function, for example if several similar library events can be handled using the same application code, but the need then arises to distinguish the callback function calls that are associated with one callback registration from those that are associated with different callback registrations.
In languages without the ability to create functions dynamically, this
problem is usually solved by passing a user_data parameter on the
registration call, and including the value of this parameter as one of
the parameters on the callback function. Here is an example of
declarations using this solution in C:
typedef void (event_handler_t) (int event_type,
void *user_data);
void register_callback (int event_type,
event_handler_t *handler,
void *user_data);
In Scheme, closure can be used to achieve the same functionality without
requiring the library code to store a user-data for each callback
registration.
;; In the library:
(define (register-callback event-type handler-proc)
...)
;; In the application:
(define (make-handler event-type user-data)
(lambda ()
...
<code referencing event-type and user-data>
...))
(register-callback event-type
(make-handler event-type ...))
As far as the library is concerned, handler-proc is a procedure
with no arguments, and all the library has to do is call it when the
appropriate event occurs. From the application's point of view, though,
the handler procedure has used closure to capture an environment that
includes all the context that the handler code needs —
event-type and user-data — to handle the event
correctly.
Closure is the capture of an environment, containing persistent variable bindings, within the definition of a procedure or a set of related procedures. This is rather similar to the idea in some object oriented languages of encapsulating a set of related data variables inside an “object”, together with a set of “methods” that operate on the encapsulated data. The following example shows how closure can be used to emulate the ideas of objects, methods and encapsulation in Scheme.
(define (make-account)
(let ((balance 0))
(define (get-balance)
balance)
(define (deposit amount)
(set! balance (+ balance amount))
balance)
(define (withdraw amount)
(deposit (- amount)))
(lambda args
(apply
(case (car args)
((get-balance) get-balance)
((deposit) deposit)
((withdraw) withdraw)
(else (error "Invalid method!")))
(cdr args)))))
Each call to make-account creates and returns a new procedure,
created by the expression in the example code that begins “(lambda
args”.
(define my-account (make-account))
my-account
=>
#<procedure args>
This procedure acts as an account object with methods
get-balance, deposit and withdraw. To apply one of
the methods to the account, you call the procedure with a symbol
indicating the required method as the first parameter, followed by any
other parameters that are required by that method.
(my-account 'get-balance)
=>
0
(my-account 'withdraw 5)
=>
-5
(my-account 'deposit 396)
=>
391
(my-account 'get-balance)
=>
391
Note how, in this example, both the current balance and the helper
procedures get-balance, deposit and withdraw, used
to implement the guts of the account object's methods, are all stored in
variable bindings within the private local environment captured by the
lambda expression that creates the account object procedure.
Guile's core language is Scheme, which is specified and described in the series of reports known as RnRS. RnRS is shorthand for the Revised^n Report on the Algorithmic Language Scheme. The current latest revision of RnRS is version 5 (see R5RS), and Guile 1.4 is fully compliant with the Scheme specification in this revision.
But Guile, like most Scheme implementations, also goes beyond R5RS in many ways, because R5RS does not give specifications (or even recommendations) regarding many issues that are important in practical programming. Some of the areas where Guile extends R5RS are:
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
The first line of a Guile script must tell the operating system to use Guile to evaluate the script, and then tell Guile how to go about doing that. Here is the simplest case:
The operating system interprets this to mean that the rest of the line is the name of an executable that can interpret the script. Guile, however, interprets these characters as the beginning of a multi-line comment, terminated by the characters `!#' on a line by themselves. (This is an extension to the syntax described in R5RS, added to support shell scripts.)
Guile reads the program, evaluating expressions in the order that they appear. Upon reaching the end of the file, Guile exits.
Here we describe Guile's command-line processing in detail. Guile processes its arguments from left to right, recognizing the switches described below. For examples, see Scripting Examples.
-s script arg...load function would. After loading script, exit. Any
command-line arguments arg... following script become the
script's arguments; the command-line function returns a list of
strings of the form (script arg...).
-c expr arg...command-line function returns a list of strings of the form
(guile arg...), where guile is the path of the
Guile executable.
-- arg...--
become command-line arguments for the interactive session; the
command-line function returns a list of strings of the form
(guile arg...), where guile is the path of the
Guile executable.
-L directory-l file-e function-s) or evaluating the expression (with
-c), apply function to a list containing the program name
and the command-line arguments — the list provided by the
command-line function.
A -e switch can appear anywhere in the argument list, but Guile
always invokes the function as the last action it performs.
This is weird, but because of the way script invocation works under
POSIX, the -s option must always come last in the list.
The function is most often a simple symbol that names a function
that is defined in the script. It can also be of the form (@
module-name symbol) and in that case, the symbol is
looked up in the module named module-name.
For compatibility with some versions of Guile 1.4, you can also use the
form (symbol ...) (that is, a list of only symbols that doesn't
start with @), which is equivalent to (@ (symbol ...)
main), or (symbol ...) symbol (that is, a list of only symbols
followed by a symbol), which is equivalent to (@ (symbol ...)
symbol). We recommend to use the equivalent forms directly since they
corresponf to the (@ ...) read syntax that can be used in
normal code, See Using Guile Modules.
See Scripting Examples.
-ds-s option as if it occurred at this point in the
command line; load the script here.
This switch is necessary because, although the POSIX script invocation
mechanism effectively requires the -s option to appear last, the
programmer may well want to run the script before other actions
requested on the command line. For examples, see Scripting Examples.
\--emacs#t.
This switch is still experimental.
--use-srfi=list--use-srfi expects a comma-separated list of numbers,
each representing a SRFI number to be loaded into the interpreter
before starting evaluating a script file or the REPL. Additionally,
the feature identifier for the loaded SRFIs is recognized by
`cond-expand' when using this option.
guile --use-srfi=8,13
--debug-s or -c, the normal, faster evaluator is used by default.
--no-debug-h, --help-v, --versionGuile's command-line switches allow the programmer to describe reasonably complicated actions in scripts. Unfortunately, the POSIX script invocation mechanism only allows one argument to appear on the `#!' line after the path to the Guile executable, and imposes arbitrary limits on that argument's length. Suppose you wrote a script starting like this:
#!/usr/local/bin/guile -e main -s
!#
(define (main args)
(map (lambda (arg) (display arg) (display " "))
(cdr args))
(newline))
The intended meaning is clear: load the file, and then call main
on the command-line arguments. However, the system will treat
everything after the Guile path as a single argument — the string
"-e main -s" — which is not what we want.
As a workaround, the meta switch \ allows the Guile programmer to
specify an arbitrary number of options without patching the kernel. If
the first argument to Guile is \, Guile will open the script file
whose name follows the \, parse arguments starting from the
file's second line (according to rules described below), and substitute
them for the \ switch.
Working in concert with the meta switch, Guile treats the characters `#!' as the beginning of a comment which extends through the next line containing only the characters `!#'. This sort of comment may appear anywhere in a Guile program, but it is most useful at the top of a file, meshing magically with the POSIX script invocation mechanism.
Thus, consider a script named /u/jimb/ekko which starts like this:
#!/usr/local/bin/guile \
-e main -s
!#
(define (main args)
(map (lambda (arg) (display arg) (display " "))
(cdr args))
(newline))
Suppose a user invokes this script as follows:
$ /u/jimb/ekko a b c
Here's what happens:
/usr/local/bin/guile \ /u/jimb/ekko a b c
This is the usual behavior, prescribed by POSIX.
\ /u/jimb/ekko, it opens
/u/jimb/ekko, parses the three arguments -e, main,
and -s from it, and substitutes them for the \ switch.
Thus, Guile's command line now reads:
/usr/local/bin/guile -e main -s /u/jimb/ekko a b c
(main "/u/jimb/ekko" "a" "b" "c").
When Guile sees the meta switch \, it parses command-line
argument from the script file according to the following rules:
"".
\n and
\t are also supported. These produce argument constituents; the
two-character combination \n doesn't act like a terminating
newline. The escape sequence \NNN for exactly three octal
digits reads as the character whose ASCII code is NNN. As above,
characters produced this way are argument constituents. Backslash
followed by other characters is not allowed.
The ability to accept and handle command line arguments is very important when writing Guile scripts to solve particular problems, such as extracting information from text files or interfacing with existing command line applications. This chapter describes how Guile makes command line arguments available to a Guile script, and the utilities that Guile provides to help with the processing of command line arguments.
When a Guile script is invoked, Guile makes the command line arguments
accessible via the procedure command-line, which returns the
arguments as a list of strings.
For example, if the script
#! /usr/local/bin/guile -s
!#
(write (command-line))
(newline)
is saved in a file cmdline-test.scm and invoked using the command
line ./cmdline-test.scm bar.txt -o foo -frumple grob, the output
is
("./cmdline-test.scm" "bar.txt" "-o" "foo" "-frumple" "grob")
If the script invocation includes a -e option, specifying a
procedure to call after loading the script, Guile will call that
procedure with (command-line) as its argument. So a script that
uses -e doesn't need to refer explicitly to command-line
in its code. For example, the script above would have identical
behaviour if it was written instead like this:
#! /usr/local/bin/guile \
-e main -s
!#
(define (main args)
(write args)
(newline))
(Note the use of the meta switch \ so that the script invocation
can include more than one Guile option: See The Meta Switch.)
These scripts use the #! POSIX convention so that they can be
executed using their own file names directly, as in the example command
line ./cmdline-test.scm bar.txt -o foo -frumple grob. But they
can also be executed by typing out the implied Guile command line in
full, as in:
$ guile -s ./cmdline-test.scm bar.txt -o foo -frumple grob
or
$ guile -e main -s ./cmdline-test2.scm bar.txt -o foo -frumple grob
Even when a script is invoked using this longer form, the arguments that
the script receives are the same as if it had been invoked using the
short form. Guile ensures that the (command-line) or -e
arguments are independent of how the script is invoked, by stripping off
the arguments that Guile itself processes.
A script is free to parse and handle its command line arguments in any
way that it chooses. Where the set of possible options and arguments is
complex, however, it can get tricky to extract all the options, check
the validity of given arguments, and so on. This task can be greatly
simplified by taking advantage of the module (ice-9 getopt-long),
which is distributed with Guile, See getopt-long.
To start with, here are some examples of invoking Guile directly:
guile -- a b c(command-line) will return ("/usr/local/bin/guile" "a" "b" "c").
guile -s /u/jimb/ex2 a b c(command-line) will return ("/u/jimb/ex2" "a" "b" "c").
guile -c '(write %load-path) (newline)'%load-path, print a newline,
and exit.
guile -e main -s /u/jimb/ex4 foomain, passing it the list ("/u/jimb/ex4" "foo").
guile -l first -ds -l last -s script-ds switch says when to process the -s
switch. For a more motivated example, see the scripts below.
Here is a very simple Guile script:
#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)
The first line marks the file as a Guile script. When the user invokes
it, the system runs /usr/local/bin/guile to interpret the script,
passing -s, the script's filename, and any arguments given to the
script as command-line arguments. When Guile sees -s
script, it loads script. Thus, running this program
produces the output:
Hello, world!
Here is a script which prints the factorial of its argument:
#!/usr/local/bin/guile -s
!#
(define (fact n)
(if (zero? n) 1
(* n (fact (- n 1)))))
(display (fact (string->number (cadr (command-line)))))
(newline)
In action:
$ fact 5
120
$
However, suppose we want to use the definition of fact in this
file from another script. We can't simply load the script file,
and then use fact's definition, because the script will try to
compute and display a factorial when we load it. To avoid this problem,
we might write the script this way:
#!/usr/local/bin/guile \
-e main -s
!#
(define (fact n)
(if (zero? n) 1
(* n (fact (- n 1)))))
(define (main args)
(display (fact (string->number (cadr args))))
(newline))
This version packages the actions the script should perform in a
function, main. This allows us to load the file purely for its
definitions, without any extraneous computation taking place. Then we
used the meta switch \ and the entry point switch -e to
tell Guile to call main after loading the script.
$ fact 50
30414093201713378043612608166064768844377641568960512000000000000
Suppose that we now want to write a script which computes the
choose function: given a set of m distinct objects,
(choose n m) is the number of distinct subsets
containing n objects each. It's easy to write choose given
fact, so we might write the script this way:
#!/usr/local/bin/guile \
-l fact -e main -s
!#
(define (choose n m)
(/ (fact m) (* (fact (- m n)) (fact n))))
(define (main args)
(let ((n (string->number (cadr args)))
(m (string->number (caddr args))))
(display (choose n m))
(newline)))
The command-line arguments here tell Guile to first load the file
fact, and then run the script, with main as the entry
point. In other words, the choose script can use definitions
made in the fact script. Here are some sample runs:
$ choose 0 4
1
$ choose 1 4
4
$ choose 2 4
6
$ choose 3 4
4
$ choose 4 4
1
$ choose 50 100
100891344545564193334812497256
When you start up Guile by typing just guile, without a
-c argument or the name of a script to execute, you get an
interactive interpreter where you can enter Scheme expressions, and
Guile will evaluate them and print the results for you. Here are some
simple examples.
guile> (+ 3 4 5)
12
guile> (display "Hello world!\n")
Hello world!
guile> (values 'a 'b)
a
b
This mode of use is called a REPL, which is short for “Read-Eval-Print Loop”, because the Guile interpreter first reads the expression that you have typed, then evaluates it, and then prints the result.
To make it easier for you to repeat and vary previously entered expressions, or to edit the expression that you're typing in, Guile can use the GNU Readline library. This is not enabled by default because of licensing reasons, but all you need to activate Readline is the following pair of lines.
guile> (use-modules (ice-9 readline))
guile> (activate-readline)
It's a good idea to put these two lines (without the “guile>” prompts) in your .guile file. Guile reads this file when it starts up interactively, so anything in this file has the same effect as if you type it in by hand at the “guile>” prompt.
Just as Readline helps you to reuse a previous input line, value
history allows you to use the result of a previous evaluation
in a new expression. When value history is enabled, each evaluation
result is automatically assigned to the next in the sequence of
variables $1, $2, ..., and you can then use these
variables in subsequent expressions.
guile> (iota 10)
$1 = (0 1 2 3 4 5 6 7 8 9)
guile> (apply * (cdr $1))
$2 = 362880
guile> (sqrt $2)
$3 = 602.3952191045344
guile> (cons $2 $1)
$4 = (362880 0 1 2 3 4 5 6 7 8 9)
To enable value history, type (use-modules (ice-9 history)) at
the Guile prompt, or add this to your .guile file. (It is not
enabled by default, to avoid the possibility of conflicting with some
other use you may have for the variables $1, $2,
..., and also because it prevents the stored evaluation results
from being garbage collected, which some people may not want.)
When code being evaluated from the REPL hits an error, Guile remembers the execution context where the error occurred and can give you three levels of information about what the error was and exactly where it occurred.
By default, Guile displays only the first level, which is the most immediate information about where and why the error occurred, for example:
(make-string (* 4 (+ 3 #\s)) #\space)
-|
standard input:2:19: In procedure + in expression (+ 3 #\s):
standard input:2:19: Wrong type argument: #\s
ABORT: (wrong-type-arg)
Type "(backtrace)" to get more information
or "(debug)" to enter the debugger.
However, as the message above says, you can obtain more information
about the context of the error by typing (backtrace) or
(debug).
(backtrace) displays the Scheme call stack at the point where the
error occurred:
(backtrace)
-|
Backtrace:
In standard input:
2: 0* [make-string ...
2: 1* [* 4 ...
2: 2* [+ 3 #\s]
Type "(debug-enable 'backtrace)" if you would like a backtrace
automatically if an error occurs in the future.
In a more complex scenario than this one, this can be extremely useful
for understanding where and why the error occurred. You can make Guile
show the backtrace automatically by adding (debug-enable
'backtrace) to your .guile.
(debug) takes you into Guile's interactive debugger, which
provides commands that allow you to
backtrace command — see Display Backtrace)
up, down, frame, position, info args
and info frame commands — see Frame Selection and
Frame Information)
evaluate command — see Frame Evaluation).
The interactive debugger is documented further in the following section.
Guile's interactive debugger is a command line application that accepts commands from you for examining the stack and, if stopped at a trap, for continuing program execution in various ways. Unlike in the normal Guile REPL, commands are typed mostly without parentheses.
When you first enter the debugger, it introduces itself with a message like this:
This is the Guile debugger -- for help, type `help'.
There are 3 frames on the stack.
Frame 2 at standard input:36:19
[+ 3 #\s]
debug>
“debug>” is the debugger's prompt, and a reminder that you are not in
the normal Guile REPL. In case you find yourself in the debugger by
mistake, the quit command will return you to the REPL.
The other available commands are described in the following subsections.
The backtrace command, which can also be invoked as bt or
where, displays the call stack (aka backtrace) at the point where
the debugger was entered:
debug> bt
In standard input:
36: 0* [make-string ...
36: 1* [* 4 ...
36: 2* [+ 3 #\s]
Print backtrace of all stack frames, or of the innermost count frames. With a negative argument, print the outermost -count frames. If the number of frames isn't explicitly given, the debug option
depthdetermines the maximum number of frames printed.
The format of the displayed backtrace is the same as for the
display-backtrace procedure (see Examining the Stack).
A call stack consists of a sequence of stack frames, with each frame describing one level of the nested evaluations and applications that the program was executing when it hit a breakpoint or an error. Frames are numbered such that frame 0 is the outermost — i.e. the operation on the call stack that began least recently — and frame N-1 the innermost (where N is the total number of frames on the stack).
When you enter the debugger, the innermost frame is selected, which
means that the commands for getting information about the “current”
frame, or for evaluating expressions in the context of the current
frame, will do so by default with respect to the innermost frame. To
select a different frame, so that these operations will apply to it
instead, use the up, down and frame commands like
this:
debug> up
Frame 1 at standard input:36:14
[* 4 ...
debug> frame 0
Frame 0 at standard input:36:1
[make-string ...
debug> down
Frame 1 at standard input:36:14
[* 4 ...
Move n frames up the stack. For positive n, this advances toward the outermost frame, to lower frame numbers, to frames that have existed longer. n defaults to one.
Move n frames down the stack. For positive n, this advances toward the innermost frame, to higher frame numbers, to frames that were created more recently. n defaults to one.
Select and print a stack frame. With no argument, print the selected stack frame. (See also “info frame”.) An argument specifies the frame to select; it must be a stack-frame number.
The following commands return detailed information about the currently selected frame.
Display a verbose description of the selected frame. The information that this command provides is equivalent to what can be deduced from the one line summary for the frame that appears in a backtrace, but is presented and explained more clearly.
Display the argument variables of the current stack frame. Arguments can also be seen in the backtrace, but are presented more clearly by this command.
Display the name of the source file that the current expression comes from, and the line and column number of the expression's opening parenthesis within that file. This information is only available when the
positionsread option is enabled (see Reader options).
The evaluate command is most useful for querying the value of a
variable, either global or local, in the environment of the selected
stack frame, but it can be used more generally to evaluate any
expression.
Evaluate an expression in the environment of the selected stack frame. The expression must appear on the same line as the command, however it may be continued over multiple lines.
The commands in this subsection all apply only when the stack is continuable — in other words when it makes sense for the program that the stack comes from to continue running. Usually this means that the program stopped because of a trap or a breakpoint.
Tell the debugged program to do n more steps from its current position. One step means executing until the next frame entry or exit of any kind. n defaults to 1.
Tell the debugged program to do n more steps from its current position, but only counting frame entries and exits where the corresponding source code comes from the same file as the current stack frame. (See Step Traps for the details of how this works.) If the current stack frame has no source code, the effect of this command is the same as of
step. n defaults to 1.
Tell the program being debugged to continue running until the completion of the current stack frame, and at that time to print the result and reenter the command line debugger.
Tell the program being debugged to continue running. (In fact this is the same as the
quitcommand, because it exits the debugger command loop and so allows whatever code it was that invoked the debugger to continue.)
There are several options for working on Guile Scheme code in Emacs.
The simplest are to use Emacs's standard scheme-mode for
editing code, and to run the interpreter when you need it by typing
“guile” at the prompt of a *shell* buffer, but there are
Emacs libraries available which add various bells and whistles to
this. The following diagram shows these libraries and how they relate
to each other, with the arrows indicating “builds on” or
“extends”. For example, the Quack library builds on cmuscheme,
which in turn builds on the standard scheme mode.
scheme
^
|
.-----+-----.
| |
cmuscheme xscheme
^
|
.-----+-----.
| |
Quack GDS
scheme, written by Bill Rozas and Dave Love, is Emacs's standard mode for Scheme code files. It provides Scheme-sensitive syntax highlighting, parenthesis matching, indentation and so on.
cmuscheme, written by Olin Shivers, provides a comint-based Scheme
interaction buffer, so that you can run an interpreter more directly
than with the *shell* buffer approach by typing M-x
run-scheme. It also extends scheme-mode so that there are key
presses for sending selected bits of code from a Scheme buffer to this
interpreter. This means that when you are writing some code and want to
check what an expression evaluates to, you can easily select that code
and send it to the interpreter for evaluation, then switch to the
interpreter to see what the result is. cmuscheme is included in the
standard Emacs distribution.
Quack, written by Neil Van Dyke, adds a number of incremental
improvements to the scheme/cmuscheme combination: convenient menu
entries for looking up Scheme-related references (such as the SRFIs);
enhanced indentation rules that are customized for particular Scheme
interpreters, including Guile; an enhanced version of the
run-scheme command that knows the names of the common Scheme
interpreters and remembers which one you used last time; and so on.
Quack is available from http://www.neilvandyke.org/quack.
GDS, written by Neil Jerram, also builds on the scheme/cmuscheme
combination, but with a change to the way that Scheme code fragments
are sent to the interpreter for evaluation. cmuscheme and Quack send
code fragments to the interpreter's standard input, on the assumption
that the interpreter is expecting to read Scheme expressions there,
and then monitor the interpreter's standard output to infer what the
result of the evaluation is. GDS doesn't use standard input and
output like this. Instead, it sets up a socket connection between the
Scheme interpreter and Emacs, and sends and receives messages using a
simple protocol through this socket. The messages include requests to
evaluate Scheme code, and responses conveying the results of an
evaluation, thus providing similar function to cmuscheme or Quack.
They also include requests for stack exploration and debugging, which
go beyond what cmuscheme or Quack can do. The price of this extra
power, however, is that GDS is Guile-specific. GDS requires the
Scheme interpreter to run some GDS-specific library code; currently
this code is written as a Guile module and uses features that are
specific to Guile. GDS is now included in the Guile distribution; for
previous Guile releases (1.8.4 and earlier) it can be obtained as part
of the guile-debugging package from
http://www.ossau.uklinux.net/guile.
Finally, xscheme is similar to cmuscheme — in that it starts up a Scheme interaction process and sends commands to that process's standard input — and to GDS — in that it has support beyond cmuscheme or Quack for exploring the Scheme stack when an error has occurred — but is implemented specifically for MIT/GNU Scheme. Hence it isn't really relevant to Guile work in Emacs, except as a reference for useful features that could be implemented in one of the other libraries mentioned here.
In summary, the best current choice for working on Guile code in Emacs is either Quack or GDS, depending on which of these libraries' features you find most important. For more information on Quack, please see the website referenced above. GDS is documented further in the rest of this section.
GDS aims to allow you to work on Guile Scheme code in the same kind of way that Emacs allows you to work on Emacs Lisp code: providing easy access to help, evaluating arbitrary fragments of code, a nice debugging interface, and so on. The thinking behind the GDS library is that you will usually be doing one of two things.
The presentation makes it very easy to move up and down the stack, showing whenever possible the source code for each frame in another Emacs buffer. It also provides convenient keystrokes for telling Guile what to do next; for example, you can select a stack frame and tell Guile to run until that frame completes, at which point GDS will display the frame's return value.
GDS can provide these facilities for any number of Guile Scheme programs (which we often refer to as “clients”) at once, and these programs can be started either independently of GDS, including outside Emacs, or specifically by GDS.
Communication between each Guile client program and GDS uses a TCP socket, which means that it is orthogonal to any other interfaces that the client program has. In particular GDS does not interfere with a program's standard input and output.
In order to understand the following documentation fully it will help to have a picture in mind of how GDS works, so we briefly describe that here. GDS consists of three components.
The following diagram shows how these components are connected to each other.
+----------------+
| Program #1 |
| |
| +------------+ |
| | GDS Client |-_
| +------------+ |-_ +-------------------+
+----------------+ -_TCP | Emacs |
-_ | |
-_+------------+ | +---------------+ |
_| GDS Server |-----| GDS Interface | |
+----------------+ _- +------------+ | +---------------+ |
| Program #2 | _- +-------------------+
| | _- TCP
| +------------+ _-
| | GDS Client |-|
| +------------+ |
+----------------+
The data exchanged between client and server components, and between server and interface, is a sequence of sexps (parenthesised expressions) that are designed so as to be directly readable by both Scheme and Emacs Lisp. The use of a TCP connection means that the server and Emacs interface can theoretically be on a different computer from the client programs, but in practice there are currently two problems with this. Firstly the GDS API doesn't provide any way of specifying a non-local server to connect to, and secondly there is no security or authentication mechanism in the GDS protocol. These are issues that should be addressed in the future.
To enable the use of GDS in your own Emacs sessions, simply add
(require 'gds)
somewhere in your .emacs file. This will cause Emacs to load the GDS Emacs Lisp code when starting up, and to start the inferior GDS server process so that it is ready and waiting for any Guile programs that want to use GDS.
(If GDS's Scheme code is not installed in one of the locations in Guile's load path, you may find that the server process fails to start. When this happens you will see an error message from Emacs:
error in process filter: Wrong type argument: listp, Backtrace:
and the gds-debug buffer will contain a Scheme backtrace ending
with the message:
no code for module (ice-9 gds-server)
The solution for this is to customize the Emacs variable
gds-scheme-directory so that it specifies where the GDS Scheme
code is installed. Then either restart Emacs or type M-x
gds-run-debug-server to try starting the GDS server process again.)
For evaluations, help and completion from Scheme code buffers that you
are working on, this is all you need. The first time you do any of
these things, GDS will automatically start a new Guile client program as
an Emacs subprocess. This Guile program does nothing but wait for and
act on instructions from GDS, and we refer to it as a utility
Guile client. Over time this utility client will accumulate the code
that you ask it to evaluate, and you can also tell it to load complete
files or modules by sending it load or use-modules
expressions.
When you want to use GDS to work on an independent Guile application, you need to add something to that application's Scheme code to cause it to connect to and interact with GDS at the right times. The following subsections describe the ways of doing this.
One option is to use GDS to catch and display any exceptions that
are thrown by the application's code. If you already have a
lazy-catch or with-throw-handler around the area of code
that you want to monitor, you just need to add the following to the
handler code:
(gds-debug-trap (throw->trap-context key args))
where key and args are the first and rest arguments that
Guile passes to the handler. (In other words, they assume the handler
signature (lambda (key . args) ...).) With Guile 1.8 or
later, you can also do this with a catch, by adding this same
code to the catch's pre-unwind handler.
If you don't already have any of these, insert a whole
with-throw-handler expression (or lazy-catch if your Guile
is pre-1.8) around the code of interest like this:
(with-throw-handler #t
(lambda ()
;; Protected code here.
)
(lambda (key . args)
(gds-debug-trap (throw->trap-context key args))))
Either way, you will need to use the (ice-9 gds-client) and
(ice-9 debugging traps) modules.
Two special cases of this are the lazy-catch that the Guile REPL code
uses to catch exceptions in user code, and the lazy-catch inside the
stack-catch utility procedure that is provided by the
(ice-9 stack-catch) module. Both of these use a handler called
lazy-handler-dispatch (defined in boot-9.scm), which you
can hook into such that it calls GDS to display the stack when an
exception occurs. To do this, use the on-lazy-handler-dispatch
procedure as follows.
(use-modules (ice-9 gds-client)
(ice-9 debugging traps))
(on-lazy-handler-dispatch gds-debug-trap)
After this the program will use GDS to display the stack whenever it
hits an exception that is protected by a lazy-catch using
lazy-handler-dispatch.
In addition to setting an exception handler as described above, a Guile program can in principle set itself up to accept new instructions from GDS at any time, not just when it has stopped at an exception. This would allow the GDS user to evaluate code in the context of the running program, without having to wait for the program to stop first.
(use-modules (ice-9 gds-client))
(gds-accept-input #t)
gds-accept-input causes the calling program to loop processing
instructions from GDS, until GDS sends the continue instruction.
This blocks the thread that calls it, however, so it will normally be
more practical for the program to set up a dedicated GDS thread and call
gds-accept-input from that thread.
For select-driven applications, an alternative approach would be
for the GDS client code to provide an API which allowed the application
to
select call
select indicated data
available for reading on those descriptors/ports.
This approach is not yet implemented, though.
The “utility” Guile client mentioned above is a simple combination of the mechanisms that we have just described. In fact the code for the utility Guile client is essentially just this:
(use-modules (ice-9 gds-client))
(named-module-use! '(guile-user) '(ice-9 session))
(gds-accept-input #f))
The named-module-use! line ensures that the client can process
help and apropos expressions, to implement lookups in
Guile's online help. The #f parameter to
gds-accept-input means that the continue instruction
will not cause the instruction loop to exit, which makes sense here
because the utility client has nothing to do except to process GDS
instructions.
The utility client does not use on-lazy-handler-dispatch at its
top level, because it has its own mechanism for catching and reporting
exceptions in the code that it is asked to evaluate. This mechanism
summarizes the exception and gives the user a button they can click to
see the full stack, so the end result is very similar to what
on-lazy-handler-dispatch provides. Deep inside
gds-accept-input, in the part that handles evaluating
expressions from Emacs, the GDS client code uses
throw->trap-context and gds-debug-trap to implement
this.
The following subsections describe the facilities and key sequences that
GDS provides for working on code in scheme-mode buffers.
The following keystrokes provide fast and convenient access to Guile's built in help, and to completion with respect to the set of defined and accessible symbols.
(help SYMBOL) into the Guile REPL
(gds-help-symbol). The symbol to query defaults to the word at
or before the cursor but can also be entered or edited in the
minibuffer. The available help is popped up in a temporary Emacs
window.
(apropos REGEXP) into
the Guile REPL (gds-apropos). The regexp to query defaults to
the word at or before the cursor but can also be entered or edited in
the minibuffer. The list of matching symbols is popped up in a
temporary Emacs window.
gds-complete-symbol). If there are any extra
characters that can be definitively added to the symbol at point, they
are inserted. Otherwise, if there are any completions available, they
are popped up in a temporary Emacs window, where one of them can be
selected using either <RET> or the mouse.
The following keystrokes and commands provide various ways of sending code to a Guile client process for evaluation.
gds-eval-defun).
gds-eval-last-sexp). This is designed so that it is easy to
evaluate an expression that you have just finished typing.
gds-eval-expression).
gds-eval-region). Note that GDS does not check whether the
region contains a balanced expression, or try to expand the region so
that it does; it uses the region exactly as it is.
If you type C-u before one of these commands, GDS will
immediately pop up a Scheme stack buffer, showing the requested
evaluation, so that you can single step through it. (This is achieved
by setting a <source-trap> trap at the start of the requested
evaluation; see Source Traps for more on how those work.) The
Scheme stack display, and the options for continuing through the code,
are described in the next two sections.
When you specify gds-debug-trap as the behaviour for a trap and
the Guile program concerned hits that trap, GDS displays the stack and
the relevant Scheme source code in Emacs, allowing you to explore the
state of the program and then decide what to do next. The same
applies if the program calls (on-lazy-handler-dispatch
gds-debug-trap) and then throws an exception that passes through
lazy-handler-dispatch, except that in this case you can only
explore; it isn't possible to continue normal execution after an
exception.
The following commands are available in the stack buffer for exploring the state of the program.
gds-up). GDS displays stack frames with the innermost at the
top, so moving “up” means selecting a more “inner” frame.
gds-down). GDS displays stack frames with the innermost at the
top, so moving “down” means selecting a more “outer” frame.
gds-select-stack-frame). This
is useful after clicking somewhere in the stack trace with the mouse.
Selecting a frame means that GDS will display the source code
corresponding to that frame in the adjacent window, and that
subsequent frame-sensitive commands, such as gds-evaluate (see
below) and gds-step-over (see Continuing Execution), will
refer to that frame.
gds-evaluate). The result is displayed in
the echo area.
gds-frame-info). This includes what type of frame it is, the
associated expression, and the frame's source location, if any.
gds-frame-args).
gds-proc-source). The source code (where
available) is displayed in the echo area.
S (gds-proc-source) is useful when the procedure being
called was created by an anonymous (lambda ...) expression.
Such procedures appear in the stack trace as <procedure #f
(...)>, which doesn't give you much clue as to what will happen
next. S will show you the procedure's code, which is usually
enough for you to identify it.
If it makes sense to continue execution from the stack which is being displayed, GDS provides the following further commands in the stack buffer.
gds-go). It may of course
stop again if it hits another trap, or another occurrence of the same
trap.
The multiple keystrokes reflect that you can think of this as “going”,
“continuing” or “quitting” (in the sense of quitting the GDS
display).
gds-step-file).
In other words, you can hit <SPC> repeatedly to step through the code in a given file, automatically stepping over any evaluations or procedure calls that use code from other files (or from no file).
If the selected stack frame has no source, the effect of this command is
the same as that of i, described next.
gds-step-into). i therefore steps
through code at the most detailed level possible.
gds-step-over).
Note that the program may stop before then if it hits another trap; in
this case the trap telling it to stop when the marked frame completes
remains in place and so will still fire at the appropriate point.
The first time that you use one of GDS's evaluation, help or completion commands from a given Scheme mode buffer, GDS will ask which Guile client program you want to use for the operation, or if you want to start up a new “utility” client. After that GDS considers the buffer to be “associated” with the selected client, and so sends all further requests to that client, but you can override this by explicitly associating the buffer with a different client, or by removing the default association.
When a buffer is associated with a client program, the buffer's modeline shows whether the client is currently able to accept instruction from GDS. This is done by adding one of the following suffixes to the “Scheme” major mode indicator:
Create a file, testgds.scm say, for experimenting with GDS and Scheme code, and type this into it:
(use-modules (ice-9 debugging traps)
(ice-9 gds-client)
(ice-9 debugging example-fns))
(install-trap (make <procedure-trap>
#:behaviour gds-debug-trap
#:procedure fact1))
Now select all of this code and type C-c C-r to send the selected region to Guile for evaluation. GDS will ask you which Guile process to use; unless you know that you already have another Guile application running and connected to GDS, choose the “Start a new Guile” option, which starts one of the “utility” processes described in GDS Getting Started.
The results of the evaluation pop up in a window like this:
(use-modules (ice-9 debugging traps)\n ...
;;; Evaluating subexpression 1 in current module (guile-user)
=> no (or unspecified) value
;;; Evaluating subexpression 2 in current module (guile-user)
=> no (or unspecified) value
--:** *Guile Evaluation* (Scheme:ready)--All------------
this tells you that the evaluation was successful but that the return
values were unspecified. Its effect was to load a module of example
functions and set a trap on one of these functions, fact1, that
calculates the factorial of its argument.
If you now call fact1, you can see the trap and GDS's stack
display in action. To do this add
(fact1 4)
to your testgds.scm buffer and type C-x C-e (which evaluates the expression that the cursor is just after the end of). The result should be that a GDS stack window like the following appears:
Calling procedure:
=> s [fact1 4]
s [primitive-eval (fact1 4)]
--:** PID 28729 (Guile-Debug)--All------------
This stack tells you that Guile is about to call the fact1
procedure, with argument 4, and you can step through this call in
detail by pressing i once and then <SPC>
(see Continuing Execution).
(i is needed as the first keystroke rather than <SPC>,
because the aim here is to step through code in the (ice-9
debugging example-fns) module, whose source file is
.../ice-9/debugging/example-fns.scm, but the initial
(fact1 4) call comes from the Guile session, whose “source
file” Guile presents as standard input. If the user starts by
pressing <SPC> instead of i, the effect is that the
program runs until it hits the first recursive call (fact1 (- n
1)), where it stops because of the trap on fact1 firing again.
At this point, the source file is
.../ice-9/debugging/example-fns.scm, because the recursive
(fact1 (- n 1)) call comes from code in that file, so further
pressing of <SPC> successfully single-steps through this
file.)
This part of the manual explains the general concepts that you need to understand when interfacing to Guile from C. You will learn about how the latent typing of Scheme is embedded into the static typing of C, how the garbage collection of Guile is made available to C code, and how continuations influence the control flow in a C program.
This knowledge should make it straightforward to add new functions to Guile that can be called from Scheme. Adding new data types is also possible and is done by defining smobs.
The Programming Overview section of this part contains general musings and guidelines about programming with Guile. It explores different ways to design a program around Guile, or how to embed Guile into existing programs.
There is also a pedagogical yet detailed explanation of how the data representation of Guile is implemented, See Data Representation. You don't need to know the details given there to use Guile from C, but they are useful when you want to modify Guile itself or when you are just curious about how it is all done.
For detailed reference information on the variables, functions etc. that make up Guile's application programming interface (API), See API Reference.
This section covers the mechanics of linking your program with Guile on a typical POSIX system.
The header file <libguile.h> provides declarations for all of
Guile's functions and constants. You should #include it at the
head of any C source file that uses identifiers described in this
manual. Once you've compiled your source files, you need to link them
against the Guile object code library, libguile.
On most systems, you should not need to tell the compiler and linker
explicitly where they can find libguile.h and libguile.
When Guile has been installed in a peculiar way, or when you are on a
peculiar system, things might not be so easy and you might need to pass
additional -I or -L options to the compiler. Guile
provides the utility program guile-config to help you find the
right values for these options. You would typically run
guile-config during the configuration phase of your program and
use the obtained information in the Makefile.
To initialize Guile, you can use one of several functions. The first,
scm_with_guile, is the most portable way to initialize Guile. It
will initialize Guile when necessary and then call a function that you
can specify. Multiple threads can call scm_with_guile
concurrently and it can also be called more than once in a given thread.
The global state of Guile will survive from one call of
scm_with_guile to the next. Your function is called from within
scm_with_guile since the garbage collector of Guile needs to know
where the stack of each thread is.
A second function, scm_init_guile, initializes Guile for the
current thread. When it returns, you can use the Guile API in the
current thread. This function employs some non-portable magic to learn
about stack bounds and might thus not be available on all platforms.
One common way to use Guile is to write a set of C functions which
perform some useful task, make them callable from Scheme, and then link
the program with Guile. This yields a Scheme interpreter just like
guile, but augmented with extra functions for some specific
application — a special-purpose scripting language.
In this situation, the application should probably process its
command-line arguments in the same manner as the stock Guile
interpreter. To make that straightforward, Guile provides the
scm_boot_guile and scm_shell function.
Here is simple-guile.c, source code for a main and an
inner_main function that will produce a complete Guile
interpreter.
/* simple-guile.c --- how to start up the Guile
interpreter from C code. */
/* Get declarations for all the scm_ functions. */
#include <libguile.h>
static void
inner_main (void *closure, int argc, char **argv)
{
/* module initializations would go here */
scm_shell (argc, argv);
}
int
main (int argc, char **argv)
{
scm_boot_guile (argc, argv, inner_main, 0);
return 0; /* never reached */
}
The main function calls scm_boot_guile to initialize
Guile, passing it inner_main. Once scm_boot_guile is
ready, it invokes inner_main, which calls scm_shell to
process the command-line arguments in the usual way.
Here is a Makefile which you can use to compile the above program. It
uses guile-config to learn about the necessary compiler and
linker flags.
# Use GCC, if you have it installed.
CC=gcc
# Tell the C compiler where to find <libguile.h>
CFLAGS=`guile-config compile`
# Tell the linker what libraries to use and where to find them.
LIBS=`guile-config link`
simple-guile: simple-guile.o
${CC} simple-guile.o ${LIBS} -o simple-guile
simple-guile.o: simple-guile.c
${CC} -c ${CFLAGS} simple-guile.c
If you are using the GNU Autoconf package to make your application more
portable, Autoconf will settle many of the details in the Makefile above
automatically, making it much simpler and more portable; we recommend
using Autoconf with Guile. Guile also provides the GUILE_FLAGS
macro for autoconf that performs all necessary checks. Here is a
configure.in file for simple-guile that uses this macro.
Autoconf can use this file as a template to generate a configure
script. In order for Autoconf to find the GUILE_FLAGS macro, you
will need to run aclocal first (see Invoking aclocal).
AC_INIT(simple-guile.c)
# Find a C compiler.
AC_PROG_CC
# Check for Guile
GUILE_FLAGS
# Generate a Makefile, based on the results.
AC_OUTPUT(Makefile)
Here is a Makefile.in template, from which the configure
script produces a Makefile customized for the host system:
# The configure script fills in these values.
CC=@CC@
CFLAGS=@GUILE_CFLAGS@
LIBS=@GUILE_LDFLAGS@
simple-guile: simple-guile.o
${CC} simple-guile.o ${LIBS} -o simple-guile
simple-guile.o: simple-guile.c
${CC} -c ${CFLAGS} simple-guile.c
The developer should use Autoconf to generate the configure script from the configure.in template, and distribute configure with the application. Here's how a user might go about building the application:
$ ls
Makefile.in configure* configure.in simple-guile.c
$ ./configure
creating cache ./config.cache
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for Guile... yes
creating ./config.status
creating Makefile
$ make
gcc -c -I/usr/local/include simple-guile.c
gcc simple-guile.o -L/usr/local/lib -lguile -lqthreads -lpthread -lm -o simple-guile
$ ./simple-guile
guile> (+ 1 2 3)
6
guile> (getpwnam "jimb")
#("jimb" "83Z7d75W2tyJQ" 4008 10 "Jim Blandy" "/u/jimb"
"/usr/local/bin/bash")
guile> (exit)
$
The previous section has briefly explained how to write programs that
make use of an embedded Guile interpreter. But sometimes, all you
want to do is make new primitive procedures and data types available
to the Scheme programmer. Writing a new version of guile is
inconvenient in this case and it would in fact make the life of the
users of your new features needlessly hard.
For example, suppose that there is a program guile-db that is a
version of Guile with additional features for accessing a database.
People who want to write Scheme programs that use these features would
have to use guile-db instead of the usual guile program.
Now suppose that there is also a program guile-gtk that extends
Guile with access to the popular Gtk+ toolkit for graphical user
interfaces. People who want to write GUIs in Scheme would have to use
guile-gtk. Now, what happens when you want to write a Scheme
application that uses a GUI to let the user access a database? You
would have to write a third program that incorporates both the
database stuff and the GUI stuff. This might not be easy (because
guile-gtk might be a quite obscure program, say) and taking this
example further makes it easy to see that this approach can not work in
practice.
It would have been much better if both the database features and the GUI
feature had been provided as libraries that can just be linked with
guile. Guile makes it easy to do just this, and we encourage you
to make your extensions to Guile available as libraries whenever
possible.
You write the new primitive procedures and data types in the normal fashion, and link them into a shared library instead of into a stand-alone program. The shared library can then be loaded dynamically by Guile.
This section explains how to make the Bessel functions of the C library
available to Scheme. First we need to write the appropriate glue code
to convert the arguments and return values of the functions from Scheme
to C and back. Additionally, we need a function that will add them to
the set of Guile primitives. Because this is just an example, we will
only implement this for the j0 function.
Consider the following file bessel.c.
#include <math.h>
#include <libguile.h>
SCM
j0_wrapper (SCM x)
{
return scm_make_real (j0 (scm_num2dbl (x, "j0")));
}
void
init_bessel ()
{
scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}
This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:
gcc -shared -o libguile-bessel.so -fPIC bessel.c
For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction).
A shared library can be loaded into a running Guile process with the
function load-extension. In addition to the name of the
library to load, this function also expects the name of a function from
that library that will be called to initialize it. For our example,
we are going to call the function init_bessel which will make
j0_wrapper available to Scheme programs with the name
j0. Note that we do not specify a filename extension such as
.so when invoking load-extension. The right extension for
the host platform will be provided automatically.
(load-extension "libguile-bessel" "init_bessel")
(j0 2)
=> 0.223890779141236
For this to work, load-extension must be able to find
libguile-bessel, of course. It will look in the places that
are usual for your operating system, and it will additionally look
into the directories listed in the LTDL_LIBRARY_PATH
environment variable.
To see how these Guile extensions via shared libraries relate to the module system, See Putting Extensions into Modules.
When you want to embed the Guile Scheme interpreter into your program or library, you need to link it against the libguile library (see Linking Programs With Guile). Once you have done this, your C code has access to a number of data types and functions that can be used to invoke the interpreter, or make new functions that you have written in C available to be called from Scheme code, among other things.
Scheme is different from C in a number of significant ways, and Guile tries to make the advantages of Scheme available to C as well. Thus, in addition to a Scheme interpreter, libguile also offers dynamic types, garbage collection, continuations, arithmetic on arbitrary sized numbers, and other things.
The two fundamental concepts are dynamic types and garbage collection. You need to understand how libguile offers them to C programs in order to use the rest of libguile. Also, the more general control flow of Scheme caused by continuations needs to be dealt with.
Running asynchronous signal handlers and multi-threading is known to C code already, but there are of course a few additional rules when using them together with libguile.
Scheme is a dynamically-typed language; this means that the system cannot, in general, determine the type of a given expression at compile time. Types only become apparent at run time. Variables do not have fixed types; a variable may hold a pair at one point, an integer at the next, and a thousand-element vector later. Instead, values, not variables, have fixed types.
In order to implement standard Scheme functions like pair? and
string? and provide garbage collection, the representation of
every value must contain enough information to accurately determine its
type at run time. Often, Scheme systems also use this information to
determine whether a program has attempted to apply an operation to an
inappropriately typed value (such as taking the car of a string).
Because variables, pairs, and vectors may hold values of any type, Scheme implementations use a uniform representation for values — a single type large enough to hold either a complete value or a pointer to a complete value, along with the necessary typing information.
In Guile, this uniform representation of all Scheme values is the C type
SCM. This is an opaque type and its size is typically equivalent
to that of a pointer to void. Thus, SCM values can be
passed around efficiently and they take up reasonably little storage on
their own.
The most important rule is: You never access a SCM value
directly; you only pass it to functions or macros defined in libguile.
As an obvious example, although a SCM variable can contain
integers, you can of course not compute the sum of two SCM values
by adding them with the C + operator. You must use the libguile
function scm_sum.
Less obvious and therefore more important to keep in mind is that you
also cannot directly test SCM values for trueness. In Scheme,
the value #f is considered false and of course a SCM
variable can represent that value. But there is no guarantee that the
SCM representation of #f looks false to C code as well.
You need to use scm_is_true or scm_is_false to test a
SCM value for trueness or falseness, respectively.
You also can not directly compare two SCM values to find out
whether they are identical (that is, whether they are eq? in
Scheme terms). You need to use scm_is_eq for this.
The one exception is that you can directly assign a SCM value to
a SCM variable by using the C = operator.
The following (contrived) example shows how to do it right. It implements a function of two arguments (a and flag) that returns a+1 if flag is true, else it returns a unchanged.
SCM
my_incrementing_function (SCM a, SCM flag)
{
SCM result;
if (scm_is_true (flag))
result = scm_sum (a, scm_from_int (1));
else
result = a;
return result;
}
Often, you need to convert between SCM values and approriate C
values. For example, we needed to convert the integer 1 to its
SCM representation in order to add it to a. Libguile
provides many function to do these conversions, both from C to
SCM and from SCM to C.
The conversion functions follow a common naming pattern: those that make
a SCM value from a C value have names of the form
scm_from_type (...) and those that convert a SCM
value to a C value use the form scm_to_type (...).
However, it is best to avoid converting values when you can. When you
must combine C values and SCM values in a computation, it is
often better to convert the C values to SCM values and do the
computation by using libguile functions than to the other way around
(converting SCM to C and doing the computation some other way).
As a simple example, consider this version of
my_incrementing_function from above:
SCM
my_other_incrementing_function (SCM a, SCM flag)
{
int result;
if (scm_is_true (flag))
result = scm_to_int (a) + 1;
else
result = scm_to_int (a);
return scm_from_int (result);
}
This version is much less general than the original one: it will only
work for values A that can fit into a int. The original
function will work for all values that Guile can represent and that
scm_sum can understand, including integers bigger than long
long, floating point numbers, complex numbers, and new numerical types
that have been added to Guile by third-party libraries.
Also, computing with SCM is not necessarily inefficient. Small
integers will be encoded directly in the SCM value, for example,
and do not need any additional memory on the heap. See Data Representation to find out the details.
Some special SCM values are available to C code without needing
to convert them from C values:
| Scheme value | C representation
|
#f | SCM_BOOL_F
|
#t | SCM_BOOL_T
|
() | SCM_EOL
|
In addition to SCM, Guile also defines the related type
scm_t_bits. This is an unsigned integral type of sufficient
size to hold all information that is directly contained in a
SCM value. The scm_t_bits type is used internally by
Guile to do all the bit twiddling explained in Data Representation, but you will encounter it occasionally in low-level
user code as well.
As explained above, the SCM type can represent all Scheme values.
Some values fit entirely into a SCM value (such as small
integers), but other values require additional storage in the heap (such
as strings and vectors). This additional storage is managed
automatically by Guile. You don't need to explicitly deallocate it
when a SCM value is no longer used.
Two things must be guaranteed so that Guile is able to manage the storage automatically: it must know about all blocks of memory that have ever been allocated for Scheme values, and it must know about all Scheme values that are still being used. Given this knowledge, Guile can periodically free all blocks that have been allocated but are not used by any active Scheme values. This activity is called garbage collection.
It is easy for Guile to remember all blocks of memory that it has allocated for use by Scheme values, but you need to help it with finding all Scheme values that are in use by C code.
You do this when writing a SMOB mark function, for example
(see Garbage Collecting Smobs). By calling this function, the
garbage collector learns about all references that your SMOB has to
other SCM values.
Other references to SCM objects, such as global variables of type
SCM or other random data structures in the heap that contain
fields of type SCM, can be made visible to the garbage collector
by calling the functions scm_gc_protect or
scm_permanent_object. You normally use these funtions for long
lived objects such as a hash table that is stored in a global variable.
For temporary references in local variables or function arguments, using
these functions would be too expensive.
These references are handled differently: Local variables (and function
arguments) of type SCM are automatically visible to the garbage
collector. This works because the collector scans the stack for
potential references to SCM objects and considers all referenced
objects to be alive. The scanning considers each and every word of the
stack, regardless of what it is actually used for, and then decides
whether it could possibly be a reference to a SCM object. Thus,
the scanning is guaranteed to find all actual references, but it might
also find words that only accidentally look like references. These
`false positives' might keep SCM objects alive that would
otherwise be considered dead. While this might waste memory, keeping an
object around longer than it strictly needs to is harmless. This is why
this technique is called “conservative garbage collection”. In
practice, the wasted memory seems to be no problem.
The stack of every thread is scanned in this way and the registers of the CPU and all other memory locations where local variables or function parameters might show up are included in this scan as well.
The consequence of the conservative scanning is that you can just
declare local variables and function parameters of type SCM and
be sure that the garbage collector will not free the corresponding
objects.
However, a local variable or function parameter is only protected as
long as it is really on the stack (or in some register). As an
optimization, the C compiler might reuse its location for some other
value and the SCM object would no longer be protected. Normally,
this leads to exactly the right behabvior: the compiler will only
overwrite a reference when it is no longer needed and thus the object
becomes unprotected precisely when the reference disappears, just as
wanted.
There are situations, however, where a SCM object needs to be
around longer than its reference from a local variable or function
parameter. This happens, for example, when you retrieve some pointer
from a smob and work with that pointer directly. The reference to the
SCM smob object might be dead after the pointer has been
retrieved, but the pointer itself (and the memory pointed to) is still
in use and thus the smob object must be protected. The compiler does
not know about this connection and might overwrite the SCM
reference too early.
To get around this problem, you can use scm_remember_upto_here_1
and its cousins. It will keep the compiler from overwriting the
reference. For a typical example of its use, see Remembering During Operations.
Scheme has a more general view of program flow than C, both locally and non-locally.
Controlling the local flow of control involves things like gotos, loops, calling functions and returning from them. Non-local control flow refers to situations where the program jumps across one or more levels of function activations without using the normal call or return operations.
The primitive means of C for local control flow is the goto
statement, together with if. Loops done with for,
while or do could in principle be rewritten with just
goto and if. In Scheme, the primitive means for local
control flow is the function call (together with if).
Thus, the repetition of some computation in a loop is ultimately
implemented by a function that calls itself, that is, by recursion.
This approach is theoretically very powerful since it is easier to reason formally about recursion than about gotos. In C, using recursion exclusively would not be practical, though, since it would eat up the stack very quickly. In Scheme, however, it is practical: function calls that appear in a tail position do not use any additional stack space (see Tail Calls).
A function call is in a tail position when it is the last thing the
calling function does. The value returned by the called function is
immediately returned from the calling function. In the following
example, the call to bar-1 is in a tail position, while the
call to bar-2 is not. (The call to 1- in foo-2
is in a tail position, though.)
(define (foo-1 x)
(bar-1 (1- x)))
(define (foo-2 x)
(1- (bar-2 x)))
Thus, when you take care to recurse only in tail positions, the recursion will only use constant stack space and will be as good as a loop constructed from gotos.
Scheme offers a few syntactic abstractions (do and named
let) that make writing loops slightly easier.
But only Scheme functions can call other functions in a tail position: C functions can not. This matters when you have, say, two functions that call each other recursively to form a common loop. The following (unrealistic) example shows how one might go about determing whether a non-negative integer n is even or odd.
(define (my-even? n)
(cond ((zero? n) #t)
(else (my-odd? (1- n)))))
(define (my-odd? n)
(cond ((zero? n) #f)
(else (my-even? (1- n)))))
Because the calls to my-even? and my-odd? are in tail
positions, these two procedures can be applied to arbitrary large
integers without overflowing the stack. (They will still take a lot
of time, of course.)
However, when one or both of the two procedures would be rewritten in C, it could no longer call its companion in a tail position (since C does not have this concept). You might need to take this consideration into account when deciding which parts of your program to write in Scheme and which in C.
In addition to calling functions and returning from them, a Scheme program can also exit non-locally from a function so that the control flow returns directly to an outer level. This means that some functions might not return at all.
Even more, it is not only possible to jump to some outer level of control, a Scheme program can also jump back into the middle of a function that has already exited. This might cause some functions to return more than once.
In general, these non-local jumps are done by invoking
continuations that have previously been captured using
call-with-current-continuation. Guile also offers a slightly
restricted set of functions, catch and throw, that can
only be used for non-local exits. This restriction makes them more
efficient. Error reporting (with the function error) is
implemented by invoking throw, for example. The functions
catch and throw belong to the topic of exceptions.
Since Scheme functions can call C functions and vice versa, C code can
experience the more general control flow of Scheme as well. It is
possible that a C function will not return at all, or will return more
than once. While C does offer setjmp and longjmp for
non-local exits, it is still an unusual thing for C code. In
contrast, non-local exits are very common in Scheme, mostly to report
errors.
You need to be prepared for the non-local jumps in the control flow
whenever you use a function from libguile: it is best to assume
that any libguile function might signal an error or run a pending
signal handler (which in turn can do arbitrary things).
It is often necessary to take cleanup actions when the control leaves a
function non-locally. Also, when the control returns non-locally, some
setup actions might be called for. For example, the Scheme function
with-output-to-port needs to modify the global state so that
current-output-port returns the port passed to
with-output-to-port. The global output port needs to be reset to
its previous value when with-output-to-port returns normally or
when it is exited non-locally. Likewise, the port needs to be set again
when control enters non-locally.
Scheme code can use the dynamic-wind function to arrange for
the setting and resetting of the global state. C code can use the
corresponding scm_internal_dynamic_wind function, or a
scm_dynwind_begin/scm_dynwind_end pair together with
suitable 'dynwind actions' (see Dynamic Wind).
Instead of coping with non-local control flow, you can also prevent it
by erecting a continuation barrier, See Continuation Barriers. The function scm_c_with_continuation_barrier, for
example, is guaranteed to return exactly once.
You can not call libguile functions from handlers for POSIX signals, but
you can register Scheme handlers for POSIX signals such as
SIGINT. These handlers do not run during the actual signal
delivery. Instead, they are run when the program (more precisely, the
thread that the handler has been registered for) reaches the next
safe point.
The libguile functions themselves have many such safe points.
Consequently, you must be prepared for arbitrary actions anytime you
call a libguile function. For example, even scm_cons can contain
a safe point and when a signal handler is pending for your thread,
calling scm_cons will run this handler and anything might happen,
including a non-local exit although scm_cons would not ordinarily
do such a thing on its own.
If you do not want to allow the running of asynchronous signal handlers,
you can block them temporarily with scm_dynwind_block_asyncs, for
example. See See System asyncs.
Since signal handling in Guile relies on safe points, you need to make sure that your functions do offer enough of them. Normally, calling libguile functions in the normal course of action is all that is needed. But when a thread might spent a long time in a code section that calls no libguile function, it is good to include explicit safe points. This can allow the user to interrupt your code with <C-c>, for example.
You can do this with the macro SCM_TICK. This macro is
syntactically a statement. That is, you could use it like this:
while (1)
{
SCM_TICK;
do_some_work ();
}
Frequent execution of a safe point is even more important in multi threaded programs, See Multi-Threading.
Guile can be used in multi-threaded programs just as well as in single-threaded ones.
Each thread that wants to use functions from libguile must put itself into guile mode and must then follow a few rules. If it doesn't want to honor these rules in certain situations, a thread can temporarily leave guile mode (but can no longer use libguile functions during that time, of course).
Threads enter guile mode by calling scm_with_guile,
scm_boot_guile, or scm_init_guile. As explained in the
reference documentation for these functions, Guile will then learn about
the stack bounds of the thread and can protect the SCM values
that are stored in local variables. When a thread puts itself into
guile mode for the first time, it gets a Scheme representation and is
listed by all-threads, for example.
While in guile mode, a thread promises to reach a safe point
reasonably frequently (see Asynchronous Signals). In addition to
running signal handlers, these points are also potential rendezvous
points of all guile mode threads where Guile can orchestrate global
things like garbage collection. Consequently, when a thread in guile
mode blocks and does no longer frequent safe points, it might cause
all other guile mode threads to block as well. To prevent this from
happening, a guile mode thread should either only block in libguile
functions (who know how to do it right), or should temporarily leave
guile mode with scm_without_guile.
For some common blocking operations, Guile provides convenience
functions. For example, if you want to lock a pthread mutex while in
guile mode, you might want to use scm_pthread_mutex_lock which is
just like pthread_mutex_lock except that it leaves guile mode
while blocking.
All libguile functions are (intended to be) robust in the face of multiple threads using them concurrently. This means that there is no risk of the internal data structures of libguile becoming corrupted in such a way that the process crashes.
A program might still produce non-sensical results, though. Taking hashtables as an example, Guile guarantees that you can use them from multiple threads concurrently and a hashtable will always remain a valid hashtable and Guile will not crash when you access it. It does not guarantee, however, that inserting into it concurrently from two threads will give useful results: only one insertion might actually happen, none might happen, or the table might in general be modified in a totally arbitrary manner. (It will still be a valid hashtable, but not the one that you might have expected.) Guile might also signal an error when it detects a harmful race condition.
Thus, you need to put in additional synchronizations when multiple threads want to use a single hashtable, or any other mutable Scheme object.
When writing C code for use with libguile, you should try to make it robust as well. An example that converts a list into a vector will help to illustrate. Here is a correct version:
SCM
my_list_to_vector (SCM list)
{
SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
size_t len, i;
len = SCM_SIMPLE_VECTOR_LENGTH (vector);
i = 0;
while (i < len && scm_is_pair (list))
{
SCM_SIMPLE_VECTOR_SET (vector, i, SCM_CAR (list));
list = SCM_CDR (list);
i++;
}
return vector;
}
The first thing to note is that storing into a SCM location
concurrently from multiple threads is guaranteed to be robust: you don't
know which value wins but it will in any case be a valid SCM
value.
But there is no guarantee that the list referenced by list is not
modified in another thread while the loop iterates over it. Thus, while
copying its elements into the vector, the list might get longer or
shorter. For this reason, the loop must check both that it doesn't
overrun the vector (SCM_SIMPLE_VECTOR_SET does no range-checking)
and that it doesn't overrung the list (SCM_CAR and SCM_CDR
likewise do no type checking).
It is safe to use SCM_CAR and SCM_CDR on the local
variable list once it is known that the variable contains a pair.
The contents of the pair might change spontaneously, but it will always
stay a valid pair (and a local variable will of course not spontaneously
point to a different Scheme object).
Likewise, a simple vector such as the one returned by
scm_make_vector is guaranteed to always stay the same length so
that it is safe to only use SCM_SIMPLE_VECTOR_LENGTH once and store the
result. (In the example, vector is safe anyway since it is a
fresh object that no other thread can possibly know about until it is
returned from my_list_to_vector.)
Of course the behavior of my_list_to_vector is suboptimal when
list does indeed get asynchronously lengthened or shortened in
another thread. But it is robust: it will always return a valid vector.
That vector might be shorter than expected, or its last elements might
be unspecified, but it is a valid vector and if a program wants to rule
out these cases, it must avoid modifying the list asynchronously.
Here is another version that is also correct:
SCM
my_pedantic_list_to_vector (SCM list)
{
SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
size_t len, i;
len = SCM_SIMPLE_VECTOR_LENGTH (vector);
i = 0;
while (i < len)
{
SCM_SIMPLE_VECTOR_SET (vector, i, scm_car (list));
list = scm_cdr (list);
i++;
}
return vector;
}
This version uses the type-checking and thread-robust functions
scm_car and scm_cdr instead of the faster, but less robust
macros SCM_CAR and SCM_CDR. When the list is shortened
(that is, when list holds a non-pair), scm_car will throw
an error. This might be preferable to just returning a half-initialized
vector.
The API for accessing vectors and arrays of various kinds from C takes a slightly different approach to thread-robustness. In order to get at the raw memory that stores the elements of an array, you need to reserve that array as long as you need the raw memory. During the time an array is reserved, its elements can still spontaneously change their values, but the memory itself and other things like the size of the array are guaranteed to stay fixed. Any operation that would change these parameters of an array that is currently reserved will signal an error. In order to avoid these errors, a program should of course put suitable synchronization mechanisms in place. As you can see, Guile itself is again only concerned about robustness, not about correctness: without proper synchronization, your program will likely not be correct, but the worst consequence is an error message.
Real thread-safeness often requires that a critical section of code is executed in a certain restricted manner. A common requirement is that the code section is not entered a second time when it is already being executed. Locking a mutex while in that section ensures that no other thread will start executing it, blocking asyncs ensures that no asynchronous code enters the section again from the current thread, and the error checking of Guile mutexes guarantees that an error is signalled when the current thread accidentally reenters the critical section via recursive function calls.
Guile provides two mechanisms to support critical sections as outlined
above. You can either use the macros
SCM_CRITICAL_SECTION_START and SCM_CRITICAL_SECTION_END
for very simple sections; or use a dynwind context together with a
call to scm_dynwind_critical_section.
The macros only work reliably for critical sections that are guaranteed to not cause a non-local exit. They also do not detect an accidental reentry by the current thread. Thus, you should probably only use them to delimit critical sections that do not contain calls to libguile functions or to other external functions that might do complicated things.
The function scm_dynwind_critical_section, on the other hand,
will correctly deal with non-local exits because it requires a dynwind
context. Also, by using a separate mutex for each critical section,
it can detect accidental reentries.
Smobs are Guile's mechanism for adding new primitive types to the system. The term “smob” was coined by Aubrey Jaffer, who says it comes from “small object”, referring to the fact that they are quite limited in size: they can hold just one pointer to a larger memory block plus 16 extra bits.
To define a new smob type, the programmer provides Guile with some
essential information about the type — how to print it, how to
garbage collect it, and so on — and Guile allocates a fresh type tag
for it. The programmer can then use scm_c_define_gsubr to make
a set of C functions visible to Scheme code that create and operate on
these objects.
(You can find a complete version of the example code used in this
section in the Guile distribution, in doc/example-smob. That
directory includes a makefile and a suitable main function, so
you can build a complete interactive Guile shell, extended with the
datatypes described here.)
To define a new type, the programmer must write four functions to manage instances of the type:
markSCM values that the object
has stored. The default smob mark function does nothing.
See Garbage Collecting Smobs, for more details.
freescm_make_smob_type is non-zero)
using scm_gc_free. See Garbage Collecting Smobs, for more
details.
This function operates while the heap is in an inconsistent state and
must therefore be careful. See Smobs, for details about what this
function is allowed to do.
printdisplay or write. The default print
function prints #<NAME ADDRESS> where NAME is the first
argument passed to scm_make_smob_type. For more information on
printing, see Port Data.
equalpequal? function to compare two instances
of the same smob type, Guile calls this function. It should return
SCM_BOOL_T if a and b should be considered
equal?, or SCM_BOOL_F otherwise. If equalp is
NULL, equal? will assume that two instances of this type are
never equal? unless they are eq?.
To actually register the new smob type, call scm_make_smob_type.
It returns a value of type scm_t_bits which identifies the new
smob type.
The four special functions described above are registered by calling
one of scm_set_smob_mark, scm_set_smob_free,
scm_set_smob_print, or scm_set_smob_equalp, as
appropriate. Each function is intended to be used at most once per
type, and the call should be placed immediately following the call to
scm_make_smob_type.
There can only be at most 256 different smob types in the system. Instead of registering a huge number of smob types (for example, one for each relevant C struct in your application), it is sometimes better to register just one and implement a second layer of type dispatching on top of it. This second layer might use the 16 extra bits to extend its type, for example.
Here is how one might declare and register a new type representing eight-bit gray-scale images:
#include <libguile.h>
struct image {
int width, height;
char *pixels;
/* The name of this image */
SCM name;
/* A function to call when this image is
modified, e.g., to update the screen,
or SCM_BOOL_F if no action necessary */
SCM update_func;
};
static scm_t_bits image_tag;
void
init_image_type (void)
{
image_tag = scm_make_smob_type ("image", sizeof (struct image));
scm_set_smob_mark (image_tag, mark_image);
scm_set_smob_free (image_tag, free_image);
scm_set_smob_print (image_tag, print_image);
}
Normally, smobs can have one immediate word of data. This word
stores either a pointer to an additional memory block that holds the
real data, or it might hold the data itself when it fits. The word is
large enough for a SCM value, a pointer to void, or an
integer that fits into a size_t or ssize_t.
You can also create smobs that have two or three immediate words, and when these words suffice to store all data, it is more efficient to use these super-sized smobs instead of using a normal smob plus a memory block. See Double Smobs, for their discussion.
Guile provides functions for managing memory which are often helpful when implementing smobs. See Memory Blocks.
To retrieve the immediate word of a smob, you use the macro
SCM_SMOB_DATA. It can be set with SCM_SET_SMOB_DATA.
The 16 extra bits can be accessed with SCM_SMOB_FLAGS and
SCM_SET_SMOB_FLAGS.
The two macros SCM_SMOB_DATA and SCM_SET_SMOB_DATA treat
the immediate word as if it were of type scm_t_bits, which is
an unsigned integer type large enough to hold a pointer to
void. Thus you can use these macros to store arbitrary
pointers in the smob word.
When you want to store a SCM value directly in the immediate
word of a smob, you should use the macros SCM_SMOB_OBJECT and
SCM_SET_SMOB_OBJECT to access it.
Creating a smob instance can be tricky when it consists of multiple steps that allocate resources and might fail. It is recommended that you go about creating a smob in the following way:
scm_gc_malloc.
SCM values in it that must be protected.
Initialize these fields with SCM_BOOL_F.
A valid state is one that can be safely acted upon by the mark and free functions of your smob type.
SCM_NEWSMOB, passing it the initialized
memory block. (This step will always succeed.)
This procedure ensures that the smob is in a valid state as soon as it
exists, that all resources that are allocated for the smob are
properly associated with it so that they can be properly freed, and
that no SCM values that need to be protected are stored in it
while the smob does not yet competely exist and thus can not protect
them.
Continuing the example from above, if the global variable
image_tag contains a tag returned by scm_make_smob_type,
here is how we could construct a smob whose immediate word contains a
pointer to a freshly allocated struct image:
SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
SCM smob;
struct image *image;
int width = scm_to_int (s_width);
int height = scm_to_int (s_height);
/* Step 1: Allocate the memory block.
*/
image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
/* Step 2: Initialize it with straight code.
*/
image->width = width;
image->height = height;
image->pixels = NULL;
image->name = SCM_BOOL_F;
image->update_func = SCM_BOOL_F;
/* Step 3: Create the smob.
*/
SCM_NEWSMOB (smob, image_tag, image);
/* Step 4: Finish the initialization.
*/
image->name = name;
image->pixels = scm_gc_malloc (width * height, "image pixels");
return smob;
}
Let us look at what might happen when make_image is called.
The conversions of s_width and s_height to ints might
fail and signal an error, thus causing a non-local exit. This is not a
problem since no resources have been allocated yet that would have to be
freed.
The allocation of image in step 1 might fail, but this is likewise no problem.
Step 2 can not exit non-locally. At the end of it, the image
struct is in a valid state for the mark_image and
free_image functions (see below).
Step 3 can not exit non-locally either. This is guaranteed by Guile. After it, smob contains a valid smob that is properly initialized and protected, and in turn can properly protect the Scheme values in its image struct.
But before the smob is completely created, SCM_NEWSMOB might
cause the garbage collector to run. During this garbage collection, the
SCM values in the image struct would be invisible to Guile.
It only gets to know about them via the mark_image function, but
that function can not yet do its job since the smob has not been created
yet. Thus, it is important to not store SCM values in the
image struct until after the smob has been created.
Step 4, finally, might fail and cause a non-local exit. In that case,
the complete creation of the smob has not been successful, but it does
nevertheless exist in a valid state. It will eventually be freed by
the garbage collector, and all the resources that have been allocated
for it will be correctly freed by free_image.
Functions that operate on smobs should check that the passed
SCM value indeed is a suitable smob before accessing its data.
They can do this with scm_assert_smob_type.
For example, here is a simple function that operates on an image smob, and checks the type of its argument.
SCM
clear_image (SCM image_smob)
{
int area;
struct image *image;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
area = image->width * image->height;
memset (image->pixels, 0, area);
/* Invoke the image's update function.
*/
if (scm_is_true (image->update_func))
scm_call_0 (image->update_func);
scm_remember_upto_here_1 (image_smob);
return SCM_UNSPECIFIED;
}
See Remembering During Operations for an explanation of the call
to scm_remember_upto_here_1.
Once a smob has been released to the tender mercies of the Scheme system, it must be prepared to survive garbage collection. Guile calls the mark and free functions of the smob to manage this.
As described in more detail elsewhere (see Conservative GC), every object in the Scheme system has a mark bit, which the garbage collector uses to tell live objects from dead ones. When collection starts, every object's mark bit is clear. The collector traces pointers through the heap, starting from objects known to be live, and sets the mark bit on each object it encounters. When it can find no more unmarked objects, the collector walks all objects, live and dead, frees those whose mark bits are still clear, and clears the mark bit on the others.
The two main portions of the collection are called the mark phase, during which the collector marks live objects, and the sweep phase, during which the collector frees all unmarked objects.
The mark bit of a smob lives in a special memory region. When the collector encounters a smob, it sets the smob's mark bit, and uses the smob's type tag to find the appropriate mark function for that smob. It then calls this mark function, passing it the smob as its only argument.
The mark function is responsible for marking any other Scheme
objects the smob refers to. If it does not do so, the objects' mark
bits will still be clear when the collector begins to sweep, and the
collector will free them. If this occurs, it will probably break, or at
least confuse, any code operating on the smob; the smob's SCM
values will have become dangling references.
To mark an arbitrary Scheme object, the mark function calls
scm_gc_mark.
Thus, here is how we might write mark_image:
SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
scm_gc_mark (image->update_func);
return SCM_BOOL_F;
}
Note that, even though the image's update_func could be an
arbitrarily complex structure (representing a procedure and any values
enclosed in its environment), scm_gc_mark will recurse as
necessary to mark all its components. Because scm_gc_mark sets
an object's mark bit before it recurses, it is not confused by
circular structures.
As an optimization, the collector will mark whatever value is returned by the mark function; this helps limit depth of recursion during the mark phase. Thus, the code above should really be written as:
SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
return image->update_func;
}
Finally, when the collector encounters an unmarked smob during the sweep phase, it uses the smob's tag to find the appropriate free function for the smob. It then calls that function, passing it the smob as its only argument.
The free function must release any resources used by the smob.
However, it must not free objects managed by the collector; the
collector will take care of them. For historical reasons, the return
type of the free function should be size_t, an unsigned
integral type; the free function should always return zero.
Here is how we might write the free_image function for the image
smob type:
size_t
free_image (SCM image_smob)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_free (image->pixels, image->width * image->height, "image pixels");
scm_gc_free (image, sizeof (struct image), "image");
return 0;
}
During the sweep phase, the garbage collector will clear the mark bits on all live objects. The code which implements a smob need not do this itself.
There is no way for smob code to be notified when collection is complete.
It is usually a good idea to minimize the amount of processing done during garbage collection; keep the mark and free functions very simple. Since collections occur at unpredictable times, it is easy for any unusual activity to interfere with normal code.
It is often useful to define very simple smob types — smobs which have no data to mark, other than the cell itself, or smobs whose immediate data word is simply an ordinary Scheme object, to be marked recursively. Guile provides some functions to handle these common cases; you can use this function as your smob type's mark function, if your smob's structure is simple enough.
If the smob refers to no other Scheme objects, then no action is necessary; the garbage collector has already marked the smob cell itself. In that case, you can use zero as your mark function.
If the smob refers to exactly one other Scheme object via its first
immediate word, you can use scm_markcdr as its mark function.
Its definition is simply:
SCM
scm_markcdr (SCM obj)
{
return SCM_SMOB_OBJECT (obj);
}
It's important that a smob is visible to the garbage collector whenever its contents are being accessed. Otherwise it could be freed while code is still using it.
For example, consider a procedure to convert image data to a list of pixel values.
SCM
image_to_list (SCM image_smob)
{
struct image *image;
SCM lst;
int i;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
lst = SCM_EOL;
for (i = image->width * image->height - 1; i >= 0; i--)
lst = scm_cons (scm_from_char (image->pixels[i]), lst);
scm_remember_upto_here_1 (image_smob);
return lst;
}
In the loop, only the image pointer is used and the C compiler
has no reason to keep the image_smob value anywhere. If
scm_cons results in a garbage collection, image_smob might
not be on the stack or anywhere else and could be freed, leaving the
loop accessing freed data. The use of scm_remember_upto_here_1
prevents this, by creating a reference to image_smob after all
data accesses.
There's no need to do the same for lst, since that's the return
value and the compiler will certainly keep it in a register or
somewhere throughout the routine.
The clear_image example previously shown (see Type checking)
also used scm_remember_upto_here_1 for this reason.
It's only in quite rare circumstances that a missing
scm_remember_upto_here_1 will bite, but when it happens the
consequences are serious. Fortunately the rule is simple: whenever
calling a Guile library function or doing something that might, ensure
that the SCM of a smob is referenced past all accesses to its
insides. Do this by adding an scm_remember_upto_here_1 if
there are no other references.
In a multi-threaded program, the rule is the same. As far as a given thread is concerned, a garbage collection still only occurs within a Guile library function, not at an arbitrary time. (Guile waits for all threads to reach one of its library functions, and holds them there while the collector runs.)
Smobs are called smob because they are small: they normally have only
room for one void* or SCM value plus 16 bits. The
reason for this is that smobs are directly implemented by using the
low-level, two-word cells of Guile that are also used to implement
pairs, for example. (see Data Representation for the details.)
One word of the two-word cells is used for SCM_SMOB_DATA (or
SCM_SMOB_OBJECT), the other contains the 16-bit type tag and
the 16 extra bits.
In addition to the fundamental two-word cells, Guile also has
four-word cells, which are appropriately called double cells.
You can use them for double smobs and get two more immediate
words of type scm_t_bits.
A double smob is created with SCM_NEWSMOB2 or
SCM_NEWSMOB3 instead of SCM_NEWSMOB. Its immediate
words can be retrieved as scm_t_bits with
SCM_SMOB_DATA_2 and SCM_SMOB_DATA_3 in addition to
SCM_SMOB_DATA. Unsurprisingly, the words can be set to
scm_t_bits values with SCM_SET_SMOB_DATA_2 and
SCM_SET_SMOB_DATA_3.
Of course there are also SCM_SMOB_OBJECT_2,
SCM_SMOB_OBJECT_3, SCM_SET_SMOB_OBJECT_2, and
SCM_SET_SMOB_OBJECT_3.
Here is the complete text of the implementation of the image datatype, as presented in the sections above. We also provide a definition for the smob's print function, and make some objects and functions static, to clarify exactly what the surrounding code is using.
As mentioned above, you can find this code in the Guile distribution, in
doc/example-smob. That directory includes a makefile and a
suitable main function, so you can build a complete interactive
Guile shell, extended with the datatypes described here.)
/* file "image-type.c" */
#include <stdlib.h>
#include <libguile.h>
static scm_t_bits image_tag;
struct image {
int width, height;
char *pixels;
/* The name of this image */
SCM name;
/* A function to call when this image is
modified, e.g., to update the screen,
or SCM_BOOL_F if no action necessary */
SCM update_func;
};
static SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
SCM smob;
struct image *image;
int width = scm_to_int (s_width);
int height = scm_to_int (s_height);
/* Step 1: Allocate the memory block.
*/
image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
/* Step 2: Initialize it with straight code.
*/
image->width = width;
image->height = height;
image->pixels = NULL;
image->name = SCM_BOOL_F;
image->update_func = SCM_BOOL_F;
/* Step 3: Create the smob.
*/
SCM_NEWSMOB (smob, image_tag, image);
/* Step 4: Finish the initialization.
*/
image->name = name;
image->pixels = scm_gc_malloc (width * height, "image pixels");
return smob;
}
SCM
clear_image (SCM image_smob)
{
int area;
struct image *image;
scm_assert_smob_type (image_tag, image_smob);
image = (struct image *) SCM_SMOB_DATA (image_smob);
area = image->width * image->height;
memset (image->pixels, 0, area);
/* Invoke the image's update function.
*/
if (scm_is_true (image->update_func))
scm_call_0 (image->update_func);
scm_remember_upto_here_1 (image_smob);
return SCM_UNSPECIFIED;
}
static SCM
mark_image (SCM image_smob)
{
/* Mark the image's name and update function. */
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_mark (image->name);
return image->update_func;
}
static size_t
free_image (SCM image_smob)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_gc_free (image->pixels, image->width * image->height, "image pixels");
scm_gc_free (image, sizeof (struct image), "image");
return 0;
}
static int
print_image (SCM image_smob, SCM port, scm_print_state *pstate)
{
struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
scm_puts ("#<image ", port);
scm_display (image->name, port);
scm_puts (">", port);
/* non-zero means success */
return 1;
}
void
init_image_type (void)
{
image_tag = scm_make_smob_type ("image", sizeof (struct image));
scm_set_smob_mark (image_tag, mark_image);
scm_set_smob_free (image_tag, free_image);
scm_set_smob_print (image_tag, print_image);
scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
}
Here is a sample build and interaction with the code from the example-smob directory, on the author's machine:
zwingli:example-smob$ make CC=gcc
gcc `guile-config compile` -c image-type.c -o image-type.o
gcc `guile-config compile` -c myguile.c -o myguile.o
gcc image-type.o myguile.o `guile-config link` -o myguile
zwingli:example-smob$ ./myguile
guile> make-image
#<primitive-procedure make-image>
guile> (define i (make-image "Whistler's Mother" 100 100))
guile> i
#<image Whistler's Mother>
guile> (clear-image i)
guile> (clear-image 4)
ERROR: In procedure clear-image in expression (clear-image 4):
ERROR: Wrong type (expecting image): 4
ABORT: (wrong-type-arg)
Type "(backtrace)" to get more information.
guile>
When writing C code for use with Guile, you typically define a set of
C functions, and then make some of them visible to the Scheme world by
calling scm_c_define_gsubr or related functions. If you have
many functions to publish, it can sometimes be annoying to keep the
list of calls to scm_c_define_gsubr in sync with the list of
function definitions.
Guile provides the guile-snarf program to manage this problem.
Using this tool, you can keep all the information needed to define the
function alongside the function definition itself; guile-snarf
will extract this information from your source code, and automatically
generate a file of calls to scm_c_define_gsubr which you can
#include into an initialization function.
The snarfing mechanism works for many kind of initialiation actions,
not just for collecting calls to scm_c_define_gsubr. For a
full list of what can be done, See Snarfing Macros.
The guile-snarf program is invoked like this:
guile-snarf [-o outfile] [cpp-args ...]
This command will extract initialization actions to outfile.
When no outfile has been specified or when outfile is
-, standard output will be used. The C preprocessor is called
with cpp-args (which usually include an input file) and the
output is filtered to extract the initialization actions.
If there are errors during processing, outfile is deleted and the program exits with non-zero status.
During snarfing, the pre-processor macro SCM_MAGIC_SNARFER is
defined. You could use this to avoid including snarfer output files
that don't yet exist by writing code like this:
#ifndef SCM_MAGIC_SNARFER
#include "foo.x"
#endif
Here is how you might define the Scheme function clear-image,
implemented by the C function clear_image:
#include <libguile.h>
SCM_DEFINE (clear_image, "clear-image", 1, 0, 0,
(SCM image_smob),
"Clear the image.")
{
/* C code to clear the image in image_smob... */
}
void
init_image_type ()
{
#include "image-type.x"
}
The SCM_DEFINE declaration says that the C function
clear_image implements a Scheme function called
clear-image, which takes one required argument (of type
SCM and named image_smob), no optional arguments, and no
rest argument. The string "Clear the image." provides a short
help text for the function, it is called a docstring.
For historical reasons, the SCM_DEFINE macro also defines a
static array of characters named s_clear_image, initialized to
the string "clear-image". You shouldn't use this array, but you might
need to be aware that it exists.
Assuming the text above lives in a file named image-type.c, you will need to execute the following command to prepare this file for compilation:
guile-snarf -o image-type.x image-type.c
This scans image-type.c for SCM_DEFINE
declarations, and writes to image-type.x the output:
scm_c_define_gsubr ("clear-image", 1, 0, 0, (SCM (*)() ) clear_image);
When compiled normally, SCM_DEFINE is a macro which expands to
the function header for clear_image.
Note that the output file name matches the #include from the
input file. Also, you still need to provide all the same information
you would if you were using scm_c_define_gsubr yourself, but you
can place the information near the function definition itself, so it is
less likely to become incorrect or out-of-date.
If you have many files that guile-snarf must process, you should
consider using a fragment like the following in your Makefile:
snarfcppopts = $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS)
.SUFFIXES: .x
.c.x:
guile-snarf -o $@ $< $(snarfcppopts)
This tells make to run guile-snarf to produce each needed
.x file from the corresponding .c file.
The program guile-snarf passes its command-line arguments
directly to the C preprocessor, which it uses to extract the
information it needs from the source code. this means you can pass
normal compilation flags to guile-snarf to define preprocessor
symbols, add header file directories, and so on.
Guile is designed as an extension language interpreter that is straightforward to integrate with applications written in C (and C++). The big win here for the application developer is that Guile integration, as the Guile web page says, “lowers your project's hacktivation energy.” Lowering the hacktivation energy means that you, as the application developer, and your users, reap the benefits that flow from being able to extend the application in a high level extension language rather than in plain old C.
In abstract terms, it's difficult to explain what this really means and what the integration process involves, so instead let's begin by jumping straight into an example of how you might integrate Guile into an existing program, and what you could expect to gain by so doing. With that example under our belts, we'll then return to a more general analysis of the arguments involved and the range of programming options available.
Dia is a free software program for drawing schematic diagrams like flow charts and floor plans (http://www.gnome.org/projects/dia/). This section conducts the thought experiment of adding Guile to Dia. In so doing, it aims to illustrate several of the steps and considerations involved in adding Guile to applications in general.
First off, you should understand why you want to add Guile to Dia at all, and that means forming a picture of what Dia does and how it does it. So, what are the constituents of the Dia application?
(In other words, a textbook example of the model - view - controller paradigm.)
Next question: how will Dia benefit once the Guile integration is complete? Several (positive!) answers are possible here, and the choice is obviously up to the application developers. Still, one answer is that the main benefit will be the ability to manipulate Dia's application domain objects from Scheme.
Suppose that Dia made a set of procedures available in Scheme, representing the most basic operations on objects such as shapes, connectors, and so on. Using Scheme, the application user could then write code that builds upon these basic operations to create more complex procedures. For example, given basic procedures to enumerate the objects on a page, to determine whether an object is a square, and to change the fill pattern of a single shape, the user can write a Scheme procedure to change the fill pattern of all squares on the current page:
(define (change-squares'-fill-pattern new-pattern)
(for-each-shape current-page
(lambda (shape)
(if (square? shape)
(change-fill-pattern shape new-pattern)))))
Assuming this objective, four steps are needed to achieve it.
First, you need a way of representing your application-specific objects
— such as shape in the previous example — when they are
passed into the Scheme world. Unless your objects are so simple that
they map naturally into builtin Scheme data types like numbers and
strings, you will probably want to use Guile's SMOB interface to
create a new Scheme data type for your objects.
Second, you need to write code for the basic operations like
for-each-shape and square? such that they access and
manipulate your existing data structures correctly, and then make these
operations available as primitives on the Scheme level.
Third, you need to provide some mechanism within the Dia application that a user can hook into to cause arbitrary Scheme code to be evaluated.
Finally, you need to restructure your top-level application C code a little so that it initializes the Guile interpreter correctly and declares your SMOBs and primitives to the Scheme world.
The following subsections expand on these four points in turn.
For all but the most trivial applications, you will probably want to allow some representation of your domain objects to exist on the Scheme level. This is where the idea of SMOBs comes in, and with it issues of lifetime management and garbage collection.
To get more concrete about this, let's look again at the example we gave earlier of how application users can use Guile to build higher-level functions from the primitives that Dia itself provides.
(define (change-squares'-fill-pattern new-pattern)
(for-each-shape current-page
(lambda (shape)
(if (square? shape)
(change-fill-pattern shape new-pattern)))))
Consider what is stored here in the variable shape. For each
shape on the current page, the for-each-shape primitive calls
(lambda (shape) ...) with an argument representing that
shape. Question is: how is that argument represented on the Scheme
level? The issues are as follows.
square? and change-fill-pattern primitives. In
other words, a primitive like square? has somehow to be able to
turn the value that it receives back into something that points to the
underlying C structure describing a shape.
shape in a global variable, but then that shape is deleted (in a
way that the Scheme code is not aware of), and later on some other
Scheme code uses that global variable again in a call to, say,
square??
shape argument passes
transiently in and out of the Scheme world, it would be quite wrong the
delete the underlying C shape just because the Scheme code has
finished evaluation. How do we avoid this happening?
One resolution of these issues is for the Scheme-level representation of
a shape to be a new, Scheme-specific C structure wrapped up as a SMOB.
The SMOB is what is passed into and out of Scheme code, and the
Scheme-specific C structure inside the SMOB points to Dia's underlying C
structure so that the code for primitives like square? can get at
it.
To cope with an underlying shape being deleted while Scheme code is still holding onto a Scheme shape value, the underlying C structure should have a new field that points to the Scheme-specific SMOB. When a shape is deleted, the relevant code chains through to the Scheme-specific structure and sets its pointer back to the underlying structure to NULL. Thus the SMOB value for the shape continues to exist, but any primitive code that tries to use it will detect that the underlying shape has been deleted because the underlying structure pointer is NULL.
So, to summarize the steps involved in this resolution of the problem
(and assuming that the underlying C structure for a shape is
struct dia_shape):
struct dia_guile_shape
{
struct dia_shape * c_shape; /* NULL => deleted */
}
struct dia_shape that points to its struct
dia_guile_shape if it has one —
struct dia_shape
{
...
struct dia_guile_shape * guile_shape;
}
— so that C code can set guile_shape->c_shape to NULL when the
underlying shape is deleted.
struct dia_guile_shape as a SMOB type.
c_shape field when decoding it, to find out whether the
underlying C shape is still there.
As far as memory management is concerned, the SMOB values and their Scheme-specific structures are under the control of the garbage collector, whereas the underlying C structures are explicitly managed in exactly the same way that Dia managed them before we thought of adding Guile.
When the garbage collector decides to free a shape SMOB value, it calls
the SMOB free function that was specified when defining the shape
SMOB type. To maintain the correctness of the guile_shape field
in the underlying C structure, this function should chain through to the
underlying C structure (if it still exists) and set its
guile_shape field to NULL.
For full documentation on defining and using SMOB types, see Defining New Types (Smobs).
Once the details of object representation are decided, writing the primitive function code that you need is usually straightforward.
A primitive is simply a C function whose arguments and return value are
all of type SCM, and whose body does whatever you want it to do.
As an example, here is a possible implementation of the square?
primitive:
#define FUNC_NAME "square?"
static SCM square_p (SCM shape)
{
struct dia_guile_shape * guile_shape;
/* Check that arg is really a shape SMOB. */
SCM_VALIDATE_SHAPE (SCM_ARG1, shape);
/* Access Scheme-specific shape structure. */
guile_shape = SCM_SMOB_DATA (shape);
/* Find out if underlying shape exists and is a
square; return answer as a Scheme boolean. */
return scm_from_bool (guile_shape->c_shape &&
(guile_shape->c_shape->type == DIA_SQUARE));
}
#undef FUNC_NAME
Notice how easy it is to chain through from the SCM shape
parameter that square_p receives — which is a SMOB — to the
Scheme-specific structure inside the SMOB, and thence to the underlying
C structure for the shape.
In this code, SCM_SMOB_DATA and scm_from_bool are from
the standard Guile API. SCM_VALIDATE_SHAPE is a macro that you
should define as part of your SMOB definition: it checks that the
passed parameter is of the expected type. This is needed to guard
against Scheme code using the square? procedure incorrectly, as
in (square? "hello"); Scheme's latent typing means that usage
errors like this must be caught at run time.
Having written the C code for your primitives, you need to make them
available as Scheme procedures by calling the scm_c_define_gsubr
function. scm_c_define_gsubr (see Primitive Procedures) takes arguments that
specify the Scheme-level name for the primitive and how many required,
optional and rest arguments it can accept. The square? primitive
always requires exactly one argument, so the call to make it available
in Scheme reads like this:
scm_c_define_gsubr ("square?", 1, 0, 0, square_p);
For where to put this call, see the subsection after next on the structure of Guile-enabled code (see Dia Structure).
To make the Guile integration useful, you have to design some kind of hook into your application that application users can use to cause their Scheme code to be evaluated.
Technically, this is straightforward; you just have to decide on a mechanism that is appropriate for your application. Think of Emacs, for example: when you type <ESC> :, you get a prompt where you can type in any Elisp code, which Emacs will then evaluate. Or, again like Emacs, you could provide a mechanism (such as an init file) to allow Scheme code to be associated with a particular key sequence, and evaluate the code when that key sequence is entered.
In either case, once you have the Scheme code that you want to evaluate,
as a null terminated string, you can tell Guile to evaluate it by
calling the scm_c_eval_string function.
Let's assume that the pre-Guile Dia code looks structurally like this:
main ()
When you add Guile to a program, one (rather technical) requirement is
that Guile's garbage collector needs to know where the bottom of the C
stack is. The easiest way to ensure this is to use
scm_boot_guile like this:
main ()
scm_boot_guile (argc, argv, inner_main, NULL)
inner_main ()
scm_c_define_gsubr
In other words, you move the guts of what was previously in your
main function into a new function called inner_main, and
then add a scm_boot_guile call, with inner_main as a
parameter, to the end of main.
Assuming that you are using SMOBs and have written primitive code as
described in the preceding subsections, you also need to insert calls to
declare your new SMOBs and export the primitives to Scheme. These
declarations must happen inside the dynamic scope of the
scm_boot_guile call, but also before any code is run that
could possibly use them — the beginning of inner_main is an
ideal place for this.
The steps described so far implement an initial Guile integration that already gives a lot of additional power to Dia application users. But there are further steps that you could take, and it's interesting to consider a few of these.
In general, you could progressively move more of Dia's source code from C into Scheme. This might make the code more maintainable and extensible, and it could open the door to new programming paradigms that are tricky to effect in C but straightforward in Scheme.
A specific example of this is that you could use the guile-gtk package, which provides Scheme-level procedures for most of the Gtk+ library, to move the code that lays out and displays Dia objects from C to Scheme.
As you follow this path, it naturally becomes less useful to maintain a distinction between Dia's original non-Guile-related source code, and its later code implementing SMOBs and primitives for the Scheme world.
For example, suppose that the original source code had a
dia_change_fill_pattern function:
void dia_change_fill_pattern (struct dia_shape * shape,
struct dia_pattern * pattern)
{
/* real pattern change work */
}
During initial Guile integration, you add a change_fill_pattern
primitive for Scheme purposes, which accesses the underlying structures
from its SMOB values and uses dia_change_fill_pattern to do the
real work:
SCM change_fill_pattern (SCM shape, SCM pattern)
{
struct dia_shape * d_shape;
struct dia_pattern * d_pattern;
...
dia_change_fill_pattern (d_shape, d_pattern);
return SCM_UNSPECIFIED;
}
At this point, it makes sense to keep dia_change_fill_pattern and
change_fill_pattern separate, because
dia_change_fill_pattern can also be called without going through
Scheme at all, say because the user clicks a button which causes a
C-registered Gtk+ callback to be called.
But, if the code for creating buttons and registering their callbacks is
moved into Scheme (using guile-gtk), it may become true that
dia_change_fill_pattern can no longer be called other than
through Scheme. In which case, it makes sense to abolish it and move
its contents directly into change_fill_pattern, like this:
SCM change_fill_pattern (SCM shape, SCM pattern)
{
struct dia_shape * d_shape;
struct dia_pattern * d_pattern;
...
/* real pattern change work */
return SCM_UNSPECIFIED;
}
So further Guile integration progressively reduces the amount of functional C code that you have to maintain over the long term.
A similar argument applies to data representation. In the discussion of SMOBs earlier, issues arose because of the different memory management and lifetime models that normally apply to data structures in C and in Scheme. However, with further Guile integration, you can resolve this issue in a more radical way by allowing all your data structures to be under the control of the garbage collector, and kept alive by references from the Scheme world. Instead of maintaining an array or linked list of shapes in C, you would instead maintain a list in Scheme.
Rather like the coalescing of dia_change_fill_pattern and
change_fill_pattern, the practical upshot of such a change is
that you would no longer have to keep the dia_shape and
dia_guile_shape structures separate, and so wouldn't need to
worry about the pointers between them. Instead, you could change the
SMOB definition to wrap the dia_shape structure directly, and
send dia_guile_shape off to the scrap yard. Cut out the middle
man!
Finally, we come to the holy grail of Guile's free software / extension language approach. Once you have a Scheme representation for interesting Dia data types like shapes, and a handy bunch of primitives for manipulating them, it suddenly becomes clear that you have a bundle of functionality that could have far-ranging use beyond Dia itself. In other words, the data types and primitives could now become a library, and Dia becomes just one of the many possible applications using that library — albeit, at this early stage, a rather important one!
In this model, Guile becomes just the glue that binds everything together. Imagine an application that usefully combined functionality from Dia, Gnumeric and GnuCash — it's tricky right now, because no such application yet exists; but it'll happen some day ...
Underlying Guile's value proposition is the assumption that programming in a high level language, specifically Guile's implementation of Scheme, is necessarily better in some way than programming in C. What do we mean by this claim, and how can we be so sure?
One class of advantages applies not only to Scheme, but more generally to any interpretable, high level, scripting language, such as Emacs Lisp, Python, Ruby, or TeX's macro language. Common features of all such languages, when compared to C, are that:
In the case of Scheme, particular features that make programming easier — and more fun! — are its powerful mechanisms for abstracting parts of programs (closures — see About Closure) and for iteration (see while do).
The evidence in support of this argument is empirical: the huge amount of code that has been written in extension languages for applications that support this mechanism. Most notable are extensions written in Emacs Lisp for GNU Emacs, in TeX's macro language for TeX, and in Script-Fu for the Gimp, but there is increasingly now a significant code eco-system for Guile-based applications as well, such as Lilypond and GnuCash. It is close to inconceivable that similar amounts of functionality could have been added to these applications just by writing new code in their base implementation languages.
As an example of what this means in practice, imagine writing a testbed for an application that is tested by submitting various requests (via a C interface) and validating the output received. Suppose further that the application keeps an idea of its current state, and that the “correct” output for a given request may depend on the current application state. A complete “white box”2 test plan for this application would aim to submit all possible requests in each distinguishable state, and validate the output for all request/state combinations.
To write all this test code in C would be very tedious. Suppose instead that the testbed code adds a single new C function, to submit an arbitrary request and return the response, and then uses Guile to export this function as a Scheme procedure. The rest of the testbed can then be written in Scheme, and so benefits from all the advantages of programming in Scheme that were described in the previous section.
(In this particular example, there is an additional benefit of writing most of the testbed in Scheme. A common problem for white box testing is that mistakes and mistaken assumptions in the application under test can easily be reproduced in the testbed code. It is more difficult to copy mistakes like this when the testbed is written in a different language from the application.)
The preceding arguments and example point to a model of Guile programming that is applicable in many cases. According to this model, Guile programming involves a balance between C and Scheme programming, with the aim being to extract the greatest possible Scheme level benefit from the least amount of C level work.
The C level work required in this model usually consists of packaging and exporting functions and application objects such that they can be seen and manipulated on the Scheme level. To help with this, Guile's C language interface includes utility features that aim to make this kind of integration very easy for the application developer. These features are documented later in this part of the manual: see REFFIXME.
This model, though, is really just one of a range of possible programming options. If all of the functionality that you need is available from Scheme, you could choose instead to write your whole application in Scheme (or one of the other high level languages that Guile supports through translation), and simply use Guile as an interpreter for Scheme. (In the future, we hope that Guile will also be able to compile Scheme code, so lessening the performance gap between C and Scheme code.) Or, at the other end of the C–Scheme scale, you could write the majority of your application in C, and only call out to Guile occasionally for specific actions such as reading a configuration file or executing a user-specified extension. The choices boil down to two basic questions:
These are of course design questions, and the right design for any given application will always depend upon the particular requirements that you are trying to meet. In the context of Guile, however, there are some generally applicable considerations that can help you when designing your answers.
Suppose, for the sake of argument, that you would prefer to write your whole application in Scheme. Then the API available to you consists of:
A module in the last category can either be a pure Scheme module — in
other words a collection of utility procedures coded in Scheme — or a
module that provides a Scheme interface to an extension library coded in
C — in other words a nice package where someone else has done the work
of wrapping up some useful C code for you. The set of available modules
is growing quickly and already includes such useful examples as
(gtk gtk), which makes Gtk+ drawing functions available in
Scheme, and (database postgres), which provides SQL access to a
Postgres database.
Given the growing collection of pre-existing modules, it is quite feasible that your application could be implemented by combining a selection of these modules together with new application code written in Scheme.
If this approach is not enough, because the functionality that your application needs is not already available in this form, and it is impossible to write the new functionality in Scheme, you will need to write some C code. If the required function is already available in C (e.g. in a library), all you need is a little glue to connect it to the world of Guile. If not, you need both to write the basic code and to plumb it into Guile.
In either case, two general considerations are important. Firstly, what is the interface by which the functionality is presented to the Scheme world? Does the interface consist only of function calls (for example, a simple drawing interface), or does it need to include objects of some kind that can be passed between C and Scheme and manipulated by both worlds. Secondly, how does the lifetime and memory management of objects in the C code relate to the garbage collection governed approach of Scheme objects? In the case where the basic C code is not already written, most of the difficulties of memory management can be avoided by using Guile's C interface features from the start.
For the full documentation on writing C code for Guile and connecting existing C code to the Guile world, see REFFIXME.
So far we have considered what Guile programming means for an application developer. But what if you are instead using an existing Guile-based application, and want to know what your options are for programming and extending this application?
The answer to this question varies from one application to another, because the options available depend inevitably on whether the application developer has provided any hooks for you to hang your own code on and, if there are such hooks, what they allow you to do.3 For example...
In the last two cases, what you can do is, by definition, restricted by the application, and you should refer to the application's own manual to find out your options.
The most well known example of the first case is Emacs, with its extension language Emacs Lisp: as well as being a text editor, Emacs supports the loading and execution of arbitrary Emacs Lisp code. The result of such openness has been dramatic: Emacs now benefits from user-contributed Emacs Lisp libraries that extend the basic editing function to do everything from reading news to psychoanalysis and playing adventure games. The only limitation is that extensions are restricted to the functionality provided by Emacs's built-in set of primitive operations. For example, you can interact and display data by manipulating the contents of an Emacs buffer, but you can't pop-up and draw a window with a layout that is totally different to the Emacs standard.
This situation with a Guile application that supports the loading of arbitrary user code is similar, except perhaps even more so, because Guile also supports the loading of extension libraries written in C. This last point enables user code to add new primitive operations to Guile, and so to bypass the limitation present in Emacs Lisp.
At this point, the distinction between an application developer and an application user becomes rather blurred. Instead of seeing yourself as a user extending an application, you could equally well say that you are developing a new application of your own using some of the primitive functionality provided by the original application. As such, all the discussions of the preceding sections of this chapter are relevant to how you can proceed with developing your extension.
Guile provides an application programming interface (API) to developers in two core languages: Scheme and C. This part of the manual contains reference documentation for all of the functionality that is available through both Scheme and C interfaces.
Guile's application programming interface (API) makes functionality available that an application developer can use in either C or Scheme programming. The interface consists of elements that may be macros, functions or variables in C, and procedures, variables, syntax or other types of object in Scheme.
Many elements are available to both Scheme and C, in a form that is
appropriate. For example, the assq Scheme procedure is also
available as scm_assq to C code. These elements are documented
only once, addressing both the Scheme and C aspects of them.
The Scheme name of an element is related to its C name in a regular way. Also, a C function takes its parameters in a systematic way.
Normally, the name of a C function can be derived given its Scheme name, using some simple textual transformations:
- (hyphen) with _ (underscore).
? (question mark) with _p.
! (exclamation point) with _x.
-> with _to_.
<= (less than or equal) with _leq.
>= (greater than or equal) with _geq.
< (less than) with _less.
> (greater than) with _gr.
scm_.
A C function always takes a fixed number of arguments of type
SCM, even when the corresponding Scheme function takes a
variable number.
For some Scheme functions, some last arguments are optional; the
corresponding C function must always be invoked with all optional
arguments specified. To get the effect as if an argument has not been
specified, pass SCM_UNDEFINED as its value. You can not do
this for an argument in the middle; when one argument is
SCM_UNDEFINED all the ones following it must be
SCM_UNDEFINED as well.
Some Scheme functions take an arbitrary number of rest arguments; the corresponding C function must be invoked with a list of all these arguments. This list is always the last argument of the C function.
These two variants can also be combined.
The type of the return value of a C function that corresponds to a
Scheme function is always SCM. In the descriptions below,
types are therefore often omitted bot for the return value and for the
arguments.
Guile represents all Scheme values with the single C type SCM.
For an introduction to this topic, See Dynamic Types.
SCMis the user level abstract C type that is used to represent all of Guile's Scheme objects, no matter what the Scheme object type is. No C operation except assignment is guaranteed to work with variables of typeSCM, so you should only use macros and functions to work withSCMvalues. Values are converted between C data types and theSCMtype with utility functions and macros.
scm_t_bitsis an unsigned integral data type that is guaranteed to be large enough to hold all information that is required to represent any Scheme object. While this data type is mostly used to implement Guile's internals, the use of this type is also necessary to write certain kinds of extensions to Guile.
Transforms the
SCMvalue x into its representation as an integral type. Only after applyingSCM_UNPACKit is possible to access the bits and contents of theSCMvalue.
Takes a valid integral representation of a Scheme object and transforms it into its representation as a
SCMvalue.
Each thread that wants to use functions from the Guile API needs to
put itself into guile mode with either scm_with_guile or
scm_init_guile. The global state of Guile is initialized
automatically when the first thread enters guile mode.
When a thread wants to block outside of a Guile API function, it
should leave guile mode temporarily with scm_without_guile,
See Blocking.
Threads that are created by call-with-new-thread or
scm_spawn_thread start out in guile mode so you don't need to
initialize them.
Call func, passing it data and return what func returns. While func is running, the current thread is in guile mode and can thus use the Guile API.
When
scm_with_guileis called from guile mode, the thread remains in guile mode whenscm_with_guilereturns.Otherwise, it puts the current thread into guile mode and, if needed, gives it a Scheme representation that is contained in the list returned by
all-threads, for example. This Scheme representation is not removed whenscm_with_guilereturns so that a given thread is always represented by the same Scheme value during its lifetime, if at all.When this is the first thread that enters guile mode, the global state of Guile is initialized before calling
func.The function func is called via
scm_with_continuation_barrier; thus,scm_with_guilereturns exactly once.When
scm_with_guilereturns, the thread is no longer in guile mode (except whenscm_with_guilewas called from guile mode, see above). Thus, onlyfunccan storeSCMvariables on the stack and be sure that they are protected from the garbage collector. Seescm_init_guilefor another approach at initializing Guile that does not have this restriction.It is OK to call
scm_with_guilewhile a thread has temporarily left guile mode viascm_without_guile. It will then simply temporarily enter guile mode again.
Arrange things so that all of the code in the current thread executes as if from within a call to
scm_with_guile. That is, all functions called by the current thread can assume thatSCMvalues on their stack frames are protected from the garbage collector (except when the thread has explicitly left guile mode, of course).When
scm_init_guileis called from a thread that already has been in guile mode once, nothing happens. This behavior matters when you callscm_init_guilewhile the thread has only temporarily left guile mode: in that case the thread will not be in guile mode afterscm_init_guilereturns. Thus, you should not usescm_init_guilein such a scenario.When a uncaught throw happens in a thread that has been put into guile mode via
scm_init_guile, a short message is printed to the current error port and the thread is exited viascm_pthread_exit (NULL). No restrictions are placed on continuations.The function
scm_init_guilemight not be available on all platforms since it requires some stack-bounds-finding magic that might not have been ported to all platforms that Guile runs on. Thus, if you can, it is better to usescm_with_guileor its variationscm_boot_guileinstead of this function.
Enter guile mode as with
scm_with_guileand call main_func, passing it data, argc, and argv as indicated. When main_func returns,scm_boot_guilecallsexit (0);scm_boot_guilenever returns. If you want some other exit value, have main_func callexititself. If you don't want to exit at all, usescm_with_guileinstead ofscm_boot_guile.The function
scm_boot_guilearranges for the Schemecommand-linefunction to return the strings given by argc and argv. If main_func modifies argc or argv, it should callscm_set_program_argumentswith the final list, so Scheme code will know which arguments have been processed (see Runtime Environment).
Process command-line arguments in the manner of the
guileexecutable. This includes loading the normal Guile initialization files, interacting with the user or running any scripts or expressions specified by-sor-eoptions, and then exiting. See Invoking Guile, for more details.Since this function does not return, you must do all application-specific initialization before calling this function.
The following macros do two different things: when compiled normally,
they expand in one way; when processed during snarfing, they cause the
guile-snarf program to pick up some initialization code,
See Function Snarfing.
The descriptions below use the term `normally' to refer to the case
when the code is compiled normally, and `while snarfing' when the code
is processed by guile-snarf.
Normally,
SCM_SNARF_INITexpands to nothing; while snarfing, it causes code to be included in the initialization action file, followed by a semicolon.This is the fundamental macro for snarfing initialization actions. The more specialized macros below use it internally.
Normally, this macro expands into
static const char s_c_name[] = scheme_name; SCM c_name arglistWhile snarfing, it causes
scm_c_define_gsubr (s_c_name, req, opt, var, c_name);to be added to the initialization actions. Thus, you can use it to declare a C function named c_name that will be made available to Scheme with the name scheme_name.
Note that the arglist argument must have parentheses around it.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_from_locale_symbol (scheme_name));Thus, you can use them declare a static or global variable of type
SCMthat will be initialized to the symbol named scheme_name.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_make_keyword (scheme_name));Thus, you can use them declare a static or global variable of type
SCMthat will be initialized to the keyword named scheme_name.
These macros are equivalent to
SCM_VARIABLE_INITandSCM_GLOBAL_VARIABLE_INIT, respectively, with a value ofSCM_BOOL_F.
Normally, these macros expand into
static SCM c_nameor
SCM c_namerespectively. While snarfing, they both expand into the initialization code
c_name = scm_permanent_object (scm_c_define (scheme_name, value));Thus, you can use them declare a static or global C variable of type
SCMthat will be initialized to the object representing the Scheme variable named scheme_name in the current module. The variable will be defined when it doesn't already exist. It is always set to value.
This chapter describes those of Guile's simple data types which are primarily used for their role as items of generic data. By simple we mean data types that are not primarily used as containers to hold other data — i.e. pairs, lists, vectors and so on. For the documentation of such compound data types, see Compound Data Types.
The two boolean values are #t for true and #f for false.
Boolean values are returned by predicate procedures, such as the general
equality predicates eq?, eqv? and equal?
(see Equality) and numerical and string comparison operators like
string=? (see String Comparison) and <=
(see Comparison).
(<= 3 8)
=> #t
(<= 3 -3)
=> #f
(equal? "house" "houses")
=> #f
(eq? #f #f)
=>
#t
In test condition contexts like if and cond (see if cond case), where a group of subexpressions will be evaluated only if a
condition expression evaluates to “true”, “true” means any
value at all except #f.
(if #t "yes" "no")
=> "yes"
(if 0 "yes" "no")
=> "yes"
(if #f "yes" "no")
=> "no"
A result of this asymmetry is that typical Scheme source code more often
uses #f explicitly than #t: #f is necessary to
represent an if or cond false value, whereas #t is
not necessary to represent an if or cond true value.
It is important to note that #f is not equivalent to any
other Scheme value. In particular, #f is not the same as the
number 0 (like in C and C++), and not the same as the “empty list”
(like in some Lisp dialects).
In C, the two Scheme boolean values are available as the two constants
SCM_BOOL_T for #t and SCM_BOOL_F for #f.
Care must be taken with the false value SCM_BOOL_F: it is not
false when used in C conditionals. In order to test for it, use
scm_is_false or scm_is_true.
Return
#tif obj is either#tor#f, else return#f.
Return
1if val isSCM_BOOL_T, return0when val isSCM_BOOL_F, else signal a `wrong type' error.You should probably use
scm_is_trueinstead of this function when you just want to test aSCMvalue for trueness.
Guile supports a rich “tower” of numerical types — integer, rational, real and complex — and provides an extensive set of mathematical and scientific functions for operating on numerical data. This section of the manual documents those types and functions.
You may also find it illuminating to read R5RS's presentation of numbers in Scheme, which is particularly clear and accessible: see Numbers.
Scheme's numerical “tower” consists of the following categories of numbers:
It is called a tower because each category “sits on” the one that follows it, in the sense that every integer is also a rational, every rational is also real, and every real number is also a complex number (but with zero imaginary part).
In addition to the classification into integers, rationals, reals and
complex numbers, Scheme also distinguishes between whether a number is
represented exactly or not. For example, the result of
2*sin(pi/4) is exactly 2^(1/2), but Guile
can represent neither pi/4 nor 2^(1/2) exactly.
Instead, it stores an inexact approximation, using the C type
double.
Guile can represent exact rationals of any magnitude, inexact
rationals that fit into a C double, and inexact complex numbers
with double real and imaginary parts.
The number? predicate may be applied to any Scheme value to
discover whether the value is any of the supported numerical types.
Return
#tif obj is any kind of number, else#f.
For example:
(number? 3)
=> #t
(number? "hello there!")
=> #f
(define pi 3.141592654)
(number? pi)
=> #t
The next few subsections document each of Guile's numerical data types in detail.
Integers are whole numbers, that is numbers with no fractional part, such as 2, 83, and −3789.
Integers in Guile can be arbitrarily big, as shown by the following example.
(define (factorial n)
(let loop ((n n) (product 1))
(if (= n 0)
product
(loop (- n 1) (* product n)))))
(factorial 3)
=> 6
(factorial 20)
=> 2432902008176640000
(- (factorial 45))
=> -119622220865480194561963161495657715064383733760000000000
Readers whose background is in programming languages where integers are limited by the need to fit into just 4 or 8 bytes of memory may find this surprising, or suspect that Guile's representation of integers is inefficient. In fact, Guile achieves a near optimal balance of convenience and efficiency by using the host computer's native representation of integers where possible, and a more general representation where the required number does not fit in the native form. Conversion between these two representations is automatic and completely invisible to the Scheme level programmer.
The infinities `+inf.0' and `-inf.0' are considered to be inexact integers. They are explained in detail in the next section, together with reals and rationals.
C has a host of different integer types, and Guile offers a host of
functions to convert between them and the SCM representation.
For example, a C int can be handled with scm_to_int and
scm_from_int. Guile also defines a few C integer types of its
own, to help with differences between systems.
C integer types that are not covered can be handled with the generic
scm_to_signed_integer and scm_from_signed_integer for
signed types, or with scm_to_unsigned_integer and
scm_from_unsigned_integer for unsigned types.
Scheme integers can be exact and inexact. For example, a number
written as 3.0 with an explicit decimal-point is inexact, but
it is also an integer. The functions integer? and
scm_is_integer report true for such a number, but the functions
scm_is_signed_integer and scm_is_unsigned_integer only
allow exact integers and thus report false. Likewise, the conversion
functions like scm_to_signed_integer only accept exact
integers.
The motivation for this behavior is that the inexactness of a number
should not be lost silently. If you want to allow inexact integers,
you can explicitly insert a call to inexact->exact or to its C
equivalent scm_inexact_to_exact. (Only inexact integers will
be converted by this call into exact integers; inexact non-integers
will become exact fractions.)
Return
#tif x is an exact or inexact integer number, else#f.(integer? 487) => #t (integer? 3.0) => #t (integer? -3.4) => #f (integer? +inf.0) => #t
The C types are equivalent to the corresponding ISO C types but are defined on all platforms, with the exception of
scm_t_int64andscm_t_uint64, which are only defined when a 64-bit type is available. For example,scm_t_int8is equivalent toint8_t.You can regard these definitions as a stop-gap measure until all platforms provide these types. If you know that all the platforms that you are interested in already provide these types, it is better to use them directly instead of the types provided by Guile.
Return
1when x represents an exact integer that is between min and max, inclusive.These functions can be used to check whether a
SCMvalue will fit into a given range, such as the range of a given C integer type. If you just want to convert aSCMvalue to a given C integer type, use one of the conversion functions directly.
When x represents an exact integer that is between min and max inclusive, return that integer. Else signal an error, either a `wrong-type' error when x is not an exact integer, or an `out-of-range' error when it doesn't fit the given range.
Return the
SCMvalue that represents the integer x. This function will always succeed and will always return an exact number.
When x represents an exact integer that fits into the indicated C type, return that integer. Else signal an error, either a `wrong-type' error when x is not an exact integer, or an `out-of-range' error when it doesn't fit the given range.
The functions
scm_to_long_long,scm_to_ulong_long,scm_to_int64, andscm_to_uint64are only available when the corresponding types are.
Return the
SCMvalue that represents the integer x. These functions will always succeed and will always return an exact number.
Assign val to the multiple precision integer rop. val must be an exact integer, otherwise an error will be signalled. rop must have been initialized with
mpz_initbefore this function is called. When rop is no longer needed the occupied space must be freed withmpz_clear. See Initializing Integers, for details.
Mathematically, the real numbers are the set of numbers that describe all possible points along a continuous, infinite, one-dimensional line. The rational numbers are the set of all numbers that can be written as fractions p/q, where p and q are integers. All rational numbers are also real, but there are real numbers that are not rational, for example the square root of 2, and pi.
Guile can represent both exact and inexact rational numbers, but it
can not represent irrational numbers. Exact rationals are represented
by storing the numerator and denominator as two exact integers.
Inexact rationals are stored as floating point numbers using the C
type double.
Exact rationals are written as a fraction of integers. There must be no whitespace around the slash:
1/2
-22/7
Even though the actual encoding of inexact rationals is in binary, it may be helpful to think of it as a decimal number with a limited number of significant figures and a decimal point somewhere, since this corresponds to the standard notation for non-whole numbers. For example:
0.34
-0.00000142857931198
-5648394822220000000000.0
4.0
The limited precision of Guile's encoding means that any “real” number
in Guile can be written in a rational form, by multiplying and then dividing
by sufficient powers of 10 (or in fact, 2). For example,
`-0.00000142857931198' is the same as −142857931198 divided by
100000000000000000. In Guile's current incarnation, therefore, the
rational? and real? predicates are equivalent.
Dividing by an exact zero leads to a error message, as one might expect. However, dividing by an inexact zero does not produce an error. Instead, the result of the division is either plus or minus infinity, depending on the sign of the divided number.
The infinities are written `+inf.0' and `-inf.0',
respectivly. This syntax is also recognized by read as an
extension to the usual Scheme syntax.
Dividing zero by zero yields something that is not a number at all: `+nan.0'. This is the special `not a number' value.
On platforms that follow IEEE 754 for their floating point
arithmetic, the `+inf.0', `-inf.0', and `+nan.0' values
are implemented using the corresponding IEEE 754 values.
They behave in arithmetic operations like IEEE 754 describes
it, i.e., (= +nan.0 +nan.0) => #f.
The infinities are inexact integers and are considered to be both even
and odd. While `+nan.0' is not = to itself, it is
eqv? to itself.
To test for the special values, use the functions inf? and
nan?.
Return
#tif obj is a real number, else#f. Note that the sets of integer and rational values form subsets of the set of real numbers, so the predicate will also be fulfilled if obj is an integer number or a rational number.
Return
#tif x is a rational number,#fotherwise. Note that the set of integer values forms a subset of the set of rational numbers, i. e. the predicate will also be fulfilled if x is an integer number.Since Guile can not represent irrational numbers, every number satisfying
real?also satisfiesrational?in Guile.
Returns the simplest rational number differing from x by no more than eps.
As required by R5RS,
rationalizeonly returns an exact result when both its arguments are exact. Thus, you might need to useinexact->exacton the arguments.(rationalize (inexact->exact 1.2) 1/100) => 6/5
Return
#tif x is either `+inf.0' or `-inf.0',#fotherwise.
Return the numerator of the rational number x.
Return the denominator of the rational number x.
Equivalent to
scm_is_true (scm_real_p (val))andscm_is_true (scm_rational_p (val)), respectively.
Returns the number closest to val that is representable as a
double. Returns infinity for a val that is too large in magnitude. The argument val must be a real number.
Return the
SCMvalue that representats val. The returned value is inexact according to the predicateinexact?, but it will be exactly equal to val.
Complex numbers are the set of numbers that describe all possible points in a two-dimensional space. The two coordinates of a particular point in this space are known as the real and imaginary parts of the complex number that describes that point.
In Guile, complex numbers are written in rectangular form as the sum of
their real and imaginary parts, using the symbol i to indicate
the imaginary part.
3+4i
=>
3.0+4.0i
(* 3-8i 2.3+0.3i)
=>
9.3-17.5i
Polar form can also be used, with an `@' between magnitude and angle,
1@3.141592 => -1.0 (approx)
-1@1.57079 => 0.0-1.0i (approx)
Guile represents a complex number with a non-zero imaginary part as a pair of inexact rationals, so the real and imaginary parts of a complex number have the same properties of inexactness and limited precision as single inexact rational numbers. Guile can not represent exact complex numbers with non-zero imaginary parts.
Return
#tif x is a complex number,#fotherwise. Note that the sets of real, rational and integer values form subsets of the set of complex numbers, i. e. the predicate will also be fulfilled if x is a real, rational or integer number.
R5RS requires that a calculation involving inexact numbers always
produces an inexact result. To meet this requirement, Guile
distinguishes between an exact integer value such as `5' and the
corresponding inexact real value which, to the limited precision
available, has no fractional part, and is printed as `5.0'. Guile
will only convert the latter value to the former when forced to do so by
an invocation of the inexact->exact procedure.
Return
#tif the number z is exact,#fotherwise.(exact? 2) => #t (exact? 0.5) => #f (exact? (/ 2)) => #t
Return
#tif the number z is inexact,#felse.
Return an exact number that is numerically closest to z, when there is one. For inexact rationals, Guile returns the exact rational that is numerically equal to the inexact rational. Inexact complex numbers with a non-zero imaginary part can not be made exact.
(inexact->exact 0.5) => 1/2The following happens because 12/10 is not exactly representable as a
double(on most platforms). However, when reading a decimal number that has been marked exact with the “#e” prefix, Guile is able to represent it correctly.(inexact->exact 1.2) => 5404319552844595/4503599627370496 #e1.2 => 6/5
Convert the number z to its inexact representation.
The read syntax for integers is a string of digits, optionally preceded by a minus or plus character, a code indicating the base in which the integer is encoded, and a code indicating whether the number is exact or inexact. The supported base codes are:
#b#B#o#O#d#D#x#XIf the base code is omitted, the integer is assumed to be decimal. The following examples show how these base codes are used.
-13
=> -13
#d-13
=> -13
#x-13
=> -19
#b+1101
=> 13
#o377
=> 255
The codes for indicating exactness (which can, incidentally, be applied to all numerical values) are:
#e#E#i#IIf the exactness indicator is omitted, the number is exact unless it contains a radix point. Since Guile can not represent exact complex numbers, an error is signalled when asking for them.
(exact? 1.2)
=> #f
(exact? #e1.2)
=> #t
(exact? #e+1i)
ERROR: Wrong type argument
Guile also understands the syntax `+inf.0' and `-inf.0' for plus and minus infinity, respectively. The value must be written exactly as shown, that is, they always must have a sign and exactly one zero digit after the decimal point. It also understands `+nan.0' and `-nan.0' for the special `not-a-number' value. The sign is ignored for `not-a-number' and the value is always printed as `+nan.0'.
Return
#tif n is an odd number,#fotherwise.
Return
#tif n is an even number,#fotherwise.
Return the quotient or remainder from n divided by d. The quotient is rounded towards zero, and the remainder will have the same sign as n. In all cases quotient and remainder satisfy n = q*d + r.
(remainder 13 4) => 1 (remainder -13 4) => -1
Return the remainder from n divided by d, with the same sign as d.
(modulo 13 4) => 1 (modulo -13 4) => 3 (modulo 13 -4) => -3 (modulo -13 -4) => -1
Return the greatest common divisor of all arguments. If called without arguments, 0 is returned.
The C function
scm_gcdalways takes two arguments, while the Scheme function can take an arbitrary number.
Return the least common multiple of the arguments. If called without arguments, 1 is returned.
The C function
scm_lcmalways takes two arguments, while the Scheme function can take an arbitrary number.
Return n raised to the integer exponent k, modulo m.
(modulo-expt 2 3 5) => 3
The C comparison functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. Also keep in mind that
the C functions return one of the Scheme boolean values
SCM_BOOL_T or SCM_BOOL_F which are both true as far as C
is concerned. Thus, always write scm_is_true (scm_num_eq_p (x,
y)) when testing the two Scheme numbers x and y for
equality, for example.
Return
#tif all parameters are numerically equal.
Return
#tif the list of parameters is monotonically increasing.
Return
#tif the list of parameters is monotonically decreasing.
Return
#tif the list of parameters is monotonically non-decreasing.
Return
#tif the list of parameters is monotonically non-increasing.
Return
#tif z is an exact or inexact number equal to zero.
Return
#tif x is an exact or inexact number greater than zero.
Return
#tif x is an exact or inexact number less than zero.
Return a string holding the external representation of the number n in the given radix. If n is inexact, a radix of 10 will be used.
Return a number of the maximally precise representation expressed by the given string. radix must be an exact integer, either 2, 8, 10, or 16. If supplied, radix is a default radix that may be overridden by an explicit radix prefix in string (e.g. "#o177"). If radix is not supplied, then the default radix is 10. If string is not a syntactically valid notation for a number, then
string->numberreturns#f.
As per
string->numberabove, but taking a C string, as pointer and length. The string characters should be in the current locale encoding (localein the name refers only to that, there's no locale-dependent parsing).
Return a complex number constructed of the given real and imaginary parts.
Return the real part of the number z.
Return the imaginary part of the number z.
Return the magnitude of the number z. This is the same as
absfor real arguments, but also allows complex numbers.
Like
scm_make_rectangularorscm_make_polar, respectively, but these functions takedoubles as their arguments.
Returns the real or imaginary part of z as a
double.
Returns the magnitude or angle of z as a
double.
The C arithmetic functions below always takes two arguments, while the
Scheme functions can take an arbitrary number. When you need to
invoke them with just one argument, for example to compute the
equivalent od (- x), pass SCM_UNDEFINED as the second
one: scm_difference (x, SCM_UNDEFINED).
Return the sum of all parameter values. Return 0 if called without any parameters.
If called with one argument z1, -z1 is returned. Otherwise the sum of all but the first argument are subtracted from the first argument.
Return the product of all arguments. If called without arguments, 1 is returned.
Divide the first argument by the product of the remaining arguments. If called with one argument z1, 1/z1 is returned.
Return the absolute value of x.
x must be a number with zero imaginary part. To calculate the magnitude of a complex number, use
magnitudeinstead.
Return the maximum of all parameter values.
Return the minimum of all parameter values.
Round the inexact number x towards zero.
Round the inexact number x to the nearest integer. When exactly halfway between two integers, round to the even one.
Like
scm_truncate_numberorscm_round_number, respectively, but these functions take and returndoublevalues.
The following procedures accept any kind of number as arguments, including complex numbers.
Return the square root of z. Of the two possible roots (positive and negative), the one with the a positive real part is returned, or if that's zero then a positive imaginary part. Thus,
(sqrt 9.0) => 3.0 (sqrt -9.0) => 0.0+3.0i (sqrt 1.0+1.0i) => 1.09868411346781+0.455089860562227i (sqrt -1.0-1.0i) => 0.455089860562227-1.09868411346781i
Return e to the power of z, where e is the base of natural logarithms (2.71828...).
Many of Guile's numeric procedures which accept any kind of numbers as arguments, including complex numbers, are implemented as Scheme procedures that use the following real number-based primitives. These primitives signal an error if they are called with complex arguments.
Return x raised to the power of y. This procedure does not accept complex arguments.
Return the arc tangent of the two arguments x and y. This is similar to calculating the arc tangent of x / y, except that the signs of both arguments are used to determine the quadrant of the result. This procedure does not accept complex arguments.
Return e to the power of x, where e is the base of natural logarithms (2.71828...).
C functions for the above are provided by the standard mathematics
library. Naturally these expect and return double arguments
(see Mathematics).
| Scheme Procedure | C Function
| |
$abs | fabs
| |
$sqrt | sqrt
| |
$sin | sin
| |
$cos | cos
| |
$tan | tan
| |
$asin | asin
| |
$acos | acos
| |
$atan | atan
| |
$atan2 | atan2
| |
$exp | exp
| |
$expt | pow
| |
$log | log
| |
$sinh | sinh
| |
$cosh | cosh
| |
$tanh | tanh
| |
$asinh | asinh
| |
$acosh | acosh
| |
$atanh | atanh
|
asinh, acosh and atanh are C99 standard but might
not be available on older systems. Guile provides the following
equivalents (on all systems).
Return the hyperbolic arcsine, arccosine or arctangent of x respectively.
For the following bitwise functions, negative numbers are treated as infinite precision twos-complements. For instance -6 is bits ...111010, with infinitely many ones on the left. It can be seen that adding 6 (binary 110) to such a bit pattern gives all zeros.
Return the bitwise and of the integer arguments.
(logand) => -1 (logand 7) => 7 (logand #b111 #b011 #b001) => 1
Return the bitwise or of the integer arguments.
(logior) => 0 (logior 7) => 7 (logior #b000 #b001 #b011) => 3
Return the bitwise xor of the integer arguments. A bit is set in the result if it is set in an odd number of arguments.
(logxor) => 0 (logxor 7) => 7 (logxor #b000 #b001 #b011) => 2 (logxor #b000 #b001 #b011 #b011) => 1
Return the integer which is the ones-complement of the integer argument, ie. each 0 bit is changed to 1 and each 1 bit to 0.
(number->string (lognot #b10000000) 2) => "-10000001" (number->string (lognot #b0) 2) => "-1"
Test whether j and k have any 1 bits in common. This is equivalent to
(not (zero? (logand j k))), but without actually calculating thelogand, just testing for non-zero.(logtest #b0100 #b1011) => #f (logtest #b0100 #b0111) => #t
Test whether bit number index in j is set. index starts from 0 for the least significant bit.
(logbit? 0 #b1101) => #t (logbit? 1 #b1101) => #f (logbit? 2 #b1101) => #t (logbit? 3 #b1101) => #t (logbit? 4 #b1101) => #f
Return n shifted left by cnt bits, or shifted right if cnt is negative. This is an “arithmetic” shift.
This is effectively a multiplication by 2^cnt, and when cnt is negative it's a division, rounded towards negative infinity. (Note that this is not the same rounding as
quotientdoes.)With n viewed as an infinite precision twos complement,
ashmeans a left shift introducing zero bits, or a right shift dropping bits.(number->string (ash #b1 3) 2) => "1000" (number->string (ash #b1010 -1) 2) => "101" ;; -23 is bits ...11101001, -6 is bits ...111010 (ash -23 -2) => -6
Return the number of bits in integer n. If n is positive, the 1-bits in its binary representation are counted. If negative, the 0-bits in its two's-complement binary representation are counted. If zero, 0 is returned.
(logcount #b10101010) => 4 (logcount 0) => 0 (logcount -2) => 1
Return the number of bits necessary to represent n.
For positive n this is how many bits to the most significant one bit. For negative n it's how many bits to the most significant zero bit in twos complement form.
(integer-length #b10101010) => 8 (integer-length #b1111) => 4 (integer-length 0) => 0 (integer-length -1) => 0 (integer-length -256) => 8 (integer-length -257) => 9
Return n raised to the power k. k must be an exact integer, n can be any number.
Negative k is supported, and results in 1/n^abs(k) in the usual way. n^0 is 1, as usual, and that includes 0^0 is 1.
(integer-expt 2 5) => 32 (integer-expt -3 3) => -27 (integer-expt 5 -3) => 1/125 (integer-expt 0 0) => 1
Return the integer composed of the start (inclusive) through end (exclusive) bits of n. The startth bit becomes the 0-th bit in the result.
(number->string (bit-extract #b1101101010 0 4) 2) => "1010" (number->string (bit-extract #b1101101010 4 9) 2) => "10110"
Pseudo-random numbers are generated from a random state object, which
can be created with seed->random-state. The state
parameter to the various functions below is optional, it defaults to
the state object in the *random-state* variable.
Return a copy of the random state state.
Return a number in [0, n).
Accepts a positive integer or real n and returns a number of the same type between zero (inclusive) and n (exclusive). The values returned have a uniform distribution.
Return an inexact real in an exponential distribution with mean 1. For an exponential distribution with mean u use
(*u(random:exp)).
Fills vect with inexact real random numbers the sum of whose squares is equal to 1.0. Thinking of vect as coordinates in space of dimension n =
(vector-lengthvect), the coordinates are uniformly distributed over the surface of the unit n-sphere.
Return an inexact real in a normal distribution. The distribution used has mean 0 and standard deviation 1. For a normal distribution with mean m and standard deviation d use
(+m(*d(random:normal))).
Fills vect with inexact real random numbers that are independent and standard normally distributed (i.e., with mean 0 and variance 1).
Fills vect with inexact real random numbers the sum of whose squares is less than 1.0. Thinking of vect as coordinates in space of dimension n =
(vector-lengthvect), the coordinates are uniformly distributed within the unit n-sphere.
Return a uniformly distributed inexact real random number in [0,1).
Return a new random state using seed.
The global random state used by the above functions when the state parameter is not given.
Note that the initial value of *random-state* is the same every
time Guile starts up. Therefore, if you don't pass a state
parameter to the above procedures, and you don't set
*random-state* to (seed->random-state your-seed), where
your-seed is something that isn't the same every time,
you'll get the same sequence of “random” numbers on every run.
For example, unless the relevant source code has changed, (map
random (cdr (iota 30))), if the first use of random numbers since
Guile started up, will always give:
(map random (cdr (iota 19)))
=>
(0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)
To use the time of day as the random seed, you can use code like this:
(let ((time (gettimeofday)))
(set! *random-state*
(seed->random-state (+ (car time)
(cdr time)))))
And then (depending on the time of day, of course):
(map random (cdr (iota 19)))
=>
(0 0 1 0 2 4 5 4 5 5 9 3 10 1 8 3 14 17)
For security applications, such as password generation, you should use more bits of seed. Otherwise an open source password generator could be attacked by guessing the seed... but that's a subject for another manual.
In Scheme, a character literal is written as #\name where
name is the name of the character that you want. Printable
characters have their usual single character name; for example,
#\a is a lower case a.
Most of the “control characters” (those below codepoint 32) in the
ASCII character set, as well as the space, may be referred
to by longer names: for example, #\tab, #\esc,
#\stx, and so on. The following table describes the
ASCII names for each character.
0 = #\nul
| 1 = #\soh
| 2 = #\stx
| 3 = #\etx
|
4 = #\eot
| 5 = #\enq
| 6 = #\ack
| 7 = #\bel
|
8 = #\bs
| 9 = #\ht
| 10 = #\nl
| 11 = #\vt
|
12 = #\np
| 13 = #\cr
| 14 = #\so
| 15 = #\si
|
16 = #\dle
| 17 = #\dc1
| 18 = #\dc2
| 19 = #\dc3
|
20 = #\dc4
| 21 = #\nak
| 22 = #\syn
| 23 = #\etb
|
24 = #\can
| 25 = #\em
| 26 = #\sub
| 27 = #\esc
|
28 = #\fs
| 29 = #\gs
| 30 = #\rs
| 31 = #\us
|
32 = #\sp
|
The “delete” character (octal 177) may be referred to with the name
#\del.
Several characters have more than one name:
| Alias | Original
|
#\space | #\sp
|
#\newline | #\nl
|
#\tab | #\ht
|
#\backspace | #\bs
|
#\return | #\cr
|
#\page | #\np
|
#\null | #\nul
|
Return
#tiff x is less than or equal to y in the ASCII sequence, else#f.
Return
#tiff x is greater than or equal to y in the ASCII sequence, else#f.
Return
#tiff x is the same character as y ignoring case, else#f.
Return
#tiff x is less than y in the ASCII sequence ignoring case, else#f.
Return
#tiff x is less than or equal to y in the ASCII sequence ignoring case, else#f.
Return
#tiff x is greater than y in the ASCII sequence ignoring case, else#f.
Return
#tiff x is greater than or equal to y in the ASCII sequence ignoring case, else#f.
Return
#tiff chr is alphabetic, else#f.
Return
#tiff chr is numeric, else#f.
Return
#tiff chr is whitespace, else#f.
Return
#tiff chr is uppercase, else#f.
Return
#tiff chr is lowercase, else#f.
Return
#tiff chr is either uppercase or lowercase, else#f.
Return the number corresponding to ordinal position of chr in the ASCII sequence.
Return the character at position n in the ASCII sequence.
Return the uppercase character version of chr.
Return the lowercase character version of chr.
The features described in this section correspond directly to SRFI-14.
The data type charset implements sets of characters (see Characters). Because the internal representation of character sets is not visible to the user, a lot of procedures for handling them are provided.
Character sets can be created, extended, tested for the membership of a characters and be compared to other character sets.
The Guile implementation of character sets currently deals only with 8-bit characters. In the future, when Guile gets support for international character sets, this will change, but the functions provided here will always then be able to efficiently cope with very large character sets.
Use these procedures for testing whether an object is a character set,
or whether several character sets are equal or subsets of each other.
char-set-hash can be used for calculating a hash value, maybe for
usage in fast lookup procedures.
Return
#tif obj is a character set,#fotherwise.
Return
#tif all given character sets are equal.
Return
#tif every character set csi is a subset of character set csi+1.
Compute a hash value for the character set cs. If bound is given and non-zero, it restricts the returned value to the range 0 ... bound - 1.
Character set cursors are a means for iterating over the members of a
character sets. After creating a character set cursor with
char-set-cursor, a cursor can be dereferenced with
char-set-ref, advanced to the next member with
char-set-cursor-next. Whether a cursor has passed past the last
element of the set can be checked with end-of-char-set?.
Additionally, mapping and (un-)folding procedures for character sets are provided.
Return a cursor into the character set cs.
Return the character at the current cursor position cursor in the character set cs. It is an error to pass a cursor for which
end-of-char-set?returns true.
Advance the character set cursor cursor to the next character in the character set cs. It is an error if the cursor given satisfies
end-of-char-set?.
Return
#tif cursor has reached the end of a character set,#fotherwise.
Fold the procedure kons over the character set cs, initializing it with knil.
This is a fundamental constructor for character sets.
- g is used to generate a series of “seed” values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of the seed values.
- f maps each seed value to a character. These characters are added to the base character set base_cs to form the result; base_cs defaults to the empty set.
This is a fundamental constructor for character sets.
- g is used to generate a series of “seed” values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of the seed values.
- f maps each seed value to a character. These characters are added to the base character set base_cs to form the result; base_cs defaults to the empty set.
Apply proc to every character in the character set cs. The return value is not specified.
Map the procedure proc over every character in cs. proc must be a character -> character procedure.
New character sets are produced with these procedures.
Return a newly allocated character set containing all characters in cs.
Return a character set containing all given characters.
Convert the character list list to a character set. If the character set base_cs is given, the character in this set are also included in the result.
Convert the character list list to a character set. The characters are added to base_cs and base_cs is returned.
Convert the string str to a character set. If the character set base_cs is given, the characters in this set are also included in the result.
Convert the string str to a character set. The characters from the string are added to base_cs, and base_cs is returned.
Return a character set containing every character from cs so that it satisfies pred. If provided, the characters from base_cs are added to the result.
Return a character set containing every character from cs so that it satisfies pred. The characters are added to base_cs and base_cs is returned.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is
#f, these characters are silently left out of the resultung character set.The characters in base_cs are added to the result, if given.
Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).
If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is
#f, these characters are silently left out of the resultung character set.The characters are added to base_cs and base_cs is returned.
Coerces x into a char-set. x may be a string, character or char-set. A string is converted to the set of its constituent characters; a character is converted to a singleton set; a char-set is returned as-is.
Access the elements and other information of a character set with these procedures.
Return the number of elements in character set cs.
Return the number of the elements int the character set cs which satisfy the predicate pred.
Return a list containing the elements of the character set cs.
Return a string containing the elements of the character set cs. The order in which the characters are placed in the string is not defined.
Return
#tiff the character ch is contained in the character set cs.
Return a true value if every character in the character set cs satisfies the predicate pred.
Return a true value if any character in the character set cs satisfies the predicate pred.
Character sets can be manipulated with the common set algebra operation, such as union, complement, intersection etc. All of these procedures provide side-effecting variants, which modify their character set argument(s).
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Add all character arguments to the first argument, which must be a character set.
Delete all character arguments from the first argument, which must be a character set.
Return the complement of the character set cs.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
Return the complement of the character set cs.
Return the union of all argument character sets.
Return the intersection of all argument character sets.
Return the difference of all argument character sets.
Return the exclusive-or of all argument character sets.
Return the difference and the intersection of all argument character sets.
In order to make the use of the character set data type and procedures useful, several predefined character set variables exist.
Currently, the contents of these character sets are recomputed upon a
successful setlocale call (see Locales) in order to reflect
the characters available in the current locale's codeset. For
instance, char-set:letter contains 52 characters under an ASCII
locale (e.g., the default C locale) and 117 characters under an
ISO-8859-1 (“Latin-1”) locale.
All lower-case characters.
All upper-case characters.
This is empty, because ASCII has no titlecase characters.
All letters, e.g. the union of
char-set:lower-caseandchar-set:upper-case.
The union of
char-set:letterandchar-set:digit.
All characters which would put ink on the paper.
The union of
char-set:graphicandchar-set:whitespace.
All whitespace characters.
All horizontal whitespace characters, that is
#\spaceand#\tab.
The ISO control characters with the codes 0–31 and 127.
The characters
!"#%&'()*,-./:;?@[\\]_{}
The hexadecimal digits
0123456789abcdefABCDEF.
This character set contains all possible characters.
Strings are fixed-length sequences of characters. They can be created by calling constructor procedures, but they can also literally get entered at the REPL or in Scheme source files.
Strings always carry the information about how many characters they are composed of with them, so there is no special end-of-string character, like in C. That means that Scheme strings can contain any character, even the `#\nul' character `\0'.
To use strings efficiently, you need to know a bit about how Guile implements them. In Guile, a string consists of two parts, a head and the actual memory where the characters are stored. When a string (or a substring of it) is copied, only a new head gets created, the memory is usually not copied. The two heads start out pointing to the same memory.
When one of these two strings is modified, as with string-set!,
their common memory does get copied so that each string has its own
memory and modifying one does not accidently modify the other as well.
Thus, Guile's strings are `copy on write'; the actual copying of their
memory is delayed until one string is written to.
This implementation makes functions like substring very
efficient in the common case that no modifications are done to the
involved strings.
If you do know that your strings are getting modified right away, you
can use substring/copy instead of substring. This
function performs the copy immediately at the time of creation. This
is more efficient, especially in a multi-threaded program. Also,
substring/copy can avoid the problem that a short substring
holds on to the memory of a very large original string that could
otherwise be recycled.
If you want to avoid the copy altogether, so that modifications of one
string show up in the other, you can use substring/shared. The
strings created by this procedure are called mutation sharing
substrings since the substring and the original string share
modifications to each other.
If you want to prevent modifications, use substring/read-only.
Guile provides all procedures of SRFI-13 and a few more.
The read syntax for strings is an arbitrarily long sequence of
characters enclosed in double quotes (").
Backslash is an escape character and can be used to insert the
following special characters. \" and \\ are R5RS
standard, the rest are Guile extensions, notice they follow C string
syntax.
\\\"" is otherwise the end
of the string).
\0\a\f\n\r\t\v\xHH\x7f for an ASCII DEL (127).
The following are examples of string literals:
"foo"
"bar plonk"
"Hello World"
"\"Hi\", he said."
The following procedures can be used to check whether a given string fulfills some specified property.
Return
#tif obj is a string, else#f.
Return
#tif str's length is zero, and#fotherwise.(string-null? "") => #t y => "foo" (string-null? y) => #f
Check if char_pred is true for any character in string s.
char_pred can be a character to check for any equal to that, or a character set (see Character Sets) to check for any in that set, or a predicate procedure to call.
For a procedure, calls
(char_predc)are made successively on the characters from start to end. If char_pred returns true (ie. non-#f),string-anystops and that return value is the return fromstring-any. The call on the last character (ie. at end-1), if that point is reached, is a tail call.If there are no characters in s (ie. start equals end) then the return is
#f.
Check if char_pred is true for every character in string s.
char_pred can be a character to check for every character equal to that, or a character set (see Character Sets) to check for every character being in that set, or a predicate procedure to call.
For a procedure, calls
(char_predc)are made successively on the characters from start to end. If char_pred returns#f,string-everystops and returns#f. The call on the last character (ie. at end-1), if that point is reached, is a tail call and the return from that call is the return fromstring-every.If there are no characters in s (ie. start equals end) then the return is
#t.
The string constructor procedures create new string objects, possibly initializing them with some specified character data. See also See String Selection, for ways to create strings from existing strings.
Return a newly allocated string made from the given character arguments.
(string #\x #\y #\z) => "xyz" (string) => ""
Return a newly allocated string made from a list of characters.
(list->string '(#\a #\b #\c)) => "abc"
Return a newly allocated string made from a list of characters, in reverse order.
(reverse-list->string '(#\a #\B #\c)) => "cBa"
Return a newly allocated string of length k. If chr is given, then all elements of the string are initialized to chr, otherwise the contents of the string are unspecified.
Like
scm_make_string, but expects the length as asize_t.
proc is an integer->char procedure. Construct a string of size len by applying proc to each index to produce the corresponding string element. The order in which proc is applied to the indices is not specified.
Append the string in the string list ls, using the string delim as a delimiter between the elements of ls. grammar is a symbol which specifies how the delimiter is placed between the strings, and defaults to the symbol
infix.
infix- Insert the separator between list elements. An empty string will produce an empty list.
string-infix- Like
infix, but will raise an error if given the empty list.suffix- Insert the separator after every list element.
prefix- Insert the separator before each list element.
When processing strings, it is often convenient to first convert them
into a list representation by using the procedure string->list,
work with the resulting list, and then convert it back into a string.
These procedures are useful for similar tasks.
Convert the string str into a list of characters.
Split the string str into the a list of the substrings delimited by appearances of the character chr. Note that an empty substring between separator characters will result in an empty string in the result list.
(string-split "root:x:0:0:root:/root:/bin/bash" #\:) => ("root" "x" "0" "0" "root" "/root" "/bin/bash") (string-split "::" #\:) => ("" "" "") (string-split "" #\:) => ("")
Portions of strings can be extracted by these procedures.
string-ref delivers individual characters whereas
substring can be used to extract substrings from longer strings.
Return the number of characters in string.
Return the number of characters in str as a
size_t.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return character k of str using zero-origin indexing. k must be a valid index of str.
Return a copy of the given string str.
The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Return a new string formed from the characters of str beginning with index start (inclusive) and ending with index end (exclusive). str must be a string, start and end must be exact integers satisfying:
0 <= start <= end <=
(string-lengthstr).The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.
Like
substring, but the strings continue to share their storage even if they are modified. Thus, modifications to str show up in the new string, and vice versa.
Like
substring, but the storage for the new string is copied immediately.
Like
substring, but the resulting string can not be modified.
Like
scm_substring, etc. but the bounds are given as asize_t.
Return the n first characters of s.
Return all but the first n characters of s.
Return the n last characters of s.
Return all but the last n characters of s.
Take characters start to end from the string s and either pad with char or truncate them to give len characters.
string-padpads or truncates on the left, so for example(string-pad "x" 3) => " x" (string-pad "abcde" 3) => "cde"
string-pad-rightpads or truncates on the right, so for example(string-pad-right "x" 3) => "x " (string-pad-right "abcde" 3) => "abc"
Trim occurrances of char_pred from the ends of s.
string-trimtrims char_pred characters from the left (start) of the string,string-trim-righttrims them from the right (end) of the string,string-trim-bothtrims from both ends.char_pred can be a character, a character set, or a predicate procedure to call on each character. If char_pred is not given the default is whitespace as per
char-set:whitespace(see Standard Character Sets).(string-trim " x ") => "x " (string-trim-right "banana" #\a) => "banan" (string-trim-both ".,xy:;" char-set:punctuation) => "xy" (string-trim-both "xyzzy" (lambda (c) (or (eqv? c #\x) (eqv? c #\y)))) => "zz"
These procedures are for modifying strings in-place. This means that the result of the operation is not a new string; instead, the original string's memory representation is modified.
Store chr in element k of str and return an unspecified value. k must be a valid index of str.
Like
scm_string_set_x, but the index is given as asize_t.
Stores chr in every element of the given str and returns an unspecified value.
Change every character in str between start and end to fill.
(define y "abcdefg") (substring-fill! y 1 3 #\r) y => "arrdefg"
Copy the substring of str1 bounded by start1 and end1 into str2 beginning at position start2. str1 and str2 can be the same string.
Copy the sequence of characters from index range [start, end) in string s to string target, beginning at index tstart. The characters are copied left-to-right or right-to-left as needed – the copy is guaranteed to work, even if target and s are the same string. It is an error if the copy operation runs off the end of the target string.
The procedures in this section are similar to the character ordering predicates (see Characters), but are defined on character sequences.
The first set is specified in R5RS and has names that end in ?.
The second set is specified in SRFI-13 and the names have no ending
?. The predicates ending in -ci ignore the character case
when comparing strings.
Lexicographic equality predicate; return
#tif the two strings are the same length and contain the same characters in the same positions, otherwise return#f.The procedure
string-ci=?treats upper and lower case letters as though they were the same character, butstring=?treats upper and lower case as distinct characters.
Lexicographic ordering predicate; return
#tif s1 is lexicographically less than s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically less than or equal to s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically greater than s2.
Lexicographic ordering predicate; return
#tif s1 is lexicographically greater than or equal to s2.
Case-insensitive string equality predicate; return
#tif the two strings are the same length and their component characters match (ignoring case) at each position; otherwise return#f.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically less than s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically less than or equal to s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically greater than s2 regardless of case.
Case insensitive lexicographic ordering predicate; return
#tif s1 is lexicographically greater than or equal to s2 regardless of case.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match. The character comparison is done case-insensitively.
Return
#fif s1 and s2 are not equal, a true value otherwise.
Return
#fif s1 and s2 are equal, a true value otherwise.
Return
#fif s1 is greater or equal to s2, a true value otherwise.
Return
#fif s1 is less or equal to s2, a true value otherwise.
Return
#fif s1 is greater to s2, a true value otherwise.
Return
#fif s1 is less to s2, a true value otherwise.
Return
#fif s1 and s2 are not equal, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 and s2 are equal, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is greater or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is less or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is greater to s2, a true value otherwise. The character comparison is done case-insensitively.
Return
#fif s1 is less to s2, a true value otherwise. The character comparison is done case-insensitively.
Compute a hash value for S. the optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Compute a hash value for S. the optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Search through the string s from left to right, returning the index of the first occurence of a character which
- equals char_pred, if it is character,
- satisifies the predicate char_pred, if it is a procedure,
- is in the set char_pred, if it is a character set.
Search through the string s from right to left, returning the index of the last occurence of a character which
- equals char_pred, if it is character,
- satisifies the predicate char_pred, if it is a procedure,
- is in the set if char_pred is a character set.
Return the length of the longest common prefix of the two strings.
Return the length of the longest common prefix of the two strings, ignoring character case.
Return the length of the longest common suffix of the two strings.
Return the length of the longest common suffix of the two strings, ignoring character case.
Is s1 a prefix of s2?
Is s1 a prefix of s2, ignoring character case?
Is s1 a suffix of s2?
Is s1 a suffix of s2, ignoring character case?
Search through the string s from right to left, returning the index of the last occurence of a character which
- equals char_pred, if it is character,
- satisifies the predicate char_pred, if it is a procedure,
- is in the set if char_pred is a character set.
Search through the string s from left to right, returning the index of the first occurence of a character which
- does not equal char_pred, if it is character,
- does not satisify the predicate char_pred, if it is a procedure,
- is not in the set if char_pred is a character set.
Search through the string s from right to left, returning the index of the last occurence of a character which
- does not equal char_pred, if it is character,
- does not satisfy the predicate char_pred, if it is a procedure,
- is not in the set if char_pred is a character set.
Return the count of the number of characters in the string s which
- equals char_pred, if it is character,
- satisifies the predicate char_pred, if it is a procedure.
- is in the set char_pred, if it is a character set.
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings.
Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings. Character comparison is done case-insensitively.
These are procedures for mapping strings to their upper- or lower-case equivalents, respectively, or for capitalizing strings.
Upcase every character in
str.
Destructively upcase every character in
str.(string-upcase! y) => "ARRDEFG" y => "ARRDEFG"
Downcase every character in str.
Destructively downcase every character in str.
y => "ARRDEFG" (string-downcase! y) => "arrdefg" y => "arrdefg"
Return a freshly allocated string with the characters in str, where the first character of every word is capitalized.
Upcase the first character of every word in str destructively and return str.
y => "hello world" (string-capitalize! y) => "Hello World" y => "Hello World"
Titlecase every first character in a word in str.
Destructively titlecase every first character in a word in str.
Reverse the string str. The optional arguments start and end delimit the region of str to operate on.
Reverse the string str in-place. The optional arguments start and end delimit the region of str to operate on. The return value is unspecified.
Return a newly allocated string whose characters form the concatenation of the given strings, args.
(let ((h "hello ")) (string-append h "world")) => "hello world"
Like
string-append, but the result may share memory with the argument strings.
Append the elements of ls (which must be strings) together into a single string. Guaranteed to return a freshly allocated string.
Without optional arguments, this procedure is equivalent to
(string-concatenate (reverse ls))If the optional argument final_string is specified, it is consed onto the beginning to ls before performing the list-reverse and string-concatenate operations. If end is given, only the characters of final_string up to index end are used.
Guaranteed to return a freshly allocated string.
Like
string-concatenate, but the result may share memory with the strings in the list ls.
Like
string-concatenate-reverse, but the result may share memory with the the strings in the ls arguments.
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified.
proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified. The string s is modified in-place, the return value is not specified.
proc is mapped over s in left-to-right order. The return value is not specified.
Call
(proci)for each index i in s, from left to right.For example, to change characters to alternately upper and lower case,
(define str (string-copy "studly")) (string-for-each-index (lambda (i) (string-set! str i ((if (even? i) char-upcase char-downcase) (string-ref str i)))) str) str => "StUdLy"
Fold kons over the characters of s, with knil as the terminating element, from left to right. kons must expect two arguments: The actual character and the last result of kons' application.
Fold kons over the characters of s, with knil as the terminating element, from right to left. kons must expect two arguments: The actual character and the last result of kons' application.
- g is used to generate a series of seed values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of these seed values.
- f maps each seed value to the corresponding character in the result string. These chars are assembled into the string in a left-to-right order.
- base is the optional initial/leftmost portion of the constructed string; it default to the empty string.
- make_final is applied to the terminal seed value (on which p returns true) to produce the final/rightmost portion of the constructed string. The default is nothing extra.
- g is used to generate a series of seed values from the initial seed: seed, (g seed), (g^2 seed), (g^3 seed), ...
- p tells us when to stop – when it returns true when applied to one of these seed values.
- f maps each seed value to the corresponding character in the result string. These chars are assembled into the string in a right-to-left order.
- base is the optional initial/rightmost portion of the constructed string; it default to the empty string.
- make_final is applied to the terminal seed value (on which p returns true) to produce the final/leftmost portion of the constructed string. It defaults to
(lambda (x) ).
This is the extended substring procedure that implements replicated copying of a substring of some string.
s is a string, start and end are optional arguments that demarcate a substring of s, defaulting to 0 and the length of s. Replicate this substring up and down index space, in both the positive and negative directions.
xsubstringreturns the substring of this string beginning at index from, and ending at to, which defaults to from + (end - start).
Exactly the same as
xsubstring, but the extracted text is written into the string target starting at index tstart. The operation is not defined if(eq?target s)or these arguments share storage – you cannot copy a string on top of itself.
Return the string s1, but with the characters start1 ... end1 replaced by the characters start2 ... end2 from s2.
Split the string s into a list of substrings, where each substring is a maximal non-empty contiguous sequence of characters from the character set token_set, which defaults to
char-set:graphic. If start or end indices are provided, they restrictstring-tokenizeto operating on the indicated substring of s.
Filter the string s, retaining only those characters which satisfy char_pred.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
Delete characters satisfying char_pred from s.
If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.
When creating a Scheme string from a C string or when converting a Scheme string to a C string, the concept of character encoding becomes important.
In C, a string is just a sequence of bytes, and the character encoding describes the relation between these bytes and the actual characters that make up the string. For Scheme strings, character encoding is not an issue (most of the time), since in Scheme you never get to see the bytes, only the characters.
Well, ideally, anyway. Right now, Guile simply equates Scheme characters and bytes, ignoring the possibility of multi-byte encodings completely. This will change in the future, where Guile will use Unicode codepoints as its characters and UTF-8 or some other encoding as its internal encoding. When you exclusively use the functions listed in this section, you are `future-proof'.
Converting a Scheme string to a C string will often allocate fresh
memory to hold the result. You must take care that this memory is
properly freed eventually. In many cases, this can be achieved by
using scm_dynwind_free inside an appropriate dynwind context,
See Dynamic Wind.
Creates a new Scheme string that has the same contents as str when interpreted in the current locale character encoding.
For
scm_from_locale_string, str must be null-terminated.For
scm_from_locale_stringn, len specifies the length of str in bytes, and str does not need to be null-terminated. If len is(size_t)-1, then str does need to be null-terminated and the real length will be found withstrlen.
Like
scm_from_locale_stringandscm_from_locale_stringn, respectively, but also frees str withfreeeventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.
Returns a C string in the current locale encoding with the same contents as str. The C string must be freed with
freeeventually, maybe by usingscm_dynwind_free, See Dynamic Wind.For
scm_to_locale_string, the returned string is null-terminated and an error is signalled when str contains#\nulcharacters.For
scm_to_locale_stringnand lenp notNULL, str might contain#\nulcharacters and the length of the returned string in bytes is stored in*lenp. The returned string will not be null-terminated in this case. If lenp isNULL,scm_to_locale_stringnbehaves likescm_to_locale_string.
Puts str as a C string in the current locale encoding into the memory pointed to by buf. The buffer at buf has room for max_len bytes and
scm_to_local_stringbufwill never store more than that. No terminating'\0'will be stored.The return value of
scm_to_locale_stringbufis the number of bytes that are needed for all of str, regardless of whether buf was large enough to hold them. Thus, when the return value is larger than max_len, only max_len bytes have been stored and you probably need to try again with a larger buffer.
A regular expression (or regexp) is a pattern that describes a whole class of strings. A full description of regular expressions and their syntax is beyond the scope of this manual; an introduction can be found in the Emacs manual (see Syntax of Regular Expressions), or in many general Unix reference books.
If your system does not include a POSIX regular expression library,
and you have not linked Guile with a third-party regexp library such
as Rx, these functions will not be available. You can tell whether
your Guile installation includes regular expression support by
checking whether (provided? 'regex) returns true.
The following regexp and string matching features are provided by the
(ice-9 regex) module. Before using the described functions,
you should load this module by executing (use-modules (ice-9
regex)).
By default, Guile supports POSIX extended regular expressions. That means that the characters `(', `)', `+' and `?' are special, and must be escaped if you wish to match the literal characters.
This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions.
Zero bytes (#\nul) cannot be used in regex patterns or input
strings, since the underlying C functions treat that as the end of
string. If there's a zero byte an error is thrown.
Patterns and input strings are treated as being in the locale
character set if setlocale has been called (see Locales),
and in a multibyte locale this includes treating multi-byte sequences
as a single character. (Guile strings are currently merely bytes,
though this may change in the future, See Conversion to/from C.)
Compile the string pattern into a regular expression and compare it with str. The optional numeric argument start specifies the position of str at which to begin matching.
string-matchreturns a match structure which describes what, if anything, was matched by the regular expression. See Match Structures. If str does not match pattern at all,string-matchreturns#f.
Two examples of a match follow. In the first example, the pattern matches the four digits in the match string. In the second, the pattern matches nothing.
(string-match "[0-9][0-9][0-9][0-9]" "blah2002")
=> #("blah2002" (4 . 8))
(string-match "[A-Za-z]" "123456")
=> #f
Each time string-match is called, it must compile its
pattern argument into a regular expression structure. This
operation is expensive, which makes string-match inefficient if
the same regular expression is used several times (for example, in a
loop). For better performance, you can compile a regular expression in
advance and then match strings against the compiled regexp.
Compile the regular expression described by pat, and return the compiled regexp structure. If pat does not describe a legal regular expression,
make-regexpthrows aregular-expression-syntaxerror.The flag arguments change the behavior of the compiled regular expression. The following values may be supplied:
— Variable: regexp/newline
If a newline appears in the target string, then permit the `^' and `$' operators to match immediately after or immediately before the newline, respectively. Also, the `.' and `[^...]' operators will never match a newline character. The intent of this flag is to treat the target string as a buffer containing many lines of text, and the regular expression as a pattern that may match a single one of those lines.
— Variable: regexp/basic
Compile a basic (“obsolete”) regexp instead of the extended (“modern”) regexps that are the default. Basic regexps do not consider `|', `+' or `?' to be special characters, and require the `{...}' and `(...)' metacharacters to be backslash-escaped (see Backslash Escapes). There are several other differences between basic and extended regular expressions, but these are the most significant.
Match the compiled regular expression rx against
str. If the optional integer start argument is provided, begin matching from that position in the string. Return a match structure describing the results of the match, or#fif no match could be found.The flags argument changes the matching behavior. The following flag values may be supplied, use
logior(see Bitwise Operations) to combine them,
;; Regexp to match uppercase letters
(define r (make-regexp "[A-Z]*"))
;; Regexp to match letters, ignoring case
(define ri (make-regexp "[A-Z]*" regexp/icase))
;; Search for bob using regexp r
(match:substring (regexp-exec r "bob"))
=> "" ; no match
;; Search for bob using regexp ri
(match:substring (regexp-exec ri "Bob"))
=> "Bob" ; matched case insensitive
Return
#tif obj is a compiled regular expression, or#fotherwise.
Return a list of match structures which are the non-overlapping matches of regexp in str. regexp can be either a pattern string or a compiled regexp. The flags argument is as per
regexp-execabove.(map match:substring (list-matches "[a-z]+" "abc 42 def 78")) => ("abc" "def")
Apply proc to the non-overlapping matches of regexp in str, to build a result. regexp can be either a pattern string or a compiled regexp. The flags argument is as per
regexp-execabove.proc is called as
(procmatch prev)where match is a match structure and prev is the previous return from proc. For the first call prev is the given init parameter.fold-matchesreturns the final value from proc.For example to count matches,
(fold-matches "[a-z][0-9]" "abc x1 def y2" 0 (lambda (match count) (1+ count))) => 2
Regular expressions are commonly used to find patterns in one string and replace them with the contents of another string. The following functions are convenient ways to do this.
Write to port selected parts of the match structure match. Or if port is
#fthen form a string from those parts and return that.Each item specifies a part to be written, and may be one of the following,
- A string. String arguments are written out verbatim.
- An integer. The submatch with that number is written (
match:substring). Zero is the entire match.- The symbol `pre'. The portion of the matched string preceding the regexp match is written (
match:prefix).- The symbol `post'. The portion of the matched string following the regexp match is written (
match:suffix).For example, changing a match and retaining the text before and after,
(regexp-substitute #f (string-match "[0-9]+" "number 25 is good") 'pre "37" 'post) => "number 37 is good"Or matching a yyyymmdd format date such as `20020828' and re-ordering and hyphenating the fields.
(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") (define s "Date 20020429 12am.") (regexp-substitute #f (string-match date-regex s) 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") => "Date 04-29-2002 12am. (20020429)"
Write to port selected parts of matches of regexp in target. If port is
#fthen form a string from those parts and return that. regexp can be a string or a compiled regex.This is similar to
regexp-substitute, but allows global substitutions on target. Each item behaves as perregexp-substitute, with the following differences,
- A function. Called as
(itemmatch)with the match structure for the regexp match, it should return a string to be written to port.- The symbol `post'. This doesn't output anything, but instead causes
regexp-substitute/globalto recurse on the unmatched portion of target.This must be supplied to perform a global search and replace on target; without it
regexp-substitute/globalreturns after a single match and output.For example, to collapse runs of tabs and spaces to a single hyphen each,
(regexp-substitute/global #f "[ \t]+" "this is the text" 'pre "-" 'post) => "this-is-the-text"Or using a function to reverse the letters in each word,
(regexp-substitute/global #f "[a-z]+" "to do and not-do" 'pre (lambda (m) (string-reverse (match:substring m))) 'post) => "ot od dna ton-od"Without the
postsymbol, just one regexp match is made. For example the following is the date example fromregexp-substituteabove, without the need for the separatestring-matchcall.(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") (define s "Date 20020429 12am.") (regexp-substitute/global #f date-regex s 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") => "Date 04-29-2002 12am. (20020429)"
A match structure is the object returned by string-match and
regexp-exec. It describes which portion of a string, if any,
matched the given regular expression. Match structures include: a
reference to the string that was checked for matches; the starting and
ending positions of the regexp match; and, if the regexp included any
parenthesized subexpressions, the starting and ending positions of each
submatch.
In each of the regexp match functions described below, the match
argument must be a match structure returned by a previous call to
string-match or regexp-exec. Most of these functions
return some information about the original target string that was
matched against a regular expression; we will call that string
target for easy reference.
Return
#tif obj is a match structure returned by a previous call toregexp-exec, or#fotherwise.
Return the portion of target matched by subexpression number n. Submatch 0 (the default) represents the entire regexp match. If the regular expression as a whole matched, but the subexpression number n did not match, return
#f.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:substring s)
=> "2002"
;; match starting at offset 6 in the string
(match:substring
(string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6))
=> "7654"
In the following example, the result is 4, since the match starts at character index 4:
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:start s)
=> 4
In the following example, the result is 8, since the match runs between characters 4 and 8 (i.e. the “2002”).
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:end s)
=> 8
Return the unmatched portion of target preceding the regexp match.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) (match:prefix s) => "blah"
Return the unmatched portion of target following the regexp match.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:suffix s)
=> "foo"
Return the number of parenthesized subexpressions from match. Note that the entire regular expression match itself counts as a subexpression, and failed submatches are included in the count.
(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
(match:string s)
=> "blah2002foo"
Sometimes you will want a regexp to match characters like `*' or `$' exactly. For example, to check whether a particular string represents a menu entry from an Info node, it would be useful to match it against a regexp like `^* [^:]*::'. However, this won't work; because the asterisk is a metacharacter, it won't match the `*' at the beginning of the string. In this case, we want to make the first asterisk un-magic.
You can do this by preceding the metacharacter with a backslash character `\'. (This is also called quoting the metacharacter, and is known as a backslash escape.) When Guile sees a backslash in a regular expression, it considers the following glyph to be an ordinary character, no matter what special meaning it would ordinarily have. Therefore, we can make the above example work by changing the regexp to `^\* [^:]*::'. The `\*' sequence tells the regular expression engine to match only a single asterisk in the target string.
Since the backslash is itself a metacharacter, you may force a regexp to match a backslash in the target string by preceding the backslash with itself. For example, to find variable references in a TeX program, you might want to find occurrences of the string `\let\' followed by any number of alphabetic characters. The regular expression `\\let\\[A-Za-z]*' would do this: the double backslashes in the regexp each match a single backslash in the target string.
Quote each special character found in str with a backslash, and return the resulting string.
Very important: Using backslash escapes in Guile source code (as in Emacs Lisp or C) can be tricky, because the backslash character has special meaning for the Guile reader. For example, if Guile encounters the character sequence `\n' in the middle of a string while processing Scheme code, it replaces those characters with a newline character. Similarly, the character sequence `\t' is replaced by a horizontal tab. Several of these escape sequences are processed by the Guile reader before your code is executed. Unrecognized escape sequences are ignored: if the characters `\*' appear in a string, they will be translated to the single character `*'.
This translation is obviously undesirable for regular expressions, since we want to be able to include backslashes in a string in order to escape regexp metacharacters. Therefore, to make sure that a backslash is preserved in a string in your Guile program, you must use two consecutive backslashes:
(define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
The string in this example is preprocessed by the Guile reader before
any code is executed. The resulting argument to make-regexp is
the string `^\* [^:]*', which is what we really want.
This also means that in order to write a regular expression that matches a single backslash character, the regular expression string in the source code must include four backslashes. Each consecutive pair of backslashes gets translated by the Guile reader to a single backslash, and the resulting double-backslash is interpreted by the regexp engine as matching a single backslash character. Hence:
(define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
The reason for the unwieldiness of this syntax is historical. Both regular expression pattern matchers and Unix string processing systems have traditionally used backslashes with the special meanings described above. The POSIX regular expression specification and ANSI C standard both require these semantics. Attempting to abandon either convention would cause other kinds of compatibility problems, possibly more severe ones. Therefore, without extending the Scheme reader to support strings with different quoting conventions (an ungainly and confusing extension when implemented in other languages), we must adhere to this cumbersome escape syntax.
Symbols in Scheme are widely used in three ways: as items of discrete data, as lookup keys for alists and hash tables, and to denote variable references.
A symbol is similar to a string in that it is defined by a sequence of characters. The sequence of characters is known as the symbol's name. In the usual case — that is, where the symbol's name doesn't include any characters that could be confused with other elements of Scheme syntax — a symbol is written in a Scheme program by writing the sequence of characters that make up the name, without any quotation marks or other special syntax. For example, the symbol whose name is “multiply-by-2” is written, simply:
multiply-by-2
Notice how this differs from a string with contents “multiply-by-2”, which is written with double quotation marks, like this:
"multiply-by-2"
Looking beyond how they are written, symbols are different from strings in two important respects.
The first important difference is uniqueness. If the same-looking string is read twice from two different places in a program, the result is two different string objects whose contents just happen to be the same. If, on the other hand, the same-looking symbol is read twice from two different places in a program, the result is the same symbol object both times.
Given two read symbols, you can use eq? to test whether they are
the same (that is, have the same name). eq? is the most
efficient comparison operator in Scheme, and comparing two symbols like
this is as fast as comparing, for example, two numbers. Given two
strings, on the other hand, you must use equal? or
string=?, which are much slower comparison operators, to
determine whether the strings have the same contents.
(define sym1 (quote hello))
(define sym2 (quote hello))
(eq? sym1 sym2) => #t
(define str1 "hello")
(define str2 "hello")
(eq? str1 str2) => #f
(equal? str1 str2) => #t
The second important difference is that symbols, unlike strings, are not
self-evaluating. This is why we need the (quote ...)s in the
example above: (quote hello) evaluates to the symbol named
"hello" itself, whereas an unquoted hello is read as the
symbol named "hello" and evaluated as a variable reference ... about
which more below (see Symbol Variables).
Numbers and symbols are similar to the extent that they both lend
themselves to eq? comparison. But symbols are more descriptive
than numbers, because a symbol's name can be used directly to describe
the concept for which that symbol stands.
For example, imagine that you need to represent some colours in a computer program. Using numbers, you would have to choose arbitrarily some mapping between numbers and colours, and then take care to use that mapping consistently:
;; 1=red, 2=green, 3=purple
(if (eq? (colour-of car) 1)
...)
You can make the mapping more explicit and the code more readable by defining constants:
(define red 1)
(define green 2)
(define purple 3)
(if (eq? (colour-of car) red)
...)
But the simplest and clearest approach is not to use numbers at all, but symbols whose names specify the colours that they refer to:
(if (eq? (colour-of car) 'red)
...)
The descriptive advantages of symbols over numbers increase as the set of concepts that you want to describe grows. Suppose that a car object can have other properties as well, such as whether it has or uses:
Then a car's combined property set could be naturally represented and manipulated as a list of symbols:
(properties-of car1)
=>
(red manual unleaded power-steering)
(if (memq 'power-steering (properties-of car1))
(display "Unfit people can drive this car.\n")
(display "You'll need strong arms to drive this car!\n"))
-|
Unfit people can drive this car.
Remember, the fundamental property of symbols that we are relying on
here is that an occurrence of 'red in one part of a program is an
indistinguishable symbol from an occurrence of 'red in
another part of a program; this means that symbols can usefully be
compared using eq?. At the same time, symbols have naturally
descriptive names. This combination of efficiency and descriptive power
makes them ideal for use as discrete data.
Given their efficiency and descriptive power, it is natural to use symbols as the keys in an association list or hash table.
To illustrate this, consider a more structured representation of the car properties example from the preceding subsection. Rather than mixing all the properties up together in a flat list, we could use an association list like this:
(define car1-properties '((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)))
Notice how this structure is more explicit and extensible than the flat
list. For example it makes clear that manual refers to the
transmission rather than, say, the windows or the locking of the car.
It also allows further properties to use the same symbols among their
possible values without becoming ambiguous:
(define car1-properties '((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)
(seat-colour . red)
(locking . manual)))
With a representation like this, it is easy to use the efficient
assq-XXX family of procedures (see Association Lists) to
extract or change individual pieces of information:
(assq-ref car1-properties 'fuel) => unleaded
(assq-ref car1-properties 'transmission) => manual
(assq-set! car1-properties 'seat-colour 'black)
=>
((colour . red)
(transmission . manual)
(fuel . unleaded)
(steering . power-assisted)
(seat-colour . black)
(locking . manual)))
Hash tables also have keys, and exactly the same arguments apply to the
use of symbols in hash tables as in association lists. The hash value
that Guile uses to decide where to add a symbol-keyed entry to a hash
table can be obtained by calling the symbol-hash procedure:
Return a hash value for symbol.
See Hash Tables for information about hash tables in general, and for why you might choose to use a hash table rather than an association list.
When an unquoted symbol in a Scheme program is evaluated, it is interpreted as a variable reference, and the result of the evaluation is the appropriate variable's value.
For example, when the expression (string-length "abcd") is read
and evaluated, the sequence of characters string-length is read
as the symbol whose name is "string-length". This symbol is associated
with a variable whose value is the procedure that implements string
length calculation. Therefore evaluation of the string-length
symbol results in that procedure.
The details of the connection between an unquoted symbol and the variable to which it refers are explained elsewhere. See Binding Constructs, for how associations between symbols and variables are created, and Modules, for how those associations are affected by Guile's module system.
Given any Scheme value, you can determine whether it is a symbol using
the symbol? primitive:
Return
#tif obj is a symbol, otherwise return#f.
Once you know that you have a symbol, you can obtain its name as a
string by calling symbol->string. Note that Guile differs by
default from R5RS on the details of symbol->string as regards
case-sensitivity:
Return the name of symbol s as a string. By default, Guile reads symbols case-sensitively, so the string returned will have the same case variation as the sequence of characters that caused s to be created.
If Guile is set to read symbols case-insensitively (as specified by R5RS), and s comes into being as part of a literal expression (see Literal expressions) or by a call to the
readorstring-ci->symbolprocedures, Guile converts any alphabetic characters in the symbol's name to lower case before creating the symbol object, so the string returned here will be in lower case.If s was created by
string->symbol, the case of characters in the string returned will be the same as that in the string that was passed tostring->symbol, regardless of Guile's case-sensitivity setting at the time s was created.It is an error to apply mutation procedures like
string-set!to strings returned by this procedure.
Most symbols are created by writing them literally in code. However it
is also possible to create symbols programmatically using the following
string->symbol and string-ci->symbol procedures:
Return the symbol whose name is string. This procedure can create symbols with names containing special characters or letters in the non-standard case, but it is usually a bad idea to create such symbols because in some implementations of Scheme they cannot be read as themselves.
Return the symbol whose name is str. If Guile is currently reading symbols case-insensitively, str is converted to lowercase before the returned symbol is looked up or created.
The following examples illustrate Guile's detailed behaviour as regards the case-sensitivity of symbols:
(read-enable 'case-insensitive) ; R5RS compliant behaviour
(symbol->string 'flying-fish) => "flying-fish"
(symbol->string 'Martin) => "martin"
(symbol->string
(string->symbol "Malvina")) => "Malvina"
(eq? 'mISSISSIppi 'mississippi) => #t
(string->symbol "mISSISSIppi") => mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) => #f
(eq? 'LolliPop
(string->symbol (symbol->string 'LolliPop))) => #t
(string=? "K. Harper, M.D."
(symbol->string
(string->symbol "K. Harper, M.D."))) => #t
(read-disable 'case-insensitive) ; Guile default behaviour
(symbol->string 'flying-fish) => "flying-fish"
(symbol->string 'Martin) => "Martin"
(symbol->string
(string->symbol "Malvina")) => "Malvina"
(eq? 'mISSISSIppi 'mississippi) => #f
(string->symbol "mISSISSIppi") => mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) => #t
(eq? 'LolliPop
(string->symbol (symbol->string 'LolliPop))) => #t
(string=? "K. Harper, M.D."
(symbol->string
(string->symbol "K. Harper, M.D."))) => #t
From C, there are lower level functions that construct a Scheme symbol from a C string in the current locale encoding.
When you want to do more from C, you should convert between symbols
and strings using scm_symbol_to_string and
scm_string_to_symbol and work with the strings.
Construct and return a Scheme symbol whose name is specified by name. For
scm_from_locale_symbol, name must be null terminated; forscm_from_locale_symbolnthe length of name is specified explicitly by len.
Like
scm_from_locale_symbolandscm_from_locale_symboln, respectively, but also frees str withfreeeventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.
The size of a symbol can also be obtained from C:
Finally, some applications, especially those that generate new Scheme
code dynamically, need to generate symbols for use in the generated
code. The gensym primitive meets this need:
Create a new symbol with a name constructed from a prefix and a counter value. The string prefix can be specified as an optional argument. Default prefix is ` g'. The counter is increased by 1 at each call. There is no provision for resetting the counter.
The symbols generated by gensym are likely to be unique,
since their names begin with a space and it is only otherwise possible
to generate such symbols if a programmer goes out of their way to do
so. Uniqueness can be guaranteed by instead using uninterned symbols
(see Symbol Uninterned), though they can't be usefully written out
and read back in.
In traditional Lisp dialects, symbols are often understood as having three kinds of value at once:
put or get functions.
Although Scheme (as one of its simplifications with respect to Lisp) does away with the distinction between variable and function namespaces, Guile currently retains some elements of the traditional structure in case they turn out to be useful when implementing translators for other languages, in particular Emacs Lisp.
Specifically, Guile symbols have two extra slots. for a symbol's property list, and for its “function value.” The following procedures are provided to access these slots.
Return the contents of symbol's function slot.
Set the contents of symbol's function slot to value.
Return the property list currently associated with symbol.
Set symbol's property list to value.
From sym's property list, return the value for property prop. The assumption is that sym's property list is an association list whose keys are distinguished from each other using
equal?; prop should be one of the keys in that list. If the property list has no entry for prop,symbol-propertyreturns#f.
In sym's property list, set the value for property prop to val, or add a new entry for prop, with value val, if none already exists. For the structure of the property list, see
symbol-property.
From sym's property list, remove the entry for property prop, if there is one. For the structure of the property list, see
symbol-property.
Support for these extra slots may be removed in a future release, and it is probably better to avoid using them. For a more modern and Schemely approach to properties, see Object Properties.
The read syntax for a symbol is a sequence of letters, digits, and
extended alphabetic characters, beginning with a character that
cannot begin a number. In addition, the special cases of +,
-, and ... are read as symbols even though numbers can
begin with +, - or ..
Extended alphabetic characters may be used within identifiers as if they were letters. The set of extended alphabetic characters is:
! $ % & * + - . / : < = > ? @ ^ _ ~
In addition to the standard read syntax defined above (which is taken from R5RS (see Formal syntax)), Guile provides an extended symbol read syntax that allows the inclusion of unusual characters such as space characters, newlines and parentheses. If (for whatever reason) you need to write a symbol containing characters not mentioned above, you can do so as follows.
#{,
}#.
Here are a few examples of this form of read syntax. The first symbol needs to use extended syntax because it contains a space character, the second because it contains a line break, and the last because it looks like a number.
#{foo bar}#
#{what
ever}#
#{4242}#
Although Guile provides this extended read syntax for symbols, widespread usage of it is discouraged because it is not portable and not very readable.
What makes symbols useful is that they are automatically kept unique. There are no two symbols that are distinct objects but have the same name. But of course, there is no rule without exception. In addition to the normal symbols that have been discussed up to now, you can also create special uninterned symbols that behave slightly differently.
To understand what is different about them and why they might be useful, we look at how normal symbols are actually kept unique.
Whenever Guile wants to find the symbol with a specific name, for
example during read or when executing string->symbol, it
first looks into a table of all existing symbols to find out whether a
symbol with the given name already exists. When this is the case, Guile
just returns that symbol. When not, a new symbol with the name is
created and entered into the table so that it can be found later.
Sometimes you might want to create a symbol that is guaranteed `fresh', i.e. a symbol that did not exist previously. You might also want to somehow guarantee that no one else will ever unintentionally stumble across your symbol in the future. These properties of a symbol are often needed when generating code during macro expansion. When introducing new temporary variables, you want to guarantee that they don't conflict with variables in other people's code.
The simplest way to arrange for this is to create a new symbol but not enter it into the global table of all symbols. That way, no one will ever get access to your symbol by chance. Symbols that are not in the table are called uninterned. Of course, symbols that are in the table are called interned.
You create new uninterned symbols with the function make-symbol.
You can test whether a symbol is interned or not with
symbol-interned?.
Uninterned symbols break the rule that the name of a symbol uniquely
identifies the symbol object. Because of this, they can not be written
out and read back in like interned symbols. Currently, Guile has no
support for reading uninterned symbols. Note that the function
gensym does not return uninterned symbols for this reason.
Return a new uninterned symbol with the name name. The returned symbol is guaranteed to be unique and future calls to
string->symbolwill not return it.
Return
#tif symbol is interned, otherwise return#f.
For example:
(define foo-1 (string->symbol "foo"))
(define foo-2 (string->symbol "foo"))
(define foo-3 (make-symbol "foo"))
(define foo-4 (make-symbol "foo"))
(eq? foo-1 foo-2)
=> #t
; Two interned symbols with the same name are the same object,
(eq? foo-1 foo-3)
=> #f
; but a call to make-symbol with the same name returns a
; distinct object.
(eq? foo-3 foo-4)
=> #f
; A call to make-symbol always returns a new object, even for
; the same name.
foo-3
=> #<uninterned-symbol foo 8085290>
; Uninterned symbols print differently from interned symbols,
(symbol? foo-3)
=> #t
; but they are still symbols,
(symbol-interned? foo-3)
=> #f
; just not interned.
Keywords are self-evaluating objects with a convenient read syntax that makes them easy to type.
Guile's keyword support conforms to R5RS, and adds a (switchable) read
syntax extension to permit keywords to begin with : as well as
#:, or to end with :.
Keywords are useful in contexts where a program or procedure wants to be able to accept a large number of optional arguments without making its interface unmanageable.
To illustrate this, consider a hypothetical make-window
procedure, which creates a new window on the screen for drawing into
using some graphical toolkit. There are many parameters that the caller
might like to specify, but which could also be sensibly defaulted, for
example:
If make-window did not use keywords, the caller would have to
pass in a value for each possible argument, remembering the correct
argument order and using a special value to indicate the default value
for that argument:
(make-window 'default ;; Color depth
'default ;; Background color
800 ;; Width
100 ;; Height
...) ;; More make-window arguments
With keywords, on the other hand, defaulted arguments are omitted, and non-default arguments are clearly tagged by the appropriate keyword. As a result, the invocation becomes much clearer:
(make-window #:width 800 #:height 100)
On the other hand, for a simpler procedure with few arguments, the use
of keywords would be a hindrance rather than a help. The primitive
procedure cons, for example, would not be improved if it had to
be invoked as
(cons #:car x #:cdr y)
So the decision whether to use keywords or not is purely pragmatic: use them if they will clarify the procedure invocation at point of call.
If a procedure wants to support keywords, it should take a rest argument and then use whatever means is convenient to extract keywords and their corresponding arguments from the contents of that rest argument.
The following example illustrates the principle: the code for
make-window uses a helper procedure called
get-keyword-value to extract individual keyword arguments from
the rest argument.
(define (get-keyword-value args keyword default)
(let ((kv (memq keyword args)))
(if (and kv (>= (length kv) 2))
(cadr kv)
default)))
(define (make-window . args)
(let ((depth (get-keyword-value args #:depth screen-depth))
(bg (get-keyword-value args #:bg "white"))
(width (get-keyword-value args #:width 800))
(height (get-keyword-value args #:height 100))
...)
...))
But you don't need to write get-keyword-value. The (ice-9
optargs) module provides a set of powerful macros that you can use to
implement keyword-supporting procedures like this:
(use-modules (ice-9 optargs))
(define (make-window . args)
(let-keywords args #f ((depth screen-depth)
(bg "white")
(width 800)
(height 100))
...))
Or, even more economically, like this:
(use-modules (ice-9 optargs))
(define* (make-window #:key (depth screen-depth)
(bg "white")
(width 800)
(height 100))
...)
For further details on let-keywords, define* and other
facilities provided by the (ice-9 optargs) module, see
Optional Arguments.
Guile, by default, only recognizes a keyword syntax that is compatible
with R5RS. A token of the form #:NAME, where NAME has the
same syntax as a Scheme symbol (see Symbol Read Syntax), is the
external representation of the keyword named NAME. Keyword
objects print using this syntax as well, so values containing keyword
objects can be read back into Guile. When used in an expression,
keywords are self-quoting objects.
If the keyword read option is set to 'prefix, Guile also
recognizes the alternative read syntax :NAME. Otherwise, tokens
of the form :NAME are read as symbols, as required by R5RS.
If the keyword read option is set to 'postfix, Guile
recognizes the SRFI-88 read syntax NAME: (see SRFI-88).
Otherwise, tokens of this form are read as symbols.
To enable and disable the alternative non-R5RS keyword syntax, you use
the read-set! procedure documented in User level options interfaces and Reader options. Note that the prefix and
postfix syntax are mutually exclusive.
(read-set! keywords 'prefix)
#:type
=>
#:type
:type
=>
#:type
(read-set! keywords 'postfix)
type:
=>
#:type
:type
=>
:type
(read-set! keywords #f)
#:type
=>
#:type
:type
-|
ERROR: In expression :type:
ERROR: Unbound variable: :type
ABORT: (unbound-variable)
Return
#tif the argument obj is a keyword, else#f.
Return the symbol with the same name as keyword.
Return the keyword with the same name as symbol.
Equivalent to
scm_symbol_to_keyword (scm_from_locale_symbol (str))andscm_symbol_to_keyword (scm_from_locale_symboln (str,len)), respectively.
Procedures and macros are documented in their own chapter: see Procedures and Macros.
Variable objects are documented as part of the description of Guile's module system: see Variables.
Asyncs, dynamic roots and fluids are described in the chapter on scheduling: see Scheduling.
Hooks are documented in the chapter on general utility functions: see Hooks.
Ports are described in the chapter on I/O: see Input and Output.
This chapter describes Guile's compound data types. By compound we mean that the primary purpose of these data types is to act as containers for other kinds of data (including other compound objects). For instance, a (non-uniform) vector with length 5 is a container that can hold five arbitrary Scheme objects.
The various kinds of container object differ from each other in how their memory is allocated, how they are indexed, and how particular values can be looked up within them.
Pairs are used to combine two Scheme objects into one compound object. Hence the name: A pair stores a pair of objects.
The data type pair is extremely important in Scheme, just like in any other Lisp dialect. The reason is that pairs are not only used to make two values available as one object, but that pairs are used for constructing lists of values. Because lists are so important in Scheme, they are described in a section of their own (see Lists).
Pairs can literally get entered in source code or at the REPL, in the
so-called dotted list syntax. This syntax consists of an opening
parentheses, the first element of the pair, a dot, the second element
and a closing parentheses. The following example shows how a pair
consisting of the two numbers 1 and 2, and a pair containing the symbols
foo and bar can be entered. It is very important to write
the whitespace before and after the dot, because otherwise the Scheme
parser would not be able to figure out where to split the tokens.
(1 . 2)
(foo . bar)
But beware, if you want to try out these examples, you have to quote the expressions. More information about quotation is available in the section Expression Syntax. The correct way to try these examples is as follows.
'(1 . 2)
=>
(1 . 2)
'(foo . bar)
=>
(foo . bar)
A new pair is made by calling the procedure cons with two
arguments. Then the argument values are stored into a newly allocated
pair, and the pair is returned. The name cons stands for
"construct". Use the procedure pair? to test whether a
given Scheme object is a pair or not.
Return a newly allocated pair whose car is x and whose cdr is y. The pair is guaranteed to be different (in the sense of
eq?) from every previously existing object.
Return
#tif x is a pair; otherwise return#f.
The two parts of a pair are traditionally called car and
cdr. They can be retrieved with procedures of the same name
(car and cdr), and can be modified with the procedures
set-car! and set-cdr!. Since a very common operation in
Scheme programs is to access the car of a car of a pair, or the car of
the cdr of a pair, etc., the procedures called caar,
cadr and so on are also predefined.
Return the car or the cdr of pair, respectively.
These two macros are the fastest way to access the car or cdr of a pair; they can be thought of as compiling into a single memory reference.
These macros do no checking at all. The argument pair must be a valid pair.
These procedures are compositions of
carandcdr, where for examplecaddrcould be defined by(define caddr (lambda (x) (car (cdr (cdr x)))))
cadr,caddrandcadddrpick out the second, third or fourth elements of a list, respectively. SRFI-1 provides the same under the namessecond,thirdandfourth(see SRFI-1 Selectors).
Stores value in the car field of pair. The value returned by
set-car!is unspecified.
Stores value in the cdr field of pair. The value returned by
set-cdr!is unspecified.
A very important data type in Scheme—as well as in all other Lisp dialects—is the data type list.4
This is the short definition of what a list is:
(),
The syntax for lists is an opening parentheses, then all the elements of the list (separated by whitespace) and finally a closing parentheses.5.
(1 2 3) ; a list of the numbers 1, 2 and 3 ("foo" bar 3.1415) ; a string, a symbol and a real number () ; the empty list
The last example needs a bit more explanation. A list with no elements, called the empty list, is special in some ways. It is used for terminating lists by storing it into the cdr of the last pair that makes up a list. An example will clear that up:
(car '(1))
=>
1
(cdr '(1))
=>
()
This example also shows that lists have to be quoted when written (see Expression Syntax), because they would otherwise be mistakingly taken as procedure applications (see Simple Invocation).
Often it is useful to test whether a given Scheme object is a list or not. List-processing procedures could use this information to test whether their input is valid, or they could do different things depending on the datatype of their arguments.
The predicate null? is often used in list-processing code to
tell whether a given list has run out of elements. That is, a loop
somehow deals with the elements of a list until the list satisfies
null?. Then, the algorithm terminates.
Return
#tiff x is the empty list, else#f.
This section describes the procedures for constructing new lists.
list simply returns a list where the elements are the arguments,
cons* is similar, but the last argument is stored in the cdr of
the last pair of the list.
SCM_UNDEFINED)Return a new list containing elements elem1 to elemN.
scm_list_ntakes a variable number of arguments, terminated by the specialSCM_UNDEFINED. That finalSCM_UNDEFINEDis not included in the list. None of elem1 to elemN can themselves beSCM_UNDEFINED, orscm_list_nwill terminate at that point.
Like
list, but the last arg provides the tail of the constructed list, returning(consarg1(consarg2(cons ...argn))). Requires at least one argument. If given one argument, that argument is returned as result. This function is calledlist*in some other Schemes and in Common LISP.
Return a (newly-created) copy of lst.
Create a list containing of n elements, where each element is initialized to init. init defaults to the empty list
()if not given.
Note that list-copy only makes a copy of the pairs which make up
the spine of the lists. The list elements are not copied, which means
that modifying the elements of the new list also modifies the elements
of the old list. On the other hand, applying procedures like
set-cdr! or delv! to the new list will not alter the old
list. If you also need to copy the list elements (making a deep copy),
use the procedure copy-tree (see Copying).
These procedures are used to get some information about a list, or to retrieve one or more elements of a list.
Return the number of elements in list lst.
Return the last pair in lst, signalling an error if lst is circular.
Return the kth element from list.
Return the "tail" of lst beginning with its kth element. The first element of the list is considered to be element 0.
list-tailandlist-cdr-refare identical. It may help to think oflist-cdr-refas accessing the kth cdr of the list, or returning the results of cdring k times down lst.
Copy the first k elements from lst into a new list, and return it.
append and append! are used to concatenate two or more
lists in order to form a new list. reverse and reverse!
return lists with the same elements as their arguments, but in reverse
order. The procedure variants with an ! directly modify the
pairs which form the list, whereas the other procedures create new
pairs. This is why you should be careful when using the side-effecting
variants.
Return a list comprising all the elements of lists lst1 to lstN.
(append '(x) '(y)) => (x y) (append '(a) '(b c d)) => (a b c d) (append '(a (b)) '((c))) => (a (b) (c))The last argument lstN may actually be any object; an improper list results if the last argument is not a proper list.
(append '(a b) '(c . d)) => (a b c . d) (append '() 'a) => a
appenddoesn't modify the given lists, but the return may share structure with the final lstN.append!modifies the given lists to form its return.For
scm_appendandscm_append_x, lstlst is a list of the list operands lst1 ... lstN. That lstlst itself is not modified or used in the return.
Return a list comprising the elements of lst, in reverse order.
reverseconstructs a new list,reverse!modifies lst in constructing its return.For
reverse!, the optional newtail is appended to to the result. newtail isn't reversed, it simply becomes the list tail. Forscm_reverse_x, the newtail parameter is mandatory, but can beSCM_EOLif no further tail is required.
The following procedures modify an existing list, either by changing elements of the list, or by changing the list structure itself.
Set the kth element of list to val.
Set the kth cdr of list to val.
Return a newly-created copy of lst with elements
eq?to item removed. This procedure mirrorsmemq:delqcompares elements of lst against item witheq?.
Return a newly-created copy of lst with elements
eqv?to item removed. This procedure mirrorsmemv:delvcompares elements of lst against item witheqv?.
Return a newly-created copy of lst with elements
equal?to item removed. This procedure mirrorsmember:deletecompares elements of lst against item withequal?.See also SRFI-1 which has an extended
delete(SRFI-1 Deleting), and also anlset-differencewhich can delete multiple items in one call (SRFI-1 Set Operations).
These procedures are destructive versions of
delq,delvanddelete: they modify the pointers in the existing lst rather than creating a new list. Caveat evaluator: Like other destructive list functions, these functions cannot modify the binding of lst, and so cannot be used to delete the first element of lst destructively.
Like
delq!, but only deletes the first occurrence of item from lst. Tests for equality usingeq?. See alsodelv1!anddelete1!.
Like
delv!, but only deletes the first occurrence of item from lst. Tests for equality usingeqv?. See alsodelq1!anddelete1!.
Like
delete!, but only deletes the first occurrence of item from lst. Tests for equality usingequal?. See alsodelq1!anddelv1!.
Return a list containing all elements from lst which satisfy the predicate pred. The elements in the result list have the same order as in lst. The order in which pred is applied to the list elements is not specified.
filterdoes not change lst, but the result may share a tail with it.filter!may modify lst to construct its return.
The following procedures search lists for particular elements. They use
different comparison predicates for comparing list elements with the
object to be searched. When they fail, they return #f, otherwise
they return the sublist whose car is equal to the search object, where
equality depends on the equality predicate used.
Return the first sublist of lst whose car is
eq?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.
Return the first sublist of lst whose car is
eqv?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.
Return the first sublist of lst whose car is
equal?to x where the sublists of lst are the non-empty lists returned by(list-taillst k)for k less than the length of lst. If x does not occur in lst, then#f(not the empty list) is returned.See also SRFI-1 which has an extended
memberfunction (SRFI-1 Searching).
List processing is very convenient in Scheme because the process of iterating over the elements of a list can be highly abstracted. The procedures in this section are the most basic iterating procedures for lists. They take a procedure and one or more lists as arguments, and apply the procedure to each element of the list. They differ in their return value.
Apply proc to each element of the list arg1 (if only two arguments are given), or to the corresponding elements of the argument lists (if more than two arguments are given). The result(s) of the procedure applications are saved and returned in a list. For
map, the order of procedure applications is not specified,map-in-orderapplies the procedure from left to right to the list elements.
Like
map, but the procedure is always applied from left to right, and the result(s) of the procedure applications are thrown away. The return value is not specified.
See also SRFI-1 which extends these functions to take lists of unequal lengths (SRFI-1 Fold and Map).
Vectors are sequences of Scheme objects. Unlike lists, the length of a vector, once the vector is created, cannot be changed. The advantage of vectors over lists is that the time required to access one element of a vector given its position (synonymous with index), a zero-origin number, is constant, whereas lists have an access time linear to the position of the accessed element in the list.
Vectors can contain any kind of Scheme object; it is even possible to have different types of objects in the same vector. For vectors containing vectors, you may wish to use arrays, instead. Note, too, that vectors are the special case of one dimensional non-uniform arrays and that most array procedures operate happily on vectors (see Arrays).
Vectors can literally be entered in source code, just like strings,
characters or some of the other data types. The read syntax for vectors
is as follows: A sharp sign (#), followed by an opening
parentheses, all elements of the vector in their respective read syntax,
and finally a closing parentheses. The following are examples of the
read syntax for vectors; where the first vector only contains numbers
and the second three different object types: a string, a symbol and a
number in hexadecimal notation.
#(1 2 3)
#("Hello" foo #xdeadbeef)
Like lists, vectors have to be quoted:
'#(a b c) => #(a b c)
Instead of creating a vector implicitly by using the read syntax just
described, you can create a vector dynamically by calling one of the
vector and list->vector primitives with the list of Scheme
values that you want to place into a vector. The size of the vector
thus created is determined implicitly by the number of arguments given.
Return a newly allocated vector composed of the given arguments. Analogous to
list.(vector 'a 'b 'c) => #(a b c)
The inverse operation is vector->list:
Return a newly allocated list composed of the elements of v.
(vector->list '#(dah dah didah)) => (dah dah didah) (list->vector '(dididit dah)) => #(dididit dah)
To allocate a vector with an explicitly specified size, use
make-vector. With this primitive you can also specify an initial
value for the vector elements (the same value for all elements, that
is):
Return a newly allocated vector of len elements. If a second argument is given, then each position is initialized to fill. Otherwise the initial contents of each position is unspecified.
Like
scm_make_vector, but the length is given as asize_t.
To check whether an arbitrary Scheme value is a vector, use the
vector? primitive:
Return
#tif obj is a vector, otherwise return#f.
Return non-zero when obj is a vector, otherwise return
zero.
vector-length and vector-ref return information about a
given vector, respectively its size and the elements that are contained
in the vector.
Return the number of elements in vector as an exact integer.
Return the number of elements in vector as a
size_t.
Return the contents of position k of vector. k must be a valid index of vector.
(vector-ref '#(1 1 2 3 5 8 13 21) 5) => 8 (vector-ref '#(1 1 2 3 5 8 13 21) (let ((i (round (* 2 (acos -1))))) (if (inexact? i) (inexact->exact i) i))) => 13
Return the contents of position k (a
size_t) of vector.
A vector created by one of the dynamic vector constructor procedures (see Vector Creation) can be modified using the following procedures.
NOTE: According to R5RS, it is an error to use any of these procedures on a literally read vector, because such vectors should be considered as constants. Currently, however, Guile does not detect this error.
Store obj in position k of vector. k must be a valid index of vector. The value returned by `vector-set!' is unspecified.
(let ((vec (vector 0 '(2 2 2 2) "Anna"))) (vector-set! vec 1 '("Sue" "Sue")) vec) => #(0 ("Sue" "Sue") "Anna")
Store obj in position k (a
size_t) of v.
Store fill in every position of vector. The value returned by
vector-fill!is unspecified.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-left!copies elements in leftmost order. Therefore, in the case where vec1 and vec2 refer to the same vector,vector-move-left!is usually appropriate when start1 is greater than start2.
Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.
vector-move-right!copies elements in rightmost order. Therefore, in the case where vec1 and vec2 refer to the same vector,vector-move-right!is usually appropriate when start1 is less than start2.
A vector can be read and modified from C with the functions
scm_c_vector_ref and scm_c_vector_set_x, for example. In
addition to these functions, there are two more ways to access vectors
from C that might be more efficient in certain situations: you can
restrict yourself to simple vectors and then use the very fast
simple vector macros; or you can use the very general framework
for accessing all kinds of arrays (see Accessing Arrays from C),
which is more verbose, but can deal efficiently with all kinds of
vectors (and arrays). For vectors, you can use the
scm_vector_elements and scm_vector_writable_elements
functions as shortcuts.
Return non-zero if obj is a simple vector, else return zero. A simple vector is a vector that can be used with the
SCM_SIMPLE_*macros below.The following functions are guaranteed to return simple vectors:
scm_make_vector,scm_c_make_vector,scm_vector,scm_list_to_vector.
Evaluates to the length of the simple vector vec. No type checking is done.
Evaluates to the element at position idx in the simple vector vec. No type or range checking is done.
Sets the element at position idx in the simple vector vec to val. No type or range checking is done.
Acquire a handle for the vector vec and return a pointer to the elements of it. This pointer can only be used to read the elements of vec. When vec is not a vector, an error is signaled. The handle mustr eventually be released with
scm_array_handle_release.The variables pointed to by lenp and incp are filled with the number of elements of the vector and the increment (number of elements) between successive elements, respectively. Successive elements of vec need not be contiguous in their underlying “root vector” returned here; hence the increment is not necessarily equal to 1 and may well be negative too (see Shared Arrays).
The following example shows the typical way to use this function. It creates a list of all elements of vec (in reverse order).
scm_t_array_handle handle; size_t i, len; ssize_t inc; const SCM *elt; SCM list; elt = scm_vector_elements (vec, &handle, &len, &inc); list = SCM_EOL; for (i = 0; i < len; i++, elt += inc) list = scm_cons (*elt, list); scm_array_handle_release (&handle);
Like
scm_vector_elementsbut the pointer can be used to modify the vector.The following example shows the typical way to use this function. It fills a vector with
#t.scm_t_array_handle handle; size_t i, len; ssize_t inc; SCM *elt; elt = scm_vector_writable_elements (vec, &handle, &len, &inc); for (i = 0; i < len; i++, elt += inc) *elt = SCM_BOOL_T; scm_array_handle_release (&handle);
A uniform numeric vector is a vector whose elements are all of a single numeric type. Guile offers uniform numeric vectors for signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers, two sizes of floating point values, and complex floating-point numbers of these two sizes.
Strings could be regarded as uniform vectors of characters, See Strings. Likewise, bit vectors could be regarded as uniform vectors of bits, See Bit Vectors. Both are sufficiently different from uniform numeric vectors that the procedures described here do not apply to these two data types. However, both strings and bit vectors are generalized vectors, See Generalized Vectors, and arrays, See Arrays.
Uniform numeric vectors are the special case of one dimensional uniform numeric arrays.
Uniform numeric vectors can be useful since they consume less memory than the non-uniform, general vectors. Also, since the types they can store correspond directly to C types, it is easier to work with them efficiently on a low level. Consider image processing as an example, where you want to apply a filter to some image. While you could store the pixels of an image in a general vector and write a general convolution function, things are much more efficient with uniform vectors: the convolution function knows that all pixels are unsigned 8-bit values (say), and can use a very tight inner loop.
That is, when it is written in C. Functions for efficiently working with uniform numeric vectors from C are listed at the end of this section.
Procedures similar to the vector procedures (see Vectors) are
provided for handling these uniform vectors, but they are distinct
datatypes and the two cannot be inter-mixed. If you want to work
primarily with uniform numeric vectors, but want to offer support for
general vectors as a convenience, you can use one of the
scm_any_to_* functions. They will coerce lists and vectors to
the given type of uniform vector. Alternatively, you can write two
versions of your code: one that is fast and works only with uniform
numeric vectors, and one that works with any kind of vector but is
slower.
One set of the procedures listed below is a generic one: it works with all types of uniform numeric vectors. In addition to that, there is a set of procedures for each type that only works with that type. Unless you really need to the generality of the first set, it is best to use the more specific functions. They might not be that much faster, but their use can serve as a kind of declaration and makes it easier to optimize later on.
The generic set of procedures uses uniform in its names, the
specific ones use the tag from the following table.
u8s8u16s16u32s32u64s64f32float
f64double
c32float
c64double
The external representation (ie. read syntax) for these vectors is similar to normal Scheme vectors, but with an additional tag from the table above indiciating the vector's type. For example,
#u16(1 2 3)
#f64(3.1415 2.71)
Note that the read syntax for floating-point here conflicts with
#f for false. In Standard Scheme one can write (1 #f3)
for a three element list (1 #f 3), but for Guile (1 #f3)
is invalid. (1 #f 3) is almost certainly what one should write
anyway to make the intention clear, so this is rarely a problem.
Return
#tif obj is a homogeneous numeric vector of the indicated type.
Return a newly allocated homogeneous numeric vector holding n elements of the indicated type. If value is given, the vector is initialized with that value, otherwise the contents are unspecified.
Return a newly allocated homogeneous numeric vector of the indicated type, holding the given parameter values. The vector length is the number of parameters given.
Return the number of elements in vec.
Return the element at index i in vec. The first element in vec is index 0.
Set the element at index i in vec to value. The first element in vec is index 0. The return value is unspecified.
Return a newly allocated list holding all elements of vec.
Return a newly allocated homogeneous numeric vector of the indicated type, initialized with the elements of the list lst.
Return a (maybe newly allocated) uniform numeric vector of the indicated type, initialized with the elements of obj, which must be a list, a vector, or a uniform vector. When obj is already a suitable uniform numeric vector, it is returned unchanged.
Return non-zero when uvec is a uniform numeric vector, zero otherwise.
Return a new uniform numeric vector of the indicated type and length that uses the memory pointed to by data to store its elements. This memory will eventually be freed with
free. The argument len specifies the number of elements in data, not its size in bytes.The
c32andc64variants take a pointer to a C array offloats ordoubles. The real parts of the complex numbers are at even indices in that array, the corresponding imaginary parts are at the following odd index.
Return the number of elements of uvec as a
size_t.
Like
scm_vector_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector of the indicated kind.
Like
scm_vector_writable_elements(see Vector Accessing from C), but returns a pointer to the elements of a uniform numeric vector of the indicated kind.
Fill the elements of uvec by reading raw bytes from port-or-fdes, using host byte order.
The optional arguments start (inclusive) and end (exclusive) allow a specified region to be read, leaving the remainder of the vector unchanged.
When port-or-fdes is a port, all specified elements of uvec are attempted to be read, potentially blocking while waiting formore input or end-of-file. When port-or-fd is an integer, a single call to read(2) is made.
An error is signalled when the last element has only been partially filled before reaching end-of-file or in the single call to read(2).
uniform-vector-read!returns the number of elements read.port-or-fdes may be omitted, in which case it defaults to the value returned by
(current-input-port).
Write the elements of uvec as raw bytes to port-or-fdes, in the host byte order.
The optional arguments start (inclusive) and end (exclusive) allow a specified region to be written.
When port-or-fdes is a port, all specified elements of uvec are attempted to be written, potentially blocking while waiting for more room. When port-or-fd is an integer, a single call to write(2) is made.
An error is signalled when the last element has only been partially written in the single call to write(2).
The number of objects actually written is returned. port-or-fdes may be omitted, in which case it defaults to the value returned by
(current-output-port).
Bit vectors are zero-origin, one-dimensional arrays of booleans. They
are displayed as a sequence of 0s and 1s prefixed by
#*, e.g.,
(make-bitvector 8 #f) =>
#*00000000
Bit vectors are are also generalized vectors, See Generalized Vectors, and can thus be used with the array procedures, See Arrays. Bit vectors are the special case of one dimensional bit arrays.
Return
#twhen obj is a bitvector, else return#f.
Create a new bitvector of length len and optionally initialize all elements to fill.
Like
scm_make_bitvector, but the length is given as asize_t.
Create a new bitvector with the arguments as elements.
Return the length of the bitvector vec.
Like
scm_bitvector_length, but the length is returned as asize_t.
Return the element at index idx of the bitvector vec.
Return the element at index idx of the bitvector vec.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set the element at index idx of the bitvector vec when val is true, else clear it.
Set all elements of the bitvector vec when val is true, else clear them.
Return a new bitvector initialized with the elements of list.
Return a new list initialized with the elements of the bitvector vec.
Return a count of how many entries in bitvector are equal to bool. For example,
(bit-count #f #*000111000) => 6
Return the index of the first occurrance of bool in bitvector, starting from start. If there is no bool entry between start and the end of bitvector, then return
#f. For example,(bit-position #t #*000101 0) => 3 (bit-position #f #*0001111 3) => #f
Modify bitvector by replacing each element with its negation.
Set entries of bitvector to bool, with uvec selecting the entries to change. The return value is unspecified.
If uvec is a bit vector, then those entries where it has
#tare the ones in bitvector which are set to bool. uvec and bitvector must be the same length. When bool is#tit's like uvec is OR'ed into bitvector. Or when bool is#fit can be seen as an ANDNOT.(define bv #*01000010) (bit-set*! bv #*10010001 #t) bv => #*11010011If uvec is a uniform vector of unsigned long integers, then they're indexes into bitvector which are set to bool.
(define bv #*01000010) (bit-set*! bv #u(5 2 7) #t) bv => #*01100111
Return a count of how many entries in bitvector are equal to bool, with uvec selecting the entries to consider.
uvec is interpreted in the same way as for
bit-set*!above. Namely, if uvec is a bit vector then entries which have#tthere are considered in bitvector. Or if uvec is a uniform vector of unsigned long integers then it's the indexes in bitvector to consider.For example,
(bit-count* #*01110111 #*11001101 #t) => 3 (bit-count* #*01110111 #u(7 0 4) #f) => 2
Like
scm_vector_elements(see Vector Accessing from C), but for bitvectors. The variable pointed to by offp is set to the value returned byscm_array_handle_bit_elements_offset. Seescm_array_handle_bit_elementsfor how to use the returned pointer and the offset.
Like
scm_bitvector_elements, but the pointer is good for reading and writing.
Guile has a number of data types that are generally vector-like: strings, uniform numeric vectors, bitvectors, and of course ordinary vectors of arbitrary Scheme values. These types are disjoint: a Scheme value belongs to at most one of the four types listed above.
If you want to gloss over this distinction and want to treat all four types with common code, you can use the procedures in this section. They work with the generalized vector type, which is the union of the four vector-like types.
Return
#tif obj is a vector, string, bitvector, or uniform numeric vector.
Return the length of the generalized vector v.
Return the element at index idx of the generalized vector v.
Set the element at index idx of the generalized vector v to val.
Return a new list whose elements are the elements of the generalized vector v.
Return
1if obj is a vector, string, bitvector, or uniform numeric vector; else return0.
Return the length of the generalized vector v.
Return the element at index idx of the generalized vector v.
Set the element at index idx of the generalized vector v to val.
Like
scm_array_get_handlebut an error is signalled when v is not of rank one. You can usescm_array_handle_refandscm_array_handle_setto read and write the elements of v, or you can use functions likescm_array_handle_<foo>_elementsto deal with specific types of vectors.
Arrays are a collection of cells organized into an arbitrary number of dimensions. Each cell can be accessed in constant time by supplying an index for each dimension.
In the current implementation, an array uses a generalized vector for
the actual storage of its elements. Any kind of generalized vector
will do, so you can have arrays of uniform numeric values, arrays of
characters, arrays of bits, and of course, arrays of arbitrary Scheme
values. For example, arrays with an underlying c64vector might
be nice for digital signal processing, while arrays made from a
u8vector might be used to hold gray-scale images.
The number of dimensions of an array is called its rank. Thus, a matrix is an array of rank 2, while a vector has rank 1. When accessing an array element, you have to specify one exact integer for each dimension. These integers are called the indices of the element. An array specifies the allowed range of indices for each dimension via an inclusive lower and upper bound. These bounds can well be negative, but the upper bound must be greater than or equal to the lower bound minus one. When all lower bounds of an array are zero, it is called a zero-origin array.
Arrays can be of rank 0, which could be interpreted as a scalar. Thus, a zero-rank array can store exactly one object and the list of indices of this element is the empty list.
Arrays contain zero elements when one of their dimensions has a zero length. These empty arrays maintain information about their shape: a matrix with zero columns and 3 rows is different from a matrix with 3 columns and zero rows, which again is different from a vector of length zero.
Generalized vectors, such as strings, uniform numeric vectors, bit vectors and ordinary vectors, are the special case of one dimensional arrays.
An array is displayed as # followed by its rank, followed by a
tag that describes the underlying vector, optionally followed by
information about its shape, and finally followed by the cells,
organized into dimensions using parentheses.
In more words, the array tag is of the form
#<rank><vectag><@lower><:len><@lower><:len>...
where <rank> is a positive integer in decimal giving the rank of
the array. It is omitted when the rank is 1 and the array is non-shared
and has zero-origin (see below). For shared arrays and for a non-zero
origin, the rank is always printed even when it is 1 to dinstinguish
them from ordinary vectors.
The <vectag> part is the tag for a uniform numeric vector, like
u8, s16, etc, b for bitvectors, or a for
strings. It is empty for ordinary vectors.
The <@lower> part is a `@' character followed by a signed
integer in decimal giving the lower bound of a dimension. There is one
<@lower> for each dimension. When all lower bounds are zero,
all <@lower> parts are omitted.
The <:len> part is a `:' character followed by an unsigned
integer in decimal giving the length of a dimension. Like for the lower
bounds, there is one <:len> for each dimension, and the
<:len> part always follows the <@lower> part for a
dimension. Lengths are only then printed when they can't be deduced
from the nested lists of elements of the array literal, which can happen
when at least one length is zero.
As a special case, an array of rank 0 is printed as
#0<vectag>(<scalar>), where <scalar> is the result of
printing the single element of the array.
Thus,
#(1 2 3)#@2(1 2 3)#2((1 2 3) (4 5 6))#u32(0 1 2)#2u32@2@3((1 2) (2 3))#2()#2:0:2()#0(12)When an array is created, the range of each dimension must be specified, e.g., to create a 2x3 array with a zero-based index:
(make-array 'ho 2 3) => #2((ho ho ho) (ho ho ho))
The range of each dimension can also be given explicitly, e.g., another way to create the same array:
(make-array 'ho '(0 1) '(0 2)) => #2((ho ho ho) (ho ho ho))
The following procedures can be used with arrays (or vectors). An argument shown as idx... means one parameter for each dimension in the array. A idxlist argument means a list of such values, one for each dimension.
Return
#tif the obj is an array, and#fif not.The second argument to scm_array_p is there for historical reasons, but it is not used. You should always pass
SCM_UNDEFINEDas its value.
Return
#tif the obj is an array of type type, and#fif not.
Return
0if the obj is an array of type type, and1if not.
Equivalent to
(make-typed-array #tfill bound...).
Create and return an array that has as many dimensions as there are bounds and (maybe) fill it with fill.
The underlaying storage vector is created according to type, which must be a symbol whose name is the `vectag' of the array as explained above, or
#tfor ordinary, non-specialized arrays.For example, using the symbol
f64for type will create an array that uses af64vectorfor storing its elements, andawill use a string.When fill is not the special unspecified value, the new array is filled with fill. Otherwise, the initial contents of the array is unspecified. The special unspecified value is stored in the variable
*unspecified*so that for example(make-typed-array 'u32 *unspecified* 4)creates a uninitializedu32vector of length 4.Each bound may be a positive non-zero integer N, in which case the index for that dimension can range from 0 through N-1; or an explicit index range specifier in the form
(LOWER UPPER), where both lower and upper are integers, possibly less than zero, and possibly the same number (however, lower cannot be greater than upper).
Return an array of the type indicated by type with elements the same as those of list.
The argument dimspec determines the number of dimensions of the array and their lower bounds. When dimspec is an exact integer, it gives the number of dimensions directly and all lower bounds are zero. When it is a list of exact integers, then each element is the lower index bound of a dimension, and there will be as many dimensions as elements in the list.
Return the type of array. This is the `vectag' used for printing array (or
#tfor ordinary arrays) and can be used withmake-typed-arrayto create an array of the same kind as array.
Return the element at
(idx ...)in array.(define a (make-array 999 '(1 2) '(3 4))) (array-ref a 2 4) => 999
Return
#tif the given index would be acceptable toarray-ref.(define a (make-array #f '(1 2) '(3 4))) (array-in-bounds? a 2 3) => #t (array-in-bounds? a 0 0) => #f
Set the element at
(idx ...)in array to obj. The return value is unspecified.(define a (make-array #f '(0 1) '(0 1))) (array-set! a #t 1 1) a => #2((#f #f) (#f #t))
dim1, dim2 ... should be nonnegative integers less than the rank of array.
enclose-arrayreturns an array resembling an array of shared arrays. The dimensions of each shared array are the same as the dimth dimensions of the original array, the dimensions of the outer array are the same as those of the original array that did not match a dim.An enclosed array is not a general Scheme array. Its elements may not be set using
array-set!. Two references to the same element of an enclosed array will beequal?but will not in general beeq?. The value returned byarray-prototypewhen given an enclosed array is unspecified.For example,
(enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1) => #<enclosed-array (#1(a d) #1(b e) #1(c f)) (#1(1 4) #1(2 5) #1(3 6))> (enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 0) => #<enclosed-array #2((a 1) (d 4)) #2((b 2) (e 5)) #2((c 3) (f 6))>
Return a list of the bounds for each dimenson of array.
array-shapegives(lower upper)for each dimension.array-dimensionsinstead returns just upper+1 for dimensions with a 0 lower bound. Both are suitable as input tomake-array.For example,
(define a (make-array 'foo '(-1 3) 5)) (array-shape a) => ((-1 3) (0 4)) (array-dimensions a) => ((-1 3) 5)
Return a list consisting of all the elements, in order, of array.
Copy every element from vector or array src to the corresponding element of dst. dst must have the same rank as src, and be at least as large in each dimension. The return value is unspecified.
Store fill in every element of array. The value returned is unspecified.
Return
#tif all arguments are arrays with the same shape, the same type, and have corresponding elements which are eitherequal?orarray-equal?. This function differs fromequal?(see Equality) in that a one dimensional shared array may bearray-equal?but notequal?to a vector or uniform vector.
Set each element of the dst array to values obtained from calls to proc. The value returned is unspecified.
Each call is
(proc elem1...elemN), where each elem is from the corresponding src array, at the dst index.array-map-in-order!makes the calls in row-major order,array-map!makes them in an unspecified order.The src arrays must have the same number of dimensions as dst, and must have a range for each dimension which covers the range in dst. This ensures all dst indices are valid in each src.
Apply proc to each tuple of elements of src1 ... srcN, in row-major order. The value returned is unspecified.
Set each element of the dst array to values returned by calls to proc. The value returned is unspecified.
Each call is
(proc i1...iN), where i1...iN is the destination index, one parameter for each dimension. The order in which the calls are made is unspecified.For example, to create a 4x4 matrix representing a cyclic group,
/ 0 1 2 3 \ | 1 2 3 0 | | 2 3 0 1 | \ 3 0 1 2 /(define a (make-array #f 4 4)) (array-index-map! a (lambda (i j) (modulo (+ i j) 4)))
Attempt to read all elements of ura, in lexicographic order, as binary objects from port-or-fdes. If an end of file is encountered, the objects up to that point are put into ura (starting at the beginning) and the remainder of the array is unchanged.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be read, leaving the remainder of the vector unchanged.
uniform-array-read!returns the number of objects read. port-or-fdes may be omitted, in which case it defaults to the value returned by(current-input-port).
Writes all elements of ura as binary objects to port-or-fdes.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be written.
The number of objects actually written is returned. port-or-fdes may be omitted, in which case it defaults to the value returned by
(current-output-port).
Return a new array which shares the storage of oldarray. Changes made through either affect the same underlying storage. The bound... arguments are the shape of the new array, the same as
make-array(see Array Procedures).mapfunc translates coordinates from the new array to the oldarray. It's called as
(mapfuncnewidx1 ...)with one parameter for each dimension of the new array, and should return a list of indices for oldarray, one for each dimension of oldarray.mapfunc must be affine linear, meaning that each oldarray index must be formed by adding integer multiples (possibly negative) of some or all of newidx1 etc, plus a possible integer offset. The multiples and offset must be the same in each call.
One good use for a shared array is to restrict the range of some dimensions, so as to apply sayarray-for-eachorarray-fill!to only part of an array. The plainlistfunction can be used for mapfunc in this case, making no changes to the index values. For example,(make-shared-array #2((a b c) (d e f) (g h i)) list 3 2) => #2((a b) (d e) (g h))The new array can have fewer dimensions than oldarray, for example to take a column from an array.
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i) (list i 2)) '(0 2)) => #1(c f i)A diagonal can be taken by using the single new array index for both row and column in the old array. For example,
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i) (list i i)) '(0 2)) => #1(a e i)Dimensions can be increased by for instance considering portions of a one dimensional array as rows in a two dimensional array. (
array-contentsbelow can do the opposite, flattening an array.)(make-shared-array #1(a b c d e f g h i j k l) (lambda (i j) (list (+ (* i 3) j))) 4 3) => #2((a b c) (d e f) (g h i) (j k l))By negating an index the order that elements appear can be reversed. The following just reverses the column order,
(make-shared-array #2((a b c) (d e f) (g h i)) (lambda (i j) (list i (- 2 j))) 3 3) => #2((c b a) (f e d) (i h g))A fixed offset on indexes allows for instance a change from a 0 based to a 1 based array,
(define x #2((a b c) (d e f) (g h i))) (define y (make-shared-array x (lambda (i j) (list (1- i) (1- j))) '(1 3) '(1 3))) (array-ref x 0 0) => a (array-ref y 1 1) => aA multiple on an index allows every Nth element of an array to be taken. The following is every third element,
(make-shared-array #1(a b c d e f g h i j k l) (lambda (i) (list (* i 3))) 4) => #1(a d g j)The above examples can be combined to make weird and wonderful selections from an array, but it's important to note that because mapfunc must be affine linear, arbitrary permutations are not possible.
In the current implementation, mapfunc is not called for every access to the new array but only on some sample points to establish a base and stride for new array indices in oldarray data. A few sample points are enough because mapfunc is linear.
For each dimension, return the distance between elements in the root vector.
Return the root vector index of the first element in the array.
Return the root vector of a shared array.
If array may be unrolled into a one dimensional shared array without changing their order (last subscript changing fastest), then
array-contentsreturns that shared array, otherwise it returns#f. All arrays made bymake-arrayandmake-typed-arraymay be unrolled, some arrays made bymake-shared-arraymay not be.If the optional argument strict is provided, a shared array will be returned only if its elements are stored internally contiguous in memory.
Return an array sharing contents with array, but with dimensions arranged in a different order. There must be one dim argument for each dimension of array. dim1, dim2, ... should be integers between 0 and the rank of the array to be returned. Each integer in that range must appear at least once in the argument list.
The values of dim1, dim2, ... correspond to dimensions in the array to be returned, and their positions in the argument list to dimensions of array. Several dims may have the same value, in which case the returned array will have smaller rank than array.
(transpose-array '#2((a b) (c d)) 1 0) => #2((a c) (b d)) (transpose-array '#2((a b) (c d)) 0 0) => #1(a d) (transpose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 1 0) => #2((a 4) (b 5) (c 6))
Arrays, especially uniform numeric arrays, are useful to efficiently represent large amounts of rectangularily organized information, such as matrices, images, or generally blobs of binary data. It is desirable to access these blobs in a C like manner so that they can be handed to external C code such as linear algebra libraries or image processing routines.
While pointers to the elements of an array are in use, the array itself must be protected so that the pointer remains valid. Such a protected array is said to be reserved. A reserved array can be read but modifications to it that would cause the pointer to its elements to become invalid are prevented. When you attempt such a modification, an error is signalled.
(This is similar to locking the array while it is in use, but without the danger of a deadlock. In a multi-threaded program, you will need additional synchronization to avoid modifying reserved arrays.)
You must take care to always unreserve an array after reserving it, even in the presence of non-local exits. If a non-local exit can happen between these two calls, you should install a dynwind context that releases the array when it is left (see Dynamic Wind).
In addition, array reserving and unreserving must be properly paired. For instance, when reserving two or more arrays in a certain order, you need to unreserve them in the opposite order.
Once you have reserved an array and have retrieved the pointer to its elements, you must figure out the layout of the elements in memory. Guile allows slices to be taken out of arrays without actually making a copy, such as making an alias for the diagonal of a matrix that can be treated as a vector. Arrays that result from such an operation are not stored contiguously in memory and when working with their elements directly, you need to take this into account.
The layout of array elements in memory can be defined via a mapping function that computes a scalar position from a vector of indices. The scalar position then is the offset of the element with the given indices from the start of the storage block of the array.
In Guile, this mapping function is restricted to be affine: all
mapping functions of Guile arrays can be written as p = b +
c[0]*i[0] + c[1]*i[1] + ... + c[n-1]*i[n-1] where i[k] is the
kth index and n is the rank of the array. For
example, a matrix of size 3x3 would have b == 0, c[0] ==
3 and c[1] == 1. When you transpose this matrix (with
transpose-array, say), you will get an array whose mapping
function has b == 0, c[0] == 1 and c[1] == 3.
The function scm_array_handle_dims gives you (indirect) access to
the coefficients c[k].
Note that there are no functions for accessing the elements of a character array yet. Once the string implementation of Guile has been changed to use Unicode, we will provide them.
This is a structure type that holds all information necessary to manage the reservation of arrays as explained above. Structures of this type must be allocated on the stack and must only be accessed by the functions listed below.
Reserve array, which must be an array, and prepare handle to be used with the functions below. You must eventually call
scm_array_handle_releaseon handle, and do this in a properly nested fashion, as explained above. The structure pointed to by handle does not need to be initialized before calling this function.
End the array reservation represented by handle. After a call to this function, handle might be used for another reservation.
Return the rank of the array represented by handle.
This structure type holds information about the layout of one dimension of an array. It includes the following fields:
ssize_t lbndssize_t ubnd- The lower and upper bounds (both inclusive) of the permissible index range for the given dimension. Both values can be negative, but lbnd is always less than or equal to ubnd.
ssize_t inc- The distance from one element of this dimension to the next. Note, too, that this can be negative.
Return a pointer to a C vector of information about the dimensions of the array represented by handle. This pointer is valid as long as the array remains reserved. As explained above, the
scm_t_array_dimstructures returned by this function can be used calculate the position of an element in the storage block of the array from its indices.This position can then be used as an index into the C array pointer returned by the various
scm_array_handle_<foo>_elementsfunctions, or withscm_array_handle_refandscm_array_handle_set.Here is how one can compute the position pos of an element given its indices in the vector indices:
ssize_t indices[RANK]; scm_t_array_dim *dims; ssize_t pos; size_t i; pos = 0; for (i = 0; i < RANK; i++) { if (indices[i] < dims[i].lbnd || indices[i] > dims[i].ubnd) out_of_range (); pos += (indices[i] - dims[i].lbnd) * dims[i].inc; }
Compute the position corresponding to indices, a list of indices. The position is computed as described above for
scm_array_handle_dims. The number of the indices and their range is checked and an approrpiate error is signalled for invalid indices.
Return the element at position pos in the storage block of the array represented by handle. Any kind of array is acceptable. No range checking is done on pos.
Set the element at position pos in the storage block of the array represented by handle to val. Any kind of array is acceptable. No range checking is done on pos. An error is signalled when the array can not store val.
Return a pointer to the elements of a ordinary array of general Scheme values (i.e., a non-uniform array) for reading. This pointer is valid as long as the array remains reserved.
Like
scm_array_handle_elements, but the pointer is good for reading and writing.
Return a pointer to the elements of a uniform numeric array for reading. This pointer is valid as long as the array remains reserved. The size of each element is given by
scm_array_handle_uniform_element_size.
Like
scm_array_handle_uniform_elements, but the pointer is good reading and writing.
Return the size of one element of the uniform numeric array represented by handle.
Return a pointer to the elements of a uniform numeric array of the indicated kind for reading. This pointer is valid as long as the array remains reserved.
The pointers for
c32andc64uniform numeric arrays point to pairs of floating point numbers. The even index holds the real part, the odd index the imaginary part of the complex number.
Like
scm_array_handle_<kind>_elements, but the pointer is good for reading and writing.
Return a pointer to the words that store the bits of the represented array, which must be a bit array.
Unlike other arrays, bit arrays have an additional offset that must be figured into index calculations. That offset is returned by
scm_array_handle_bit_elements_offset.To find a certain bit you first need to calculate its position as explained above for
scm_array_handle_dimsand then add the offset. This gives the absolute position of the bit, which is always a non-negative integer.Each word of the bit array storage block contains exactly 32 bits, with the least significant bit in that word having the lowest absolute position number. The next word contains the next 32 bits.
Thus, the following code can be used to access a bit whose position according to
scm_array_handle_dimsis given in pos:SCM bit_array; scm_t_array_handle handle; scm_t_uint32 *bits; ssize_t pos; size_t abs_pos; size_t word_pos, mask; scm_array_get_handle (&bit_array, &handle); bits = scm_array_handle_bit_elements (&handle); pos = ... abs_pos = pos + scm_array_handle_bit_elements_offset (&handle); word_pos = abs_pos / 32; mask = 1L << (abs_pos % 32); if (bits[word_pos] & mask) /* bit is set. */ scm_array_handle_release (&handle);
Like
scm_array_handle_bit_elementsbut the pointer is good for reading and writing. You must take care not to modify bits outside of the allowed index range of the array, even for contiguous arrays.
A record type is a first class object representing a user-defined data type. A record is an instance of a record type.
Return
#tif obj is a record of any type and#fotherwise.Note that
record?may be true of any Scheme value; there is no promise that records are disjoint with other Scheme types.
Create and return a new record-type descriptor.
type-name is a string naming the type. Currently it's only used in the printed representation of records, and in diagnostics. field-names is a list of symbols naming the fields of a record of the type. Duplicates are not allowed among these symbols.
(make-record-type "employee" '(name age salary))The optional print argument is a function used by
display,write, etc, for printing a record of the new type. It's called as(printrecord port)and should look at record and write to port.
Return a procedure for constructing new members of the type represented by rtd. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in that list are unspecified. The field-names argument defaults to the list of field names in the call to
make-record-typethat created the type represented by rtd; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.
Return a procedure for testing membership in the type represented by rtd. The returned procedure accepts exactly one argument and returns a true value if the argument is a member of the indicated record type; it returns a false value otherwise.
Return a procedure for reading the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field-names in the call to
make-record-typethat created the type represented by rtd.
Return a procedure for writing the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the modifier procedure is unspecified. The symbol field-name must be a member of the list of field-names in the call to
make-record-typethat created the type represented by rtd.
Return a record-type descriptor representing the type of the given record. That is, for example, if the returned descriptor were passed to
record-predicate, the resulting predicate would return a true value when passed the given record. Note that it is not necessarily the case that the returned descriptor is the one that was passed torecord-constructorin the call that created the constructor procedure that created the given record.
Return the type-name associated with the type represented by rtd. The returned value is
eqv?to the type-name argument given in the call tomake-record-typethat created the type represented by rtd.
Return a list of the symbols naming the fields in members of the type represented by rtd. The returned value is
equal?to the field-names argument given in the call tomake-record-typethat created the type represented by rtd.
A structure is a first class data type which holds Scheme values
or C words in fields numbered 0 upwards. A vtable represents a
structure type, giving field types and permissions, and an optional
print function for write etc.
Structures are lower level than records (see Records) but have
some extra features. The vtable system allows sets of types be
constructed, with class data. The uninterpreted words can
inter-operate with C code, allowing arbitrary pointers or other values
to be stored along side usual Scheme SCM values.
A vtable is a structure type, specifying its layout, and other information. A vtable is actually itself a structure, but there's no need to worray about that initially (see Vtable Contents.)
Create a new vtable.
fields is a string describing the fields in the structures to be created. Each field is represented by two characters, a type letter and a permissions letter, for example
"pw". The types are as follows.
p– a Scheme value. “p” stands for “protected” meaning it's protected against garbage collection.u– an arbitrary word of data (anscm_t_bits). At the Scheme level it's read and written as an unsigned integer. “u” stands for “uninterpreted” (it's not treated as a Scheme value), or “unprotected” (it's not marked during GC), or “unsigned long” (its size), or all of these things.s– a self-reference. Such a field holds theSCMvalue of the structure itself (a circular reference). This can be useful in C code where you might have a pointer to the data array, and want to get the SchemeSCMhandle for the structure. In Scheme code it has no use.The second letter for each field is a permission code,
w– writable, the field can be read and written.r– read-only, the field can be read but not written.o– opaque, the field can be neither read nor written at the Scheme level. This can be used for fields which should only be used from C code.W,R,O– a tail array, with permissions for the array fields as perw,r,o.A tail array is further fields at the end of a structure. The last field in the layout string might be for instance `pW' to have a tail of writable Scheme-valued fields. The `pW' field itself holds the tail size, and the tail fields come after it.
Here are some examples.
(make-vtable "pw") ;; one writable field (make-vtable "prpw") ;; one read-only and one writable (make-vtable "pwuwuw") ;; one scheme and two uninterpreted (make-vtable "prpW") ;; one fixed then a tail arrayThe optional print argument is a function called by
displayandwrite(etc) to give a printed representation of a structure created from this vtable. It's called(printstruct port)and should look at struct and write to port. The default print merely gives a form like `#<struct ADDR:ADDR>' with a pair of machine addresses.The following print function for example shows the two fields of its structure.
(make-vtable "prpw" (lambda (struct port) (display "#<") (display (struct-ref 0)) (display " and ") (display (struct-ref 1)) (display ">")))
This section describes the basic procedures for working with
structures. make-struct creates a structure, and
struct-ref and struct-set! access write fields.
Create a new structure, with layout per the given vtable (see Vtables).
tail-size is the size of the tail array if vtable specifies a tail array. tail-size should be 0 when vtable doesn't specify a tail array.
The optional init... arguments are initial values for the fields of the structure (and the tail array). This is the only way to put values in read-only fields. If there are fewer init arguments than fields then the defaults are
#ffor a Scheme field (typep) or 0 for an uninterpreted field (typeu).Type
sself-reference fields, permissionoopaque fields, and the count field of a tail array are all ignored for the init arguments, ie. an argument is not consumed by such a field. Ansis always set to the structure itself, anois always set to#for 0 (with the intention that C code will do something to it later), and the tail count is always the given tail-size.For example,
(define v (make-vtable "prpwpw")) (define s (make-struct v 0 123 "abc" 456)) (struct-ref s 0) => 123 (struct-ref s 1) => "abc"(define v (make-vtable "prpW")) (define s (make-struct v 6 "fixed field" 'x 'y)) (struct-ref s 0) => "fixed field" (struct-ref s 1) => 2 ;; tail size (struct-ref s 2) => x ;; tail array ... (struct-ref s 3) => y (struct-ref s 4) => #f
Return
#tif obj is a structure, or#fif not.
Return the contents of field number n in struct. The first field is number 0.
An error is thrown if n is out of range, or if the field cannot be read because it's
oopaque.
Set field number n in struct to value. The first field is number 0.
An error is thrown if n is out of range, or if the field cannot be written because it's
rread-only oroopaque.
Return the vtable used by struct.
This can be used to examine the layout of an unknown structure, see Vtable Contents.
A vtable is itself a structure, with particular fields that hold information about the structures to be created. These include the fields of those structures, and the print function for them. The variables below allow access to those fields.
Return
#tif obj is a vtable structure.Note that because vtables are simply structures with a particular layout,
struct-vtable?can potentially return true on an application structure which merely happens to look like a vtable.
The field number of the layout specification in a vtable. The layout specification is a symbol like
pwpwformed from the fields string passed tomake-vtable, or created bymake-struct-layout(see Vtable Vtables).(define v (make-vtable "pwpw" 0)) (struct-ref v vtable-index-layout) => pwpwThis field is read-only, since the layout of structures using a vtable cannot be changed.
A self-reference to the vtable, ie. a type
sfield. This is used by C code within Guile and has no use at the Scheme level.
The field number of the printer function. This field contains
#fif the default print function should be used.(define (my-print-func struct port) ...) (define v (make-vtable "pwpw" my-print-func)) (struct-ref v vtable-index-printer) => my-print-funcThis field is writable, allowing the print function to be changed dynamically.
Get or set the name of vtable. name is a symbol and is used in the default print function when printing structures created from vtable.
(define v (make-vtable "pw")) (set-struct-vtable-name! v 'my-name) (define s (make-struct v 0)) (display s) -| #<my-name b7ab3ae0:b7ab3730>
Return the tag of the given vtable.
As noted above, a vtable is a structure and that structure is itself
described by a vtable. Such a “vtable of a vtable” can be created
with make-vtable-vtable below. This can be used to build sets
of related vtables, possibly with extra application fields.
This second level of vtable can be a little confusing. The ball example below is a typical use, adding a “class data” field to the vtables, from which instance structures are created. The current implementation of Guile's own records (see Records) does something similar, a record type descriptor is a vtable with room to hold the field names of the records to be created from it.
Create a “vtable-vtable” which can be used to create vtables. This vtable-vtable is also a vtable, and is self-describing, meaning its vtable is itself. The following is a simple usage.
(define vt-vt (make-vtable-vtable "" 0)) (define vt (make-struct vt-vt 0 (make-struct-layout "pwpw")) (define s (make-struct vt 0 123 456)) (struct-ref s 0) => 123
make-structis used to create a vtable from the vtable-vtable. The first initializer is a layout object (fieldvtable-index-layout), usually obtained frommake-struct-layout(below). An optional second initializer is a printer function (fieldvtable-index-printer), used as described undermake-vtable(see Vtables).user-fields is a layout string giving extra fields to have in the vtables. A vtable starts with some base fields as per Vtable Contents, and user-fields is appended. The user-fields start at field numbervtable-offset-user(below), and exist in both the vtable-vtable and in the vtables created from it. Such fields provide space for “class data”. For example,(define vt-of-vt (make-vtable-vtable "pw" 0)) (define vt (make-struct vt-of-vt 0)) (struct-set! vt vtable-offset-user "my class data")tail-size is the size of the tail array in the vtable-vtable itself, if user-fields specifies a tail array. This should be 0 if nothing extra is required or the format has no tail array. The tail array field such as `pW' holds the tail array size, as usual, and is followed by the extra space.
(define vt-vt (make-vtable-vtable "pW" 20)) (define my-vt-tail-start (1+ vtable-offset-user)) (struct-set! vt-vt (+ 3 my-vt-tail-start) "data in tail")The optional print argument is used by
displayandwrite(etc) to print the vtable-vtable and any vtables created from it. It's called as(printvtable port)and should look at vtable and write to port. The default is the usual structure print function, which just gives machine addresses.
Return a structure layout symbol, from a fields string. fields is as described under
make-vtable(see Vtables). An invalid fields string is an error.(make-struct-layout "prpW") => prpW (make-struct-layout "blah") => ERROR
The first field in a vtable which is available for application use. Such fields only exist when specified by user-fields in
make-vtable-vtableabove.
Here's an extended vtable-vtable example, creating classes of “balls”. Each class has a “colour”, which is fixed. Instances of those classes are created, and such each such ball has an “owner”, which can be changed.
(define ball-root (make-vtable-vtable "pr" 0))
(define (make-ball-type ball-color)
(make-struct ball-root 0
(make-struct-layout "pw")
(lambda (ball port)
(format port "#<a ~A ball owned by ~A>"
(color ball)
(owner ball)))
ball-color))
(define (color ball) (struct-ref (struct-vtable ball) vtable-offset-user))
(define (owner ball) (struct-ref ball 0))
(define red (make-ball-type 'red))
(define green (make-ball-type 'green))
(define (make-ball type owner) (make-struct type 0 owner))
(define ball (make-ball green 'Nisse))
ball => #<a green ball owned by Nisse>
A dictionary object is a data structure used to index
information in a user-defined way. In standard Scheme, the main
aggregate data types are lists and vectors. Lists are not really
indexed at all, and vectors are indexed only by number
(e.g. (vector-ref foo 5)). Often you will find it useful
to index your data on some other type; for example, in a library
catalog you might want to look up a book by the name of its
author. Dictionaries are used to help you organize information in
such a way.
An association list (or alist for short) is a list of
key-value pairs. Each pair represents a single quantity or
object; the car of the pair is a key which is used to
identify the object, and the cdr is the object's value.
A hash table also permits you to index objects with arbitrary keys, but in a way that makes looking up any one object extremely fast. A well-designed hash system makes hash table lookups almost as fast as conventional array or vector references.
Alists are popular among Lisp programmers because they use only the language's primitive operations (lists, car, cdr and the equality primitives). No changes to the language core are necessary. Therefore, with Scheme's built-in list manipulation facilities, it is very convenient to handle data stored in an association list. Also, alists are highly portable and can be easily implemented on even the most minimal Lisp systems.
However, alists are inefficient, especially for storing large quantities of data. Because we want Guile to be useful for large software systems as well as small ones, Guile provides a rich set of tools for using either association lists or hash tables.
An association list is a conventional data structure that is often used
to implement simple key-value databases. It consists of a list of
entries in which each entry is a pair. The key of each entry is
the car of the pair and the value of each entry is the
cdr.
ASSOCIATION LIST ::= '( (KEY1 . VALUE1)
(KEY2 . VALUE2)
(KEY3 . VALUE3)
...
)
Association lists are also known, for short, as alists.
The structure of an association list is just one example of the infinite
number of possible structures that can be built using pairs and lists.
As such, the keys and values in an association list can be manipulated
using the general list structure procedures cons, car,
cdr, set-car!, set-cdr! and so on. However,
because association lists are so useful, Guile also provides specific
procedures for manipulating them.
All of Guile's dedicated association list procedures, apart from
acons, come in three flavours, depending on the level of equality
that is required to decide whether an existing key in the association
list is the same as the key that the procedure call uses to identify the
required entry.
eq? to determine key
equality.
eqv? to determine
key equality.
equal? to
determine key equality.
acons is an exception because it is used to build association
lists which do not require their entries' keys to be unique.
acons adds a new entry to an association list and returns the
combined association list. The combined alist is formed by consing the
new entry onto the head of the alist specified in the acons
procedure call. So the specified alist is not modified, but its
contents become shared with the tail of the combined alist that
acons returns.
In the most common usage of acons, a variable holding the
original association list is updated with the combined alist:
(set! address-list (acons name address address-list))
In such cases, it doesn't matter that the old and new values of
address-list share some of their contents, since the old value is
usually no longer independently accessible.
Note that acons adds the specified new entry regardless of
whether the alist may already contain entries with keys that are, in
some sense, the same as that of the new entry. Thus acons is
ideal for building alists where there is no concept of key uniqueness.
(set! task-list (acons 3 "pay gas bill" '()))
task-list
=>
((3 . "pay gas bill"))
(set! task-list (acons 3 "tidy bedroom" task-list))
task-list
=>
((3 . "tidy bedroom") (3 . "pay gas bill"))
assq-set!, assv-set! and assoc-set! are used to add
or replace an entry in an association list where there is a
concept of key uniqueness. If the specified association list already
contains an entry whose key is the same as that specified in the
procedure call, the existing entry is replaced by the new one.
Otherwise, the new entry is consed onto the head of the old association
list to create the combined alist. In all cases, these procedures
return the combined alist.
assq-set! and friends may destructively modify the
structure of the old association list in such a way that an existing
variable is correctly updated without having to set! it to the
value returned:
address-list
=>
(("mary" . "34 Elm Road") ("james" . "16 Bow Street"))
(assoc-set! address-list "james" "1a London Road")
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
address-list
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
Or they may not:
(assoc-set! address-list "bob" "11 Newington Avenue")
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
address-list
=>
(("mary" . "34 Elm Road") ("james" . "1a London Road"))
The only safe way to update an association list variable when adding or
replacing an entry like this is to set! the variable to the
returned value:
(set! address-list
(assoc-set! address-list "bob" "11 Newington Avenue"))
address-list
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
Because of this slight inconvenience, you may find it more convenient to use hash tables to store dictionary data. If your application will not be modifying the contents of an alist very often, this may not make much difference to you.
If you need to keep the old value of an association list in a form
independent from the list that results from modification by
acons, assq-set!, assv-set! or assoc-set!,
use list-copy to copy the old association list before modifying
it.
Add a new key-value pair to alist. A new pair is created whose car is key and whose cdr is value, and the pair is consed onto alist, and the new list is returned. This function is not destructive; alist is not modified.
Reassociate key in alist with value: find any existing alist entry for key and associate it with the new value. If alist does not contain an entry for key, add a new one. Return the (possibly new) alist.
These functions do not attempt to verify the structure of alist, and so may cause unusual results if passed an object that is not an association list.
assq, assv and assoc find the entry in an alist
for a given key, and return the (key . value) pair.
assq-ref, assv-ref and assoc-ref do a similar
lookup, but return just the value.
Return the first entry in alist with the given key. The return is the pair
(KEY . VALUE)from alist. If there's no matching entry the return is#f.
assqcompares keys witheq?,assvuseseqv?andassocusesequal?. See also SRFI-1 which has an extendedassoc(SRFI-1 Association Lists).
Return the value from the first entry in alist with the given key, or
#fif there's no such entry.
assq-refcompares keys witheq?,assv-refuseseqv?andassoc-refusesequal?.Notice these functions have the key argument last, like other
-reffunctions, but this is opposite to what whatassqetc above use.When the return is
#fit can be either key not found, or an entry which happens to have value#fin thecdr. Useassqetc above if you need to differentiate these cases.
To remove the element from an association list whose key matches a
specified key, use assq-remove!, assv-remove! or
assoc-remove! (depending, as usual, on the level of equality
required between the key that you specify and the keys in the
association list).
As with assq-set! and friends, the specified alist may or may not
be modified destructively, and the only safe way to update a variable
containing the alist is to set! it to the value that
assq-remove! and friends return.
address-list
=>
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
("james" . "1a London Road"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
=>
(("bob" . "11 Newington Avenue") ("james" . "1a London Road"))
Note that, when assq/v/oc-remove! is used to modify an
association list that has been constructed only using the corresponding
assq/v/oc-set!, there can be at most one matching entry in the
alist, so the question of multiple entries being removed in one go does
not arise. If assq/v/oc-remove! is applied to an association
list that has been constructed using acons, or an
assq/v/oc-set! with a different level of equality, or any mixture
of these, it removes only the first matching entry from the alist, even
if the alist might contain further matching entries. For example:
(define address-list '())
(set! address-list (assq-set! address-list "mary" "11 Elm Street"))
(set! address-list (assq-set! address-list "mary" "57 Pine Drive"))
address-list
=>
(("mary" . "57 Pine Drive") ("mary" . "11 Elm Street"))
(set! address-list (assoc-remove! address-list "mary"))
address-list
=>
(("mary" . "11 Elm Street"))
In this example, the two instances of the string "mary" are not the same
when compared using eq?, so the two assq-set! calls add
two distinct entries to address-list. When compared using
equal?, both "mary"s in address-list are the same as the
"mary" in the assoc-remove! call, but assoc-remove! stops
after removing the first matching entry that it finds, and so one of the
"mary" entries is left in place.
Delete the first entry in alist associated with key, and return the resulting alist.
sloppy-assq, sloppy-assv and sloppy-assoc behave
like the corresponding non-sloppy- procedures, except that they
return #f when the specified association list is not well-formed,
where the non-sloppy- versions would signal an error.
Specifically, there are two conditions for which the non-sloppy-
procedures signal an error, which the sloppy- procedures handle
instead by returning #f. Firstly, if the specified alist as a
whole is not a proper list:
(assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
=>
ERROR: In procedure assoc in expression (assoc "mary" (quote #)):
ERROR: Wrong type argument in position 2 (expecting association list): ((1 . 2) ("key" . "door") . "open sesame")
(sloppy-assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
=>
#f
Secondly, if one of the entries in the specified alist is not a pair:
(assoc 2 '((1 . 1) 2 (3 . 9)))
=>
ERROR: In procedure assoc in expression (assoc 2 (quote #)):
ERROR: Wrong type argument in position 2 (expecting association list): ((1 . 1) 2 (3 . 9))
(sloppy-assoc 2 '((1 . 1) 2 (3 . 9)))
=>
#f
Unless you are explicitly working with badly formed association lists,
it is much safer to use the non-sloppy- procedures, because they
help to highlight coding and data errors that the sloppy-
versions would silently cover up.
Behaves like
assqbut does not do any error checking. Recommended only for use in Guile internals.
Behaves like
assvbut does not do any error checking. Recommended only for use in Guile internals.
Behaves like
assocbut does not do any error checking. Recommended only for use in Guile internals.
Here is a longer example of how alists may be used in practice.
(define capitals '(("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami")))
;; What's the capital of Oregon?
(assoc "Oregon" capitals) => ("Oregon" . "Salem")
(assoc-ref capitals "Oregon") => "Salem"
;; We left out South Dakota.
(set! capitals
(assoc-set! capitals "South Dakota" "Pierre"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Miami"))
;; And we got Florida wrong.
(set! capitals
(assoc-set! capitals "Florida" "Tallahassee"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Oregon" . "Salem")
("Florida" . "Tallahassee"))
;; After Oregon secedes, we can remove it.
(set! capitals
(assoc-remove! capitals "Oregon"))
capitals
=> (("South Dakota" . "Pierre")
("New York" . "Albany")
("Florida" . "Tallahassee"))
Hash tables are dictionaries which offer similar functionality as association lists: They provide a mapping from keys to values. The difference is that association lists need time linear in the size of elements when searching for entries, whereas hash tables can normally search in constant time. The drawback is that hash tables require a little bit more memory, and that you can not use the normal list procedures (see Lists) for working with them.
Guile provides two types of hashtables. One is an abstract data type that can only be manipulated with the functions in this section. The other type is concrete: it uses a normal vector with alists as elements. The advantage of the abstract hash tables is that they will be automatically resized when they become too full or too empty.
For demonstration purposes, this section gives a few usage examples of some hash table procedures, together with some explanation what they do.
First we start by creating a new hash table with 31 slots, and populate it with two key/value pairs.
(define h (make-hash-table 31))
;; This is an opaque object
h
=>
#<hash-table 0/31>
;; We can also use a vector of alists.
(define h (make-vector 7 '()))
h
=>
#(() () () () () () ())
;; Inserting into a hash table can be done with hashq-set!
(hashq-set! h 'foo "bar")
=>
"bar"
(hashq-set! h 'braz "zonk")
=>
"zonk"
;; Or with hash-create-handle!
(hashq-create-handle! h 'frob #f)
=>
(frob . #f)
;; The vector now contains three elements in the alists and the frob
;; entry is at index (hashq 'frob).
h
=>
#(() () () () ((frob . #f) (braz . "zonk")) () ((foo . "bar")))
(hashq 'frob)
=>
4
You can get the value for a given key with the procedure
hashq-ref, but the problem with this procedure is that you
cannot reliably determine whether a key does exists in the table. The
reason is that the procedure returns #f if the key is not in
the table, but it will return the same value if the key is in the
table and just happens to have the value #f, as you can see in
the following examples.
(hashq-ref h 'foo)
=>
"bar"
(hashq-ref h 'frob)
=>
#f
(hashq-ref h 'not-there)
=>
#f
Better is to use the procedure hashq-get-handle, which makes a
distinction between the two cases. Just like assq, this
procedure returns a key/value-pair on success, and #f if the
key is not found.
(hashq-get-handle h 'foo)
=>
(foo . "bar")
(hashq-get-handle h 'not-there)
=>
#f
There is no procedure for calculating the number of key/value-pairs in
a hash table, but hash-fold can be used for doing exactly that.
(hash-fold (lambda (key value seed) (+ 1 seed)) 0 h)
=>
3
Like the association list functions, the hash table functions come in
several varieties, according to the equality test used for the keys.
Plain hash- functions use equal?, hashq-
functions use eq?, hashv- functions use eqv?, and
the hashx- functions use an application supplied test.
A single make-hash-table creates a hash table suitable for use
with any set of functions, but it's imperative that just one set is
then used consistently, or results will be unpredictable.
Hash tables are implemented as a vector indexed by a hash value formed
from the key, with an association list of key/value pairs for each
bucket in case distinct keys hash together. Direct access to the
pairs in those lists is provided by the -handle- functions.
The abstract kind of hash tables hide the vector in an opaque object
that represents the hash table, while for the concrete kind the vector
is the hashtable.
When the number of table entries in an abstract hash table goes above a threshold, the vector is made larger and the entries are rehashed, to prevent the bucket lists from becoming too long and slowing down accesses. When the number of entries goes below a threshold, the vector is shrunk to save space.
A abstract hash table is created with make-hash-table. To
create a vector that is suitable as a hash table, use
(make-vector size '()), for example.
For the hashx- “extended” routines, an application supplies a
hash function producing an integer index like hashq etc
below, and an assoc alist search function like assq etc
(see Retrieving Alist Entries). Here's an example of such
functions implementing case-insensitive hashing of string keys,
(use-modules (srfi srfi-1)
(srfi srfi-13))
(define (my-hash str size)
(remainder (string-hash-ci str) size))
(define (my-assoc str alist)
(find (lambda (pair) (string-ci=? str (car pair))) alist))
(define my-table (make-hash-table))
(hashx-set! my-hash my-assoc my-table "foo" 123)
(hashx-ref my-hash my-assoc my-table "FOO")
=> 123
In a hashx- hash function the aim is to spread keys
across the vector, so bucket lists don't become long. But the actual
values are arbitrary as long as they're in the range 0 to
size-1. Helpful functions for forming a hash value, in
addition to hashq etc below, include symbol-hash
(see Symbol Keys), string-hash and string-hash-ci
(see String Comparison), and char-set-hash
(see Character Set Predicates/Comparison).
Create a new abstract hash table object, with an optional minimum vector size.
When size is given, the table vector will still grow and shrink automatically, as described above, but with size as a minimum. If an application knows roughly how many entries the table will hold then it can use size to avoid rehashing when initial entries are added.
Return
#tif obj is a abstract hash table object.
Remove all items from table (without triggering a resize).
Lookup key in the given hash table, and return the associated value. If key is not found, return dflt, or
#fif dflt is not given.
Associate val with key in the given hash table. If key is already present then it's associated value is changed. If it's not present then a new entry is created.
Remove any association for key in the given hash table. If key is not in table then nothing is done.
Return a hash value for key. This is a number in the range 0 to size-1, which is suitable for use in a hash table of the given size.
Note that
hashqandhashvmay use internal addresses of objects, so if an object is garbage collected and re-created it can have a different hash value, even when the two are notionallyeq?. For instance with symbols,(hashq 'something 123) => 19 (gc) (hashq 'something 123) => 62In normal use this is not a problem, since an object entered into a hash table won't be garbage collected until removed. It's only if hashing calculations are somehow separated from normal references that its lifetime needs to be considered.
Return the
(key.value)pair for key in the given hash table, or#fif key is not in table.
Return the
(key.value)pair for key in the given hash table. If key is not in table then create an entry for it with init as the value, and return that pair.
Apply proc to the entries in the given hash table. Each call is
(proc key value).hash-map->listreturns a list of the results from these calls,hash-for-eachdiscards the results and returns an unspecified value.Calls are made over the table entries in an unspecified order, and for
hash-map->listthe order of the values in the returned list is unspecified. Results will be unpredictable if table is modified while iterating.For example the following returns a new alist comprising all the entries from
mytable, in no particular order.(hash-map->list cons mytable)
Apply proc to the entries in the given hash table. Each call is
(proc handle), where handle is a(key.value)pair. Return an unspecified value.
hash-for-each-handlediffers fromhash-for-eachonly in the argument list of proc.
Accumulate a result by applying proc to the elements of the given hash table. Each call is
(proc key value prior-result), where key and value are from the table and prior-result is the return from the previous proc call. For the first call, prior-result is the given init value.Calls are made over the table entries in an unspecified order. Results will be unpredictable if table is modified while
hash-foldis running.For example, the following returns a count of how many keys in
mytableare strings.(hash-fold (lambda (key value prior) (if (string? key) (1+ prior) prior)) 0 mytable)
This chapter contains reference information related to defining and working with smobs. See Defining New Types (Smobs) for a tutorial-like introduction to smobs.
This function adds a new smob type, named name, with instance size size, to the system. The return value is a tag that is used in creating instances of the type.
If size is 0, the default free function will do nothing.
If size is not 0, the default free function will deallocate the memory block pointed to by
SCM_SMOB_DATAwithscm_gc_free. The WHAT parameter in the call toscm_gc_freewill be NAME.Default values are provided for the mark, free, print, and equalp functions, as described in Defining New Types (Smobs). If you want to customize any of these functions, the call to
scm_make_smob_typeshould be immediately followed by calls to one or several ofscm_set_smob_mark,scm_set_smob_free,scm_set_smob_print, and/orscm_set_smob_equalp.
This function sets the smob marking procedure for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The mark procedure must cause
scm_gc_markto be called for everySCMvalue that is directly referenced by the smob instance obj. One of theseSCMvalues can be returned from the procedure and Guile will callscm_gc_markfor it. This can be used to avoid deep recursions for smob instances that form a list.It must not call any libguile function or macro except
scm_gc_mark,SCM_SMOB_FLAGS,SCM_SMOB_DATA,SCM_SMOB_DATA_2, andSCM_SMOB_DATA_3.
This function sets the smob freeing procedure for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The free procedure must deallocate all resources that are directly associated with the smob instance OBJ. It must assume that all
SCMvalues that it references have already been freed and are thus invalid.It must also not call any libguile function or macro except
scm_gc_free,SCM_SMOB_FLAGS,SCM_SMOB_DATA,SCM_SMOB_DATA_2, andSCM_SMOB_DATA_3.The free procedure must return 0.
This function sets the smob printing procedure for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The print procedure should output a textual representation of the smob instance obj to port, using information in pstate.
The textual representation should be of the form
#<name ...>. This ensures thatreadwill not interpret it as some other Scheme value.It is often best to ignore pstate and just print to port with
scm_display,scm_write,scm_simple_format, andscm_puts.
This function sets the smob equality-testing predicate for the smob type specified by the tag tc. tc is the tag returned by
scm_make_smob_type.The equalp procedure should return
SCM_BOOL_Twhen obj1 isequal?to obj2. Else it should return SCM_BOOL_F. Both obj1 and obj2 are instances of the smob type tc.
When val is a smob of the type indicated by tag, do nothing. Else, signal an error.
Return true iff exp is a smob instance of the type indicated by tag. The expression exp can be evaluated more than once, so it shouldn't contain any side effects.
Make value contain a smob instance of the type with tag tag and smob data data, data2, and data3, as appropriate.
The tag is what has been returned by
scm_make_smob_type. The initial values data, data2, and data3 are of typescm_t_bits; when you want to use them forSCMvalues, these values need to be converted to ascm_t_bitsfirst by usingSCM_UNPACK.The flags of the smob instance start out as zero.
Since it is often the case (e.g., in smob constructors) that you will create a smob instance and return it, there is also a slightly specialized macro for this situation:
This macro expands to a block of code that creates a smob instance of the type with tag tag and smob data data, data2, and data3, as with
SCM_NEWSMOB, etc., and causes the surrounding function to return thatSCMvalue. It should be the last piece of code in a block.
Return the 16 extra bits of the smob obj. No meaning is predefined for these bits, you can use them freely.
Set the 16 extra bits of the smob obj to flags. No meaning is predefined for these bits, you can use them freely.
Return the first (second, third) immediate word of the smob obj as a
scm_t_bitsvalue. When the word contains aSCMvalue, useSCM_SMOB_OBJECT(etc.) instead.
Set the first (second, third) immediate word of the smob obj to val. When the word should be set to a
SCMvalue, useSCM_SMOB_SET_OBJECT(etc.) instead.
Return the first (second, third) immediate word of the smob obj as a
SCMvalue. When the word contains ascm_t_bitsvalue, useSCM_SMOB_DATA(etc.) instead.
Set the first (second, third) immediate word of the smob obj to val. When the word should be set to a
scm_t_bitsvalue, useSCM_SMOB_SET_DATA(etc.) instead.
Return a pointer to the first (second, third) immediate word of the smob obj. Note that this is a pointer to
SCM. If you need to work withscm_t_bitsvalues, useSCM_PACKandSCM_UNPACK, as appropriate.
Mark the references in the smob x, assuming that x's first data word contains an ordinary Scheme object, and x refers to no other objects. This function simply returns x's first data word.
A lambda expression evaluates to a procedure. The environment
which is in effect when a lambda expression is evaluated is
enclosed in the newly created procedure, this is referred to as a
closure (see About Closure).
When a procedure created by lambda is called with some actual
arguments, the environment enclosed in the procedure is extended by
binding the variables named in the formal argument list to new locations
and storing the actual arguments into these locations. Then the body of
the lambda expression is evaluation sequentially. The result of
the last expression in the procedure body is then the result of the
procedure invocation.
The following examples will show how procedures can be created using
lambda, and what you can do with these procedures.
(lambda (x) (+ x x)) => a procedure
((lambda (x) (+ x x)) 4) => 8
The fact that the environment in effect when creating a procedure is enclosed in the procedure is shown with this example:
(define add4
(let ((x 4))
(lambda (y) (+ x y))))
(add4 6) => 10
formals should be a formal argument list as described in the following table.
(variable1...)- The procedure takes a fixed number of arguments; when the procedure is called, the arguments will be stored into the newly created location for the formal variables.
- variable
- The procedure takes any number of arguments; when the procedure is called, the sequence of actual arguments will converted into a list and stored into the newly created location for the formal variable.
(variable1...variablen.variablen+1)- If a space-delimited period precedes the last variable, then the procedure takes n or more variables where n is the number of formal arguments before the period. There must be at least one argument before the period. The first n actual arguments will be stored into the newly allocated locations for the first n formal arguments and the sequence of the remaining actual arguments is converted into a list and the stored into the location for the last formal argument. If there are exactly n actual arguments, the empty list is stored into the location of the last formal argument.
The list in variable or variablen+1 is always newly created and the procedure can modify it if desired. This is the case even when the procedure is invoked via
apply, the required part of the list argument there will be copied (see Procedures for On the Fly Evaluation).body is a sequence of Scheme expressions which are evaluated in order when the procedure is invoked.
Procedures written in C can be registered for use from Scheme,
provided they take only arguments of type SCM and return
SCM values. scm_c_define_gsubr is likely to be the most
useful mechanism, combining the process of registration
(scm_c_make_gsubr) and definition (scm_define).
Register a C procedure FCN as a “subr” — a primitive subroutine that can be called from Scheme. It will be associated with the given name but no environment binding will be created. The arguments req, opt and rst specify the number of required, optional and “rest” arguments respectively. The total number of these arguments should match the actual number of arguments to fcn. The number of rest arguments should be 0 or 1.
scm_c_make_gsubrreturns a value of typeSCMwhich is a “handle” for the procedure.
Register a C procedure FCN, as for
scm_c_make_gsubrabove, and additionally create a top-level Scheme binding for the procedure in the “current environment” usingscm_define.scm_c_define_gsubrreturns a handle for the procedure in the same way asscm_c_make_gsubr, which is usually not further required.
scm_c_make_gsubr and scm_c_define_gsubr automatically
use scm_c_make_subr and also scm_makcclo if necessary.
It is advisable to use the gsubr variants since they provide a
slightly higher-level abstraction of the Guile implementation.
Scheme procedures, as defined in R5RS, can either handle a fixed number of actual arguments, or a fixed number of actual arguments followed by arbitrarily many additional arguments. Writing procedures of variable arity can be useful, but unfortunately, the syntactic means for handling argument lists of varying length is a bit inconvenient. It is possible to give names to the fixed number of argument, but the remaining (optional) arguments can be only referenced as a list of values (see Lambda).
Guile comes with the module (ice-9 optargs), which makes using
optional arguments much more convenient. In addition, this module
provides syntax for handling keywords in argument lists
(see Keywords).
Before using any of the procedures or macros defined in this section,
you have to load the module (ice-9 optargs) with the statement:
(use-modules (ice-9 optargs))
The syntax let-optional and let-optional* are for
destructuring rest argument lists and giving names to the various list
elements. let-optional binds all variables simultaneously, while
let-optional* binds them sequentially, consistent with let
and let* (see Local Bindings).
These two macros give you an optional argument interface that is very Schemey and introduces no fancy syntax. They are compatible with the scsh macros of the same name, but are slightly extended. Each of binding may be of one of the forms var or
(var default-value). rest-arg should be the rest-argument of the procedures these are used from. The items in rest-arg are sequentially bound to the variable names are given. When rest-arg runs out, the remaining vars are bound either to the default values or#fif no default value was specified. rest-arg remains bound to whatever may have been left of rest-arg.After binding the variables, the expressions expr ... are evaluated in order.
let-keywords and let-keywords* extract values from
keyword style argument lists, binding local variables to those values
or to defaults.
args is evaluated and should give a list of the form
(#:keyword1 value1 #:keyword2 value2 ...). The bindings are variables and default expressions, with the variables to be set (by name) from the keyword values. The body forms are then evaluated and the last is the result. An example will make the syntax clearest,(define args '(#:xyzzy "hello" #:foo "world")) (let-keywords args #t ((foo "default for foo") (bar (string-append "default" "for" "bar"))) (display foo) (display ", ") (display bar)) -| world, defaultforbarThe binding for
foocomes from the#:fookeyword inargs. But the binding forbaris the default in thelet-keywords, since there's no#:barin the args.allow-other-keys? is evaluated and controls whether unknown keywords are allowed in the args list. When true other keys are ignored (such as
#:xyzzyin the example), when#fan error is thrown for anything unknown.
let-keywordsis likelet(see Local Bindings) in that all bindings are made at once, the defaults expressions are evaluated (if needed) outside the scope of thelet-keywords.
let-keywords*is likelet*, each binding is made successively, and the default expressions see the bindings previously made. This is the style used bylambda*keywords (see lambda* Reference). For example,(define args '(#:foo 3)) (let-keywords* args #f ((foo 99) (bar (+ foo 6))) (display bar)) -| 9The expression for each default is only evaluated if it's needed, ie. if the keyword doesn't appear in args. So one way to make a keyword mandatory is to throw an error of some sort as the default.
(define args '(#:start 7 #:finish 13)) (let-keywords* args #t ((start 0) (stop (error "missing #:stop argument"))) ...) => ERROR: missing #:stop argument
When using optional and keyword argument lists, lambda for
creating a procedure then let-optional or let-keywords
is a bit lengthy. lambda* combines the features of those
macros into a single convenient syntax.
Create a procedure which takes optional and/or keyword arguments specified with#:optionaland#:key. For example,(lambda* (a b #:optional c d . e) '())is a procedure with fixed arguments a and b, optional arguments c and d, and rest argument e. If the optional arguments are omitted in a call, the variables for them are bound to
#f.
lambda*can also take keyword arguments. For example, a procedure defined like this:(lambda* (#:key xyzzy larch) '())can be called with any of the argument lists
(#:xyzzy 11),(#:larch 13),(#:larch 42 #:xyzzy 19),(). Whichever arguments are given as keywords are bound to values (and those not given are#f).Optional and keyword arguments can also have default values to take when not present in a call, by giving a two-element list of variable name and expression. For example in
(lambda* (foo #:optional (bar 42) #:key (baz 73)) (list foo bar baz))foo is a fixed argument, bar is an optional argument with default value 42, and baz is a keyword argument with default value 73. Default value expressions are not evaluated unless they are needed, and until the procedure is called.
Normally it's an error if a call has keywords other than those specified by
#:key, but adding#:allow-other-keysto the definition (after the keyword argument declarations) will ignore unknown keywords.If a call has a keyword given twice, the last value is used. For example,
((lambda* (#:key (heads 0) (tails 0)) (display (list heads tails))) #:heads 37 #:tails 42 #:heads 99) -| (99 42)
#:restis a synonym for the dotted syntax rest argument. The argument lists(a . b)and(a #:rest b)are equivalent in all respects. This is provided for more similarity to DSSSL, MIT-Scheme and Kawa among others, as well as for refugees from other Lisp dialects.When
#:keyis used together with a rest argument, the keyword parameters in a call all remain in the rest list. This is the same as Common Lisp. For example,((lambda* (#:key (x 0) #:allow-other-keys #:rest r) (display r)) #:x 123 #:y 456) -| (#:x 123 #:y 456)
#:optionaland#:keyestablish their bindings successively, from left to right, as perlet-optional*andlet-keywords*. This means default expressions can refer back to prior parameters, for example(lambda* (start #:optional (end (+ 10 start))) (do ((i start (1+ i))) ((> i end)) (display i)))
Just like define has a shorthand notation for defining procedures
(see Lambda Alternatives), define* is provided as an
abbreviation of the combination of define and lambda*.
define*-public is the lambda* version of
define-public; defmacro* and defmacro*-public exist
for defining macros with the improved argument list handling
possibilities. The -public versions not only define the
procedures/macros, but also export them from the current module.
define*anddefine*-publicsupport optional arguments with a similar syntax tolambda*. They also support arbitrary-depth currying, just like Guile's define. Some examples:(define* (x y #:optional a (z 3) #:key w . u) (display (list y z u)))defines a procedure
xwith a fixed argument y, an optional argument a, another optional argument z with default value 3, a keyword argument w, and a rest argument u.(define-public* ((foo #:optional bar) #:optional baz) '())This illustrates currying. A procedure
foois defined, which, when called with an optional argument bar, returns a procedure that takes an optional argument baz.Of course,
define*[-public]also supports#:restand#:allow-other-keysin the same way aslambda*.
These are just like
defmacroanddefmacro-publicexcept that they takelambda*-style extended parameter lists, where#:optional,#:key,#:allow-other-keysand#:restare allowed with the usual semantics. Here is an example of a macro with an optional argument:(defmacro* transmorgify (a #:optional b) (a 1))
Procedures always have attached the environment in which they were created and information about how to apply them to actual arguments. In addition to that, properties and meta-information can be stored with procedures. The procedures in this section can be used to test whether a given procedure satisfies a condition; and to access and set a procedure's property.
The first group of procedures are predicates to test whether a Scheme
object is a procedure, or a special procedure, respectively.
procedure? is the most general predicates, it returns #t
for any kind of procedure. closure? does not return #t
for primitive procedures, and thunk? only returns #t for
procedures which do not accept any arguments.
Return
#tif obj is a procedure.
Procedure properties are general properties to be attached to procedures. These can be the name of a procedure or other relevant information, such as debug hints.
Return the name of the procedure proc
Return the source of the procedure proc.
Return the environment of the procedure proc.
Return obj's property list.
Return the property of obj with name key.
Set obj's property list to alist.
In obj's property list, set the property named key to value.
Documentation for a procedure can be accessed with the procedure
procedure-documentation.
Return the documentation string associated with
proc. By convention, if a procedure contains more than one expression and the first expression is a string constant, that string is assumed to contain documentation for that procedure.
A procedure with setter is a special kind of procedure which normally behaves like any accessor procedure, that is a procedure which accesses a data structure. The difference is that this kind of procedure has a so-called setter attached, which is a procedure for storing something into a data structure.
Procedures with setters are treated specially when the procedure appears
in the special form set! (REFFIXME). How it works is best shown
by example.
Suppose we have a procedure called foo-ref, which accepts two
arguments, a value of type foo and an integer. The procedure
returns the value stored at the given index in the foo object.
Let f be a variable containing such a foo data
structure.6
(foo-ref f 0) => bar
(foo-ref f 1) => braz
Also suppose that a corresponding setter procedure called
foo-set! does exist.
(foo-set! f 0 'bla)
(foo-ref f 0) => bla
Now we could create a new procedure called foo, which is a
procedure with setter, by calling make-procedure-with-setter with
the accessor and setter procedures foo-ref and foo-set!.
Let us call this new procedure foo.
(define foo (make-procedure-with-setter foo-ref foo-set!))
foo can from now an be used to either read from the data
structure stored in f, or to write into the structure.
(set! (foo f 0) 'dum)
(foo f 0) => dum
Create a new procedure which behaves like procedure, but with the associated setter setter.
Return
#tif obj is a procedure with an associated setter procedure.
Return the procedure of proc, which must be either a procedure with setter, or an operator struct.
Return the setter of proc, which must be either a procedure with setter or an operator struct.
Macros are objects which cause the expression that they appear in to be transformed in some way before being evaluated. In expressions that are intended for macro transformation, the identifier that names the relevant macro must appear as the first element, like this:
(macro-name macro-args ...)
In Lisp-like languages, the traditional way to define macros is very
similar to procedure definitions. The key differences are that the
macro definition body should return a list that describes the
transformed expression, and that the definition is marked as a macro
definition (rather than a procedure definition) by the use of a
different definition keyword: in Lisp, defmacro rather than
defun, and in Scheme, define-macro rather than
define.
Guile supports this style of macro definition using both defmacro
and define-macro. The only difference between them is how the
macro name and arguments are grouped together in the definition:
(defmacro name (args ...) body ...)
is the same as
(define-macro (name args ...) body ...)
The difference is analogous to the corresponding difference between
Lisp's defun and Scheme's define.
false-if-exception, from the boot-9.scm file in the Guile
distribution, is a good example of macro definition using
defmacro:
(defmacro false-if-exception (expr)
`(catch #t
(lambda () ,expr)
(lambda args #f)))
The effect of this definition is that expressions beginning with the
identifier false-if-exception are automatically transformed into
a catch expression following the macro definition specification.
For example:
(false-if-exception (open-input-file "may-not-exist"))
==
(catch #t
(lambda () (open-input-file "may-not-exist"))
(lambda args #f))
syntax-rules System
R5RS defines an alternative system for macro and syntax transformations
using the keywords define-syntax, let-syntax,
letrec-syntax and syntax-rules.
The main difference between the R5RS system and the traditional macros of the previous section is how the transformation is specified. In R5RS, rather than permitting a macro definition to return an arbitrary expression, the transformation is specified in a pattern language that
caddr etc.
The last point is commonly referred to as being hygienic: the R5RS
syntax-case system provides hygienic macros.
For example, the R5RS pattern language for the false-if-exception
example of the previous section looks like this:
(syntax-rules ()
((_ expr)
(catch #t
(lambda () expr)
(lambda args #f))))
In Guile, the syntax-rules system is provided by the (ice-9
syncase) module. To make these facilities available in your code,
include the expression (use-syntax (ice-9 syncase)) (see Using Guile Modules) before the first usage of define-syntax etc. If
you are writing a Scheme module, you can alternatively include the form
#:use-syntax (ice-9 syncase) in your define-module
declaration (see Creating Guile Modules).
syntax-rules Pattern Languagedefine-syntax: The gist is
(define-syntax <keyword> <transformer-spec>)
makes the <keyword> into a macro so that
(<keyword> ...)
expands at _compile_ or _read_ time (i.e. before any evaluation begins) into some expression that is given by the <transformer-spec>.
syntax-case SystemInternally, Guile uses three different flavors of macros. The three flavors are called acro (or syntax), macro and mmacro.
Given the expression
(foo ...)
with foo being some flavor of macro, one of the following things
will happen when the expression is evaluated.
foo has been defined to be an acro, the procedure used
in the acro definition of foo is passed the whole expression and
the current lexical environment, and whatever that procedure returns is
the value of evaluating the expression. You can think of this a
procedure that receives its argument as an unevaluated expression.
foo has been defined to be a macro, the procedure used
in the macro definition of foo is passed the whole expression and
the current lexical environment, and whatever that procedure returns is
evaluated again. That is, the procedure should return a valid Scheme
expression.
foo has been defined to be a mmacro, the procedure
used in the mmacro definition of `foo' is passed the whole expression
and the current lexical environment, and whatever that procedure returns
replaces the original expression. Evaluation then starts over from the
new expression that has just been returned.
The key difference between a macro and a mmacro is that the expression returned by a mmacro procedure is remembered (or memoized) so that the expansion does not need to be done again next time the containing code is evaluated.
The primitives procedure->syntax, procedure->macro and
procedure->memoizing-macro are used to construct acros, macros
and mmacros respectively. However, if you do not have a very special
reason to use one of these primitives, you should avoid them: they are
very specific to Guile's current implementation and therefore likely to
change. Use defmacro, define-macro (see Macros) or
define-syntax (see Syntax Rules) instead. (In low level
terms, defmacro, define-macro and define-syntax are
all implemented as mmacros.)
Return a macro which, when a symbol defined to this value appears as the first symbol in an expression, returns the result of applying code to the expression and the environment.
Return a macro which, when a symbol defined to this value appears as the first symbol in an expression, evaluates the result of applying code to the expression and the environment. For example:
(define trace (procedure->macro (lambda (x env) `(set! ,(cadr x) (tracef ,(cadr x) ',(cadr x)))))) (trace foo) == (set! foo (tracef foo 'foo)).
Return a macro which, when a symbol defined to this value appears as the first symbol in an expression, evaluates the result of applying code to the expression and the environment.
procedure->memoizing-macrois the same asprocedure->macro, except that the expression returned by code replaces the original macro expression in the memoized form of the containing code.
In the following primitives, acro flavor macros are referred to as syntax transformers.
Return
#tif obj is a regular macro, a memoizing macro or a syntax transformer.
Return one of the symbols
syntax,macroormacro!, depending on whether m is a syntax transformer, a regular macro, or a memoizing macro, respectively. If m is not a macro,#fis returned.
Return the transformer of the macro m.
Create and return a new pair whose car and cdr are x and y. Any source properties associated with xorig are also associated with the new pair.
This chapter contains information about procedures which are not cleanly tied to a specific data type. Because of their wide range of applications, they are collected in a utility chapter.
There are three kinds of core equality predicates in Scheme, described
below. The same kinds of comparisons arise in other functions, like
memq and friends (see List Searching).
For all three tests, objects of different types are never equal. So
for instance a list and a vector are not equal?, even if their
contents are the same. Exact and inexact numbers are considered
different types too, and are hence not equal even if their values are
the same.
eq? tests just for the same object (essentially a pointer
comparison). This is fast, and can be used when searching for a
particular object, or when working with symbols or keywords (which are
always unique objects).
eqv? extends eq? to look at the value of numbers and
characters. It can for instance be used somewhat like =
(see Comparison) but without an error if one operand isn't a
number.
equal? goes further, it looks (recursively) into the contents
of lists, vectors, etc. This is good for instance on lists that have
been read or calculated in various places and are the same, just not
made up of the same pairs. Such lists look the same (when printed),
and equal? will consider them the same.
Return
#tif x and y are the same object, except for numbers and characters. For example,(define x (vector 1 2 3)) (define y (vector 1 2 3)) (eq? x x) => #t (eq? x y) => #fNumbers and characters are not equal to any other object, but the problem is they're not necessarily
eq?to themselves either. This is even so when the number comes directly from a variable,(let ((n (+ 2 3))) (eq? n n)) => *unspecified*Generally
eqv?below should be used when comparing numbers or characters.=(see Comparison) orchar=?(see Characters) can be used too.It's worth noting that end-of-list
(),#t,#f, a symbol of a given name, and a keyword of a given name, are unique objects. There's just one of each, so for instance no matter how()arises in a program, it's the same object and can be compared witheq?,(define x (cdr '(123))) (define y (cdr '(456))) (eq? x y) => #t (define x (string->symbol "foo")) (eq? x 'foo) => #t
Return
1when x and y are equal in the sense ofeq?, otherwise return0.The
==operator should not be used onSCMvalues, anSCMis a C type which cannot necessarily be compared using==(see The SCM Type).
Return
#tif x and y are the same object, or for characters and numbers the same value.On objects except characters and numbers,
eqv?is the same aseq?above, it's true if x and y are the same object.If x and y are numbers or characters,
eqv?compares their type and value. An exact number is noteqv?to an inexact number (even if their value is the same).(eqv? 3 (+ 1 2)) => #t (eqv? 1 1.0) => #f
Return
#tif x and y are the same type, and their contents or value are equal.For a pair, string, vector, array or structure,
equal?compares the contents, and does so using using the sameequal?recursively, so a deep structure can be traversed.(equal? (list 1 2 3) (list 1 2 3)) => #t (equal? (list 1 2 3) (vector 1 2 3)) => #fFor other objects,
equal?compares as pereqv?above, which means characters and numbers are compared by type and value (and likeeqv?, exact and inexact numbers are notequal?, even if their value is the same).(equal? 3 (+ 1 2)) => #t (equal? 1 1.0) => #fHash tables are currently only compared as per
eq?, so two different tables are notequal?, even if their contents are the same.
equal?does not support circular data structures, it may go into an infinite loop if asked to compare two circular lists or similar.New application-defined object types (see Defining New Types (Smobs)) have an
equalphandler which is called byequal?. This lets an application traverse the contents or control what is consideredequal?for two objects of such a type. If there's no such handler, the default is to just compare as pereq?.
It's often useful to associate a piece of additional information with a Scheme object even though that object does not have a dedicated slot available in which the additional information could be stored. Object properties allow you to do just that.
Guile's representation of an object property is a procedure-with-setter
(see Procedures with Setters) that can be used with the generalized
form of set! (REFFIXME) to set and retrieve that property for any
Scheme object. So, setting a property looks like this:
(set! (my-property obj1) value-for-obj1)
(set! (my-property obj2) value-for-obj2)
And retrieving values of the same property looks like this:
(my-property obj1)
=>
value-for-obj1
(my-property obj2)
=>
value-for-obj2
To create an object property in the first place, use the
make-object-property procedure:
(define my-property (make-object-property))
Create and return an object property. An object property is a procedure-with-setter that can be called in two ways.
(set! (property obj)val)sets obj's property to val.(property obj)returns the current setting of obj's property.
A single object property created by make-object-property can
associate distinct property values with all Scheme values that are
distinguishable by eq? (including, for example, integers).
Internally, object properties are implemented using a weak key hash table. This means that, as long as a Scheme value with property values is protected from garbage collection, its property values are also protected. When the Scheme value is collected, its entry in the property table is removed and so the (ex-) property values are no longer protected by the table.
Create a property token that can be used with
primitive-property-refandprimitive-property-set!. Seeprimitive-property-reffor the significance of not-found-proc.
Return the property prop of obj.
When no value has yet been associated with prop and obj, the not-found-proc from prop is used. A call
(not-found-proc prop obj)is made and the result set as the property value. If not-found-proc is#fthen#fis the property value.
Set the property prop of obj to val.
Remove any value associated with prop and obj.
Traditionally, Lisp systems provide a different object property
interface to that provided by make-object-property, in which the
object property that is being set or retrieved is indicated by a symbol.
Guile includes this older kind of interface as well, but it may well be
removed in a future release, as it is less powerful than
make-object-property and so increases the size of the Guile
library for no benefit. (And it is trivial to write a compatibility
layer in Scheme.)
Return obj's property list.
Set obj's property list to alist.
Return the property of obj with name key.
In obj's property list, set the property named key to value.
Sorting is very important in computer programs. Therefore, Guile comes
with several sorting procedures built-in. As always, procedures with
names ending in ! are side-effecting, that means that they may
modify their parameters in order to produce their results.
The first group of procedures can be used to merge two lists (which must be already sorted on their own) and produce sorted lists containing all elements of the input lists.
Merge two already sorted lists into one. Given two lists alist and blist, such that
(sorted? alist less?)and(sorted? blist less?), return a new list in which the elements of alist and blist have been stably interleaved so that(sorted? (merge alist blist less?) less?). Note: this does _not_ accept vectors.
Takes two lists alist and blist such that
(sorted? alist less?)and(sorted? blist less?)and returns a new list in which the elements of alist and blist have been stably interleaved so that(sorted? (merge alist blist less?) less?). This is the destructive variant ofmergeNote: this does _not_ accept vectors.
The following procedures can operate on sequences which are either
vectors or list. According to the given arguments, they return sorted
vectors or lists, respectively. The first of the following procedures
determines whether a sequence is already sorted, the other sort a given
sequence. The variants with names starting with stable- are
special in that they maintain a special property of the input sequences:
If two or more elements are the same according to the comparison
predicate, they are left in the same order as they appeared in the
input.
Return
#tiff items is a list or a vector such that for all 1 <= i <= m, the predicate less returns true when applied to all elements i - 1 and i
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. This is not a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. The sorting is destructive, that means that the input sequence is modified to produce the sorted result. This is not a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. This is a stable sort.
Sort the sequence items, which may be a list or a vector. less is used for comparing the sequence elements. The sorting is destructive, that means that the input sequence is modified to produce the sorted result. This is a stable sort.
The procedures in the last group only accept lists or vectors as input, as their names indicate.
Sort the list items, using less for comparing the list elements. This is a stable sort.
Sort the list items, using less for comparing the list elements. The sorting is destructive, that means that the input list is modified to produce the sorted result. This is a stable sort.
Sort the vector vec, using less for comparing the vector elements. startpos (inclusively) and endpos (exclusively) delimit the range of the vector which gets sorted. The return value is not specified.
The procedures for copying lists (see Lists) only produce a flat
copy of the input list, and currently Guile does not even contain
procedures for copying vectors. copy-tree can be used for these
application, as it does not only copy the spine of a list, but also
copies any pairs in the cars of the input lists.
Recursively copy the data tree that is bound to obj, and return a the new data structure.
copy-treerecurses down the contents of both pairs and vectors (since both cons cells and vector cells may point to arbitrary objects), and stops recursing when it hits any other object.
When debugging Scheme programs, but also for providing a human-friendly interface, a procedure for converting any Scheme object into string format is very useful. Conversion from/to strings can of course be done with specialized procedures when the data type of the object to convert is known, but with this procedure, it is often more comfortable.
object->string converts an object by using a print procedure for
writing to a string port, and then returning the resulting string.
Converting an object back from the string is only possible if the object
type has a read syntax and the read syntax is preserved by the printing
procedure.
Return a Scheme string obtained by printing obj. Printing function can be specified by the optional second argument printer (default:
write).
A hook is a list of procedures to be called at well defined points in time. Typically, an application provides a hook h and promises its users that it will call all of the procedures in h at a defined point in the application's processing. By adding its own procedure to h, an application user can tap into or even influence the progress of the application.
Guile itself provides several such hooks for debugging and customization purposes: these are listed in a subsection below.
When an application first creates a hook, it needs to know how many arguments will be passed to the hook's procedures when the hook is run. The chosen number of arguments (which may be none) is declared when the hook is created, and all the procedures that are added to that hook must be capable of accepting that number of arguments.
A hook is created using make-hook. A procedure can be added to
or removed from a hook using add-hook! or remove-hook!,
and all of a hook's procedures can be removed together using
reset-hook!. When an application wants to run a hook, it does so
using run-hook.
Hook usage is shown by some examples in this section. First, we will define a hook of arity 2 — that is, the procedures stored in the hook will have to accept two arguments.
(define hook (make-hook 2))
hook
=> #<hook 2 40286c90>
Now we are ready to add some procedures to the newly created hook with
add-hook!. In the following example, two procedures are added,
which print different messages and do different things with their
arguments.
(add-hook! hook (lambda (x y)
(display "Foo: ")
(display (+ x y))
(newline)))
(add-hook! hook (lambda (x y)
(display "Bar: ")
(display (* x y))
(newline)))
Once the procedures have been added, we can invoke the hook using
run-hook.
(run-hook hook 3 4)
-| Bar: 12
-| Foo: 7
Note that the procedures are called in the reverse of the order with
which they were added. This is because the default behaviour of
add-hook! is to add its procedure to the front of the
hook's procedure list. You can force add-hook! to add its
procedure to the end of the list instead by providing a third
#t argument on the second call to add-hook!.
(add-hook! hook (lambda (x y)
(display "Foo: ")
(display (+ x y))
(newline)))
(add-hook! hook (lambda (x y)
(display "Bar: ")
(display (* x y))
(newline))
#t) ; <- Change here!
(run-hook hook 3 4)
-| Foo: 7
-| Bar: 12
When you create a hook with make-hook, you must specify the arity
of the procedures which can be added to the hook. If the arity is not
given explicitly as an argument to make-hook, it defaults to
zero. All procedures of a given hook must have the same arity, and when
the procedures are invoked using run-hook, the number of
arguments passed must match the arity specified at hook creation time.
The order in which procedures are added to a hook matters. If the third
parameter to add-hook! is omitted or is equal to #f, the
procedure is added in front of the procedures which might already be on
that hook, otherwise the procedure is added at the end. The procedures
are always called from the front to the end of the list when they are
invoked via run-hook.
The ordering of the list of procedures returned by hook->list
matches the order in which those procedures would be called if the hook
was run using run-hook.
Note that the C functions in the following entries are for handling Scheme-level hooks in C. There are also C-level hooks which have their own interface (see C Hooks).
Create a hook for storing procedure of arity n_args. n_args defaults to zero. The returned value is a hook object to be used with the other hook procedures.
Return
#tif hook is an empty hook,#fotherwise.
Add the procedure proc to the hook hook. The procedure is added to the end if append_p is true, otherwise it is added to the front. The return value of this procedure is not specified.
Remove the procedure proc from the hook hook. The return value of this procedure is not specified.
Remove all procedures from the hook hook. The return value of this procedure is not specified.
Convert the procedure list of hook to a list.
Apply all procedures from the hook hook to the arguments args. The order of the procedure application is first to last. The return value of this procedure is not specified.
If, in C code, you are certain that you have a hook object and well
formed argument list for that hook, you can also use
scm_c_run_hook, which is identical to scm_run_hook but
does no type checking.
The same as
scm_run_hookbut without any type checking to confirm that hook is actually a hook object and that args is a well-formed list matching the arity of the hook.
For C code, SCM_HOOKP is a faster alternative to
scm_hook_p:
Here is an example of how to handle Scheme-level hooks from C code using the above functions.
if (scm_is_true (scm_hook_p (obj)))
/* handle Scheme-level hook using C functions */
scm_reset_hook_x (obj);
else
/* do something else (obj is not a hook) */
The hooks already described are intended to be populated by Scheme-level procedures. In addition to this, the Guile library provides an independent set of interfaces for the creation and manipulation of hooks that are designed to be populated by functions implemented in C.
The original motivation here was to provide a kind of hook that could safely be invoked at various points during garbage collection. Scheme-level hooks are unsuitable for this purpose as running them could itself require memory allocation, which would then invoke garbage collection recursively ... However, it is also the case that these hooks are easier to work with than the Scheme-level ones if you only want to register C functions with them. So if that is mainly what your code needs to do, you may prefer to use this interface.
To create a C hook, you should allocate storage for a structure of type
scm_t_c_hook and then initialize it using scm_c_hook_init.
Data type for a C hook. The internals of this type should be treated as opaque.
Enumeration of possible hook types, which are:
SCM_C_HOOK_NORMAL- Type of hook for which all the registered functions will always be called.
SCM_C_HOOK_OR- Type of hook for which the sequence of registered functions will be called only until one of them returns C true (a non-NULL pointer).
SCM_C_HOOK_AND- Type of hook for which the sequence of registered functions will be called only until one of them returns C false (a NULL pointer).
Initialize the C hook at memory pointed to by hook. type should be one of the values of the
scm_t_c_hook_typeenumeration, and controls how the hook functions will be called. hook_data is a closure parameter that will be passed to all registered hook functions when they are called.
To add or remove a C function from a C hook, use scm_c_hook_add
or scm_c_hook_remove. A hook function must expect three
void * parameters which are, respectively:
scm_c_hook_init.
scm_c_hook_add.
scm_c_hook_run call that
runs the hook.
Function type for a C hook function: takes three
void *parameters and returns avoid *result.
Add function func, with function closure data func_data, to the C hook hook. The new function is appended to the hook's list of functions if appendp is non-zero, otherwise prepended.
Remove function func, with function closure data func_data, from the C hook hook.
scm_c_hook_removechecks both func and func_data so as to allow for the same func being registered multiple times with different closure data.
Finally, to invoke a C hook, call the scm_c_hook_run function
specifying the hook and the call closure data for this run:
Run the C hook hook will call closure data data. Subject to the variations for hook types
SCM_C_HOOK_ORandSCM_C_HOOK_AND,scm_c_hook_runcalls hook's registered functions in turn, passing them the hook's closure data, each function's closure data, and the call closure data.
scm_c_hook_run's return value is the return value of the last function to be called.
Whenever Guile performs a garbage collection, it calls the following hooks in the order shown.
C hook called at the very start of a garbage collection, after setting
scm_gc_running_pto 1, but before entering the GC critical section.If garbage collection is blocked because
scm_block_gcis non-zero, GC exits early soon after calling this hook, and no further hooks will be called.
C hook called before beginning the mark phase of garbage collection, after the GC thread has entered a critical section.
C hook called before beginning the sweep phase of garbage collection. This is the same as at the end of the mark phase, since nothing else happens between marking and sweeping.
C hook called after the end of the sweep phase of garbage collection, but while the GC thread is still inside its critical section.
C hook called at the very end of a garbage collection, after the GC thread has left its critical section.
Scheme hook with arity 0. This hook is run asynchronously (see Asyncs) soon after the GC has completed and any other events that were deferred during garbage collection have been processed. (Also accessible from C with the name
scm_after_gc_hook.)
All the C hooks listed here have type SCM_C_HOOK_NORMAL, are
initialized with hook closure data NULL, are are invoked by
scm_c_hook_run with call closure data NULL.
The Scheme hook after-gc-hook is particularly useful in
conjunction with guardians (see Guardians). Typically, if you are
using a guardian, you want to call the guardian after garbage collection
to see if any of the objects added to the guardian have been collected.
By adding a thunk that performs this call to after-gc-hook, you
can ensure that your guardian is tested after every garbage collection
cycle.
Scheme supports the definition of variables in different contexts. Variables can be defined at the top level, so that they are visible in the entire program, and variables can be defined locally to procedures and expressions. This is important for modularity and data abstraction.
On the top level of a program (i.e. when not inside the body of a
procedure definition or a let, let* or letrec
expression), a definition of the form
(define a value)
defines a variable called a and sets it to the value value.
If the variable already exists, because it has already been created by a
previous define expression with the same name, its value is
simply changed to the new value. In this case, then, the above
form is completely equivalent to
(set! a value)
This equivalence means that define can be used interchangeably
with set! to change the value of variables at the top level of
the REPL or a Scheme source file. It is useful during interactive
development when reloading a Scheme file that you have modified, because
it allows the define expressions in that file to work as expected
both the first time that the file is loaded and on subsequent occasions.
Note, though, that define and set! are not always
equivalent. For example, a set! is not allowed if the named
variable does not already exist, and the two expressions can behave
differently in the case where there are imported variables visible from
another module.
Create a top level variable named name with value value. If the named variable already exists, just change its value. The return value of a
defineexpression is unspecified.
The C API equivalents of define are scm_define and
scm_c_define, which differ from each other in whether the
variable name is specified as a SCM symbol or as a
null-terminated C string.
C equivalents of
define, with variable name specified either by sym, a symbol, or by name, a null-terminated C string. Both variants return the new or preexisting variable object.
define (when it occurs at top level), scm_define and
scm_c_define all create or set the value of a variable in the top
level environment of the current module. If there was not already a
variable with the specified name belonging to the current module, but a
similarly named variable from another module was visible through having
been imported, the newly created variable in the current module will
shadow the imported variable, such that the imported variable is no
longer visible.
Attention: Scheme definitions inside local binding constructs (see Local Bindings) act differently (see Internal Definitions).
As opposed to definitions at the top level, which are visible in the whole program (or current module, when Guile modules are used), it is also possible to define variables which are only visible in a well-defined part of the program. Normally, this part of a program will be a procedure or a subexpression of a procedure.
With the constructs for local binding (let, let* and
letrec), the Scheme language has a block structure like most
other programming languages since the days of Algol 60. Readers
familiar to languages like C or Java should already be used to this
concept, but the family of let expressions has a few properties
which are well worth knowing.
The first local binding construct is let. The other constructs
let* and letrec are specialized versions for usage where
using plain let is a bit inconvenient.
bindings has the form
((variable1 init1) ...)that is zero or more two-element lists of a variable and an arbitrary expression each. All variable names must be distinct.
A
letexpression is evaluated as follows.
- All init expressions are evaluated.
- New storage is allocated for the variables.
- The values of the init expressions are stored into the variables.
- The expressions in body are evaluated in order, and the value of the last expression is returned as the value of the
letexpression.- The storage for the variables is freed.
The init expressions are not allowed to refer to any of the variables.
Similar to
let, but the variable bindings are performed sequentially, that means that all init expression are allowed to use the variables defined on their left in the binding list.A
let*expression can always be expressed with nestedletexpressions.(let* ((a 1) (b a)) b) == (let ((a 1)) (let ((b a)) b))
Similar to
let, but it is possible to refer to the variable from lambda expression created in any of the inits. That is, procedures created in the init expression can recursively refer to the defined variables.(letrec ((even? (lambda (n) (if (zero? n) #t (odd? (- n 1))))) (odd? (lambda (n) (if (zero? n) #f (even? (- n 1)))))) (even? 88)) => #t
There is also an alternative form of the let form, which is used
for expressing iteration. Because of the use as a looping construct,
this form (the named let) is documented in the section about
iteration (see Iteration)
A define form which appears inside the body of a lambda,
let, let*, letrec or equivalent expression is
called an internal definition. An internal definition differs
from a top level definition (see Top Level), because the definition
is only visible inside the complete body of the enclosing form. Let us
examine the following example.
(let ((frumble "froz"))
(define banana (lambda () (apple 'peach)))
(define apple (lambda (x) x))
(banana))
=>
peach
Here the enclosing form is a let, so the defines in the
let-body are internal definitions. Because the scope of the
internal definitions is the complete body of the
let-expression, the lambda-expression which gets bound
to the variable banana may refer to the variable apple,
even though it's definition appears lexically after the definition
of banana. This is because a sequence of internal definition
acts as if it were a letrec expression.
(let ()
(define a 1)
(define b 2)
(+ a b))
is equivalent to
(let ()
(letrec ((a 1) (b 2))
(+ a b)))
Another noteworthy difference to top level definitions is that within
one group of internal definitions all variable names must be distinct.
That means where on the top level a second define for a given variable
acts like a set!, an exception is thrown for internal definitions
with duplicate bindings.
Guile provides a procedure for checking whether a symbol is bound in the top level environment.
Return
#tif sym is defined in the lexical environment env. When env is not specified, look in the top-level environment as defined by the current module.
See Control Flow for a discussion of how the more general control flow of Scheme affects C code.