Guile Reference Manual

Next: , Previous: , Up: (dir)   [Contents][Index]

The Guile Reference Manual

This manual documents Guile version 2.0.11.

Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2009, 2010, 2011, 2012, 2013, 2014 Free Software Foundation.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”


Table of Contents


Next: , Up: Top   [Contents][Index]

Preface

This manual describes how to use Guile, GNU’s Ubiquitous Intelligent Language for Extensions. It relates particularly to Guile version 2.0.11.


Next: , Up: Preface   [Contents][Index]

Contributors to this Manual

Like Guile itself, the Guile reference manual is a living entity, cared for by many people over a long period of time. As such, it is hard to identify individuals of whom to say “yes, this person, she wrote the manual.”

Still, among the many contributions, some caretakers stand out. First among them is Neil Jerram, who has been working on this document for ten years now. Neil’s attention both to detail and to the big picture have made a real difference in the understanding of a generation of Guile hackers.

Next we should note Marius Vollmer’s effect on this document. Marius maintained Guile during a period in which Guile’s API was clarified—put to the fire, so to speak—and he had the good sense to effect the same change on the manual.

Martin Grabmueller made substantial contributions throughout the manual in preparation for the Guile 1.6 release, including filling out a lot of the documentation of Scheme data types, control mechanisms and procedures. In addition, he wrote the documentation for Guile’s SRFI modules and modules associated with the Guile REPL.

Ludovic Courtès and Andy Wingo, the Guile maintainers at the time of this writing (late 2010), have also made their dent in the manual, writing documentation for new modules and subsystems in Guile 2.0. They are also responsible for ensuring that the existing text retains its relevance as Guile evolves. See Reporting Bugs, for more information on reporting problems in this manual.

The content for the first versions of this manual incorporated and was inspired by documents from Aubrey Jaffer, author of the SCM system on which Guile was based, and from Tom Lord, Guile’s first maintainer. Although most of this text has been rewritten, all of it was important, and some of the structure remains.

The manual for the first versions of Guile were largely written, edited, and compiled by Mark Galassi and Jim Blandy. In particular, Jim wrote the original tutorial on Guile’s data representation and the C API for accessing Guile objects.

Significant portions were also contributed by Thien-Thi Nguyen, Kevin Ryde, Mikael Djurfeldt, Christian Lynbech, Julian Graham, Gary Houston, Tim Pierce, and a few dozen more. You, reader, are most welcome to join their esteemed ranks. Visit Guile’s web site at http://www.gnu.org/software/guile/ to find out how to get involved.


Previous: , Up: Preface   [Contents][Index]

The Guile License

Guile is Free Software. Guile is copyrighted, not public domain, and there are restrictions on its distribution or redistribution, but these restrictions are designed to permit everything a cooperating person would want to do.

C code linking to the Guile library is subject to terms of that library. Basically such code may be published on any terms, provided users can re-link against a new or modified version of Guile.

C code linking to the Guile readline module is subject to the terms of that module. Basically such code must be published on Free terms.

Scheme level code written to be run by Guile (but not derived from Guile itself) is not restricted in any way, and may be published on any terms. We encourage authors to publish on Free terms.

You must be aware there is no warranty whatsoever for Guile. This is described in full in the licenses.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

Guile is an implementation of the Scheme programming language. Scheme (http://schemers.org/) is an elegant and conceptually simple dialect of Lisp, originated by Guy Steele and Gerald Sussman, and since evolved by the series of reports known as RnRS (the Revised^n Reports on Scheme).

Unlike, for example, Python or Perl, Scheme has no benevolent dictator. There are many Scheme implementations, with different characteristics and with communities and academic activities around them, and the language develops as a result of the interplay between these. Guile’s particular characteristics are that

The next few sections explain what we mean by these points. The sections after that cover how you can obtain and install Guile, and the typographical conventions that we use in this manual.


Next: , Up: Introduction   [Contents][Index]

1.1 Guile and Scheme

Guile implements Scheme as described in the Revised^5 Report on the Algorithmic Language Scheme (usually known as R5RS), providing clean and general data and control structures. Guile goes beyond the rather austere language presented in R5RS, extending it with a module system, full access to POSIX system calls, networking support, multiple threads, dynamic linking, a foreign function call interface, powerful string processing, and many other features needed for programming in the real world.

The Scheme community has recently agreed and published R6RS, the latest installment in the RnRS series. R6RS significantly expands the core Scheme language, and standardises many non-core functions that implementations—including Guile—have previously done in different ways. Guile has been updated to incorporate some of the features of R6RS, and to adjust some existing features to conform to the R6RS specification, but it is by no means a complete R6RS implementation. See R6RS Support.

Between R5RS and R6RS, the SRFI process (http://srfi.schemers.org/) standardised interfaces for many practical needs, such as multithreaded programming and multidimensional arrays. Guile supports many SRFIs, as documented in detail in SRFI Support.

In summary, so far as relationship to the Scheme standards is concerned, Guile is an R5RS implementation with many extensions, some of which conform to SRFIs or to the relevant parts of R6RS.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.2 Combining with C Code

Like a shell, Guile can run interactively—reading expressions from the user, evaluating them, and displaying the results—or as a script interpreter, reading and executing Scheme code from a file. Guile also provides an object library, libguile, that allows other applications to easily incorporate a complete Scheme interpreter. An application can then use Guile as an extension language, a clean and powerful configuration language, or as multi-purpose “glue”, connecting primitives provided by the application. It is easy to call Scheme code from C code and vice versa, giving the application designer full control of how and when to invoke the interpreter. Applications can add new functions, data types, control structures, and even syntax to Guile, creating a domain-specific language tailored to the task at hand, but based on a robust language design.

This kind of combination is helped by four aspects of Guile’s design and history. First is that Guile has always been targeted as an extension language. Hence its C API has always been of great importance, and has been developed accordingly. Second and third are rather technical points—that Guile uses conservative garbage collection, and that it implements the Scheme concept of continuations by copying and reinstating the C stack—but whose practical consequence is that most existing C code can be glued into Guile as is, without needing modifications to cope with strange Scheme execution flows. Last is the module system, which helps extensions to coexist without stepping on each others’ toes.

Guile’s module system allows one to break up a large program into manageable sections with well-defined interfaces between them. Modules may contain a mixture of interpreted and compiled code; Guile can use either static or dynamic linking to incorporate compiled code. Modules also encourage developers to package up useful collections of routines for general distribution; as of this writing, one can find Emacs interfaces, database access routines, compilers, GUI toolkit interfaces, and HTTP client functions, among others.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.3 Guile and the GNU Project

Guile was conceived by the GNU Project following the fantastic success of Emacs Lisp as an extension language within Emacs. Just as Emacs Lisp allowed complete and unanticipated applications to be written within the Emacs environment, the idea was that Guile should do the same for other GNU Project applications. This remains true today.

The idea of extensibility is closely related to the GNU project’s primary goal, that of promoting software freedom. Software freedom means that people receiving a software package can modify or enhance it to their own desires, including in ways that may not have occurred at all to the software’s original developers. For programs written in a compiled language like C, this freedom covers modifying and rebuilding the C code; but if the program also provides an extension language, that is usually a much friendlier and lower-barrier-of-entry way for the user to start making their own changes.

Guile is now used by GNU project applications such as AutoGen, Lilypond, Denemo, Mailutils, TeXmacs and Gnucash, and we hope that there will be many more in future.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.4 Interactive Programming

Non-free software has no interest in its users being able to see how it works. They are supposed to just accept it, or to report problems and hope that the source code owners will choose to work on them.

Free software aims to work reliably just as much as non-free software does, but it should also empower its users by making its workings available. This is useful for many reasons, including education, auditing and enhancements, as well as for debugging problems.

The ideal free software system achieves this by making it easy for interested users to see the source code for a feature that they are using, and to follow through that source code step-by-step, as it runs. In Emacs, good examples of this are the source code hyperlinks in the help system, and edebug. Then, for bonus points and maximising the ability for the user to experiment quickly with code changes, the system should allow parts of the source code to be modified and reloaded into the running program, to take immediate effect.

Guile is designed for this kind of interactive programming, and this distinguishes it from many Scheme implementations that instead prioritise running a fixed Scheme program as fast as possible—because there are tradeoffs between performance and the ability to modify parts of an already running program. There are faster Schemes than Guile, but Guile is a GNU project and so prioritises the GNU vision of programming freedom and experimentation.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.5 Supporting Multiple Languages

Since the 2.0 release, Guile’s architecture supports compiling any language to its core virtual machine bytecode, and Scheme is just one of the supported languages. Other supported languages are Emacs Lisp, ECMAScript (commonly known as Javascript) and Brainfuck, and work is under discussion for Lua, Ruby and Python.

This means that users can program applications which use Guile in the language of their choice, rather than having the tastes of the application’s author imposed on them.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.6 Obtaining and Installing Guile

Guile can be obtained from the main GNU archive site ftp://ftp.gnu.org or any of its mirrors. The file will be named guile-version.tar.gz. The current version is 2.0.11, so the file you should grab is:

ftp://ftp.gnu.org/gnu/guile/guile-2.0.11.tar.gz

To unbundle Guile use the instruction

zcat guile-2.0.11.tar.gz | tar xvf -

which will create a directory called guile-2.0.11 with all the sources. You can look at the file INSTALL for detailed instructions on how to build and install Guile, but you should be able to just do

cd guile-2.0.11
./configure
make
make install

This will install the Guile executable guile, the Guile library libguile and various associated header files and support libraries. It will also install the Guile reference manual.

Since this manual frequently refers to the Scheme “standard”, also known as R5RS, or the “Revised^5 Report on the Algorithmic Language Scheme”, we have included the report in the Guile distribution; see Introduction in Revised(5) Report on the Algorithmic Language Scheme. This will also be installed in your info directory.


Next: , Previous: , Up: Introduction   [Contents][Index]

1.7 Organisation of this Manual

The rest of this manual is organised into the following chapters.

Chapter 2: Hello Guile!

A whirlwind tour shows how Guile can be used interactively and as a script interpreter, how to link Guile into your own applications, and how to write modules of interpreted and compiled code for use with Guile. Everything introduced here is documented again and in full by the later parts of the manual.

Chapter 3: Hello Scheme!

For readers new to Scheme, this chapter provides an introduction to the basic ideas of the Scheme language. This material would apply to any Scheme implementation and so does not make reference to anything Guile-specific.

Chapter 4: Programming in Scheme

Provides an overview of programming in Scheme with Guile. It covers how to invoke the guile program from the command-line and how to write scripts in Scheme. It also introduces the extensions that Guile offers beyond standard Scheme.

Chapter 5: Programming in C

Provides an overview of how to use Guile in a C program. It discusses the fundamental concepts that you need to understand to access the features of Guile, such as dynamic types and the garbage collector. It explains in a tutorial like manner how to define new data types and functions for the use by Scheme programs.

Chapter 6: Guile API Reference

This part of the manual documents the Guile API in functionality-based groups with the Scheme and C interfaces presented side by side.

Chapter 7: Guile Modules

Describes some important modules, distributed as part of the Guile distribution, that extend the functionality provided by the Guile Scheme core.

Chapter 8: GOOPS

Describes GOOPS, an object oriented extension to Guile that provides classes, multiple inheritance and generic functions.


Previous: , Up: Introduction   [Contents][Index]

1.8 Typographical Conventions

In examples and procedure descriptions and all other places where the evaluation of Scheme expression is shown, we use some notation for denoting the output and evaluation results of expressions.

The symbol ‘’ is used to tell which value is returned by an evaluation:

(+ 1 2)
⇒ 3

Some procedures produce some output besides returning a value. This is denoted by the symbol ‘-|’.

(begin (display 1) (newline) 'hooray)
-| 1
⇒ hooray

As you can see, this code prints ‘1’ (denoted by ‘-|’), and returns hooray (denoted by ‘’).


Next: , Previous: , Up: Top   [Contents][Index]

2 Hello Guile!

This chapter presents a quick tour of all the ways that Guile can be used. There are additional examples in the examples/ directory in the Guile source distribution. It also explains how best to report any problems that you find.

The following examples assume that Guile has been installed in /usr/local/.


Next: , Up: Hello Guile!   [Contents][Index]

2.1 Running Guile Interactively

In its simplest form, Guile acts as an interactive interpreter for the Scheme programming language, reading and evaluating Scheme expressions the user enters from the terminal. Here is a sample interaction between Guile and a user; the user’s input appears after the $ and scheme@(guile-user)> prompts:

$ guile
scheme@(guile-user)> (+ 1 2 3)                ; add some numbers
$1 = 6
scheme@(guile-user)> (define (factorial n)    ; define a function
                       (if (zero? n) 1 (* n (factorial (- n 1)))))
scheme@(guile-user)> (factorial 20)
$2 = 2432902008176640000
scheme@(guile-user)> (getpwnam "root")        ; look in /etc/passwd
$3 = #("root" "x" 0 0 "root" "/root" "/bin/bash")
scheme@(guile-user)> C-d
$

Next: , Previous: , Up: Hello Guile!   [Contents][Index]

2.2 Running Guile Scripts

Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.

Here is a trivial Guile script. See Guile Scripting, for more details.

#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)

Next: , Previous: , Up: Hello Guile!   [Contents][Index]

2.3 Linking Guile into Programs

The Guile interpreter is available as an object library, to be linked into applications using Scheme as a configuration or extension language.

Here is simple-guile.c, source code for a program that will produce a complete Guile interpreter. In addition to all usual functions provided by Guile, it will also offer the function my-hostname.

#include <stdlib.h>
#include <libguile.h>

static SCM
my_hostname (void)
{
  char *s = getenv ("HOSTNAME");
  if (s == NULL)
    return SCM_BOOL_F;
  else
    return scm_from_locale_string (s);
}

static void
inner_main (void *data, int argc, char **argv)
{
  scm_c_define_gsubr ("my-hostname", 0, 0, 0, my_hostname);
  scm_shell (argc, argv);
}

int
main (int argc, char **argv)
{
  scm_boot_guile (argc, argv, inner_main, 0);
  return 0; /* never reached */
}

When Guile is correctly installed on your system, the above program can be compiled and linked like this:

$ gcc -o simple-guile simple-guile.c \
    `pkg-config --cflags --libs guile-2.0`

When it is run, it behaves just like the guile program except that you can also call the new my-hostname function.

$ ./simple-guile
scheme@(guile-user)> (+ 1 2 3)
$1 = 6
scheme@(guile-user)> (my-hostname)
"burns"

Next: , Previous: , Up: Hello Guile!   [Contents][Index]

2.4 Writing Guile Extensions

You can link Guile into your program and make Scheme available to the users of your program. You can also link your library into Guile and make its functionality available to all users of Guile.

A library that is linked into Guile is called an extension, but it really just is an ordinary object library.

The following example shows how to write a simple extension for Guile that makes the j0 function available to Scheme code.

#include <math.h>
#include <libguile.h>

SCM
j0_wrapper (SCM x)
{
  return scm_from_double (j0 (scm_to_double (x)));
}

void
init_bessel ()
{
  scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}

This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:

gcc `pkg-config --cflags guile-2.0` \
  -shared -o libguile-bessel.so -fPIC bessel.c

For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction in GNU Libtool).

A shared library can be loaded into a running Guile process with the function load-extension. The j0 is then immediately available:

$ guile
scheme@(guile-user)> (load-extension "./libguile-bessel" "init_bessel")
scheme@(guile-user)> (j0 2)
$1 = 0.223890779141236

For more on how to install your extension, see Installing Site Packages.


Next: , Previous: , Up: Hello Guile!   [Contents][Index]

2.5 Using the Guile Module System

Guile has support for dividing a program into modules. By using modules, you can group related code together and manage the composition of complete programs from largely independent parts.

For more details on the module system beyond this introductory material, See Modules.


Next: , Up: Using the Guile Module System   [Contents][Index]

2.5.1 Using Modules

Guile comes with a lot of useful modules, for example for string processing or command line parsing. Additionally, there exist many Guile modules written by other Guile hackers, but which have to be installed manually.

Here is a sample interactive session that shows how to use the (ice-9 popen) module which provides the means for communicating with other processes over pipes together with the (ice-9 rdelim) module that provides the function read-line.

$ guile
scheme@(guile-user)> (use-modules (ice-9 popen))
scheme@(guile-user)> (use-modules (ice-9 rdelim))
scheme@(guile-user)> (define p (open-input-pipe "ls -l"))
scheme@(guile-user)> (read-line p)
$1 = "total 30"
scheme@(guile-user)> (read-line p)
$2 = "drwxr-sr-x    2 mgrabmue mgrabmue     1024 Mar 29 19:57 CVS"

Next: , Previous: , Up: Using the Guile Module System   [Contents][Index]

2.5.2 Writing new Modules

You can create new modules using the syntactic form define-module. All definitions following this form until the next define-module are placed into the new module.

One module is usually placed into one file, and that file is installed in a location where Guile can automatically find it. The following session shows a simple example.

$ cat /usr/local/share/guile/site/foo/bar.scm

(define-module (foo bar)
  #:export (frob))

(define (frob x) (* 2 x))

$ guile
scheme@(guile-user)> (use-modules (foo bar))
scheme@(guile-user)> (frob 12)
$1 = 24

For more on how to install your module, see Installing Site Packages.


Previous: , Up: Using the Guile Module System   [Contents][Index]

2.5.3 Putting Extensions into Modules

In addition to Scheme code you can also put things that are defined in C into a module.

You do this by writing a small Scheme file that defines the module and call load-extension directly in the body of the module.

$ cat /usr/local/share/guile/site/math/bessel.scm

(define-module (math bessel)
  #:export (j0))

(load-extension "libguile-bessel" "init_bessel")

$ file /usr/local/lib/guile/2.0/extensions/libguile-bessel.so
… ELF 32-bit LSB shared object …
$ guile
scheme@(guile-user)> (use-modules (math bessel))
scheme@(guile-user)> (j0 2)
$1 = 0.223890779141236

See Modules and Extensions, for more information.


Previous: , Up: Hello Guile!   [Contents][Index]

2.6 Reporting Bugs

Any problems with the installation should be reported to bug-guile@gnu.org.

If you find a bug in Guile, please report it to the Guile developers, so they can fix it. They may also be able to suggest workarounds when it is not possible for you to apply the bug-fix or install a new version of Guile yourself.

Before sending in bug reports, please check with the following list that you really have found a bug.

Before reporting the bug, check whether any programs you have loaded into Guile, including your .guile file, set any variables that may affect the functioning of Guile. Also, see whether the problem happens in a freshly started Guile without loading your .guile file (start Guile with the -q switch to prevent loading the init file). If the problem does not occur then, you must report the precise contents of any programs that you must load into Guile in order to cause the problem to occur.

When you write a bug report, please make sure to include as much of the information described below in the report. If you can’t figure out some of the items, it is not a problem, but the more information we get, the more likely we can diagnose and fix the bug.

If your bug causes Guile to crash, additional information from a low-level debugger such as GDB might be helpful. If you have built Guile yourself, you can run Guile under GDB via the meta/gdb-uninstalled-guile script. Instead of invoking Guile as usual, invoke the wrapper script, type run to start the process, then backtrace when the crash comes. Include that backtrace in your report.


Next: , Previous: , Up: Top   [Contents][Index]

3 Hello Scheme!

In this chapter, we introduce the basic concepts that underpin the elegance and power of the Scheme language.

Readers who already possess a background knowledge of Scheme may happily skip this chapter. For the reader who is new to the language, however, the following discussions on data, procedures, expressions and closure are designed to provide a minimum level of Scheme understanding that is more or less assumed by the chapters that follow.

The style of this introductory material aims about halfway between the terse precision of R5RS and the discursiveness of existing Scheme tutorials. For pointers to useful Scheme resources on the web, please see Further Reading.


Next: , Up: Hello Scheme!   [Contents][Index]

3.1 Data Types, Values and Variables

This section discusses the representation of data types and values, what it means for Scheme to be a latently typed language, and the role of variables. We conclude by introducing the Scheme syntaxes for defining a new variable, and for changing the value of an existing variable.


Next: , Up: About Data   [Contents][Index]

3.1.1 Latent Typing

The term latent typing is used to describe a computer language, such as Scheme, for which you cannot, in general, simply look at a program’s source code and determine what type of data will be associated with a particular variable, or with the result of a particular expression.

Sometimes, of course, you can tell from the code what the type of an expression will be. If you have a line in your program that sets the variable x to the numeric value 1, you can be certain that, immediately after that line has executed (and in the absence of multiple threads), x has the numeric value 1. Or if you write a procedure that is designed to concatenate two strings, it is likely that the rest of your application will always invoke this procedure with two string parameters, and quite probable that the procedure would go wrong in some way if it was ever invoked with parameters that were not both strings.

Nevertheless, the point is that there is nothing in Scheme which requires the procedure parameters always to be strings, or x always to hold a numeric value, and there is no way of declaring in your program that such constraints should always be obeyed. In the same vein, there is no way to declare the expected type of a procedure’s return value.

Instead, the types of variables and expressions are only known – in general – at run time. If you need to check at some point that a value has the expected type, Scheme provides run time procedures that you can invoke to do so. But equally, it can be perfectly valid for two separate invocations of the same procedure to specify arguments with different types, and to return values with different types.

The next subsection explains what this means in practice, for the ways that Scheme programs use data types, values and variables.


Next: , Previous: , Up: About Data   [Contents][Index]

3.1.2 Values and Variables

Scheme provides many data types that you can use to represent your data. Primitive types include characters, strings, numbers and procedures. Compound types, which allow a group of primitive and compound values to be stored together, include lists, pairs, vectors and multi-dimensional arrays. In addition, Guile allows applications to define their own data types, with the same status as the built-in standard Scheme types.

As a Scheme program runs, values of all types pop in and out of existence. Sometimes values are stored in variables, but more commonly they pass seamlessly from being the result of one computation to being one of the parameters for the next.

Consider an example. A string value is created because the interpreter reads in a literal string from your program’s source code. Then a numeric value is created as the result of calculating the length of the string. A second numeric value is created by doubling the calculated length. Finally the program creates a list with two elements – the doubled length and the original string itself – and stores this list in a program variable.

All of the values involved here – in fact, all values in Scheme – carry their type with them. In other words, every value “knows,” at runtime, what kind of value it is. A number, a string, a list, whatever.

A variable, on the other hand, has no fixed type. A variable – x, say – is simply the name of a location – a box – in which you can store any kind of Scheme value. So the same variable in a program may hold a number at one moment, a list of procedures the next, and later a pair of strings. The “type” of a variable – insofar as the idea is meaningful at all – is simply the type of whatever value the variable happens to be storing at a particular moment.


Previous: , Up: About Data   [Contents][Index]

3.1.3 Defining and Setting Variables

To define a new variable, you use Scheme’s define syntax like this:

(define variable-name value)

This makes a new variable called variable-name and stores value in it as the variable’s initial value. For example:

;; Make a variable `x' with initial numeric value 1.
(define x 1)

;; Make a variable `organization' with an initial string value.
(define organization "Free Software Foundation")

(In Scheme, a semicolon marks the beginning of a comment that continues until the end of the line. So the lines beginning ;; are comments.)

Changing the value of an already existing variable is very similar, except that define is replaced by the Scheme syntax set!, like this:

(set! variable-name new-value)

Remember that variables do not have fixed types, so new-value may have a completely different type from whatever was previously stored in the location named by variable-name. Both of the following examples are therefore correct.

;; Change the value of `x' to 5.
(set! x 5)

;; Change the value of `organization' to the FSF's street number.
(set! organization 545)

In these examples, value and new-value are literal numeric or string values. In general, however, value and new-value can be any Scheme expression. Even though we have not yet covered the forms that Scheme expressions can take (see About Expressions), you can probably guess what the following set! example does…

(set! x (+ x 1))

(Note: this is not a complete description of define and set!, because we need to introduce some other aspects of Scheme before the missing pieces can be filled in. If, however, you are already familiar with the structure of Scheme, you may like to read about those missing pieces immediately by jumping ahead to the following references.


Next: , Previous: , Up: Hello Scheme!   [Contents][Index]

3.2 The Representation and Use of Procedures

This section introduces the basics of using and creating Scheme procedures. It discusses the representation of procedures as just another kind of Scheme value, and shows how procedure invocation expressions are constructed. We then explain how lambda is used to create new procedures, and conclude by presenting the various shorthand forms of define that can be used instead of writing an explicit lambda expression.


Next: , Up: About Procedures   [Contents][Index]

3.2.1 Procedures as Values

One of the great simplifications of Scheme is that a procedure is just another type of value, and that procedure values can be passed around and stored in variables in exactly the same way as, for example, strings and lists. When we talk about a built-in standard Scheme procedure such as open-input-file, what we actually mean is that there is a pre-defined top level variable called open-input-file, whose value is a procedure that implements what R5RS says that open-input-file should do.

Note that this is quite different from many dialects of Lisp — including Emacs Lisp — in which a program can use the same name with two quite separate meanings: one meaning identifies a Lisp function, while the other meaning identifies a Lisp variable, whose value need have nothing to do with the function that is associated with the first meaning. In these dialects, functions and variables are said to live in different namespaces.

In Scheme, on the other hand, all names belong to a single unified namespace, and the variables that these names identify can hold any kind of Scheme value, including procedure values.

One consequence of the “procedures as values” idea is that, if you don’t happen to like the standard name for a Scheme procedure, you can change it.

For example, call-with-current-continuation is a very important standard Scheme procedure, but it also has a very long name! So, many programmers use the following definition to assign the same procedure value to the more convenient name call/cc.

(define call/cc call-with-current-continuation)

Let’s understand exactly how this works. The definition creates a new variable call/cc, and then sets its value to the value of the variable call-with-current-continuation; the latter value is a procedure that implements the behaviour that R5RS specifies under the name “call-with-current-continuation”. So call/cc ends up holding this value as well.

Now that call/cc holds the required procedure value, you could choose to use call-with-current-continuation for a completely different purpose, or just change its value so that you will get an error if you accidentally use call-with-current-continuation as a procedure in your program rather than call/cc. For example:

(set! call-with-current-continuation "Not a procedure any more!")

Or you could just leave call-with-current-continuation as it was. It’s perfectly fine for more than one variable to hold the same procedure value.


Next: , Previous: , Up: About Procedures   [Contents][Index]

3.2.2 Simple Procedure Invocation

A procedure invocation in Scheme is written like this:

(procedure [arg1 [arg2 …]])

In this expression, procedure can be any Scheme expression whose value is a procedure. Most commonly, however, procedure is simply the name of a variable whose value is a procedure.

For example, string-append is a standard Scheme procedure whose behaviour is to concatenate together all the arguments, which are expected to be strings, that it is given. So the expression

(string-append "/home" "/" "andrew")

is a procedure invocation whose result is the string value "/home/andrew".

Similarly, string-length is a standard Scheme procedure that returns the length of a single string argument, so

(string-length "abc")

is a procedure invocation whose result is the numeric value 3.

Each of the parameters in a procedure invocation can itself be any Scheme expression. Since a procedure invocation is itself a type of expression, we can put these two examples together to get

(string-length (string-append "/home" "/" "andrew"))

— a procedure invocation whose result is the numeric value 12.

(You may be wondering what happens if the two examples are combined the other way round. If we do this, we can make a procedure invocation expression that is syntactically correct:

(string-append "/home" (string-length "abc"))

but when this expression is executed, it will cause an error, because the result of (string-length "abc") is a numeric value, and string-append is not designed to accept a numeric value as one of its arguments.)


Next: , Previous: , Up: About Procedures   [Contents][Index]

3.2.3 Creating and Using a New Procedure

Scheme has lots of standard procedures, and Guile provides all of these via predefined top level variables. All of these standard procedures are documented in the later chapters of this reference manual.

Before very long, though, you will want to create new procedures that encapsulate aspects of your own applications’ functionality. To do this, you can use the famous lambda syntax.

For example, the value of the following Scheme expression

(lambda (name address) expression …)

is a newly created procedure that takes two arguments: name and address. The behaviour of the new procedure is determined by the sequence of expressions in the body of the procedure definition. (Typically, these expressions would use the arguments in some way, or else there wouldn’t be any point in giving them to the procedure.) When invoked, the new procedure returns a value that is the value of the last expression in the procedure body.

To make things more concrete, let’s suppose that the two arguments are both strings, and that the purpose of this procedure is to form a combined string that includes these arguments. Then the full lambda expression might look like this:

(lambda (name address)
  (string-append "Name=" name ":Address=" address))

We noted in the previous subsection that the procedure part of a procedure invocation expression can be any Scheme expression whose value is a procedure. But that’s exactly what a lambda expression is! So we can use a lambda expression directly in a procedure invocation, like this:

((lambda (name address)
   (string-append "Name=" name ":Address=" address))
 "FSF"
 "Cambridge") 

This is a valid procedure invocation expression, and its result is the string:

"Name=FSF:Address=Cambridge"

It is more common, though, to store the procedure value in a variable —

(define make-combined-string
  (lambda (name address)
    (string-append "Name=" name ":Address=" address)))

— and then to use the variable name in the procedure invocation:

(make-combined-string "FSF" "Cambridge") 

Which has exactly the same result.

It’s important to note that procedures created using lambda have exactly the same status as the standard built in Scheme procedures, and can be invoked, passed around, and stored in variables in exactly the same ways.


Previous: , Up: About Procedures   [Contents][Index]

3.2.4 Lambda Alternatives

Since it is so common in Scheme programs to want to create a procedure and then store it in a variable, there is an alternative form of the define syntax that allows you to do just that.

A define expression of the form

(define (name [arg1 [arg2 …]])
  expression …)

is exactly equivalent to the longer form

(define name
  (lambda ([arg1 [arg2 …]])
    expression …))

So, for example, the definition of make-combined-string in the previous subsection could equally be written:

(define (make-combined-string name address)
  (string-append "Name=" name ":Address=" address))

This kind of procedure definition creates a procedure that requires exactly the expected number of arguments. There are two further forms of the lambda expression, which create a procedure that can accept a variable number of arguments:

(lambda (arg1 … . args) expression …)

(lambda args expression …)

The corresponding forms of the alternative define syntax are:

(define (name arg1 … . args) expression …)

(define (name . args) expression …)

For details on how these forms work, see See Lambda.

Prior to Guile 2.0, Guile provided an extension to define syntax that allowed you to nest the previous extension up to an arbitrary depth. These are no longer provided by default, and instead have been moved to Curried Definitions

(It could be argued that the alternative define forms are rather confusing, especially for newcomers to the Scheme language, as they hide both the role of lambda and the fact that procedures are values that are stored in variables in the some way as any other kind of value. On the other hand, they are very convenient, and they are also a good example of another of Scheme’s powerful features: the ability to specify arbitrary syntactic transformations at run time, which can be applied to subsequently read input.)


Next: , Previous: , Up: Hello Scheme!   [Contents][Index]

3.3 Expressions and Evaluation

So far, we have met expressions that do things, such as the define expressions that create and initialize new variables, and we have also talked about expressions that have values, for example the value of the procedure invocation expression:

(string-append "/home" "/" "andrew")

but we haven’t yet been precise about what causes an expression like this procedure invocation to be reduced to its “value”, or how the processing of such expressions relates to the execution of a Scheme program as a whole.

This section clarifies what we mean by an expression’s value, by introducing the idea of evaluation. It discusses the side effects that evaluation can have, explains how each of the various types of Scheme expression is evaluated, and describes the behaviour and use of the Guile REPL as a mechanism for exploring evaluation. The section concludes with a very brief summary of Scheme’s common syntactic expressions.


Next: , Up: About Expressions   [Contents][Index]

3.3.1 Evaluating Expressions and Executing Programs

In Scheme, the process of executing an expression is known as evaluation. Evaluation has two kinds of result:

Of the expressions that we have met so far, define and set! expressions have side effects — the creation or modification of a variable — but no value; lambda expressions have values — the newly constructed procedures — but no side effects; and procedure invocation expressions, in general, have either values, or side effects, or both.

It is tempting to try to define more intuitively what we mean by “value” and “side effects”, and what the difference between them is. In general, though, this is extremely difficult. It is also unnecessary; instead, we can quite happily define the behaviour of a Scheme program by specifying how Scheme executes a program as a whole, and then by describing the value and side effects of evaluation for each type of expression individually.

So, some1 definitions…

The following subsections describe how each of these types of expression is evaluated.


Next: , Up: Evaluating   [Contents][Index]

3.3.1.1 Evaluating Literal Data

When a literal data expression is evaluated, the value of the expression is simply the value that the expression describes. The evaluation of a literal data expression has no side effects.

So, for example,

For any data type which can be expressed literally like this, the syntax of the literal data expression for that data type — in other words, what you need to write in your code to indicate a literal value of that type — is known as the data type’s read syntax. This manual specifies the read syntax for each such data type in the section that describes that data type.

Some data types do not have a read syntax. Procedures, for example, cannot be expressed as literal data; they must be created using a lambda expression (see Creating a Procedure) or implicitly using the shorthand form of define (see Lambda Alternatives).


Next: , Previous: , Up: Evaluating   [Contents][Index]

3.3.1.2 Evaluating a Variable Reference

When an expression that consists simply of a variable name is evaluated, the value of the expression is the value of the named variable. The evaluation of a variable reference expression has no side effects.

So, after

(define key "Paul Evans")

the value of the expression key is the string value "Paul Evans". If key is then modified by

(set! key 3.74)

the value of the expression key is the numeric value 3.74.

If there is no variable with the specified name, evaluation of the variable reference expression signals an error.


Next: , Previous: , Up: Evaluating   [Contents][Index]

3.3.1.3 Evaluating a Procedure Invocation Expression

This is where evaluation starts getting interesting! As already noted, a procedure invocation expression has the form

(procedure [arg1 [arg2 …]])

where procedure must be an expression whose value, when evaluated, is a procedure.

The evaluation of a procedure invocation expression like this proceeds by

For a procedure defined in Scheme, “calling the procedure with the list of values as its parameters” means binding the values to the procedure’s formal parameters and then evaluating the sequence of expressions that make up the body of the procedure definition. The value of the procedure invocation expression is the value of the last evaluated expression in the procedure body. The side effects of calling the procedure are the combination of the side effects of the sequence of evaluations of expressions in the procedure body.

For a built-in procedure, the value and side-effects of calling the procedure are best described by that procedure’s documentation.

Note that the complete side effects of evaluating a procedure invocation expression consist not only of the side effects of the procedure call, but also of any side effects of the preceding evaluation of the expressions procedure, arg1, arg2, and so on.

To illustrate this, let’s look again at the procedure invocation expression:

(string-length (string-append "/home" "/" "andrew"))

In the outermost expression, procedure is string-length and arg1 is (string-append "/home" "/" "andrew").

In the evaluation of the outermost expression, the interpreter can now invoke the procedure value obtained from procedure with the value obtained from arg1 as its arguments. The resulting value is a numeric value that is the length of the argument string, which is 12.


Previous: , Up: Evaluating   [Contents][Index]

3.3.1.4 Evaluating Special Syntactic Expressions

When a procedure invocation expression is evaluated, the procedure and all the argument expressions must be evaluated before the procedure can be invoked. Special syntactic expressions are special because they are able to manipulate their arguments in an unevaluated form, and can choose whether to evaluate any or all of the argument expressions.

Why is this needed? Consider a program fragment that asks the user whether or not to delete a file, and then deletes the file if the user answers yes.

(if (string=? (read-answer "Should I delete this file?")
              "yes")
    (delete-file file))

If the outermost (if …) expression here was a procedure invocation expression, the expression (delete-file file), whose side effect is to actually delete a file, would already have been evaluated before the if procedure even got invoked! Clearly this is no use — the whole point of an if expression is that the consequent expression is only evaluated if the condition of the if expression is “true”.

Therefore if must be special syntax, not a procedure. Other special syntaxes that we have already met are define, set! and lambda. define and set! are syntax because they need to know the variable name that is given as the first argument in a define or set! expression, not that variable’s value. lambda is syntax because it does not immediately evaluate the expressions that define the procedure body; instead it creates a procedure object that incorporates these expressions so that they can be evaluated in the future, when that procedure is invoked.

The rules for evaluating each special syntactic expression are specified individually for each special syntax. For a summary of standard special syntax, see See Syntax Summary.


Next: , Previous: , Up: About Expressions   [Contents][Index]

3.3.2 Tail calls

Scheme is “properly tail recursive”, meaning that tail calls or recursions from certain contexts do not consume stack space or other resources and can therefore be used on arbitrarily large data or for an arbitrarily long calculation. Consider for example,

(define (foo n)
  (display n)
  (newline)
  (foo (1+ n)))

(foo 1)
-|
1
2
3
…

foo prints numbers infinitely, starting from the given n. It’s implemented by printing n then recursing to itself to print n+1 and so on. This recursion is a tail call, it’s the last thing done, and in Scheme such tail calls can be made without limit.

Or consider a case where a value is returned, a version of the SRFI-1 last function (see SRFI-1 Selectors) returning the last element of a list,

(define (my-last lst)
  (if (null? (cdr lst))
      (car lst)
      (my-last (cdr lst))))

(my-last '(1 2 3)) ⇒ 3      

If the list has more than one element, my-last applies itself to the cdr. This recursion is a tail call, there’s no code after it, and the return value is the return value from that call. In Scheme this can be used on an arbitrarily long list argument.


A proper tail call is only available from certain contexts, namely the following special form positions,

The following core functions make tail calls,


The above are just core functions and special forms. Tail calls in other modules are described with the relevant documentation, for example SRFI-1 any and every (see SRFI-1 Searching).

It will be noted there are a lot of places which could potentially be tail calls, for instance the last call in a for-each, but only those explicitly described are guaranteed.


Next: , Previous: , Up: About Expressions   [Contents][Index]

3.3.3 Using the Guile REPL

If you start Guile without specifying a particular program for it to execute, Guile enters its standard Read Evaluate Print Loop — or REPL for short. In this mode, Guile repeatedly reads in the next Scheme expression that the user types, evaluates it, and prints the resulting value.

The REPL is a useful mechanism for exploring the evaluation behaviour described in the previous subsection. If you type string-append, for example, the REPL replies #<primitive-procedure string-append>, illustrating the relationship between the variable string-append and the procedure value stored in that variable.

In this manual, the notation ⇒ is used to mean “evaluates to”. Wherever you see an example of the form

expressionresult

feel free to try it out yourself by typing expression into the REPL and checking that it gives the expected result.


Previous: , Up: About Expressions   [Contents][Index]

3.3.4 Summary of Common Syntax

This subsection lists the most commonly used Scheme syntactic expressions, simply so that you will recognize common special syntax when you see it. For a full description of each of these syntaxes, follow the appropriate reference.

lambda (see Lambda) is used to construct procedure objects.

define (see Top Level) is used to create a new variable and set its initial value.

set! (see Top Level) is used to modify an existing variable’s value.

let, let* and letrec (see Local Bindings) create an inner lexical environment for the evaluation of a sequence of expressions, in which a specified set of local variables is bound to the values of a corresponding set of expressions. For an introduction to environments, see See About Closure.

begin (see begin) executes a sequence of expressions in order and returns the value of the last expression. Note that this is not the same as a procedure which returns its last argument, because the evaluation of a procedure invocation expression does not guarantee to evaluate the arguments in order.

if and cond (see Conditionals) provide conditional evaluation of argument expressions depending on whether one or more conditions evaluate to “true” or “false”.

case (see Conditionals) provides conditional evaluation of argument expressions depending on whether a variable has one of a specified group of values.

and (see and or) executes a sequence of expressions in order until either there are no expressions left, or one of them evaluates to “false”.

or (see and or) executes a sequence of expressions in order until either there are no expressions left, or one of them evaluates to “true”.


Next: , Previous: , Up: Hello Scheme!   [Contents][Index]

3.4 The Concept of Closure

The concept of closure is the idea that a lambda expression “captures” the variable bindings that are in lexical scope at the point where the lambda expression occurs. The procedure created by the lambda expression can refer to and mutate the captured bindings, and the values of those bindings persist between procedure calls.

This section explains and explores the various parts of this idea in more detail.


Next: , Up: About Closure   [Contents][Index]

3.4.1 Names, Locations, Values and Environments

We said earlier that a variable name in a Scheme program is associated with a location in which any kind of Scheme value may be stored. (Incidentally, the term “vcell” is often used in Lisp and Scheme circles as an alternative to “location”.) Thus part of what we mean when we talk about “creating a variable” is in fact establishing an association between a name, or identifier, that is used by the Scheme program code, and the variable location to which that name refers. Although the value that is stored in that location may change, the location to which a given name refers is always the same.

We can illustrate this by breaking down the operation of the define syntax into three parts: define

A collection of associations between names and locations is called an environment. When you create a top level variable in a program using define, the name-location association for that variable is added to the “top level” environment. The “top level” environment also includes name-location associations for all the procedures that are supplied by standard Scheme.

It is also possible to create environments other than the top level one, and to create variable bindings, or name-location associations, in those environments. This ability is a key ingredient in the concept of closure; the next subsection shows how it is done.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.2 Local Variables and Environments

We have seen how to create top level variables using the define syntax (see Definition). It is often useful to create variables that are more limited in their scope, typically as part of a procedure body. In Scheme, this is done using the let syntax, or one of its modified forms let* and letrec. These syntaxes are described in full later in the manual (see Local Bindings). Here our purpose is to illustrate their use just enough that we can see how local variables work.

For example, the following code uses a local variable s to simplify the computation of the area of a triangle given the lengths of its three sides.

(define a 5.3)
(define b 4.7)
(define c 2.8)

(define area
  (let ((s (/ (+ a b c) 2)))
    (sqrt (* s (- s a) (- s b) (- s c)))))

The effect of the let expression is to create a new environment and, within this environment, an association between the name s and a new location whose initial value is obtained by evaluating (/ (+ a b c) 2). The expressions in the body of the let, namely (sqrt (* s (- s a) (- s b) (- s c))), are then evaluated in the context of the new environment, and the value of the last expression evaluated becomes the value of the whole let expression, and therefore the value of the variable area.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.3 Environment Chaining

In the example of the previous subsection, we glossed over an important point. The body of the let expression in that example refers not only to the local variable s, but also to the top level variables a, b, c and sqrt. (sqrt is the standard Scheme procedure for calculating a square root.) If the body of the let expression is evaluated in the context of the local let environment, how does the evaluation get at the values of these top level variables?

The answer is that the local environment created by a let expression automatically has a reference to its containing environment — in this case the top level environment — and that the Scheme interpreter automatically looks for a variable binding in the containing environment if it doesn’t find one in the local environment. More generally, every environment except for the top level one has a reference to its containing environment, and the interpreter keeps searching back up the chain of environments — from most local to top level — until it either finds a variable binding for the required identifier or exhausts the chain.

This description also determines what happens when there is more than one variable binding with the same name. Suppose, continuing the example of the previous subsection, that there was also a pre-existing top level variable s created by the expression:

(define s "Some beans, my lord!")

Then both the top level environment and the local let environment would contain bindings for the name s. When evaluating code within the let body, the interpreter looks first in the local let environment, and so finds the binding for s created by the let syntax. Even though this environment has a reference to the top level environment, which also has a binding for s, the interpreter doesn’t get as far as looking there. When evaluating code outside the let body, the interpreter looks up variable names in the top level environment, so the name s refers to the top level variable.

Within the let body, the binding for s in the local environment is said to shadow the binding for s in the top level environment.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.4 Lexical Scope

The rules that we have just been describing are the details of how Scheme implements “lexical scoping”. This subsection takes a brief diversion to explain what lexical scope means in general and to present an example of non-lexical scoping.

“Lexical scope” in general is the idea that

In practice, lexical scoping is the norm for most programming languages, and probably corresponds to what you would intuitively consider to be “normal”. You may even be wondering how the situation could possibly — and usefully — be otherwise. To demonstrate that another kind of scoping is possible, therefore, and to compare it against lexical scoping, the following subsection presents an example of non-lexical scoping and examines in detail how its behavior differs from the corresponding lexically scoped code.


Up: Lexical Scope   [Contents][Index]

3.4.4.1 An Example of Non-Lexical Scoping

To demonstrate that non-lexical scoping does exist and can be useful, we present the following example from Emacs Lisp, which is a “dynamically scoped” language.

(defvar currency-abbreviation "USD")

(defun currency-string (units hundredths)
  (concat currency-abbreviation
          (number-to-string units)
          "."
          (number-to-string hundredths)))

(defun french-currency-string (units hundredths)
  (let ((currency-abbreviation "FRF"))
    (currency-string units hundredths)))

The question to focus on here is: what does the identifier currency-abbreviation refer to in the currency-string function? The answer, in Emacs Lisp, is that all variable bindings go onto a single stack, and that currency-abbreviation refers to the topmost binding from that stack which has the name “currency-abbreviation”. The binding that is created by the defvar form, to the value "USD", is only relevant if none of the code that calls currency-string rebinds the name “currency-abbreviation” in the meanwhile.

The second function french-currency-string works precisely by taking advantage of this behaviour. It creates a new binding for the name “currency-abbreviation” which overrides the one established by the defvar form.

;; Note!  This is Emacs Lisp evaluation, not Scheme!
(french-currency-string 33 44)
⇒
"FRF33.44"

Now let’s look at the corresponding, lexically scoped Scheme code:

(define currency-abbreviation "USD")

(define (currency-string units hundredths)
  (string-append currency-abbreviation
                 (number->string units)
                 "."
                 (number->string hundredths)))

(define (french-currency-string units hundredths)
  (let ((currency-abbreviation "FRF"))
    (currency-string units hundredths)))

According to the rules of lexical scoping, the currency-abbreviation in currency-string refers to the variable location in the innermost environment at that point in the code which has a binding for currency-abbreviation, which is the variable location in the top level environment created by the preceding (define currency-abbreviation …) expression.

In Scheme, therefore, the french-currency-string procedure does not work as intended. The variable binding that it creates for “currency-abbreviation” is purely local to the code that forms the body of the let expression. Since this code doesn’t directly use the name “currency-abbreviation” at all, the binding is pointless.

(french-currency-string 33 44)
⇒
"USD33.44"

This begs the question of how the Emacs Lisp behaviour can be implemented in Scheme. In general, this is a design question whose answer depends upon the problem that is being addressed. In this case, the best answer may be that currency-string should be redesigned so that it can take an optional third argument. This third argument, if supplied, is interpreted as a currency abbreviation that overrides the default.

It is possible to change french-currency-string so that it mostly works without changing currency-string, but the fix is inelegant, and susceptible to interrupts that could leave the currency-abbreviation variable in the wrong state:

(define (french-currency-string units hundredths)
  (set! currency-abbreviation "FRF")
  (let ((result (currency-string units hundredths)))
    (set! currency-abbreviation "USD")
    result))

The key point here is that the code does not create any local binding for the identifier currency-abbreviation, so all occurrences of this identifier refer to the top level variable.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.5 Closure

Consider a let expression that doesn’t contain any lambdas:

(let ((s (/ (+ a b c) 2)))
  (sqrt (* s (- s a) (- s b) (- s c))))

When the Scheme interpreter evaluates this, it

After the let expression has been evaluated, the local environment that was created is simply forgotten, and there is no longer any way to access the binding that was created in this environment. If the same code is evaluated again, it will follow the same steps again, creating a second new local environment that has no connection with the first, and then forgetting this one as well.

If the let body contains a lambda expression, however, the local environment is not forgotten. Instead, it becomes associated with the procedure that is created by the lambda expression, and is reinstated every time that that procedure is called. In detail, this works as follows.

The result is that the procedure body is always evaluated in the context of the environment that was current when the procedure was created.

This is what is meant by closure. The next few subsections present examples that explore the usefulness of this concept.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.6 Example 1: A Serial Number Generator

This example uses closure to create a procedure with a variable binding that is private to the procedure, like a local variable, but whose value persists between procedure calls.

(define (make-serial-number-generator)
  (let ((current-serial-number 0))
    (lambda ()
      (set! current-serial-number (+ current-serial-number 1))
      current-serial-number)))

(define entry-sn-generator (make-serial-number-generator))

(entry-sn-generator)
⇒
1

(entry-sn-generator)
⇒
2

When make-serial-number-generator is called, it creates a local environment with a binding for current-serial-number whose initial value is 0, then, within this environment, creates a procedure. The local environment is stored within the created procedure object and so persists for the lifetime of the created procedure.

Every time the created procedure is invoked, it increments the value of the current-serial-number binding in the captured environment and then returns the current value.

Note that make-serial-number-generator can be called again to create a second serial number generator that is independent of the first. Every new invocation of make-serial-number-generator creates a new local let environment and returns a new procedure object with an association to this environment.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.7 Example 2: A Shared Persistent Variable

This example uses closure to create two procedures, get-balance and deposit, that both refer to the same captured local environment so that they can both access the balance variable binding inside that environment. The value of this variable binding persists between calls to either procedure.

Note that the captured balance variable binding is private to these two procedures: it is not directly accessible to any other code. It can only be accessed indirectly via get-balance or deposit, as illustrated by the withdraw procedure.

(define get-balance #f)
(define deposit #f)

(let ((balance 0))
  (set! get-balance
        (lambda ()
          balance))
  (set! deposit
        (lambda (amount)
          (set! balance (+ balance amount))
          balance)))

(define (withdraw amount)
  (deposit (- amount)))

(get-balance)
⇒
0

(deposit 50)
⇒
50

(withdraw 75)
⇒
-25

An important detail here is that the get-balance and deposit variables must be set up by defineing them at top level and then set!ing their values inside the let body. Using define within the let body would not work: this would create variable bindings within the local let environment that would not be accessible at top level.


Next: , Previous: , Up: About Closure   [Contents][Index]

3.4.8 Example 3: The Callback Closure Problem

A frequently used programming model for library code is to allow an application to register a callback function for the library to call when some particular event occurs. It is often useful for the application to make several such registrations using the same callback function, for example if several similar library events can be handled using the same application code, but the need then arises to distinguish the callback function calls that are associated with one callback registration from those that are associated with different callback registrations.

In languages without the ability to create functions dynamically, this problem is usually solved by passing a user_data parameter on the registration call, and including the value of this parameter as one of the parameters on the callback function. Here is an example of declarations using this solution in C:

typedef void (event_handler_t) (int event_type,
                                void *user_data);

void register_callback (int event_type,
                        event_handler_t *handler,
                        void *user_data);

In Scheme, closure can be used to achieve the same functionality without requiring the library code to store a user-data for each callback registration.

;; In the library:

(define (register-callback event-type handler-proc)
  …)

;; In the application:

(define (make-handler event-type user-data)
  (lambda ()
    …
    <code referencing event-type and user-data>
    …))

(register-callback event-type
                   (make-handler event-type …))

As far as the library is concerned, handler-proc is a procedure with no arguments, and all the library has to do is call it when the appropriate event occurs. From the application’s point of view, though, the handler procedure has used closure to capture an environment that includes all the context that the handler code needs — event-type and user-data — to handle the event correctly.


Previous: , Up: About Closure   [Contents][Index]

3.4.9 Example 4: Object Orientation

Closure is the capture of an environment, containing persistent variable bindings, within the definition of a procedure or a set of related procedures. This is rather similar to the idea in some object oriented languages of encapsulating a set of related data variables inside an “object”, together with a set of “methods” that operate on the encapsulated data. The following example shows how closure can be used to emulate the ideas of objects, methods and encapsulation in Scheme.

(define (make-account)
  (let ((balance 0))
    (define (get-balance)
      balance)
    (define (deposit amount)
      (set! balance (+ balance amount))
      balance)
    (define (withdraw amount)
      (deposit (- amount)))

    (lambda args
      (apply
        (case (car args)
          ((get-balance) get-balance)
          ((deposit) deposit)
          ((withdraw) withdraw)
          (else (error "Invalid method!")))
        (cdr args)))))

Each call to make-account creates and returns a new procedure, created by the expression in the example code that begins “(lambda args”.

(define my-account (make-account))

my-account
⇒
#<procedure args>

This procedure acts as an account object with methods get-balance, deposit and withdraw. To apply one of the methods to the account, you call the procedure with a symbol indicating the required method as the first parameter, followed by any other parameters that are required by that method.

(my-account 'get-balance)
⇒
0

(my-account 'withdraw 5)
⇒
-5

(my-account 'deposit 396)
⇒
391

(my-account 'get-balance)
⇒
391

Note how, in this example, both the current balance and the helper procedures get-balance, deposit and withdraw, used to implement the guts of the account object’s methods, are all stored in variable bindings within the private local environment captured by the lambda expression that creates the account object procedure.


Previous: , Up: Hello Scheme!   [Contents][Index]

3.5 Further Reading


Next: , Previous: , Up: Top   [Contents][Index]

4 Programming in Scheme

Guile’s core language is Scheme, and a lot can be achieved simply by using Guile to write and run Scheme programs — as opposed to having to dive into C code. In this part of the manual, we explain how to use Guile in this mode, and describe the tools that Guile provides to help you with script writing, debugging, and packaging your programs for distribution.

For detailed reference information on the variables, functions, and so on that make up Guile’s application programming interface (API), see API Reference.


Next: , Up: Programming in Scheme   [Contents][Index]

4.1 Guile’s Implementation of Scheme

Guile’s core language is Scheme, which is specified and described in the series of reports known as RnRS. RnRS is shorthand for the Revised^n Report on the Algorithmic Language Scheme. Guile complies fully with R5RS (see Introduction in R5RS), and implements some aspects of R6RS.

Guile also has many extensions that go beyond these reports. Some of the areas where Guile extends R5RS are:


Next: , Previous: , Up: Programming in Scheme   [Contents][Index]

4.2 Invoking Guile

Many features of Guile depend on and can be changed by information that the user provides either before or when Guile is started. Below is a description of what information to provide and how to provide it.


Next: , Up: Invoking Guile   [Contents][Index]

4.2.1 Command-line Options

Here we describe Guile’s command-line processing in detail. Guile processes its arguments from left to right, recognizing the switches described below. For examples, see Scripting Examples.

script arg...
-s script arg...

By default, Guile will read a file named on the command line as a script. Any command-line arguments arg... following script become the script’s arguments; the command-line function returns a list of strings of the form (script arg...).

It is possible to name a file using a leading hyphen, for example, -myfile.scm. In this case, the file name must be preceded by -s to tell Guile that a (script) file is being named.

Scripts are read and evaluated as Scheme source code just as the load function would. After loading script, Guile exits.

-c expr arg...

Evaluate expr as Scheme code, and then exit. Any command-line arguments arg... following expr become command-line arguments; the command-line function returns a list of strings of the form (guile arg...), where guile is the path of the Guile executable.

-- arg...

Run interactively, prompting the user for expressions and evaluating them. Any command-line arguments arg... following the -- become command-line arguments for the interactive session; the command-line function returns a list of strings of the form (guile arg...), where guile is the path of the Guile executable.

-L directory

Add directory to the front of Guile’s module load path. The given directories are searched in the order given on the command line and before any directories in the GUILE_LOAD_PATH environment variable. Paths added here are not in effect during execution of the user’s .guile file.

-C directory

Like -L, but adjusts the load path for compiled files.

-x extension

Add extension to the front of Guile’s load extension list (see %load-extensions). The specified extensions are tried in the order given on the command line, and before the default load extensions. Extensions added here are not in effect during execution of the user’s .guile file.

-l file

Load Scheme source code from file, and continue processing the command line.

-e function

Make function the entry point of the script. After loading the script file (with -s) or evaluating the expression (with -c), apply function to a list containing the program name and the command-line arguments—the list provided by the command-line function.

A -e switch can appear anywhere in the argument list, but Guile always invokes the function as the last action it performs. This is weird, but because of the way script invocation works under POSIX, the -s option must always come last in the list.

The function is most often a simple symbol that names a function that is defined in the script. It can also be of the form (@ module-name symbol), and in that case, the symbol is looked up in the module named module-name.

For compatibility with some versions of Guile 1.4, you can also use the form (symbol ...) (that is, a list of only symbols that doesn’t start with @), which is equivalent to (@ (symbol ...) main), or (symbol ...) symbol (that is, a list of only symbols followed by a symbol), which is equivalent to (@ (symbol ...) symbol). We recommend to use the equivalent forms directly since they correspond to the (@ ...) read syntax that can be used in normal code. See Using Guile Modules and Scripting Examples.

-ds

Treat a final -s option as if it occurred at this point in the command line; load the script here.

This switch is necessary because, although the POSIX script invocation mechanism effectively requires the -s option to appear last, the programmer may well want to run the script before other actions requested on the command line. For examples, see Scripting Examples.

\

Read more command-line arguments, starting from the second line of the script file. See The Meta Switch.

--use-srfi=list

The option --use-srfi expects a comma-separated list of numbers, each representing a SRFI module to be loaded into the interpreter before evaluating a script file or starting the REPL. Additionally, the feature identifier for the loaded SRFIs is recognized by the procedure cond-expand when this option is used.

Here is an example that loads the modules SRFI-8 (’receive’) and SRFI-13 (’string library’) before the GUILE interpreter is started:

guile --use-srfi=8,13
--debug

Start with the debugging virtual machine (VM) engine. Using the debugging VM will enable support for VM hooks, which are needed for tracing, breakpoints, and accurate call counts when profiling. The debugging VM is slower than the regular VM, though, by about ten percent. See VM Hooks, for more information.

By default, the debugging VM engine is only used when entering an interactive session. When executing a script with -s or -c, the normal, faster VM is used by default.

--no-debug

Do not use the debugging VM engine, even when entering an interactive session.

Note that, despite the name, Guile running with --no-debug does support the usual debugging facilities, such as printing a detailed backtrace upon error. The only difference with --debug is lack of support for VM hooks and the facilities that build upon it (see above).

-q

Do not load the initialization file, .guile. This option only has an effect when running interactively; running scripts does not load the .guile file. See Init File.

--listen[=p]

While this program runs, listen on a local port or a path for REPL clients. If p starts with a number, it is assumed to be a local port on which to listen. If it starts with a forward slash, it is assumed to be a path to a UNIX domain socket on which to listen.

If p is not given, the default is local port 37146. If you look at it upside down, it almost spells “Guile”. If you have netcat installed, you should be able to nc localhost 37146 and get a Guile prompt. Alternately you can fire up Emacs and connect to the process; see Using Guile in Emacs for more details.

Note that opening a port allows anyone who can connect to that port—in the TCP case, any local user—to do anything Guile can do, as the user that the Guile process is running as. Do not use --listen on multi-user machines. Of course, if you do not pass --listen to Guile, no port will be opened.

That said, --listen is great for interactive debugging and development.

--auto-compile

Compile source files automatically (default behavior).

--fresh-auto-compile

Treat the auto-compilation cache as invalid, forcing recompilation.

--no-auto-compile

Disable automatic source file compilation.

--language=lang

For the remainder of the command line arguments, assume that files mentioned with -l and expressions passed with -c are written in lang. lang must be the name of one of the languages supported by the compiler (see Compiler Tower). When run interactively, set the REPL’s language to lang (see Using Guile Interactively).

The default language is scheme; other interesting values include elisp (for Emacs Lisp), and ecmascript.

The example below shows the evaluation of expressions in Scheme, Emacs Lisp, and ECMAScript:

guile -c "(apply + '(1 2))"
guile --language=elisp -c "(= (funcall (symbol-function '+) 1 2) 3)"
guile --language=ecmascript -c '(function (x) { return x * x; })(2);'

To load a file written in Scheme and one written in Emacs Lisp, and then start a Scheme REPL, type:

guile -l foo.scm --language=elisp -l foo.el --language=scheme
-h, --help

Display help on invoking Guile, and then exit.

-v, --version

Display the current version of Guile, and then exit.


Previous: , Up: Invoking Guile   [Contents][Index]

4.2.2 Environment Variables

The environment is a feature of the operating system; it consists of a collection of variables with names and values. Each variable is called an environment variable (or, sometimes, a “shell variable”); environment variable names are case-sensitive, and it is conventional to use upper-case letters only. The values are all text strings, even those that are written as numerals. (Note that here we are referring to names and values that are defined in the operating system shell from which Guile is invoked. This is not the same as a Scheme environment that is defined within a running instance of Guile. For a description of Scheme environments, see About Environments.)

How to set environment variables before starting Guile depends on the operating system and, especially, the shell that you are using. For example, here is how to tell Guile to provide detailed warning messages about deprecated features by setting GUILE_WARN_DEPRECATED using Bash:

$ export GUILE_WARN_DEPRECATED="detailed"
$ guile

Or, detailed warnings can be turned on for a single invocation using:

$ env GUILE_WARN_DEPRECATED="detailed" guile

If you wish to retrieve or change the value of the shell environment variables that affect the run-time behavior of Guile from within a running instance of Guile, see Runtime Environment.

Here are the environment variables that affect the run-time behavior of Guile:

GUILE_AUTO_COMPILE

This is a flag that can be used to tell Guile whether or not to compile Scheme source files automatically. Starting with Guile 2.0, Scheme source files will be compiled automatically, by default.

If a compiled (.go) file corresponding to a .scm file is not found or is not newer than the .scm file, the .scm file will be compiled on the fly, and the resulting .go file stored away. An advisory note will be printed on the console.

Compiled files will be stored in the directory $XDG_CACHE_HOME/guile/ccache, where XDG_CACHE_HOME defaults to the directory $HOME/.cache. This directory will be created if it does not already exist.

Note that this mechanism depends on the timestamp of the .go file being newer than that of the .scm file; if the .scm or .go files are moved after installation, care should be taken to preserve their original timestamps.

Set GUILE_AUTO_COMPILE to zero (0), to prevent Scheme files from being compiled automatically. Set this variable to “fresh” to tell Guile to compile Scheme files whether they are newer than the compiled files or not.

See Compilation.

GUILE_HISTORY

This variable names the file that holds the Guile REPL command history. You can specify a different history file by setting this environment variable. By default, the history file is $HOME/.guile_history.

GUILE_INSTALL_LOCALE

This is a flag that can be used to tell Guile whether or not to install the current locale at startup, via a call to (setlocale LC_ALL ""). See Locales, for more information on locales.

You may explicitly indicate that you do not want to install the locale by setting GUILE_INSTALL_LOCALE to 0, or explicitly enable it by setting the variable to 1.

Usually, installing the current locale is the right thing to do. It allows Guile to correctly parse and print strings with non-ASCII characters. However, for compatibility with previous Guile 2.0 releases, this option is off by default. The next stable release series of Guile (the 2.2 series) will install locales by default.

GUILE_STACK_SIZE

Guile currently has a limited stack size for Scheme computations. Attempting to call too many nested functions will signal an error. This is good to detect infinite recursion, but sometimes the limit is reached for normal computations. This environment variable, if set to a positive integer, specifies the number of Scheme value slots to allocate for the stack.

In the future we will implement stacks that can grow and shrink, but for now this hack will have to do.

GUILE_LOAD_COMPILED_PATH

This variable may be used to augment the path that is searched for compiled Scheme files (.go files) when loading. Its value should be a colon-separated list of directories. If it contains the special path component ... (ellipsis), then the default path is put in place of the ellipsis, otherwise the default path is placed at the end. The result is stored in %load-compiled-path (see Load Paths).

Here is an example using the Bash shell that adds the current directory, ., and the relative directory ../my-library to %load-compiled-path:

$ export GUILE_LOAD_COMPILED_PATH=".:../my-library"
$ guile -c '(display %load-compiled-path) (newline)'
(. ../my-library /usr/local/lib/guile/2.0/ccache)
GUILE_LOAD_PATH

This variable may be used to augment the path that is searched for Scheme files when loading. Its value should be a colon-separated list of directories. If it contains the special path component ... (ellipsis), then the default path is put in place of the ellipsis, otherwise the default path is placed at the end. The result is stored in %load-path (see Load Paths).

Here is an example using the Bash shell that prepends the current directory to %load-path, and adds the relative directory ../srfi to the end:

$ env GUILE_LOAD_PATH=".:...:../srfi" \
guile -c '(display %load-path) (newline)'
(. /usr/local/share/guile/2.0 \
/usr/local/share/guile/site/2.0 \
/usr/local/share/guile/site \
/usr/local/share/guile \
../srfi)

(Note: The line breaks, above, are for documentation purposes only, and not required in the actual example.)

GUILE_WARN_DEPRECATED

As Guile evolves, some features will be eliminated or replaced by newer features. To help users migrate their code as this evolution occurs, Guile will issue warning messages about code that uses features that have been marked for eventual elimination. GUILE_WARN_DEPRECATED can be set to “no” to tell Guile not to display these warning messages, or set to “detailed” to tell Guile to display more lengthy messages describing the warning. See Deprecation.

HOME

Guile uses the environment variable HOME, the name of your home directory, to locate various files, such as .guile or .guile_history.


Next: , Previous: , Up: Programming in Scheme   [Contents][Index]

4.3 Guile Scripting

Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.


Next: , Up: Guile Scripting   [Contents][Index]

4.3.1 The Top of a Script File

The first line of a Guile script must tell the operating system to use Guile to evaluate the script, and then tell Guile how to go about doing that. Here is the simplest case:

Guile reads the program, evaluating expressions in the order that they appear. Upon reaching the end of the file, Guile exits.


Next: , Previous: , Up: Guile Scripting   [Contents][Index]

4.3.2 The Meta Switch

Guile’s command-line switches allow the programmer to describe reasonably complicated actions in scripts. Unfortunately, the POSIX script invocation mechanism only allows one argument to appear on the ‘#!’ line after the path to the Guile executable, and imposes arbitrary limits on that argument’s length. Suppose you wrote a script starting like this:

#!/usr/local/bin/guile -e main -s
!#
(define (main args)
  (map (lambda (arg) (display arg) (display " "))
       (cdr args))
  (newline))

The intended meaning is clear: load the file, and then call main on the command-line arguments. However, the system will treat everything after the Guile path as a single argument — the string "-e main -s" — which is not what we want.

As a workaround, the meta switch \ allows the Guile programmer to specify an arbitrary number of options without patching the kernel. If the first argument to Guile is \, Guile will open the script file whose name follows the \, parse arguments starting from the file’s second line (according to rules described below), and substitute them for the \ switch.

Working in concert with the meta switch, Guile treats the characters ‘#!’ as the beginning of a comment which extends through the next line containing only the characters ‘!#’. This sort of comment may appear anywhere in a Guile program, but it is most useful at the top of a file, meshing magically with the POSIX script invocation mechanism.

Thus, consider a script named /u/jimb/ekko which starts like this:

#!/usr/local/bin/guile \
-e main -s
!#
(define (main args)
        (map (lambda (arg) (display arg) (display " "))
             (cdr args))
        (newline))

Suppose a user invokes this script as follows:

$ /u/jimb/ekko a b c

Here’s what happens:

When Guile sees the meta switch \, it parses command-line argument from the script file according to the following rules:


Next: , Previous: , Up: Guile Scripting   [Contents][Index]

4.3.3 Command Line Handling

The ability to accept and handle command line arguments is very important when writing Guile scripts to solve particular problems, such as extracting information from text files or interfacing with existing command line applications. This chapter describes how Guile makes command line arguments available to a Guile script, and the utilities that Guile provides to help with the processing of command line arguments.

When a Guile script is invoked, Guile makes the command line arguments accessible via the procedure command-line, which returns the arguments as a list of strings.

For example, if the script

#! /usr/local/bin/guile -s
!#
(write (command-line))
(newline)

is saved in a file cmdline-test.scm and invoked using the command line ./cmdline-test.scm bar.txt -o foo -frumple grob, the output is

("./cmdline-test.scm" "bar.txt" "-o" "foo" "-frumple" "grob")

If the script invocation includes a -e option, specifying a procedure to call after loading the script, Guile will call that procedure with (command-line) as its argument. So a script that uses -e doesn’t need to refer explicitly to command-line in its code. For example, the script above would have identical behaviour if it was written instead like this:

#! /usr/local/bin/guile \
-e main -s
!#
(define (main args)
  (write args)
  (newline))

(Note the use of the meta switch \ so that the script invocation can include more than one Guile option: See The Meta Switch.)

These scripts use the #! POSIX convention so that they can be executed using their own file names directly, as in the example command line ./cmdline-test.scm bar.txt -o foo -frumple grob. But they can also be executed by typing out the implied Guile command line in full, as in:

$ guile -s ./cmdline-test.scm bar.txt -o foo -frumple grob

or

$ guile -e main -s ./cmdline-test2.scm bar.txt -o foo -frumple grob

Even when a script is invoked using this longer form, the arguments that the script receives are the same as if it had been invoked using the short form. Guile ensures that the (command-line) or -e arguments are independent of how the script is invoked, by stripping off the arguments that Guile itself processes.

A script is free to parse and handle its command line arguments in any way that it chooses. Where the set of possible options and arguments is complex, however, it can get tricky to extract all the options, check the validity of given arguments, and so on. This task can be greatly simplified by taking advantage of the module (ice-9 getopt-long), which is distributed with Guile, See getopt-long.


Previous: , Up: Guile Scripting   [Contents][Index]

4.3.4 Scripting Examples

To start with, here are some examples of invoking Guile directly:

guile -- a b c

Run Guile interactively; (command-line) will return
("/usr/local/bin/guile" "a" "b" "c").

guile -s /u/jimb/ex2 a b c

Load the file /u/jimb/ex2; (command-line) will return
("/u/jimb/ex2" "a" "b" "c").

guile -c '(write %load-path) (newline)'

Write the value of the variable %load-path, print a newline, and exit.

guile -e main -s /u/jimb/ex4 foo

Load the file /u/jimb/ex4, and then call the function main, passing it the list ("/u/jimb/ex4" "foo").

guile -l first -ds -l last -s script

Load the files first, script, and last, in that order. The -ds switch says when to process the -s switch. For a more motivated example, see the scripts below.

Here is a very simple Guile script:

#!/usr/local/bin/guile -s
!#
(display "Hello, world!")
(newline)

The first line marks the file as a Guile script. When the user invokes it, the system runs /usr/local/bin/guile to interpret the script, passing -s, the script’s filename, and any arguments given to the script as command-line arguments. When Guile sees -s script, it loads script. Thus, running this program produces the output:

Hello, world!

Here is a script which prints the factorial of its argument:

#!/usr/local/bin/guile -s
!#
(define (fact n)
  (if (zero? n) 1
    (* n (fact (- n 1)))))

(display (fact (string->number (cadr (command-line)))))
(newline)

In action:

$ ./fact 5
120
$

However, suppose we want to use the definition of fact in this file from another script. We can’t simply load the script file, and then use fact’s definition, because the script will try to compute and display a factorial when we load it. To avoid this problem, we might write the script this way:

#!/usr/local/bin/guile \
-e main -s
!#
(define (fact n)
  (if (zero? n) 1
    (* n (fact (- n 1)))))

(define (main args)
  (display (fact (string->number (cadr args))))
  (newline))

This version packages the actions the script should perform in a function, main. This allows us to load the file purely for its definitions, without any extraneous computation taking place. Then we used the meta switch \ and the entry point switch -e to tell Guile to call main after loading the script.

$ ./fact 50
30414093201713378043612608166064768844377641568960512000000000000

Suppose that we now want to write a script which computes the choose function: given a set of m distinct objects, (choose n m) is the number of distinct subsets containing n objects each. It’s easy to write choose given fact, so we might write the script this way:

#!/usr/local/bin/guile \
-l fact -e main -s
!#
(define (choose n m)
  (/ (fact m) (* (fact (- m n)) (fact n))))

(define (main args)
  (let ((n (string->number (cadr args)))
        (m (string->number (caddr args))))
    (display (choose n m))
    (newline)))

The command-line arguments here tell Guile to first load the file fact, and then run the script, with main as the entry point. In other words, the choose script can use definitions made in the fact script. Here are some sample runs:

$ ./choose 0 4
1
$ ./choose 1 4
4
$ ./choose 2 4
6
$ ./choose 3 4
4
$ ./choose 4 4
1
$ ./choose 50 100
100891344545564193334812497256

Next: , Previous: , Up: Programming in Scheme   [Contents][Index]

4.4 Using Guile Interactively

When you start up Guile by typing just guile, without a -c argument or the name of a script to execute, you get an interactive interpreter where you can enter Scheme expressions, and Guile will evaluate them and print the results for you. Here are some simple examples.

scheme@(guile-user)> (+ 3 4 5)
$1 = 12
scheme@(guile-user)> (display "Hello world!\n")
Hello world!
scheme@(guile-user)> (values 'a 'b)
$2 = a
$3 = b

This mode of use is called a REPL, which is short for “Read-Eval-Print Loop”, because the Guile interpreter first reads the expression that you have typed, then evaluates it, and then prints the result.

The prompt shows you what language and module you are in. In this case, the current language is scheme, and the current module is (guile-user). See Other Languages, for more information on Guile’s support for languages other than Scheme.


Next: , Up: Using Guile Interactively   [Contents][Index]

4.4.1 The Init File, ~/.guile

When run interactively, Guile will load a local initialization file from ~/.guile. This file should contain Scheme expressions for evaluation.

This facility lets the user customize their interactive Guile environment, pulling in extra modules or parameterizing the REPL implementation.

To run Guile without loading the init file, use the -q command-line option.


Next: , Previous: , Up: Using Guile Interactively   [Contents][Index]

4.4.2 Readline

To make it easier for you to repeat and vary previously entered expressions, or to edit the expression that you’re typing in, Guile can use the GNU Readline library. This is not enabled by default because of licensing reasons, but all you need to activate Readline is the following pair of lines.

scheme@(guile-user)> (use-modules (ice-9 readline))
scheme@(guile-user)> (activate-readline)

It’s a good idea to put these two lines (without the scheme@(guile-user)> prompts) in your .guile file. See Init File, for more on .guile.


Next: , Previous: , Up: Using Guile Interactively   [Contents][Index]

4.4.3 Value History

Just as Readline helps you to reuse a previous input line, value history allows you to use the result of a previous evaluation in a new expression. When value history is enabled, each evaluation result is automatically assigned to the next in the sequence of variables $1, $2, …. You can then use these variables in subsequent expressions.

scheme@(guile-user)> (iota 10)
$1 = (0 1 2 3 4 5 6 7 8 9)
scheme@(guile-user)> (apply * (cdr $1))
$2 = 362880
scheme@(guile-user)> (sqrt $2)
$3 = 602.3952191045344
scheme@(guile-user)> (cons $2 $1)
$4 = (362880 0 1 2 3 4 5 6 7 8 9)

Value history is enabled by default, because Guile’s REPL imports the (ice-9 history) module. Value history may be turned off or on within the repl, using the options interface:

scheme@(guile-user)> ,option value-history #f
scheme@(guile-user)> 'foo
foo
scheme@(guile-user)> ,option value-history #t
scheme@(guile-user)> 'bar
$5 = bar

Note that previously recorded values are still accessible, even if value history is off. In rare cases, these references to past computations can cause Guile to use too much memory. One may clear these values, possibly enabling garbage collection, via the clear-value-history! procedure, described below.

The programmatic interface to value history is in a module:

(use-modules (ice-9 history))
Scheme Procedure: value-history-enabled?

Return true if value history is enabled, or false otherwise.

Scheme Procedure: enable-value-history!

Turn on value history, if it was off.

Scheme Procedure: disable-value-history!

Turn off value history, if it was on.

Scheme Procedure: clear-value-history!

Clear the value history. If the stored values are not captured by some other data structure or closure, they may then be reclaimed by the garbage collector.


Next: , Previous: , Up: Using Guile Interactively   [Contents][Index]

4.4.4 REPL Commands

The REPL exists to read expressions, evaluate them, and then print their results. But sometimes one wants to tell the REPL to evaluate an expression in a different way, or to do something else altogether. A user can affect the way the REPL works with a REPL command.

The previous section had an example of a command, in the form of ,option.

scheme@(guile-user)> ,option value-history #t

Commands are distinguished from expressions by their initial comma (‘,’). Since a comma cannot begin an expression in most languages, it is an effective indicator to the REPL that the following text forms a command, not an expression.

REPL commands are convenient because they are always there. Even if the current module doesn’t have a binding for pretty-print, one can always ,pretty-print.

The following sections document the various commands, grouped together by functionality. Many of the commands have abbreviations; see the online help (,help) for more information.


Next: , Up: REPL Commands   [Contents][Index]

4.4.4.1 Help Commands

When Guile starts interactively, it notifies the user that help can be had by typing ‘,help’. Indeed, help is a command, and a particularly useful one, as it allows the user to discover the rest of the commands.

REPL Command: help [all | group | [-c] command]

Show help.

With one argument, tries to look up the argument as a group name, giving help on that group if successful. Otherwise tries to look up the argument as a command, giving help on the command.

If there is a command whose name is also a group name, use the ‘-c command’ form to give help on the command instead of the group.

Without any argument, a list of help commands and command groups are displayed.

REPL Command: show [topic]

Gives information about Guile.

With one argument, tries to show a particular piece of information; currently supported topics are ‘warranty’ (or ‘w’), ‘copying’ (or ‘c’), and ‘version’ (or ‘v’).

Without any argument, a list of topics is displayed.

REPL Command: apropos regexp

Find bindings/modules/packages.

REPL Command: describe obj

Show description/documentation.


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.2 Module Commands

REPL Command: module [module]

Change modules / Show current module.

REPL Command: import module …

Import modules / List those imported.

REPL Command: load file

Load a file in the current module.

REPL Command: reload [module]

Reload the given module, or the current module if none was given.

REPL Command: binding

List current bindings.

REPL Command: in module expression
REPL Command: in module command arg …

Evaluate an expression, or alternatively, execute another meta-command in the context of a module. For example, ‘,in (foo bar) ,binding’ will show the bindings in the module (foo bar).


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.3 Language Commands

REPL Command: language language

Change languages.


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.4 Compile Commands

REPL Command: compile exp

Generate compiled code.

REPL Command: compile-file file

Compile a file.

REPL Command: expand exp

Expand any macros in a form.

REPL Command: optimize exp

Run the optimizer on a piece of code and print the result.

REPL Command: disassemble exp

Disassemble a compiled procedure.

REPL Command: disassemble-file file

Disassemble a file.


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.5 Profile Commands

REPL Command: time exp

Time execution.

REPL Command: profile exp

Profile execution.

REPL Command: trace exp [#:width w] [#:max-indent i]

Trace execution.

By default, the trace will limit its width to the width of your terminal, or width if specified. Nested procedure invocations will be printed farther to the right, though if the width of the indentation passes the max-indent, the indentation is abbreviated.


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.6 Debug Commands

These debugging commands are only available within a recursive REPL; they do not work at the top level.

REPL Command: backtrace [count] [#:width w] [#:full? f]

Print a backtrace.

Print a backtrace of all stack frames, or innermost count frames. If count is negative, the last count frames will be shown.

REPL Command: up [count]

Select a calling stack frame.

Select and print stack frames that called this one. An argument says how many frames up to go.

REPL Command: down [count]

Select a called stack frame.

Select and print stack frames called by this one. An argument says how many frames down to go.

REPL Command: frame [idx]

Show a frame.

Show the selected frame. With an argument, select a frame by index, then show it.

REPL Command: procedure

Print the procedure for the selected frame.

REPL Command: locals

Show local variables.

Show locally-bound variables in the selected frame.

REPL Command: error-message
REPL Command: error

Show error message.

Display the message associated with the error that started the current debugging REPL.

REPL Command: registers

Show the VM registers associated with the current frame.

See Stack Layout, for more information on VM stack frames.

REPL Command: width [cols]

Sets the number of display columns in the output of ,backtrace and ,locals to cols. If cols is not given, the width of the terminal is used.

The next 3 commands work at any REPL.

REPL Command: break proc

Set a breakpoint at proc.

REPL Command: break-at-source file line

Set a breakpoint at the given source location.

REPL Command: tracepoint proc

Set a tracepoint on the given procedure. This will cause all calls to the procedure to print out a tracing message. See Tracing Traps, for more information.

The rest of the commands in this subsection all apply only when the stack is continuable — in other words when it makes sense for the program that the stack comes from to continue running. Usually this means that the program stopped because of a trap or a breakpoint.

REPL Command: step

Tell the debugged program to step to the next source location.

REPL Command: next

Tell the debugged program to step to the next source location in the same frame. (See Traps for the details of how this works.)

REPL Command: finish

Tell the program being debugged to continue running until the completion of the current stack frame, and at that time to print the result and reenter the REPL.


Next: , Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.7 Inspect Commands

REPL Command: inspect exp

Inspect the result(s) of evaluating exp.

REPL Command: pretty-print exp

Pretty-print the result(s) of evaluating exp.


Previous: , Up: REPL Commands   [Contents][Index]

4.4.4.8 System Commands

REPL Command: gc

Garbage collection.

REPL Command: statistics

Display statistics.

REPL Command: option [name] [exp]

With no arguments, lists all options. With one argument, shows the current value of the name option. With two arguments, sets the name option to the result of evaluating the Scheme expression exp.

REPL Command: quit

Quit this session.

Current REPL options include:

compile-options

The options used when compiling expressions entered at the REPL. See Compilation, for more on compilation options.

interp

Whether to interpret or compile expressions given at the REPL, if such a choice is available. Off by default (indicating compilation).

prompt

A customized REPL prompt. #f by default, indicating the default prompt.

print

A procedure of two arguments used to print the result of evaluating each expression. The arguments are the current REPL and the value to print. By default, #f, to use the default procedure.

value-history

Whether value history is on or not. See Value History.

on-error

What to do when an error happens. By default, debug, meaning to enter the debugger. Other values include backtrace, to show a backtrace without entering the debugger, or report, to simply show a short error printout.

Default values for REPL options may be set using repl-default-option-set! from (system repl common):

Scheme Procedure: repl-default-option-set! key value

Set the default value of a REPL option. This function is particularly useful in a user’s init file. See Init File.


Next: , Previous: , Up: Using Guile Interactively   [Contents][Index]

4.4.5 Error Handling

When code being evaluated from the REPL hits an error, Guile enters a new prompt, allowing you to inspect the context of the error.

scheme@(guile-user)> (map string-append '("a" "b") '("c" #\d))
ERROR: In procedure string-append:
ERROR: Wrong type (expecting string): #\d
Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>

The new prompt runs inside the old one, in the dynamic context of the error. It is a recursive REPL, augmented with a reified representation of the stack, ready for debugging.

,backtrace (abbreviated ,bt) displays the Scheme call stack at the point where the error occurred:

scheme@(guile-user) [1]> ,bt
           1 (map #<procedure string-append _> ("a" "b") ("c" #\d))
           0 (string-append "b" #\d)

In the above example, the backtrace doesn’t have much source information, as map and string-append are both primitives. But in the general case, the space on the left of the backtrace indicates the line and column in which a given procedure calls another.

You can exit a recursive REPL in the same way that you exit any REPL: via ‘(quit)’, ‘,quit’ (abbreviated ‘,q’), or C-d, among other options.


Previous: , Up: Using Guile Interactively   [Contents][Index]

4.4.6 Interactive Debugging

A recursive debugging REPL exposes a number of other meta-commands that inspect the state of the computation at the time of the error. These commands allow you to

See Debug Commands, for documentation of the individual commands. This section aims to give more of a walkthrough of a typical debugging session.

First, we’re going to need a good error. Let’s try to macroexpand the expression (unquote foo), outside of a quasiquote form, and see how the macroexpander reports this error.

scheme@(guile-user)> (macroexpand '(unquote foo))
ERROR: In procedure macroexpand:
ERROR: unquote: expression not valid outside of quasiquote in (unquote foo)
Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>

The backtrace command, which can also be invoked as bt, displays the call stack (aka backtrace) at the point where the debugger was entered:

scheme@(guile-user) [1]> ,bt
In ice-9/psyntax.scm:
  1130:21  3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #))
  1071:30  2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f)
  1368:28  1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...)
In unknown file:
           0 (scm-error syntax-error macroexpand "~a: ~a in ~a" # #f)

A call stack consists of a sequence of stack frames, with each frame describing one procedure which is waiting to do something with the values returned by another. Here we see that there are four frames on the stack.

Note that macroexpand is not on the stack – it must have made a tail call to chi-top, as indeed we would find if we searched ice-9/psyntax.scm for its definition.

When you enter the debugger, the innermost frame is selected, which means that the commands for getting information about the “current” frame, or for evaluating expressions in the context of the current frame, will do so by default with respect to the innermost frame. To select a different frame, so that these operations will apply to it instead, use the up, down and frame commands like this:

scheme@(guile-user) [1]> ,up
In ice-9/psyntax.scm:
  1368:28  1 (chi-macro #<procedure de9360 at ice-9/psyntax.scm...> ...)
scheme@(guile-user) [1]> ,frame 3
In ice-9/psyntax.scm:
  1130:21  3 (chi-top (unquote foo) () ((top)) e (eval) (hygiene #))
scheme@(guile-user) [1]> ,down
In ice-9/psyntax.scm:
  1071:30  2 (syntax-type (unquote foo) () ((top)) #f #f (# #) #f)

Perhaps we’re interested in what’s going on in frame 2, so we take a look at its local variables:

scheme@(guile-user) [1]> ,locals
  Local variables:
  $1 = e = (unquote foo)
  $2 = r = ()
  $3 = w = ((top))
  $4 = s = #f
  $5 = rib = #f
  $6 = mod = (hygiene guile-user)
  $7 = for-car? = #f
  $8 = first = unquote
  $9 = ftype = macro
  $10 = fval = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)>
  $11 = fe = unquote
  $12 = fw = ((top))
  $13 = fs = #f
  $14 = fmod = (hygiene guile-user)

All of the values are accessible by their value-history names ($n):

scheme@(guile-user) [1]> $10
$15 = #<procedure de9360 at ice-9/psyntax.scm:2817:2 (x)>

We can even invoke the procedure at the REPL directly:

scheme@(guile-user) [1]> ($10 'not-going-to-work)
ERROR: In procedure macroexpand:
ERROR: source expression failed to match any pattern in not-going-to-work
Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.

Well at this point we’ve caused an error within an error. Let’s just quit back to the top level:

scheme@(guile-user) [2]> ,q
scheme@(guile-user) [1]> ,q
scheme@(guile-user)> 

Finally, as a word to the wise: hackers close their REPL prompts with C-d.


Next: , Previous: , Up: Programming in Scheme   [Contents][Index]

4.5 Using Guile in Emacs

Any text editor can edit Scheme, but some are better than others. Emacs is the best, of course, and not just because it is a fine text editor. Emacs has good support for Scheme out of the box, with sensible indentation rules, parenthesis-matching, syntax highlighting, and even a set of keybindings for structural editing, allowing navigation, cut-and-paste, and transposition operations that work on balanced S-expressions.

As good as it is, though, two things will vastly improve your experience with Emacs and Guile.

The first is Taylor Campbell’s Paredit. You should not code in any dialect of Lisp without Paredit. (They say that unopinionated writing is boring—hence this tone—but it’s the truth, regardless.) Paredit is the bee’s knees.

The second is José Antonio Ortega Ruiz’s Geiser. Geiser complements Emacs’ scheme-mode with tight integration to running Guile processes via a comint-mode REPL buffer.

Of course there are keybindings to switch to the REPL, and a good REPL environment, but Geiser goes beyond that, providing:

See Geiser’s web page at http://www.nongnu.org/geiser/, for more information.


Next: , Previous: , Up: Programming in Scheme   [Contents][Index]

4.6 Using Guile Tools

Guile also comes with a growing number of command-line utilities: a compiler, a disassembler, some module inspectors, and in the future, a system to install Guile packages from the internet. These tools may be invoked using the guild program.

$ guild compile -o foo.go foo.scm
wrote `foo.go'

This program used to be called guile-tools up to Guile version 2.0.1, and for backward compatibility it still may be called as such. However we changed the name to guild, not only because it is pleasantly shorter and easier to read, but also because this tool will serve to bind Guile wizards together, by allowing hackers to share code with each other using a CPAN-like system.

See Compilation, for more on guild compile.

A complete list of guild scripts can be had by invoking guild list, or simply guild.


Previous: , Up: Programming in Scheme   [Contents][Index]

4.7 Installing Site Packages

At some point, you will probably want to share your code with other people. To do so effectively, it is important to follow a set of common conventions, to make it easy for the user to install and use your package.

The first thing to do is to install your Scheme files where Guile can find them. When Guile goes to find a Scheme file, it will search a load path to find the file: first in Guile’s own path, then in paths for site packages. A site package is any Scheme code that is installed and not part of Guile itself. See Load Paths, for more on load paths.

There are several site paths, for historical reasons, but the one that should generally be used can be obtained by invoking the %site-dir procedure. See Build Config. If Guile 2.0 is installed on your system in /usr/, then (%site-dir) will be /usr/share/guile/site/2.0. Scheme files should be installed there.

If you do not install compiled .go files, Guile will compile your modules and programs when they are first used, and cache them in the user’s home directory. See Compilation, for more on auto-compilation. However, it is better to compile the files before they are installed, and to just copy the files to a place that Guile can find them.

As with Scheme files, Guile searches a path to find compiled .go files, the %load-compiled-path. By default, this path has two entries: a path for Guile’s files, and a path for site packages. You should install your .go files into the latter directory, whose value is returned by invoking the %site-ccache-dir procedure. As in the previous example, if Guile 2.0 is installed on your system in /usr/, then (%site-ccache-dir) site packages will be /usr/lib/guile/2.0/site-ccache.

Note that a .go file will only be loaded in preference to a .scm file if it is newer. For that reason, you should install your Scheme files first, and your compiled files second. Load Paths, for more on the loading process.

Finally, although this section is only about Scheme, sometimes you need to install C extensions too. Shared libraries should be installed in the extensions dir. This value can be had from the build config (see Build Config). Again, if Guile 2.0 is installed on your system in /usr/, then the extensions dir will be /usr/lib/guile/2.0/extensions.


Next: , Previous: , Up: Top   [Contents][Index]

5 Programming in C

This part of the manual explains the general concepts that you need to understand when interfacing to Guile from C. You will learn about how the latent typing of Scheme is embedded into the static typing of C, how the garbage collection of Guile is made available to C code, and how continuations influence the control flow in a C program.

This knowledge should make it straightforward to add new functions to Guile that can be called from Scheme. Adding new data types is also possible and is done by defining smobs.

The Programming Overview section of this part contains general musings and guidelines about programming with Guile. It explores different ways to design a program around Guile, or how to embed Guile into existing programs.

For a pedagogical yet detailed explanation of how the data representation of Guile is implemented, See Data Representation. You don’t need to know the details given there to use Guile from C, but they are useful when you want to modify Guile itself or when you are just curious about how it is all done.

For detailed reference information on the variables, functions etc. that make up Guile’s application programming interface (API), See API Reference.


Next: , Up: Programming in C   [Contents][Index]

5.1 Parallel Installations

Guile provides strong API and ABI stability guarantees during stable series, so that if a user writes a program against Guile version 2.0.3, it will be compatible with some future version 2.0.7. We say in this case that 2.0 is the effective version, composed of the major and minor versions, in this case 2 and 0.

Users may install multiple effective versions of Guile, with each version’s headers, libraries, and Scheme files under their own directories. This provides the necessary stability guarantee for users, while also allowing Guile developers to evolve the language and its implementation.

However, parallel installability does have a down-side, in that users need to know which version of Guile to ask for, when they build against Guile. Guile solves this problem by installing a file to be read by the pkg-config utility, a tool to query installed packages by name. Guile encodes the version into its pkg-config name, so that users can ask for guile-2.0 or guile-2.2, as appropriate.

For effective version 2.0, for example, you would invoke pkg-config --cflags --libs guile-2.0 to get the compilation and linking flags necessary to link to version 2.0 of Guile. You would typically run pkg-config during the configuration phase of your program and use the obtained information in the Makefile.

Guile’s pkg-config file, guile-2.0.pc, defines additional useful variables:

sitedir

The default directory where Guile looks for Scheme source and compiled files (see %site-dir). Run pkg-config guile-2.0 --variable=sitedir to see its value. See GUILE_SITE_DIR, for more on how to use it from Autoconf.

extensiondir

The default directory where Guile looks for extensions—i.e., shared libraries providing additional features (see Modules and Extensions). Run pkg-config guile-2.0 --variable=extensiondir to see its value.

See the pkg-config man page, for more information, or its web site, http://pkg-config.freedesktop.org/. See Autoconf Support, for more on checking for Guile from within a configure.ac file.


Next: , Previous: , Up: Programming in C   [Contents][Index]

5.2 Linking Programs With Guile

This section covers the mechanics of linking your program with Guile on a typical POSIX system.

The header file <libguile.h> provides declarations for all of Guile’s functions and constants. You should #include it at the head of any C source file that uses identifiers described in this manual. Once you’ve compiled your source files, you need to link them against the Guile object code library, libguile.

As noted in the previous section, <libguile.h> is not in the default search path for headers. The following command lines give respectively the C compilation and link flags needed to build programs using Guile 2.0:

pkg-config guile-2.0 --cflags
pkg-config guile-2.0 --libs

Next: , Up: Linking Programs With Guile   [Contents][Index]

5.2.1 Guile Initialization Functions

To initialize Guile, you can use one of several functions. The first, scm_with_guile, is the most portable way to initialize Guile. It will initialize Guile when necessary and then call a function that you can specify. Multiple threads can call scm_with_guile concurrently and it can also be called more than once in a given thread. The global state of Guile will survive from one call of scm_with_guile to the next. Your function is called from within scm_with_guile since the garbage collector of Guile needs to know where the stack of each thread is.

A second function, scm_init_guile, initializes Guile for the current thread. When it returns, you can use the Guile API in the current thread. This function employs some non-portable magic to learn about stack bounds and might thus not be available on all platforms.

One common way to use Guile is to write a set of C functions which perform some useful task, make them callable from Scheme, and then link the program with Guile. This yields a Scheme interpreter just like guile, but augmented with extra functions for some specific application — a special-purpose scripting language.

In this situation, the application should probably process its command-line arguments in the same manner as the stock Guile interpreter. To make that straightforward, Guile provides the scm_boot_guile and scm_shell function.

For more about these functions, see Initialization.


Previous: , Up: Linking Programs With Guile   [Contents][Index]

5.2.2 A Sample Guile Main Program

Here is simple-guile.c, source code for a main and an inner_main function that will produce a complete Guile interpreter.

/* simple-guile.c --- Start Guile from C.  */

#include <libguile.h>

static void
inner_main (void *closure, int argc, char **argv)
{
  /* preparation */
  scm_shell (argc, argv);
  /* after exit */
}

int
main (int argc, char **argv)
{
  scm_boot_guile (argc, argv, inner_main, 0);
  return 0; /* never reached, see inner_main */
}

The main function calls scm_boot_guile to initialize Guile, passing it inner_main. Once scm_boot_guile is ready, it invokes inner_main, which calls scm_shell to process the command-line arguments in the usual way.

5.2.3 Building the Example with Make

Here is a Makefile which you can use to compile the example program. It uses pkg-config to learn about the necessary compiler and linker flags.

# Use GCC, if you have it installed.
CC=gcc

# Tell the C compiler where to find <libguile.h>
CFLAGS=`pkg-config --cflags guile-2.0`

# Tell the linker what libraries to use and where to find them.
LIBS=`pkg-config --libs guile-2.0`

simple-guile: simple-guile.o
        ${CC} simple-guile.o ${LIBS} -o simple-guile

simple-guile.o: simple-guile.c
        ${CC} -c ${CFLAGS} simple-guile.c

5.2.4 Building the Example with Autoconf

If you are using the GNU Autoconf package to make your application more portable, Autoconf will settle many of the details in the Makefile automatically, making it much simpler and more portable; we recommend using Autoconf with Guile. Here is a configure.ac file for simple-guile that uses the standard PKG_CHECK_MODULES macro to check for Guile. Autoconf will process this file into a configure script. We recommend invoking Autoconf via the autoreconf utility.

AC_INIT(simple-guile.c)

# Find a C compiler.
AC_PROG_CC

# Check for Guile
PKG_CHECK_MODULES([GUILE], [guile-2.0])

# Generate a Makefile, based on the results.
AC_OUTPUT(Makefile)

Run autoreconf -vif to generate configure.

Here is a Makefile.in template, from which the configure script produces a Makefile customized for the host system:

# The configure script fills in these values.
CC=@CC@
CFLAGS=@GUILE_CFLAGS@
LIBS=@GUILE_LIBS@

simple-guile: simple-guile.o
        ${CC} simple-guile.o ${LIBS} -o simple-guile
simple-guile.o: simple-guile.c
        ${CC} -c ${CFLAGS} simple-guile.c

The developer should use Autoconf to generate the configure script from the configure.ac template, and distribute configure with the application. Here’s how a user might go about building the application:

$ ls
Makefile.in     configure*      configure.ac    simple-guile.c
$ ./configure
checking for gcc... ccache gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether ccache gcc accepts -g... yes
checking for ccache gcc option to accept ISO C89... none needed
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for GUILE... yes
configure: creating ./config.status
config.status: creating Makefile
$ make
[...]
$ ./simple-guile
guile> (+ 1 2 3)
6
guile> (getpwnam "jimb")
#("jimb" "83Z7d75W2tyJQ" 4008 10 "Jim Blandy" "/u/jimb"
  "/usr/local/bin/bash")
guile> (exit)
$

Next: , Previous: , Up: Programming in C   [Contents][Index]

5.3 Linking Guile with Libraries

The previous section has briefly explained how to write programs that make use of an embedded Guile interpreter. But sometimes, all you want to do is make new primitive procedures and data types available to the Scheme programmer. Writing a new version of guile is inconvenient in this case and it would in fact make the life of the users of your new features needlessly hard.

For example, suppose that there is a program guile-db that is a version of Guile with additional features for accessing a database. People who want to write Scheme programs that use these features would have to use guile-db instead of the usual guile program. Now suppose that there is also a program guile-gtk that extends Guile with access to the popular Gtk+ toolkit for graphical user interfaces. People who want to write GUIs in Scheme would have to use guile-gtk. Now, what happens when you want to write a Scheme application that uses a GUI to let the user access a database? You would have to write a third program that incorporates both the database stuff and the GUI stuff. This might not be easy (because guile-gtk might be a quite obscure program, say) and taking this example further makes it easy to see that this approach can not work in practice.

It would have been much better if both the database features and the GUI feature had been provided as libraries that can just be linked with guile. Guile makes it easy to do just this, and we encourage you to make your extensions to Guile available as libraries whenever possible.

You write the new primitive procedures and data types in the normal fashion, and link them into a shared library instead of into a stand-alone program. The shared library can then be loaded dynamically by Guile.


Up: Linking Guile with Libraries   [Contents][Index]

5.3.1 A Sample Guile Extension

This section explains how to make the Bessel functions of the C library available to Scheme. First we need to write the appropriate glue code to convert the arguments and return values of the functions from Scheme to C and back. Additionally, we need a function that will add them to the set of Guile primitives. Because this is just an example, we will only implement this for the j0 function.

Consider the following file bessel.c.

#include <math.h>
#include <libguile.h>

SCM
j0_wrapper (SCM x)
{
  return scm_from_double (j0 (scm_to_double (x)));
}

void
init_bessel ()
{
  scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
}

This C source file needs to be compiled into a shared library. Here is how to do it on GNU/Linux:

gcc `pkg-config --cflags guile-2.0` \
  -shared -o libguile-bessel.so -fPIC bessel.c

For creating shared libraries portably, we recommend the use of GNU Libtool (see Introduction in GNU Libtool).

A shared library can be loaded into a running Guile process with the function load-extension. In addition to the name of the library to load, this function also expects the name of a function from that library that will be called to initialize it. For our example, we are going to call the function init_bessel which will make j0_wrapper available to Scheme programs with the name j0. Note that we do not specify a filename extension such as .so when invoking load-extension. The right extension for the host platform will be provided automatically.

(load-extension "libguile-bessel" "init_bessel")
(j0 2)
⇒ 0.223890779141236

For this to work, load-extension must be able to find libguile-bessel, of course. It will look in the places that are usual for your operating system, and it will additionally look into the directories listed in the LTDL_LIBRARY_PATH environment variable.

To see how these Guile extensions via shared libraries relate to the module system, See Putting Extensions into Modules.


Next: , Previous: , Up: Programming in C   [Contents][Index]

5.4 General concepts for using libguile

When you want to embed the Guile Scheme interpreter into your program or library, you need to link it against the libguile library (see Linking Programs With Guile). Once you have done this, your C code has access to a number of data types and functions that can be used to invoke the interpreter, or make new functions that you have written in C available to be called from Scheme code, among other things.

Scheme is different from C in a number of significant ways, and Guile tries to make the advantages of Scheme available to C as well. Thus, in addition to a Scheme interpreter, libguile also offers dynamic types, garbage collection, continuations, arithmetic on arbitrary sized numbers, and other things.

The two fundamental concepts are dynamic types and garbage collection. You need to understand how libguile offers them to C programs in order to use the rest of libguile. Also, the more general control flow of Scheme caused by continuations needs to be dealt with.

Running asynchronous signal handlers and multi-threading is known to C code already, but there are of course a few additional rules when using them together with libguile.


Next: , Up: General Libguile Concepts   [Contents][Index]

5.4.1 Dynamic Types

Scheme is a dynamically-typed language; this means that the system cannot, in general, determine the type of a given expression at compile time. Types only become apparent at run time. Variables do not have fixed types; a variable may hold a pair at one point, an integer at the next, and a thousand-element vector later. Instead, values, not variables, have fixed types.

In order to implement standard Scheme functions like pair? and string? and provide garbage collection, the representation of every value must contain enough information to accurately determine its type at run time. Often, Scheme systems also use this information to determine whether a program has attempted to apply an operation to an inappropriately typed value (such as taking the car of a string).

Because variables, pairs, and vectors may hold values of any type, Scheme implementations use a uniform representation for values — a single type large enough to hold either a complete value or a pointer to a complete value, along with the necessary typing information.

In Guile, this uniform representation of all Scheme values is the C type SCM. This is an opaque type and its size is typically equivalent to that of a pointer to void. Thus, SCM values can be passed around efficiently and they take up reasonably little storage on their own.

The most important rule is: You never access a SCM value directly; you only pass it to functions or macros defined in libguile.

As an obvious example, although a SCM variable can contain integers, you can of course not compute the sum of two SCM values by adding them with the C + operator. You must use the libguile function scm_sum.

Less obvious and therefore more important to keep in mind is that you also cannot directly test SCM values for trueness. In Scheme, the value #f is considered false and of course a SCM variable can represent that value. But there is no guarantee that the SCM representation of #f looks false to C code as well. You need to use scm_is_true or scm_is_false to test a SCM value for trueness or falseness, respectively.

You also can not directly compare two SCM values to find out whether they are identical (that is, whether they are eq? in Scheme terms). You need to use scm_is_eq for this.

The one exception is that you can directly assign a SCM value to a SCM variable by using the C = operator.

The following (contrived) example shows how to do it right. It implements a function of two arguments (a and flag) that returns a+1 if flag is true, else it returns a unchanged.

SCM
my_incrementing_function (SCM a, SCM flag)
{
  SCM result;

  if (scm_is_true (flag))
    result = scm_sum (a, scm_from_int (1));
  else
    result = a;

  return result;
}

Often, you need to convert between SCM values and appropriate C values. For example, we needed to convert the integer 1 to its SCM representation in order to add it to a. Libguile provides many function to do these conversions, both from C to SCM and from SCM to C.

The conversion functions follow a common naming pattern: those that make a SCM value from a C value have names of the form scm_from_type (…) and those that convert a SCM value to a C value use the form scm_to_type (…).

However, it is best to avoid converting values when you can. When you must combine C values and SCM values in a computation, it is often better to convert the C values to SCM values and do the computation by using libguile functions than to the other way around (converting SCM to C and doing the computation some other way).

As a simple example, consider this version of my_incrementing_function from above:

SCM
my_other_incrementing_function (SCM a, SCM flag)
{
  int result;

  if (scm_is_true (flag))
    result = scm_to_int (a) + 1;
  else
    result = scm_to_int (a);

  return scm_from_int (result);
}

This version is much less general than the original one: it will only work for values A that can fit into a int. The original function will work for all values that Guile can represent and that scm_sum can understand, including integers bigger than long long, floating point numbers, complex numbers, and new numerical types that have been added to Guile by third-party libraries.

Also, computing with SCM is not necessarily inefficient. Small integers will be encoded directly in the SCM value, for example, and do not need any additional memory on the heap. See Data Representation to find out the details.

Some special SCM values are available to C code without needing to convert them from C values:

Scheme valueC representation
#fSCM_BOOL_F
#tSCM_BOOL_T
()SCM_EOL

In addition to SCM, Guile also defines the related type scm_t_bits. This is an unsigned integral type of sufficient size to hold all information that is directly contained in a SCM value. The scm_t_bits type is used internally by Guile to do all the bit twiddling explained in Data Representation, but you will encounter it occasionally in low-level user code as well.


Next: , Previous: , Up: General Libguile Concepts   [Contents][Index]

5.4.2 Garbage Collection

As explained above, the SCM type can represent all Scheme values. Some values fit entirely into a SCM value (such as small integers), but other values require additional storage in the heap (such as strings and vectors). This additional storage is managed automatically by Guile. You don’t need to explicitly deallocate it when a SCM value is no longer used.

Two things must be guaranteed so that Guile is able to manage the storage automatically: it must know about all blocks of memory that have ever been allocated for Scheme values, and it must know about all Scheme values that are still being used. Given this knowledge, Guile can periodically free all blocks that have been allocated but are not used by any active Scheme values. This activity is called garbage collection.

It is easy for Guile to remember all blocks of memory that it has allocated for use by Scheme values, but you need to help it with finding all Scheme values that are in use by C code.

You do this when writing a SMOB mark function, for example (see Garbage Collecting Smobs). By calling this function, the garbage collector learns about all references that your SMOB has to other SCM values.

Other references to SCM objects, such as global variables of type SCM or other random data structures in the heap that contain fields of type SCM, can be made visible to the garbage collector by calling the functions scm_gc_protect or scm_permanent_object. You normally use these functions for long lived objects such as a hash table that is stored in a global variable. For temporary references in local variables or function arguments, using these functions would be too expensive.

These references are handled differently: Local variables (and function arguments) of type SCM are automatically visible to the garbage collector. This works because the collector scans the stack for potential references to SCM objects and considers all referenced objects to be alive. The scanning considers each and every word of the stack, regardless of what it is actually used for, and then decides whether it could possibly be a reference to a SCM object. Thus, the scanning is guaranteed to find all actual references, but it might also find words that only accidentally look like references. These ‘false positives’ might keep SCM objects alive that would otherwise be considered dead. While this might waste memory, keeping an object around longer than it strictly needs to is harmless. This is why this technique is called “conservative garbage collection”. In practice, the wasted memory seems to be no problem.

The stack of every thread is scanned in this way and the registers of the CPU and all other memory locations where local variables or function parameters might show up are included in this scan as well.

The consequence of the conservative scanning is that you can just declare local variables and function parameters of type SCM and be sure that the garbage collector will not free the corresponding objects.

However, a local variable or function parameter is only protected as long as it is really on the stack (or in some register). As an optimization, the C compiler might reuse its location for some other value and the SCM object would no longer be protected. Normally, this leads to exactly the right behavior: the compiler will only overwrite a reference when it is no longer needed and thus the object becomes unprotected precisely when the reference disappears, just as wanted.

There are situations, however, where a SCM object needs to be around longer than its reference from a local variable or function parameter. This happens, for example, when you retrieve some pointer from a smob and work with that pointer directly. The reference to the SCM smob object might be dead after the pointer has been retrieved, but the pointer itself (and the memory pointed to) is still in use and thus the smob object must be protected. The compiler does not know about this connection and might overwrite the SCM reference too early.

To get around this problem, you can use scm_remember_upto_here_1 and its cousins. It will keep the compiler from overwriting the reference. For a typical example of its use, see Remembering During Operations.


Next: , Previous: , Up: General Libguile Concepts   [Contents][Index]

5.4.3 Control Flow

Scheme has a more general view of program flow than C, both locally and non-locally.

Controlling the local flow of control involves things like gotos, loops, calling functions and returning from them. Non-local control flow refers to situations where the program jumps across one or more levels of function activations without using the normal call or return operations.

The primitive means of C for local control flow is the goto statement, together with if. Loops done with for, while or do could in principle be rewritten with just goto and if. In Scheme, the primitive means for local control flow is the function call (together with if). Thus, the repetition of some computation in a loop is ultimately implemented by a function that calls itself, that is, by recursion.

This approach is theoretically very powerful since it is easier to reason formally about recursion than about gotos. In C, using recursion exclusively would not be practical, though, since it would eat up the stack very quickly. In Scheme, however, it is practical: function calls that appear in a tail position do not use any additional stack space (see Tail Calls).

A function call is in a tail position when it is the last thing the calling function does. The value returned by the called function is immediately returned from the calling function. In the following example, the call to bar-1 is in a tail position, while the call to bar-2 is not. (The call to 1- in foo-2 is in a tail position, though.)

(define (foo-1 x)
  (bar-1 (1- x)))

(define (foo-2 x)
  (1- (bar-2 x)))

Thus, when you take care to recurse only in tail positions, the recursion will only use constant stack space and will be as good as a loop constructed from gotos.

Scheme offers a few syntactic abstractions (do and named let) that make writing loops slightly easier.

But only Scheme functions can call other functions in a tail position: C functions can not. This matters when you have, say, two functions that call each other recursively to form a common loop. The following (unrealistic) example shows how one might go about determining whether a non-negative integer n is even or odd.

(define (my-even? n)
  (cond ((zero? n) #t)
        (else (my-odd? (1- n)))))

(define (my-odd? n)
  (cond ((zero? n) #f)
        (else (my-even? (1- n)))))

Because the calls to my-even? and my-odd? are in tail positions, these two procedures can be applied to arbitrary large integers without overflowing the stack. (They will still take a lot of time, of course.)

However, when one or both of the two procedures would be rewritten in C, it could no longer call its companion in a tail position (since C does not have this concept). You might need to take this consideration into account when deciding which parts of your program to write in Scheme and which in C.

In addition to calling functions and returning from them, a Scheme program can also exit non-locally from a function so that the control flow returns directly to an outer level. This means that some functions might not return at all.

Even more, it is not only possible to jump to some outer level of control, a Scheme program can also jump back into the middle of a function that has already exited. This might cause some functions to return more than once.

In general, these non-local jumps are done by invoking continuations that have previously been captured using call-with-current-continuation. Guile also offers a slightly restricted set of functions, catch and throw, that can only be used for non-local exits. This restriction makes them more efficient. Error reporting (with the function error) is implemented by invoking throw, for example. The functions catch and throw belong to the topic of exceptions.

Since Scheme functions can call C functions and vice versa, C code can experience the more general control flow of Scheme as well. It is possible that a C function will not return at all, or will return more than once. While C does offer setjmp and longjmp for non-local exits, it is still an unusual thing for C code. In contrast, non-local exits are very common in Scheme, mostly to report errors.

You need to be prepared for the non-local jumps in the control flow whenever you use a function from libguile: it is best to assume that any libguile function might signal an error or run a pending signal handler (which in turn can do arbitrary things).

It is often necessary to take cleanup actions when the control leaves a function non-locally. Also, when the control returns non-locally, some setup actions might be called for. For example, the Scheme function with-output-to-port needs to modify the global state so that current-output-port returns the port passed to with-output-to-port. The global output port needs to be reset to its previous value when with-output-to-port returns normally or when it is exited non-locally. Likewise, the port needs to be set again when control enters non-locally.

Scheme code can use the dynamic-wind function to arrange for the setting and resetting of the global state. C code can use the corresponding scm_internal_dynamic_wind function, or a scm_dynwind_begin/scm_dynwind_end pair together with suitable ’dynwind actions’ (see Dynamic Wind).

Instead of coping with non-local control flow, you can also prevent it by erecting a continuation barrier, See Continuation Barriers. The function scm_c_with_continuation_barrier, for example, is guaranteed to return exactly once.


Next: , Previous: , Up: General Libguile Concepts   [Contents][Index]

5.4.4 Asynchronous Signals

You can not call libguile functions from handlers for POSIX signals, but you can register Scheme handlers for POSIX signals such as SIGINT. These handlers do not run during the actual signal delivery. Instead, they are run when the program (more precisely, the thread that the handler has been registered for) reaches the next safe point.

The libguile functions themselves have many such safe points. Consequently, you must be prepared for arbitrary actions anytime you call a libguile function. For example, even scm_cons can contain a safe point and when a signal handler is pending for your thread, calling scm_cons will run this handler and anything might happen, including a non-local exit although scm_cons would not ordinarily do such a thing on its own.

If you do not want to allow the running of asynchronous signal handlers, you can block them temporarily with scm_dynwind_block_asyncs, for example. See See System asyncs.

Since signal handling in Guile relies on safe points, you need to make sure that your functions do offer enough of them. Normally, calling libguile functions in the normal course of action is all that is needed. But when a thread might spent a long time in a code section that calls no libguile function, it is good to include explicit safe points. This can allow the user to interrupt your code with C-c, for example.

You can do this with the macro SCM_TICK. This macro is syntactically a statement. That is, you could use it like this:

while (1)
  {
    SCM_TICK;
    do_some_work ();
  }

Frequent execution of a safe point is even more important in multi threaded programs, See Multi-Threading.


Previous: , Up: General Libguile Concepts   [Contents][Index]

5.4.5 Multi-Threading

Guile can be used in multi-threaded programs just as well as in single-threaded ones.

Each thread that wants to use functions from libguile must put itself into guile mode and must then follow a few rules. If it doesn’t want to honor these rules in certain situations, a thread can temporarily leave guile mode (but can no longer use libguile functions during that time, of course).

Threads enter guile mode by calling scm_with_guile, scm_boot_guile, or scm_init_guile. As explained in the reference documentation for these functions, Guile will then learn about the stack bounds of the thread and can protect the SCM values that are stored in local variables. When a thread puts itself into guile mode for the first time, it gets a Scheme representation and is listed by all-threads, for example.

Threads in guile mode can block (e.g., do blocking I/O) without causing any problems2; temporarily leaving guile mode with scm_without_guile before blocking slightly improves GC performance, though. For some common blocking operations, Guile provides convenience functions. For example, if you want to lock a pthread mutex while in guile mode, you might want to use scm_pthread_mutex_lock which is just like pthread_mutex_lock except that it leaves guile mode while blocking.

All libguile functions are (intended to be) robust in the face of multiple threads using them concurrently. This means that there is no risk of the internal data structures of libguile becoming corrupted in such a way that the process crashes.

A program might still produce nonsensical results, though. Taking hashtables as an example, Guile guarantees that you can use them from multiple threads concurrently and a hashtable will always remain a valid hashtable and Guile will not crash when you access it. It does not guarantee, however, that inserting into it concurrently from two threads will give useful results: only one insertion might actually happen, none might happen, or the table might in general be modified in a totally arbitrary manner. (It will still be a valid hashtable, but not the one that you might have expected.) Guile might also signal an error when it detects a harmful race condition.

Thus, you need to put in additional synchronizations when multiple threads want to use a single hashtable, or any other mutable Scheme object.

When writing C code for use with libguile, you should try to make it robust as well. An example that converts a list into a vector will help to illustrate. Here is a correct version:

SCM
my_list_to_vector (SCM list)
{
  SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
  size_t len, i;

  len = scm_c_vector_length (vector);
  i = 0;
  while (i < len && scm_is_pair (list))
    {
      scm_c_vector_set_x (vector, i, scm_car (list));
      list = scm_cdr (list);
      i++;
    }

  return vector;
}

The first thing to note is that storing into a SCM location concurrently from multiple threads is guaranteed to be robust: you don’t know which value wins but it will in any case be a valid SCM value.

But there is no guarantee that the list referenced by list is not modified in another thread while the loop iterates over it. Thus, while copying its elements into the vector, the list might get longer or shorter. For this reason, the loop must check both that it doesn’t overrun the vector and that it doesn’t overrun the list. Otherwise, scm_c_vector_set_x would raise an error if the index is out of range, and scm_car and scm_cdr would raise an error if the value is not a pair.

It is safe to use scm_car and scm_cdr on the local variable list once it is known that the variable contains a pair. The contents of the pair might change spontaneously, but it will always stay a valid pair (and a local variable will of course not spontaneously point to a different Scheme object).

Likewise, a vector such as the one returned by scm_make_vector is guaranteed to always stay the same length so that it is safe to only use scm_c_vector_length once and store the result. (In the example, vector is safe anyway since it is a fresh object that no other thread can possibly know about until it is returned from my_list_to_vector.)

Of course the behavior of my_list_to_vector is suboptimal when list does indeed get asynchronously lengthened or shortened in another thread. But it is robust: it will always return a valid vector. That vector might be shorter than expected, or its last elements might be unspecified, but it is a valid vector and if a program wants to rule out these cases, it must avoid modifying the list asynchronously.

Here is another version that is also correct:

SCM
my_pedantic_list_to_vector (SCM list)
{
  SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED);
  size_t len, i;

  len = scm_c_vector_length (vector);
  i = 0;
  while (i < len)
    {
      scm_c_vector_set_x (vector, i, scm_car (list));
      list = scm_cdr (list);
      i++;
    }

  return vector;
}

This version relies on the error-checking behavior of scm_car and scm_cdr. When the list is shortened (that is, when list holds a non-pair), scm_car will throw an error. This might be preferable to just returning a half-initialized vector.

The API for accessing vectors and arrays of various kinds from C takes a slightly different approach to thread-robustness. In order to get at the raw memory that stores the elements of an array, you need to reserve that array as long as you need the raw memory. During the time an array is reserved, its elements can still spontaneously change their values, but the memory itself and other things like the size of the array are guaranteed to stay fixed. Any operation that would change these parameters of an array that is currently reserved will signal an error. In order to avoid these errors, a program should of course put suitable synchronization mechanisms in place. As you can see, Guile itself is again only concerned about robustness, not about correctness: without proper synchronization, your program will likely not be correct, but the worst consequence is an error message.

Real thread-safety often requires that a critical section of code is executed in a certain restricted manner. A common requirement is that the code section is not entered a second time when it is already being executed. Locking a mutex while in that section ensures that no other thread will start executing it, blocking asyncs ensures that no asynchronous code enters the section again from the current thread, and the error checking of Guile mutexes guarantees that an error is signalled when the current thread accidentally reenters the critical section via recursive function calls.

Guile provides two mechanisms to support critical sections as outlined above. You can either use the macros SCM_CRITICAL_SECTION_START and SCM_CRITICAL_SECTION_END for very simple sections; or use a dynwind context together with a call to scm_dynwind_critical_section.

The macros only work reliably for critical sections that are guaranteed to not cause a non-local exit. They also do not detect an accidental reentry by the current thread. Thus, you should probably only use them to delimit critical sections that do not contain calls to libguile functions or to other external functions that might do complicated things.

The function scm_dynwind_critical_section, on the other hand, will correctly deal with non-local exits because it requires a dynwind context. Also, by using a separate mutex for each critical section, it can detect accidental reentries.


Next: , Previous: , Up: Programming in C   [Contents][Index]

5.5 Defining New Types (Smobs)

Smobs are Guile’s mechanism for adding new primitive types to the system. The term “smob” was coined by Aubrey Jaffer, who says it comes from “small object”, referring to the fact that they are quite limited in size: they can hold just one pointer to a larger memory block plus 16 extra bits.

To define a new smob type, the programmer provides Guile with some essential information about the type — how to print it, how to garbage collect it, and so on — and Guile allocates a fresh type tag for it. The programmer can then use scm_c_define_gsubr to make a set of C functions visible to Scheme code that create and operate on these objects.

(You can find a complete version of the example code used in this section in the Guile distribution, in doc/example-smob. That directory includes a makefile and a suitable main function, so you can build a complete interactive Guile shell, extended with the datatypes described here.)


Next: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.1 Describing a New Type

To define a new type, the programmer must write two functions to manage instances of the type:

print

Guile will apply this function to each instance of the new type to print the value, as for display or write. The default print function prints #<NAME ADDRESS> where NAME is the first argument passed to scm_make_smob_type.

equalp

If Scheme code asks the equal? function to compare two instances of the same smob type, Guile calls this function. It should return SCM_BOOL_T if a and b should be considered equal?, or SCM_BOOL_F otherwise. If equalp is NULL, equal? will assume that two instances of this type are never equal? unless they are eq?.

When the only resource associated with a smob is memory managed by the garbage collector—i.e., memory allocated with the scm_gc_malloc functions—this is sufficient. However, when a smob is associated with other kinds of resources, it may be necessary to define one of the following functions, or both:

mark

Guile will apply this function to each instance of the new type it encounters during garbage collection. This function is responsible for telling the collector about any other SCM values that the object has stored, and that are in memory regions not already scanned by the garbage collector. See Garbage Collecting Smobs, for more details.

free

Guile will apply this function to each instance of the new type that is to be deallocated. The function should release all resources held by the object. This is analogous to the Java finalization method—it is invoked at an unspecified time (when garbage collection occurs) after the object is dead. See Garbage Collecting Smobs, for more details.

This function operates while the heap is in an inconsistent state and must therefore be careful. See Smobs, for details about what this function is allowed to do.

To actually register the new smob type, call scm_make_smob_type. It returns a value of type scm_t_bits which identifies the new smob type.

The four special functions described above are registered by calling one of scm_set_smob_mark, scm_set_smob_free, scm_set_smob_print, or scm_set_smob_equalp, as appropriate. Each function is intended to be used at most once per type, and the call should be placed immediately following the call to scm_make_smob_type.

There can only be at most 256 different smob types in the system. Instead of registering a huge number of smob types (for example, one for each relevant C struct in your application), it is sometimes better to register just one and implement a second layer of type dispatching on top of it. This second layer might use the 16 extra bits to extend its type, for example.

Here is how one might declare and register a new type representing eight-bit gray-scale images:

#include <libguile.h>

struct image {
  int width, height;
  char *pixels;

  /* The name of this image */
  SCM name;

  /* A function to call when this image is
     modified, e.g., to update the screen,
     or SCM_BOOL_F if no action necessary */
  SCM update_func;
};

static scm_t_bits image_tag;

void
init_image_type (void)
{
  image_tag = scm_make_smob_type ("image", sizeof (struct image));
  scm_set_smob_mark (image_tag, mark_image);
  scm_set_smob_free (image_tag, free_image);
  scm_set_smob_print (image_tag, print_image);
}

Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.2 Creating Smob Instances

Normally, smobs can have one immediate word of data. This word stores either a pointer to an additional memory block that holds the real data, or it might hold the data itself when it fits. The word is large enough for a SCM value, a pointer to void, or an integer that fits into a size_t or ssize_t.

You can also create smobs that have two or three immediate words, and when these words suffice to store all data, it is more efficient to use these super-sized smobs instead of using a normal smob plus a memory block. See Double Smobs, for their discussion.

Guile provides functions for managing memory which are often helpful when implementing smobs. See Memory Blocks.

To retrieve the immediate word of a smob, you use the macro SCM_SMOB_DATA. It can be set with SCM_SET_SMOB_DATA. The 16 extra bits can be accessed with SCM_SMOB_FLAGS and SCM_SET_SMOB_FLAGS.

The two macros SCM_SMOB_DATA and SCM_SET_SMOB_DATA treat the immediate word as if it were of type scm_t_bits, which is an unsigned integer type large enough to hold a pointer to void. Thus you can use these macros to store arbitrary pointers in the smob word.

When you want to store a SCM value directly in the immediate word of a smob, you should use the macros SCM_SMOB_OBJECT and SCM_SET_SMOB_OBJECT to access it.

Creating a smob instance can be tricky when it consists of multiple steps that allocate resources. Most of the time, this is mainly about allocating memory to hold associated data structures. Using memory managed by the garbage collector simplifies things: the garbage collector will automatically scan those data structures for pointers, and reclaim them when they are no longer referenced.

Continuing the example from above, if the global variable image_tag contains a tag returned by scm_make_smob_type, here is how we could construct a smob whose immediate word contains a pointer to a freshly allocated struct image:

SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
  SCM smob;
  struct image *image;
  int width = scm_to_int (s_width);
  int height = scm_to_int (s_height);

  /* Step 1: Allocate the memory block.
   */
  image = (struct image *)
     scm_gc_malloc (sizeof (struct image), "image");

  /* Step 2: Initialize it with straight code.
   */
  image->width = width;
  image->height = height;
  image->pixels = NULL;
  image->name = SCM_BOOL_F;
  image->update_func = SCM_BOOL_F;

  /* Step 3: Create the smob.
   */
  smob = scm_new_smob (image_tag, image);

  /* Step 4: Finish the initialization.
   */
  image->name = name;
  image->pixels =
    scm_gc_malloc_pointerless (width * height, "image pixels");

  return smob;
}

We use scm_gc_malloc_pointerless for the pixel buffer to tell the garbage collector not to scan it for pointers. Calls to scm_gc_malloc, scm_new_smob, and scm_gc_malloc_pointerless raise an exception in out-of-memory conditions; the garbage collector is able to reclaim previously allocated memory if that happens.


Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.3 Type checking

Functions that operate on smobs should check that the passed SCM value indeed is a suitable smob before accessing its data. They can do this with scm_assert_smob_type.

For example, here is a simple function that operates on an image smob, and checks the type of its argument.

SCM
clear_image (SCM image_smob)
{
  int area;
  struct image *image;

  scm_assert_smob_type (image_tag, image_smob);

  image = (struct image *) SCM_SMOB_DATA (image_smob);
  area = image->width * image->height;
  memset (image->pixels, 0, area);

  /* Invoke the image's update function.
   */
  if (scm_is_true (image->update_func))
    scm_call_0 (image->update_func);

  scm_remember_upto_here_1 (image_smob);

  return SCM_UNSPECIFIED;
}

See Remembering During Operations for an explanation of the call to scm_remember_upto_here_1.


Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.4 Garbage Collecting Smobs

Once a smob has been released to the tender mercies of the Scheme system, it must be prepared to survive garbage collection. In the example above, all the memory associated with the smob is managed by the garbage collector because we used the scm_gc_ allocation functions. Thus, no special care must be taken: the garbage collector automatically scans them and reclaims any unused memory.

However, when data associated with a smob is managed in some other way—e.g., malloc’d memory or file descriptors—it is possible to specify a free function to release those resources when the smob is reclaimed, and a mark function to mark Scheme objects otherwise invisible to the garbage collector.

As described in more detail elsewhere (see Conservative GC), every object in the Scheme system has a mark bit, which the garbage collector uses to tell live objects from dead ones. When collection starts, every object’s mark bit is clear. The collector traces pointers through the heap, starting from objects known to be live, and sets the mark bit on each object it encounters. When it can find no more unmarked objects, the collector walks all objects, live and dead, frees those whose mark bits are still clear, and clears the mark bit on the others.

The two main portions of the collection are called the mark phase, during which the collector marks live objects, and the sweep phase, during which the collector frees all unmarked objects.

The mark bit of a smob lives in a special memory region. When the collector encounters a smob, it sets the smob’s mark bit, and uses the smob’s type tag to find the appropriate mark function for that smob. It then calls this mark function, passing it the smob as its only argument.

The mark function is responsible for marking any other Scheme objects the smob refers to. If it does not do so, the objects’ mark bits will still be clear when the collector begins to sweep, and the collector will free them. If this occurs, it will probably break, or at least confuse, any code operating on the smob; the smob’s SCM values will have become dangling references.

To mark an arbitrary Scheme object, the mark function calls scm_gc_mark.

Thus, here is how we might write mark_image—again this is not needed in our example since we used the scm_gc_ allocation routines, so this is just for the sake of illustration:

SCM
mark_image (SCM image_smob)
{
  /* Mark the image's name and update function.  */
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_mark (image->name);
  scm_gc_mark (image->update_func);

  return SCM_BOOL_F;
}

Note that, even though the image’s update_func could be an arbitrarily complex structure (representing a procedure and any values enclosed in its environment), scm_gc_mark will recurse as necessary to mark all its components. Because scm_gc_mark sets an object’s mark bit before it recurses, it is not confused by circular structures.

As an optimization, the collector will mark whatever value is returned by the mark function; this helps limit depth of recursion during the mark phase. Thus, the code above should really be written as:

SCM
mark_image (SCM image_smob)
{
  /* Mark the image's name and update function.  */
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_mark (image->name);
  return image->update_func;
}

Finally, when the collector encounters an unmarked smob during the sweep phase, it uses the smob’s tag to find the appropriate free function for the smob. It then calls that function, passing it the smob as its only argument.

The free function must release any resources used by the smob. However, it must not free objects managed by the collector; the collector will take care of them. For historical reasons, the return type of the free function should be size_t, an unsigned integral type; the free function should always return zero.

Here is how we might write the free_image function for the image smob type—again for the sake of illustration, since our example does not need it thanks to the use of the scm_gc_ allocation routines:

size_t
free_image (SCM image_smob)
{
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_free (image->pixels,
               image->width * image->height,
               "image pixels");
  scm_gc_free (image, sizeof (struct image), "image");

  return 0;
}

During the sweep phase, the garbage collector will clear the mark bits on all live objects. The code which implements a smob need not do this itself.

There is no way for smob code to be notified when collection is complete.

It is usually a good idea to minimize the amount of processing done during garbage collection; keep the mark and free functions very simple. Since collections occur at unpredictable times, it is easy for any unusual activity to interfere with normal code.


Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.5 Remembering During Operations

It’s important that a smob is visible to the garbage collector whenever its contents are being accessed. Otherwise it could be freed while code is still using it.

For example, consider a procedure to convert image data to a list of pixel values.

SCM
image_to_list (SCM image_smob)
{
  struct image *image;
  SCM lst;
  int i;

  scm_assert_smob_type (image_tag, image_smob);

  image = (struct image *) SCM_SMOB_DATA (image_smob);
  lst = SCM_EOL;
  for (i = image->width * image->height - 1; i >= 0; i--)
    lst = scm_cons (scm_from_char (image->pixels[i]), lst);

  scm_remember_upto_here_1 (image_smob);
  return lst;
}

In the loop, only the image pointer is used and the C compiler has no reason to keep the image_smob value anywhere. If scm_cons results in a garbage collection, image_smob might not be on the stack or anywhere else and could be freed, leaving the loop accessing freed data. The use of scm_remember_upto_here_1 prevents this, by creating a reference to image_smob after all data accesses.

There’s no need to do the same for lst, since that’s the return value and the compiler will certainly keep it in a register or somewhere throughout the routine.

The clear_image example previously shown (see Type checking) also used scm_remember_upto_here_1 for this reason.

It’s only in quite rare circumstances that a missing scm_remember_upto_here_1 will bite, but when it happens the consequences are serious. Fortunately the rule is simple: whenever calling a Guile library function or doing something that might, ensure that the SCM of a smob is referenced past all accesses to its insides. Do this by adding an scm_remember_upto_here_1 if there are no other references.

In a multi-threaded program, the rule is the same. As far as a given thread is concerned, a garbage collection still only occurs within a Guile library function, not at an arbitrary time. (Guile waits for all threads to reach one of its library functions, and holds them there while the collector runs.)


Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.6 Double Smobs

Smobs are called smob because they are small: they normally have only room for one void* or SCM value plus 16 bits. The reason for this is that smobs are directly implemented by using the low-level, two-word cells of Guile that are also used to implement pairs, for example. (see Data Representation for the details.) One word of the two-word cells is used for SCM_SMOB_DATA (or SCM_SMOB_OBJECT), the other contains the 16-bit type tag and the 16 extra bits.

In addition to the fundamental two-word cells, Guile also has four-word cells, which are appropriately called double cells. You can use them for double smobs and get two more immediate words of type scm_t_bits.

A double smob is created with scm_new_double_smob. Its immediate words can be retrieved as scm_t_bits with SCM_SMOB_DATA_2 and SCM_SMOB_DATA_3 in addition to SCM_SMOB_DATA. Unsurprisingly, the words can be set to scm_t_bits values with SCM_SET_SMOB_DATA_2 and SCM_SET_SMOB_DATA_3.

Of course there are also SCM_SMOB_OBJECT_2, SCM_SMOB_OBJECT_3, SCM_SET_SMOB_OBJECT_2, and SCM_SET_SMOB_OBJECT_3.


Previous: , Up: Defining New Types (Smobs)   [Contents][Index]

5.5.7 The Complete Example

Here is the complete text of the implementation of the image datatype, as presented in the sections above. We also provide a definition for the smob’s print function, and make some objects and functions static, to clarify exactly what the surrounding code is using.

As mentioned above, you can find this code in the Guile distribution, in doc/example-smob. That directory includes a makefile and a suitable main function, so you can build a complete interactive Guile shell, extended with the datatypes described here.)

/* file "image-type.c" */

#include <stdlib.h>
#include <libguile.h>

static scm_t_bits image_tag;

struct image {
  int width, height;
  char *pixels;

  /* The name of this image */
  SCM name;

  /* A function to call when this image is
     modified, e.g., to update the screen,
     or SCM_BOOL_F if no action necessary */
  SCM update_func;
};

static SCM
make_image (SCM name, SCM s_width, SCM s_height)
{
  SCM smob;
  struct image *image;
  int width = scm_to_int (s_width);
  int height = scm_to_int (s_height);

  /* Step 1: Allocate the memory block.
   */
  image = (struct image *)
     scm_gc_malloc (sizeof (struct image), "image");

  /* Step 2: Initialize it with straight code.
   */
  image->width = width;
  image->height = height;
  image->pixels = NULL;
  image->name = SCM_BOOL_F;
  image->update_func = SCM_BOOL_F;

  /* Step 3: Create the smob.
   */
  smob = scm_new_smob (image_tag, image);

  /* Step 4: Finish the initialization.
   */
  image->name = name;
  image->pixels =
     scm_gc_malloc (width * height, "image pixels");

  return smob;
}

SCM
clear_image (SCM image_smob)
{
  int area;
  struct image *image;

  scm_assert_smob_type (image_tag, image_smob);

  image = (struct image *) SCM_SMOB_DATA (image_smob);
  area = image->width * image->height;
  memset (image->pixels, 0, area);

  /* Invoke the image's update function.
   */
  if (scm_is_true (image->update_func))
    scm_call_0 (image->update_func);

  scm_remember_upto_here_1 (image_smob);

  return SCM_UNSPECIFIED;
}

static SCM
mark_image (SCM image_smob)
{
  /* Mark the image's name and update function.  */
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_mark (image->name);
  return image->update_func;
}

static size_t
free_image (SCM image_smob)
{
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_free (image->pixels,
               image->width * image->height,
               "image pixels");
  scm_gc_free (image, sizeof (struct image), "image");

  return 0;
}

static int
print_image (SCM image_smob, SCM port, scm_print_state *pstate)
{
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_puts ("#<image ", port);
  scm_display (image->name, port);
  scm_puts (">", port);

  /* non-zero means success */
  return 1;
}

void
init_image_type (void)
{
  image_tag = scm_make_smob_type ("image", sizeof (struct image));
  scm_set_smob_mark (image_tag, mark_image);
  scm_set_smob_free (image_tag, free_image);
  scm_set_smob_print (image_tag, print_image);

  scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
  scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
}

Here is a sample build and interaction with the code from the example-smob directory, on the author’s machine:

zwingli:example-smob$ make CC=gcc
gcc `pkg-config --cflags guile-2.0` -c image-type.c -o image-type.o
gcc `pkg-config --cflags guile-2.0` -c myguile.c -o myguile.o
gcc image-type.o myguile.o `pkg-config --libs guile-2.0` -o myguile
zwingli:example-smob$ ./myguile
guile> make-image
#<primitive-procedure make-image>
guile> (define i (make-image "Whistler's Mother" 100 100))
guile> i
#<image Whistler's Mother>
guile> (clear-image i)
guile> (clear-image 4)
ERROR: In procedure clear-image in expression (clear-image 4):
ERROR: Wrong type (expecting image): 4
ABORT: (wrong-type-arg)
 
Type "(backtrace)" to get more information.
guile> 

Next: , Previous: , Up: Programming in C   [Contents][Index]

5.6 Function Snarfing

When writing C code for use with Guile, you typically define a set of C functions, and then make some of them visible to the Scheme world by calling scm_c_define_gsubr or related functions. If you have many functions to publish, it can sometimes be annoying to keep the list of calls to scm_c_define_gsubr in sync with the list of function definitions.

Guile provides the guile-snarf program to manage this problem. Using this tool, you can keep all the information needed to define the function alongside the function definition itself; guile-snarf will extract this information from your source code, and automatically generate a file of calls to scm_c_define_gsubr which you can #include into an initialization function.

The snarfing mechanism works for many kind of initialization actions, not just for collecting calls to scm_c_define_gsubr. For a full list of what can be done, See Snarfing Macros.

The guile-snarf program is invoked like this:

guile-snarf [-o outfile] [cpp-args ...]

This command will extract initialization actions to outfile. When no outfile has been specified or when outfile is -, standard output will be used. The C preprocessor is called with cpp-args (which usually include an input file) and the output is filtered to extract the initialization actions.

If there are errors during processing, outfile is deleted and the program exits with non-zero status.

During snarfing, the pre-processor macro SCM_MAGIC_SNARFER is defined. You could use this to avoid including snarfer output files that don’t yet exist by writing code like this:

#ifndef SCM_MAGIC_SNARFER
#include "foo.x"
#endif

Here is how you might define the Scheme function clear-image, implemented by the C function clear_image:

#include <libguile.h>

SCM_DEFINE (clear_image, "clear-image", 1, 0, 0,
            (SCM image_smob),
            "Clear the image.")
{
  /* C code to clear the image in image_smob... */
}

void
init_image_type ()
{
#include "image-type.x"
}

The SCM_DEFINE declaration says that the C function clear_image implements a Scheme function called clear-image, which takes one required argument (of type SCM and named image_smob), no optional arguments, and no rest argument. The string "Clear the image." provides a short help text for the function, it is called a docstring.

SCM_DEFINE macro also defines a static array of characters initialized to the Scheme name of the function. In this case, s_clear_image is set to the C string, "clear-image". You might want to use this symbol when generating error messages.

Assuming the text above lives in a file named image-type.c, you will need to execute the following command to prepare this file for compilation:

guile-snarf -o image-type.x image-type.c

This scans image-type.c for SCM_DEFINE declarations, and writes to image-type.x the output:

scm_c_define_gsubr ("clear-image", 1, 0, 0, (SCM (*)() ) clear_image);

When compiled normally, SCM_DEFINE is a macro which expands to the function header for clear_image.

Note that the output file name matches the #include from the input file. Also, you still need to provide all the same information you would if you were using scm_c_define_gsubr yourself, but you can place the information near the function definition itself, so it is less likely to become incorrect or out-of-date.

If you have many files that guile-snarf must process, you should consider using a fragment like the following in your Makefile:

snarfcppopts = $(DEFS) $(INCLUDES) $(CPPFLAGS) $(CFLAGS)
.SUFFIXES: .x
.c.x:
	guile-snarf -o $@ $< $(snarfcppopts)

This tells make to run guile-snarf to produce each needed .x file from the corresponding .c file.

The program guile-snarf passes its command-line arguments directly to the C preprocessor, which it uses to extract the information it needs from the source code. this means you can pass normal compilation flags to guile-snarf to define preprocessor symbols, add header file directories, and so on.


Next: , Previous: , Up: Programming in C   [Contents][Index]

5.7 An Overview of Guile Programming

Guile is designed as an extension language interpreter that is straightforward to integrate with applications written in C (and C++). The big win here for the application developer is that Guile integration, as the Guile web page says, “lowers your project’s hacktivation energy.” Lowering the hacktivation energy means that you, as the application developer, and your users, reap the benefits that flow from being able to extend the application in a high level extension language rather than in plain old C.

In abstract terms, it’s difficult to explain what this really means and what the integration process involves, so instead let’s begin by jumping straight into an example of how you might integrate Guile into an existing program, and what you could expect to gain by so doing. With that example under our belts, we’ll then return to a more general analysis of the arguments involved and the range of programming options available.


Next: , Up: Programming Overview   [Contents][Index]

5.7.1 How One Might Extend Dia Using Guile

Dia is a free software program for drawing schematic diagrams like flow charts and floor plans (http://www.gnome.org/projects/dia/). This section conducts the thought experiment of adding Guile to Dia. In so doing, it aims to illustrate several of the steps and considerations involved in adding Guile to applications in general.


Next: , Up: Extending Dia   [Contents][Index]

5.7.1.1 Deciding Why You Want to Add Guile

First off, you should understand why you want to add Guile to Dia at all, and that means forming a picture of what Dia does and how it does it. So, what are the constituents of the Dia application?

(In other words, a textbook example of the model - view - controller paradigm.)

Next question: how will Dia benefit once the Guile integration is complete? Several (positive!) answers are possible here, and the choice is obviously up to the application developers. Still, one answer is that the main benefit will be the ability to manipulate Dia’s application domain objects from Scheme.

Suppose that Dia made a set of procedures available in Scheme, representing the most basic operations on objects such as shapes, connectors, and so on. Using Scheme, the application user could then write code that builds upon these basic operations to create more complex procedures. For example, given basic procedures to enumerate the objects on a page, to determine whether an object is a square, and to change the fill pattern of a single shape, the user can write a Scheme procedure to change the fill pattern of all squares on the current page:

(define (change-squares'-fill-pattern new-pattern)
  (for-each-shape current-page
    (lambda (shape)
      (if (square? shape)
          (change-fill-pattern shape new-pattern)))))

Next: , Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.2 Four Steps Required to Add Guile

Assuming this objective, four steps are needed to achieve it.

First, you need a way of representing your application-specific objects — such as shape in the previous example — when they are passed into the Scheme world. Unless your objects are so simple that they map naturally into builtin Scheme data types like numbers and strings, you will probably want to use Guile’s SMOB interface to create a new Scheme data type for your objects.

Second, you need to write code for the basic operations like for-each-shape and square? such that they access and manipulate your existing data structures correctly, and then make these operations available as primitives on the Scheme level.

Third, you need to provide some mechanism within the Dia application that a user can hook into to cause arbitrary Scheme code to be evaluated.

Finally, you need to restructure your top-level application C code a little so that it initializes the Guile interpreter correctly and declares your SMOBs and primitives to the Scheme world.

The following subsections expand on these four points in turn.


Next: , Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.3 How to Represent Dia Data in Scheme

For all but the most trivial applications, you will probably want to allow some representation of your domain objects to exist on the Scheme level. This is where the idea of SMOBs comes in, and with it issues of lifetime management and garbage collection.

To get more concrete about this, let’s look again at the example we gave earlier of how application users can use Guile to build higher-level functions from the primitives that Dia itself provides.

(define (change-squares'-fill-pattern new-pattern)
  (for-each-shape current-page
    (lambda (shape)
      (if (square? shape)
          (change-fill-pattern shape new-pattern)))))

Consider what is stored here in the variable shape. For each shape on the current page, the for-each-shape primitive calls (lambda (shape) …) with an argument representing that shape. Question is: how is that argument represented on the Scheme level? The issues are as follows.

One resolution of these issues is for the Scheme-level representation of a shape to be a new, Scheme-specific C structure wrapped up as a SMOB. The SMOB is what is passed into and out of Scheme code, and the Scheme-specific C structure inside the SMOB points to Dia’s underlying C structure so that the code for primitives like square? can get at it.

To cope with an underlying shape being deleted while Scheme code is still holding onto a Scheme shape value, the underlying C structure should have a new field that points to the Scheme-specific SMOB. When a shape is deleted, the relevant code chains through to the Scheme-specific structure and sets its pointer back to the underlying structure to NULL. Thus the SMOB value for the shape continues to exist, but any primitive code that tries to use it will detect that the underlying shape has been deleted because the underlying structure pointer is NULL.

So, to summarize the steps involved in this resolution of the problem (and assuming that the underlying C structure for a shape is struct dia_shape):

As far as memory management is concerned, the SMOB values and their Scheme-specific structures are under the control of the garbage collector, whereas the underlying C structures are explicitly managed in exactly the same way that Dia managed them before we thought of adding Guile.

When the garbage collector decides to free a shape SMOB value, it calls the SMOB free function that was specified when defining the shape SMOB type. To maintain the correctness of the guile_shape field in the underlying C structure, this function should chain through to the underlying C structure (if it still exists) and set its guile_shape field to NULL.

For full documentation on defining and using SMOB types, see Defining New Types (Smobs).


Next: , Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.4 Writing Guile Primitives for Dia

Once the details of object representation are decided, writing the primitive function code that you need is usually straightforward.

A primitive is simply a C function whose arguments and return value are all of type SCM, and whose body does whatever you want it to do. As an example, here is a possible implementation of the square? primitive:

static SCM square_p (SCM shape)
{
  struct dia_guile_shape * guile_shape;

  /* Check that arg is really a shape SMOB. */
  scm_assert_smob_type (shape_tag, shape);

  /* Access Scheme-specific shape structure. */
  guile_shape = SCM_SMOB_DATA (shape);

  /* Find out if underlying shape exists and is a
     square; return answer as a Scheme boolean. */
  return scm_from_bool (guile_shape->c_shape &&
                        (guile_shape->c_shape->type == DIA_SQUARE));
}

Notice how easy it is to chain through from the SCM shape parameter that square_p receives — which is a SMOB — to the Scheme-specific structure inside the SMOB, and thence to the underlying C structure for the shape.

In this code, scm_assert_smob_type, SCM_SMOB_DATA, and scm_from_bool are from the standard Guile API. We assume that shape_tag was given to us when we made the shape SMOB type, using scm_make_smob_type. The call to scm_assert_smob_type ensures that shape is indeed a shape. This is needed to guard against Scheme code using the square? procedure incorrectly, as in (square? "hello"); Scheme’s latent typing means that usage errors like this must be caught at run time.

Having written the C code for your primitives, you need to make them available as Scheme procedures by calling the scm_c_define_gsubr function. scm_c_define_gsubr (see Primitive Procedures) takes arguments that specify the Scheme-level name for the primitive and how many required, optional and rest arguments it can accept. The square? primitive always requires exactly one argument, so the call to make it available in Scheme reads like this:

scm_c_define_gsubr ("square?", 1, 0, 0, square_p);

For where to put this call, see the subsection after next on the structure of Guile-enabled code (see Dia Structure).


Next: , Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.5 Providing a Hook for the Evaluation of Scheme Code

To make the Guile integration useful, you have to design some kind of hook into your application that application users can use to cause their Scheme code to be evaluated.

Technically, this is straightforward; you just have to decide on a mechanism that is appropriate for your application. Think of Emacs, for example: when you type ESC :, you get a prompt where you can type in any Elisp code, which Emacs will then evaluate. Or, again like Emacs, you could provide a mechanism (such as an init file) to allow Scheme code to be associated with a particular key sequence, and evaluate the code when that key sequence is entered.

In either case, once you have the Scheme code that you want to evaluate, as a null terminated string, you can tell Guile to evaluate it by calling the scm_c_eval_string function.


Next: , Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.6 Top-level Structure of Guile-enabled Dia

Let’s assume that the pre-Guile Dia code looks structurally like this:

When you add Guile to a program, one (rather technical) requirement is that Guile’s garbage collector needs to know where the bottom of the C stack is. The easiest way to ensure this is to use scm_boot_guile like this:

In other words, you move the guts of what was previously in your main function into a new function called inner_main, and then add a scm_boot_guile call, with inner_main as a parameter, to the end of main.

Assuming that you are using SMOBs and have written primitive code as described in the preceding subsections, you also need to insert calls to declare your new SMOBs and export the primitives to Scheme. These declarations must happen inside the dynamic scope of the scm_boot_guile call, but also before any code is run that could possibly use them — the beginning of inner_main is an ideal place for this.


Previous: , Up: Extending Dia   [Contents][Index]

5.7.1.7 Going Further with Dia and Guile

The steps described so far implement an initial Guile integration that already gives a lot of additional power to Dia application users. But there are further steps that you could take, and it’s interesting to consider a few of these.

In general, you could progressively move more of Dia’s source code from C into Scheme. This might make the code more maintainable and extensible, and it could open the door to new programming paradigms that are tricky to effect in C but straightforward in Scheme.

A specific example of this is that you could use the guile-gtk package, which provides Scheme-level procedures for most of the Gtk+ library, to move the code that lays out and displays Dia objects from C to Scheme.

As you follow this path, it naturally becomes less useful to maintain a distinction between Dia’s original non-Guile-related source code, and its later code implementing SMOBs and primitives for the Scheme world.

For example, suppose that the original source code had a dia_change_fill_pattern function:

void dia_change_fill_pattern (struct dia_shape * shape,
                              struct dia_pattern * pattern)
{
  /* real pattern change work */
}

During initial Guile integration, you add a change_fill_pattern primitive for Scheme purposes, which accesses the underlying structures from its SMOB values and uses dia_change_fill_pattern to do the real work:

SCM change_fill_pattern (SCM shape, SCM pattern)
{
  struct dia_shape * d_shape;
  struct dia_pattern * d_pattern;

  …

  dia_change_fill_pattern (d_shape, d_pattern);

  return SCM_UNSPECIFIED;
}

At this point, it makes sense to keep dia_change_fill_pattern and change_fill_pattern separate, because dia_change_fill_pattern can also be called without going through Scheme at all, say because the user clicks a button which causes a C-registered Gtk+ callback to be called.

But, if the code for creating buttons and registering their callbacks is moved into Scheme (using guile-gtk), it may become true that dia_change_fill_pattern can no longer be called other than through Scheme. In which case, it makes sense to abolish it and move its contents directly into change_fill_pattern, like this:

SCM change_fill_pattern (SCM shape, SCM pattern)
{
  struct dia_shape * d_shape;
  struct dia_pattern * d_pattern;

  …

  /* real pattern change work */

  return SCM_UNSPECIFIED;
}

So further Guile integration progressively reduces the amount of functional C code that you have to maintain over the long term.

A similar argument applies to data representation. In the discussion of SMOBs earlier, issues arose because of the different memory management and lifetime models that normally apply to data structures in C and in Scheme. However, with further Guile integration, you can resolve this issue in a more radical way by allowing all your data structures to be under the control of the garbage collector, and kept alive by references from the Scheme world. Instead of maintaining an array or linked list of shapes in C, you would instead maintain a list in Scheme.

Rather like the coalescing of dia_change_fill_pattern and change_fill_pattern, the practical upshot of such a change is that you would no longer have to keep the dia_shape and dia_guile_shape structures separate, and so wouldn’t need to worry about the pointers between them. Instead, you could change the SMOB definition to wrap the dia_shape structure directly, and send dia_guile_shape off to the scrap yard. Cut out the middle man!

Finally, we come to the holy grail of Guile’s free software / extension language approach. Once you have a Scheme representation for interesting Dia data types like shapes, and a handy bunch of primitives for manipulating them, it suddenly becomes clear that you have a bundle of functionality that could have far-ranging use beyond Dia itself. In other words, the data types and primitives could now become a library, and Dia becomes just one of the many possible applications using that library — albeit, at this early stage, a rather important one!

In this model, Guile becomes just the glue that binds everything together. Imagine an application that usefully combined functionality from Dia, Gnumeric and GnuCash — it’s tricky right now, because no such application yet exists; but it’ll happen some day …


Next: , Previous: , Up: Programming Overview   [Contents][Index]

5.7.2 Why Scheme is More Hackable Than C

Underlying Guile’s value proposition is the assumption that programming in a high level language, specifically Guile’s implementation of Scheme, is necessarily better in some way than programming in C. What do we mean by this claim, and how can we be so sure?

One class of advantages applies not only to Scheme, but more generally to any interpretable, high level, scripting language, such as Emacs Lisp, Python, Ruby, or TeX’s macro language. Common features of all such languages, when compared to C, are that:

In the case of Scheme, particular features that make programming easier — and more fun! — are its powerful mechanisms for abstracting parts of programs (closures — see About Closure) and for iteration (see while do).

The evidence in support of this argument is empirical: the huge amount of code that has been written in extension languages for applications that support this mechanism. Most notable are extensions written in Emacs Lisp for GNU Emacs, in TeX’s macro language for TeX, and in Script-Fu for the Gimp, but there is increasingly now a significant code eco-system for Guile-based applications as well, such as Lilypond and GnuCash. It is close to inconceivable that similar amounts of functionality could have been added to these applications just by writing new code in their base implementation languages.


Next: , Previous: , Up: Programming Overview   [Contents][Index]

5.7.3 Example: Using Guile for an Application Testbed

As an example of what this means in practice, imagine writing a testbed for an application that is tested by submitting various requests (via a C interface) and validating the output received. Suppose further that the application keeps an idea of its current state, and that the “correct” output for a given request may depend on the current application state. A complete “white box”3 test plan for this application would aim to submit all possible requests in each distinguishable state, and validate the output for all request/state combinations.

To write all this test code in C would be very tedious. Suppose instead that the testbed code adds a single new C function, to submit an arbitrary request and return the response, and then uses Guile to export this function as a Scheme procedure. The rest of the testbed can then be written in Scheme, and so benefits from all the advantages of programming in Scheme that were described in the previous section.

(In this particular example, there is an additional benefit of writing most of the testbed in Scheme. A common problem for white box testing is that mistakes and mistaken assumptions in the application under test can easily be reproduced in the testbed code. It is more difficult to copy mistakes like this when the testbed is written in a different language from the application.)


Next: , Previous: , Up: Programming Overview   [Contents][Index]

5.7.4 A Choice of Programming Options

The preceding arguments and example point to a model of Guile programming that is applicable in many cases. According to this model, Guile programming involves a balance between C and Scheme programming, with the aim being to extract the greatest possible Scheme level benefit from the least amount of C level work.

The C level work required in this model usually consists of packaging and exporting functions and application objects such that they can be seen and manipulated on the Scheme level. To help with this, Guile’s C language interface includes utility features that aim to make this kind of integration very easy for the application developer. These features are documented later in this part of the manual: see REFFIXME.

This model, though, is really just one of a range of possible programming options. If all of the functionality that you need is available from Scheme, you could choose instead to write your whole application in Scheme (or one of the other high level languages that Guile supports through translation), and simply use Guile as an interpreter for Scheme. (In the future, we hope that Guile will also be able to compile Scheme code, so lessening the performance gap between C and Scheme code.) Or, at the other end of the C–Scheme scale, you could write the majority of your application in C, and only call out to Guile occasionally for specific actions such as reading a configuration file or executing a user-specified extension. The choices boil down to two basic questions:

These are of course design questions, and the right design for any given application will always depend upon the particular requirements that you are trying to meet. In the context of Guile, however, there are some generally applicable considerations that can help you when designing your answers.


Next: , Up: Programming Options   [Contents][Index]

5.7.4.1 What Functionality is Already Available?

Suppose, for the sake of argument, that you would prefer to write your whole application in Scheme. Then the API available to you consists of:

A module in the last category can either be a pure Scheme module — in other words a collection of utility procedures coded in Scheme — or a module that provides a Scheme interface to an extension library coded in C — in other words a nice package where someone else has done the work of wrapping up some useful C code for you. The set of available modules is growing quickly and already includes such useful examples as (gtk gtk), which makes Gtk+ drawing functions available in Scheme, and (database postgres), which provides SQL access to a Postgres database.

Given the growing collection of pre-existing modules, it is quite feasible that your application could be implemented by combining a selection of these modules together with new application code written in Scheme.

If this approach is not enough, because the functionality that your application needs is not already available in this form, and it is impossible to write the new functionality in Scheme, you will need to write some C code. If the required function is already available in C (e.g. in a library), all you need is a little glue to connect it to the world of Guile. If not, you need both to write the basic code and to plumb it into Guile.

In either case, two general considerations are important. Firstly, what is the interface by which the functionality is presented to the Scheme world? Does the interface consist only of function calls (for example, a simple drawing interface), or does it need to include objects of some kind that can be passed between C and Scheme and manipulated by both worlds. Secondly, how does the lifetime and memory management of objects in the C code relate to the garbage collection governed approach of Scheme objects? In the case where the basic C code is not already written, most of the difficulties of memory management can be avoided by using Guile’s C interface features from the start.

For the full documentation on writing C code for Guile and connecting existing C code to the Guile world, see REFFIXME.


Next: , Previous: , Up: Programming Options   [Contents][Index]

5.7.4.2 Functional and Performance Constraints


Next: , Previous: , Up: Programming Options   [Contents][Index]

5.7.4.3 Your Preferred Programming Style


Previous: , Up: Programming Options   [Contents][Index]

5.7.4.4 What Controls Program Execution?


Previous: , Up: Programming Overview   [Contents][Index]

5.7.5 How About Application Users?

So far we have considered what Guile programming means for an application developer. But what if you are instead using an existing Guile-based application, and want to know what your options are for programming and extending this application?

The answer to this question varies from one application to another, because the options available depend inevitably on whether the application developer has provided any hooks for you to hang your own code on and, if there are such hooks, what they allow you to do.4 For example…

In the last two cases, what you can do is, by definition, restricted by the application, and you should refer to the application’s own manual to find out your options.

The most well known example of the first case is Emacs, with its extension language Emacs Lisp: as well as being a text editor, Emacs supports the loading and execution of arbitrary Emacs Lisp code. The result of such openness has been dramatic: Emacs now benefits from user-contributed Emacs Lisp libraries that extend the basic editing function to do everything from reading news to psychoanalysis and playing adventure games. The only limitation is that extensions are restricted to the functionality provided by Emacs’s built-in set of primitive operations. For example, you can interact and display data by manipulating the contents of an Emacs buffer, but you can’t pop-up and draw a window with a layout that is totally different to the Emacs standard.

This situation with a Guile application that supports the loading of arbitrary user code is similar, except perhaps even more so, because Guile also supports the loading of extension libraries written in C. This last point enables user code to add new primitive operations to Guile, and so to bypass the limitation present in Emacs Lisp.

At this point, the distinction between an application developer and an application user becomes rather blurred. Instead of seeing yourself as a user extending an application, you could equally well say that you are developing a new application of your own using some of the primitive functionality provided by the original application. As such, all the discussions of the preceding sections of this chapter are relevant to how you can proceed with developing your extension.


Previous: , Up: Programming in C   [Contents][Index]

5.8 Autoconf Support

Autoconf, a part of the GNU build system, makes it easy for users to build your package. This section documents Guile’s Autoconf support.


Next: , Up: Autoconf Support   [Contents][Index]

5.8.1 Autoconf Background

As explained in the GNU Autoconf Manual, any package needs configuration at build-time (see Introduction in The GNU Autoconf Manual). If your package uses Guile (or uses a package that in turn uses Guile), you probably need to know what specific Guile features are available and details about them.

The way to do this is to write feature tests and arrange for their execution by the configure script, typically by adding the tests to configure.ac, and running autoconf to create configure. Users of your package then run configure in the normal way.

Macros are a way to make common feature tests easy to express. Autoconf provides a wide range of macros (see Existing Tests in The GNU Autoconf Manual), and Guile installation provides Guile-specific tests in the areas of: program detection, compilation flags reporting, and Scheme module checks.


Next: , Previous: , Up: Autoconf Support   [Contents][Index]

5.8.2 Autoconf Macros

As mentioned earlier in this chapter, Guile supports parallel installation, and uses pkg-config to let the user choose which version of Guile they are interested in. pkg-config has its own set of Autoconf macros that are probably installed on most every development system. The most useful of these macros is PKG_CHECK_MODULES.

PKG_CHECK_MODULES([GUILE], [guile-2.0])

This example looks for Guile and sets the GUILE_CFLAGS and GUILE_LIBS variables accordingly, or prints an error and exits if Guile was not found.

Guile comes with additional Autoconf macros providing more information, installed as prefix/share/aclocal/guile.m4. Their names all begin with GUILE_.

Autoconf Macro: GUILE_PKG [VERSIONS]

This macro runs the pkg-config tool to find development files for an available version of Guile.

By default, this macro will search for the latest stable version of Guile (e.g. 2.0), falling back to the previous stable version (e.g. 1.8) if it is available. If no guile-VERSION.pc file is found, an error is signalled. The found version is stored in GUILE_EFFECTIVE_VERSION.

If GUILE_PROGS was already invoked, this macro ensures that the development files have the same effective version as the Guile program.

GUILE_EFFECTIVE_VERSION is marked for substitution, as by AC_SUBST.

Autoconf Macro: GUILE_FLAGS

This macro runs the pkg-config tool to find out how to compile and link programs against Guile. It sets four variables: GUILE_CFLAGS, GUILE_LDFLAGS, GUILE_LIBS, and GUILE_LTLIBS.

GUILE_CFLAGS: flags to pass to a C or C++ compiler to build code that uses Guile header files. This is almost always just one or more -I flags.

GUILE_LDFLAGS: flags to pass to the compiler to link a program against Guile. This includes -lguile-VERSION for the Guile library itself, and may also include one or more -L flag to tell the compiler where to find the libraries. But it does not include flags that influence the program’s runtime search path for libraries, and will therefore lead to a program that fails to start, unless all necessary libraries are installed in a standard location such as /usr/lib.

GUILE_LIBS and GUILE_LTLIBS: flags to pass to the compiler or to libtool, respectively, to link a program against Guile. It includes flags that augment the program’s runtime search path for libraries, so that shared libraries will be found at the location where they were during linking, even in non-standard locations. GUILE_LIBS is to be used when linking the program directly with the compiler, whereas GUILE_LTLIBS is to be used when linking the program is done through libtool.

The variables are marked for substitution, as by AC_SUBST.

Autoconf Macro: GUILE_SITE_DIR

This looks for Guile’s "site" directory, usually something like PREFIX/share/guile/site, and sets var GUILE_SITE to the path. Note that the var name is different from the macro name.

The variable is marked for substitution, as by AC_SUBST.

Autoconf Macro: GUILE_PROGS [VERSION]

This macro looks for programs guile and guild, setting variables GUILE and GUILD to their paths, respectively. If guile is not found, signal an error.

By default, this macro will search for the latest stable version of Guile (e.g. 2.0). x.y or x.y.z versions can be specified. If an older version is found, the macro will signal an error.

The effective version of the found guile is set to GUILE_EFFECTIVE_VERSION. This macro ensures that the effective version is compatible with the result of a previous invocation of GUILE_FLAGS, if any.

As a legacy interface, it also looks for guile-config and guile-tools, setting GUILE_CONFIG and GUILE_TOOLS.

The variables are marked for substitution, as by AC_SUBST.

Autoconf Macro: GUILE_CHECK_RETVAL var check

var is a shell variable name to be set to the return value. check is a Guile Scheme expression, evaluated with "$GUILE -c", and returning either 0 or non-#f to indicate the check passed. Non-0 number or #f indicates failure. Avoid using the character "#" since that confuses autoconf.

Autoconf Macro: GUILE_MODULE_CHECK var module featuretest description

var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). featuretest is an expression acceptable to GUILE_CHECK, q.v. description is a present-tense verb phrase (passed to AC_MSG_CHECKING).

Autoconf Macro: GUILE_MODULE_AVAILABLE var module

var is a shell variable name to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list).

Autoconf Macro: GUILE_MODULE_REQUIRED symlist

symlist is a list of symbols, WITHOUT surrounding parens, like: ice-9 common-list.

Autoconf Macro: GUILE_MODULE_EXPORTS var module modvar

var is a shell variable to be set to "yes" or "no". module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.

Autoconf Macro: GUILE_MODULE_REQUIRED_EXPORT module modvar

module is a list of symbols, like: (ice-9 common-list). modvar is the Guile Scheme variable to check.


Previous: , Up: Autoconf Support   [Contents][Index]

5.8.3 Using Autoconf Macros

Using the autoconf macros is straightforward: Add the macro "calls" (actually instantiations) to configure.ac, run aclocal, and finally, run autoconf. If your system doesn’t have guile.m4 installed, place the desired macro definitions (AC_DEFUN forms) in acinclude.m4, and aclocal will do the right thing.

Some of the macros can be used inside normal shell constructs: if foo ; then GUILE_BAZ ; fi, but this is not guaranteed. It’s probably a good idea to instantiate macros at top-level.

We now include two examples, one simple and one complicated.

The first example is for a package that uses libguile, and thus needs to know how to compile and link against it. So we use PKG_CHECK_MODULES to set the vars GUILE_CFLAGS and GUILE_LIBS, which are automatically substituted in the Makefile.

In configure.ac:

  PKG_CHECK_MODULES([GUILE], [guile-2.0])

In Makefile.in:

  GUILE_CFLAGS  = @GUILE_CFLAGS@
  GUILE_LIBS = @GUILE_LIBS@

  myprog.o: myprog.c
          $(CC) -o $ $(GUILE_CFLAGS) $<
  myprog: myprog.o
          $(CC) -o $ $< $(GUILE_LIBS)

The second example is for a package of Guile Scheme modules that uses an external program and other Guile Scheme modules (some might call this a "pure scheme" package). So we use the GUILE_SITE_DIR macro, a regular AC_PATH_PROG macro, and the GUILE_MODULE_AVAILABLE macro.

In configure.ac:

  GUILE_SITE_DIR

  probably_wont_work=""

  # pgtype pgtable
  GUILE_MODULE_AVAILABLE(have_guile_pg, (database postgres))
  test $have_guile_pg = no &&
      probably_wont_work="(my pgtype) (my pgtable) $probably_wont_work"

  # gpgutils
  AC_PATH_PROG(GNUPG,gpg)
  test x"$GNUPG" = x &&
      probably_wont_work="(my gpgutils) $probably_wont_work"

  if test ! "$probably_wont_work" = "" ; then
      p="         ***"
      echo
      echo "$p"
      echo "$p NOTE:"
      echo "$p The following modules probably won't work:"
      echo "$p   $probably_wont_work"
      echo "$p They can be installed anyway, and will work if their"
      echo "$p dependencies are installed later.  Please see README."
      echo "$p"
      echo
  fi

In Makefile.in:

  instdir = @GUILE_SITE@/my

  install:
        $(INSTALL) my/*.scm $(instdir)

Next: , Previous: , Up: Top   [Contents][Index]

6 API Reference

Guile provides an application programming interface (API) to developers in two core languages: Scheme and C. This part of the manual contains reference documentation for all of the functionality that is available through both Scheme and C interfaces.


Next: , Up: API Reference   [Contents][Index]

6.1 Overview of the Guile API

Guile’s application programming interface (API) makes functionality available that an application developer can use in either C or Scheme programming. The interface consists of elements that may be macros, functions or variables in C, and procedures, variables, syntax or other types of object in Scheme.

Many elements are available to both Scheme and C, in a form that is appropriate. For example, the assq Scheme procedure is also available as scm_assq to C code. These elements are documented only once, addressing both the Scheme and C aspects of them.

The Scheme name of an element is related to its C name in a regular way. Also, a C function takes its parameters in a systematic way.

Normally, the name of a C function can be derived given its Scheme name, using some simple textual transformations:

A C function always takes a fixed number of arguments of type SCM, even when the corresponding Scheme function takes a variable number.

For some Scheme functions, some last arguments are optional; the corresponding C function must always be invoked with all optional arguments specified. To get the effect as if an argument has not been specified, pass SCM_UNDEFINED as its value. You can not do this for an argument in the middle; when one argument is SCM_UNDEFINED all the ones following it must be SCM_UNDEFINED as well.

Some Scheme functions take an arbitrary number of rest arguments; the corresponding C function must be invoked with a list of all these arguments. This list is always the last argument of the C function.

These two variants can also be combined.

The type of the return value of a C function that corresponds to a Scheme function is always SCM. In the descriptions below, types are therefore often omitted but for the return value and for the arguments.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.2 Deprecation

From time to time functions and other features of Guile become obsolete. Guile’s deprecation is a mechanism that can help you cope with this.

When you use a feature that is deprecated, you will likely get a warning message at run-time. Also, if you have a new enough toolchain, using a deprecated function from libguile will cause a link-time warning.

The primary source for information about just what interfaces are deprecated in a given release is the file NEWS. That file also documents what you should use instead of the obsoleted things.

The file README contains instructions on how to control the inclusion or removal of the deprecated features from the public API of Guile, and how to control the deprecation warning messages.

The idea behind this mechanism is that normally all deprecated interfaces are available, but you get feedback when compiling and running code that uses them, so that you can migrate to the newer APIs at your leisure.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.3 The SCM Type

Guile represents all Scheme values with the single C type SCM. For an introduction to this topic, See Dynamic Types.

C Type: SCM

SCM is the user level abstract C type that is used to represent all of Guile’s Scheme objects, no matter what the Scheme object type is. No C operation except assignment is guaranteed to work with variables of type SCM, so you should only use macros and functions to work with SCM values. Values are converted between C data types and the SCM type with utility functions and macros.

C Type: scm_t_bits

scm_t_bits is an unsigned integral data type that is guaranteed to be large enough to hold all information that is required to represent any Scheme object. While this data type is mostly used to implement Guile’s internals, the use of this type is also necessary to write certain kinds of extensions to Guile.

C Type: scm_t_signed_bits

This is a signed integral type of the same size as scm_t_bits.

C Macro: scm_t_bits SCM_UNPACK (SCM x)

Transforms the SCM value x into its representation as an integral type. Only after applying SCM_UNPACK it is possible to access the bits and contents of the SCM value.

C Macro: SCM SCM_PACK (scm_t_bits x)

Takes a valid integral representation of a Scheme object and transforms it into its representation as a SCM value.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.4 Initializing Guile

Each thread that wants to use functions from the Guile API needs to put itself into guile mode with either scm_with_guile or scm_init_guile. The global state of Guile is initialized automatically when the first thread enters guile mode.

When a thread wants to block outside of a Guile API function, it should leave guile mode temporarily with scm_without_guile, See Blocking.

Threads that are created by call-with-new-thread or scm_spawn_thread start out in guile mode so you don’t need to initialize them.

C Function: void * scm_with_guile (void *(*func)(void *), void *data)

Call func, passing it data and return what func returns. While func is running, the current thread is in guile mode and can thus use the Guile API.

When scm_with_guile is called from guile mode, the thread remains in guile mode when scm_with_guile returns.

Otherwise, it puts the current thread into guile mode and, if needed, gives it a Scheme representation that is contained in the list returned by all-threads, for example. This Scheme representation is not removed when scm_with_guile returns so that a given thread is always represented by the same Scheme value during its lifetime, if at all.

When this is the first thread that enters guile mode, the global state of Guile is initialized before calling func.

The function func is called via scm_with_continuation_barrier; thus, scm_with_guile returns exactly once.

When scm_with_guile returns, the thread is no longer in guile mode (except when scm_with_guile was called from guile mode, see above). Thus, only func can store SCM variables on the stack and be sure that they are protected from the garbage collector. See scm_init_guile for another approach at initializing Guile that does not have this restriction.

It is OK to call scm_with_guile while a thread has temporarily left guile mode via scm_without_guile. It will then simply temporarily enter guile mode again.

C Function: void scm_init_guile ()

Arrange things so that all of the code in the current thread executes as if from within a call to scm_with_guile. That is, all functions called by the current thread can assume that SCM values on their stack frames are protected from the garbage collector (except when the thread has explicitly left guile mode, of course).

When scm_init_guile is called from a thread that already has been in guile mode once, nothing happens. This behavior matters when you call scm_init_guile while the thread has only temporarily left guile mode: in that case the thread will not be in guile mode after scm_init_guile returns. Thus, you should not use scm_init_guile in such a scenario.

When a uncaught throw happens in a thread that has been put into guile mode via scm_init_guile, a short message is printed to the current error port and the thread is exited via scm_pthread_exit (NULL). No restrictions are placed on continuations.

The function scm_init_guile might not be available on all platforms since it requires some stack-bounds-finding magic that might not have been ported to all platforms that Guile runs on. Thus, if you can, it is better to use scm_with_guile or its variation scm_boot_guile instead of this function.

C Function: void scm_boot_guile (int argc, char **argv, void (*main_func) (void *data, int argc, char **argv), void *data)

Enter guile mode as with scm_with_guile and call main_func, passing it data, argc, and argv as indicated. When main_func returns, scm_boot_guile calls exit (0); scm_boot_guile never returns. If you want some other exit value, have main_func call exit itself. If you don’t want to exit at all, use scm_with_guile instead of scm_boot_guile.

The function scm_boot_guile arranges for the Scheme command-line function to return the strings given by argc and argv. If main_func modifies argc or argv, it should call scm_set_program_arguments with the final list, so Scheme code will know which arguments have been processed (see Runtime Environment).

C Function: void scm_shell (int argc, char **argv)

Process command-line arguments in the manner of the guile executable. This includes loading the normal Guile initialization files, interacting with the user or running any scripts or expressions specified by -s or -e options, and then exiting. See Invoking Guile, for more details.

Since this function does not return, you must do all application-specific initialization before calling this function.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.5 Snarfing Macros

The following macros do two different things: when compiled normally, they expand in one way; when processed during snarfing, they cause the guile-snarf program to pick up some initialization code, See Function Snarfing.

The descriptions below use the term ‘normally’ to refer to the case when the code is compiled normally, and ‘while snarfing’ when the code is processed by guile-snarf.

C Macro: SCM_SNARF_INIT (code)

Normally, SCM_SNARF_INIT expands to nothing; while snarfing, it causes code to be included in the initialization action file, followed by a semicolon.

This is the fundamental macro for snarfing initialization actions. The more specialized macros below use it internally.

C Macro: SCM_DEFINE (c_name, scheme_name, req, opt, var, arglist, docstring)

Normally, this macro expands into

static const char s_c_name[] = scheme_name;
SCM
c_name arglist

While snarfing, it causes

scm_c_define_gsubr (s_c_name, req, opt, var,
                    c_name);

to be added to the initialization actions. Thus, you can use it to declare a C function named c_name that will be made available to Scheme with the name scheme_name.

Note that the arglist argument must have parentheses around it.

C Macro: SCM_SYMBOL (c_name, scheme_name)
C Macro: SCM_GLOBAL_SYMBOL (c_name, scheme_name)

Normally, these macros expand into

static SCM c_name

or

SCM c_name

respectively. While snarfing, they both expand into the initialization code

c_name = scm_permanent_object (scm_from_locale_symbol (scheme_name));

Thus, you can use them declare a static or global variable of type SCM that will be initialized to the symbol named scheme_name.

C Macro: SCM_KEYWORD (c_name, scheme_name)
C Macro: SCM_GLOBAL_KEYWORD (c_name, scheme_name)

Normally, these macros expand into

static SCM c_name

or

SCM c_name

respectively. While snarfing, they both expand into the initialization code

c_name = scm_permanent_object (scm_c_make_keyword (scheme_name));

Thus, you can use them declare a static or global variable of type SCM that will be initialized to the keyword named scheme_name.

C Macro: SCM_VARIABLE (c_name, scheme_name)
C Macro: SCM_GLOBAL_VARIABLE (c_name, scheme_name)

These macros are equivalent to SCM_VARIABLE_INIT and SCM_GLOBAL_VARIABLE_INIT, respectively, with a value of SCM_BOOL_F.

C Macro: SCM_VARIABLE_INIT (c_name, scheme_name, value)
C Macro: SCM_GLOBAL_VARIABLE_INIT (c_name, scheme_name, value)

Normally, these macros expand into

static SCM c_name

or

SCM c_name

respectively. While snarfing, they both expand into the initialization code

c_name = scm_permanent_object (scm_c_define (scheme_name, value));

Thus, you can use them declare a static or global C variable of type SCM that will be initialized to the object representing the Scheme variable named scheme_name in the current module. The variable will be defined when it doesn’t already exist. It is always set to value.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.6 Simple Generic Data Types

This chapter describes those of Guile’s simple data types which are primarily used for their role as items of generic data. By simple we mean data types that are not primarily used as containers to hold other data — i.e. pairs, lists, vectors and so on. For the documentation of such compound data types, see Compound Data Types.


Next: , Up: Simple Data Types   [Contents][Index]

6.6.1 Booleans

The two boolean values are #t for true and #f for false. They can also be written as #true and #false, as per R7RS.

Boolean values are returned by predicate procedures, such as the general equality predicates eq?, eqv? and equal? (see Equality) and numerical and string comparison operators like string=? (see String Comparison) and <= (see Comparison).

(<= 3 8)
⇒ #t

(<= 3 -3)
⇒ #f

(equal? "house" "houses")
⇒ #f

(eq? #f #f)
⇒
#t

In test condition contexts like if and cond (see Conditionals), where a group of subexpressions will be evaluated only if a condition expression evaluates to “true”, “true” means any value at all except #f.

(if #t "yes" "no")
⇒ "yes"

(if 0 "yes" "no")
⇒ "yes"

(if #f "yes" "no")
⇒ "no"

A result of this asymmetry is that typical Scheme source code more often uses #f explicitly than #t: #f is necessary to represent an if or cond false value, whereas #t is not necessary to represent an if or cond true value.

It is important to note that #f is not equivalent to any other Scheme value. In particular, #f is not the same as the number 0 (like in C and C++), and not the same as the “empty list” (like in some Lisp dialects).

In C, the two Scheme boolean values are available as the two constants SCM_BOOL_T for #t and SCM_BOOL_F for #f. Care must be taken with the false value SCM_BOOL_F: it is not false when used in C conditionals. In order to test for it, use scm_is_false or scm_is_true.

Scheme Procedure: not x
C Function: scm_not (x)

Return #t if x is #f, else return #f.

Scheme Procedure: boolean? obj
C Function: scm_boolean_p (obj)

Return #t if obj is either #t or #f, else return #f.

C Macro: SCM SCM_BOOL_T

The SCM representation of the Scheme object #t.

C Macro: SCM SCM_BOOL_F

The SCM representation of the Scheme object #f.

C Function: int scm_is_true (SCM obj)

Return 0 if obj is #f, else return 1.

C Function: int scm_is_false (SCM obj)

Return 1 if obj is #f, else return 0.

C Function: int scm_is_bool (SCM obj)

Return 1 if obj is either #t or #f, else return 0.

C Function: SCM scm_from_bool (int val)

Return #f if val is 0, else return #t.

C Function: int scm_to_bool (SCM val)

Return 1 if val is SCM_BOOL_T, return 0 when val is SCM_BOOL_F, else signal a ‘wrong type’ error.

You should probably use scm_is_true instead of this function when you just want to test a SCM value for trueness.


Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.2 Numerical data types

Guile supports a rich “tower” of numerical types — integer, rational, real and complex — and provides an extensive set of mathematical and scientific functions for operating on numerical data. This section of the manual documents those types and functions.

You may also find it illuminating to read R5RS’s presentation of numbers in Scheme, which is particularly clear and accessible: see Numbers in R5RS.


Next: , Up: Numbers   [Contents][Index]

6.6.2.1 Scheme’s Numerical “Tower”

Scheme’s numerical “tower” consists of the following categories of numbers:

integers

Whole numbers, positive or negative; e.g. –5, 0, 18.

rationals

The set of numbers that can be expressed as p/q where p and q are integers; e.g. 9/16 works, but pi (an irrational number) doesn’t. These include integers (n/1).

real numbers

The set of numbers that describes all possible positions along a one-dimensional line. This includes rationals as well as irrational numbers.

complex numbers

The set of numbers that describes all possible positions in a two dimensional space. This includes real as well as imaginary numbers (a+bi, where a is the real part, b is the imaginary part, and i is the square root of -1.)

It is called a tower because each category “sits on” the one that follows it, in the sense that every integer is also a rational, every rational is also real, and every real number is also a complex number (but with zero imaginary part).

In addition to the classification into integers, rationals, reals and complex numbers, Scheme also distinguishes between whether a number is represented exactly or not. For example, the result of 2*sin(pi/4) is exactly 2^(1/2), but Guile can represent neither pi/4 nor 2^(1/2) exactly. Instead, it stores an inexact approximation, using the C type double.

Guile can represent exact rationals of any magnitude, inexact rationals that fit into a C double, and inexact complex numbers with double real and imaginary parts.

The number? predicate may be applied to any Scheme value to discover whether the value is any of the supported numerical types.

Scheme Procedure: number? obj
C Function: scm_number_p (obj)

Return #t if obj is any kind of number, else #f.

For example:

(number? 3)
⇒ #t

(number? "hello there!")
⇒ #f

(define pi 3.141592654)
(number? pi)
⇒ #t
C Function: int scm_is_number (SCM obj)

This is equivalent to scm_is_true (scm_number_p (obj)).

The next few subsections document each of Guile’s numerical data types in detail.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.2 Integers

Integers are whole numbers, that is numbers with no fractional part, such as 2, 83, and -3789.

Integers in Guile can be arbitrarily big, as shown by the following example.

(define (factorial n)
  (let loop ((n n) (product 1))
    (if (= n 0)
        product
        (loop (- n 1) (* product n)))))

(factorial 3)
⇒ 6

(factorial 20)
⇒ 2432902008176640000

(- (factorial 45))
⇒ -119622220865480194561963161495657715064383733760000000000

Readers whose background is in programming languages where integers are limited by the need to fit into just 4 or 8 bytes of memory may find this surprising, or suspect that Guile’s representation of integers is inefficient. In fact, Guile achieves a near optimal balance of convenience and efficiency by using the host computer’s native representation of integers where possible, and a more general representation where the required number does not fit in the native form. Conversion between these two representations is automatic and completely invisible to the Scheme level programmer.

C has a host of different integer types, and Guile offers a host of functions to convert between them and the SCM representation. For example, a C int can be handled with scm_to_int and scm_from_int. Guile also defines a few C integer types of its own, to help with differences between systems.

C integer types that are not covered can be handled with the generic scm_to_signed_integer and scm_from_signed_integer for signed types, or with scm_to_unsigned_integer and scm_from_unsigned_integer for unsigned types.

Scheme integers can be exact and inexact. For example, a number written as 3.0 with an explicit decimal-point is inexact, but it is also an integer. The functions integer? and scm_is_integer report true for such a number, but the functions exact-integer?, scm_is_exact_integer, scm_is_signed_integer, and scm_is_unsigned_integer only allow exact integers and thus report false. Likewise, the conversion functions like scm_to_signed_integer only accept exact integers.

The motivation for this behavior is that the inexactness of a number should not be lost silently. If you want to allow inexact integers, you can explicitly insert a call to inexact->exact or to its C equivalent scm_inexact_to_exact. (Only inexact integers will be converted by this call into exact integers; inexact non-integers will become exact fractions.)

Scheme Procedure: integer? x
C Function: scm_integer_p (x)

Return #t if x is an exact or inexact integer number, else return #f.

(integer? 487)
⇒ #t

(integer? 3.0)
⇒ #t

(integer? -3.4)
⇒ #f

(integer? +inf.0)
⇒ #f
C Function: int scm_is_integer (SCM x)

This is equivalent to scm_is_true (scm_integer_p (x)).

Scheme Procedure: exact-integer? x
C Function: scm_exact_integer_p (x)

Return #t if x is an exact integer number, else return #f.

(exact-integer? 37)
⇒ #t

(exact-integer? 3.0)
⇒ #f
C Function: int scm_is_exact_integer (SCM x)

This is equivalent to scm_is_true (scm_exact_integer_p (x)).

C Type: scm_t_int8
C Type: scm_t_uint8
C Type: scm_t_int16
C Type: scm_t_uint16
C Type: scm_t_int32
C Type: scm_t_uint32
C Type: scm_t_int64
C Type: scm_t_uint64
C Type: scm_t_intmax
C Type: scm_t_uintmax

The C types are equivalent to the corresponding ISO C types but are defined on all platforms, with the exception of scm_t_int64 and scm_t_uint64, which are only defined when a 64-bit type is available. For example, scm_t_int8 is equivalent to int8_t.

You can regard these definitions as a stop-gap measure until all platforms provide these types. If you know that all the platforms that you are interested in already provide these types, it is better to use them directly instead of the types provided by Guile.

C Function: int scm_is_signed_integer (SCM x, scm_t_intmax min, scm_t_intmax max)
C Function: int scm_is_unsigned_integer (SCM x, scm_t_uintmax min, scm_t_uintmax max)

Return 1 when x represents an exact integer that is between min and max, inclusive.

These functions can be used to check whether a SCM value will fit into a given range, such as the range of a given C integer type. If you just want to convert a SCM value to a given C integer type, use one of the conversion functions directly.

C Function: scm_t_intmax scm_to_signed_integer (SCM x, scm_t_intmax min, scm_t_intmax max)
C Function: scm_t_uintmax scm_to_unsigned_integer (SCM x, scm_t_uintmax min, scm_t_uintmax max)

When x represents an exact integer that is between min and max inclusive, return that integer. Else signal an error, either a ‘wrong-type’ error when x is not an exact integer, or an ‘out-of-range’ error when it doesn’t fit the given range.

C Function: SCM scm_from_signed_integer (scm_t_intmax x)
C Function: SCM scm_from_unsigned_integer (scm_t_uintmax x)

Return the SCM value that represents the integer x. This function will always succeed and will always return an exact number.

C Function: char scm_to_char (SCM x)
C Function: signed char scm_to_schar (SCM x)
C Function: unsigned char scm_to_uchar (SCM x)
C Function: short scm_to_short (SCM x)
C Function: unsigned short scm_to_ushort (SCM x)
C Function: int scm_to_int (SCM x)
C Function: unsigned int scm_to_uint (SCM x)
C Function: long scm_to_long (SCM x)
C Function: unsigned long scm_to_ulong (SCM x)
C Function: long long scm_to_long_long (SCM x)
C Function: unsigned long long scm_to_ulong_long (SCM x)
C Function: size_t scm_to_size_t (SCM x)
C Function: ssize_t scm_to_ssize_t (SCM x)
C Function: scm_t_ptrdiff scm_to_ptrdiff_t (SCM x)
C Function: scm_t_int8 scm_to_int8 (SCM x)
C Function: scm_t_uint8 scm_to_uint8 (SCM x)
C Function: scm_t_int16 scm_to_int16 (SCM x)
C Function: scm_t_uint16 scm_to_uint16 (SCM x)
C Function: scm_t_int32 scm_to_int32 (SCM x)
C Function: scm_t_uint32 scm_to_uint32 (SCM x)
C Function: scm_t_int64 scm_to_int64 (SCM x)
C Function: scm_t_uint64 scm_to_uint64 (SCM x)
C Function: scm_t_intmax scm_to_intmax (SCM x)
C Function: scm_t_uintmax scm_to_uintmax (SCM x)

When x represents an exact integer that fits into the indicated C type, return that integer. Else signal an error, either a ‘wrong-type’ error when x is not an exact integer, or an ‘out-of-range’ error when it doesn’t fit the given range.

The functions scm_to_long_long, scm_to_ulong_long, scm_to_int64, and scm_to_uint64 are only available when the corresponding types are.

C Function: SCM scm_from_char (char x)
C Function: SCM scm_from_schar (signed char x)
C Function: SCM scm_from_uchar (unsigned char x)
C Function: SCM scm_from_short (short x)
C Function: SCM scm_from_ushort (unsigned short x)
C Function: SCM scm_from_int (int x)
C Function: SCM scm_from_uint (unsigned int x)
C Function: SCM scm_from_long (long x)
C Function: SCM scm_from_ulong (unsigned long x)
C Function: SCM scm_from_long_long (long long x)
C Function: SCM scm_from_ulong_long (unsigned long long x)
C Function: SCM scm_from_size_t (size_t x)
C Function: SCM scm_from_ssize_t (ssize_t x)
C Function: SCM scm_from_ptrdiff_t (scm_t_ptrdiff x)
C Function: SCM scm_from_int8 (scm_t_int8 x)
C Function: SCM scm_from_uint8 (scm_t_uint8 x)
C Function: SCM scm_from_int16 (scm_t_int16 x)
C Function: SCM scm_from_uint16 (scm_t_uint16 x)
C Function: SCM scm_from_int32 (scm_t_int32 x)
C Function: SCM scm_from_uint32 (scm_t_uint32 x)
C Function: SCM scm_from_int64 (scm_t_int64 x)
C Function: SCM scm_from_uint64 (scm_t_uint64 x)
C Function: SCM scm_from_intmax (scm_t_intmax x)
C Function: SCM scm_from_uintmax (scm_t_uintmax x)

Return the SCM value that represents the integer x. These functions will always succeed and will always return an exact number.

C Function: void scm_to_mpz (SCM val, mpz_t rop)

Assign val to the multiple precision integer rop. val must be an exact integer, otherwise an error will be signalled. rop must have been initialized with mpz_init before this function is called. When rop is no longer needed the occupied space must be freed with mpz_clear. See Initializing Integers in GNU MP Manual, for details.

C Function: SCM scm_from_mpz (mpz_t val)

Return the SCM value that represents val.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.3 Real and Rational Numbers

Mathematically, the real numbers are the set of numbers that describe all possible points along a continuous, infinite, one-dimensional line. The rational numbers are the set of all numbers that can be written as fractions p/q, where p and q are integers. All rational numbers are also real, but there are real numbers that are not rational, for example the square root of 2, and pi.

Guile can represent both exact and inexact rational numbers, but it cannot represent precise finite irrational numbers. Exact rationals are represented by storing the numerator and denominator as two exact integers. Inexact rationals are stored as floating point numbers using the C type double.

Exact rationals are written as a fraction of integers. There must be no whitespace around the slash:

1/2
-22/7

Even though the actual encoding of inexact rationals is in binary, it may be helpful to think of it as a decimal number with a limited number of significant figures and a decimal point somewhere, since this corresponds to the standard notation for non-whole numbers. For example:

0.34
-0.00000142857931198
-5648394822220000000000.0
4.0

The limited precision of Guile’s encoding means that any finite “real” number in Guile can be written in a rational form, by multiplying and then dividing by sufficient powers of 10 (or in fact, 2). For example, ‘-0.00000142857931198’ is the same as -142857931198 divided by 100000000000000000. In Guile’s current incarnation, therefore, the rational? and real? predicates are equivalent for finite numbers.

Dividing by an exact zero leads to a error message, as one might expect. However, dividing by an inexact zero does not produce an error. Instead, the result of the division is either plus or minus infinity, depending on the sign of the divided number and the sign of the zero divisor (some platforms support signed zeroes ‘-0.0’ and ‘+0.0’; ‘0.0’ is the same as ‘+0.0’).

Dividing zero by an inexact zero yields a NaN (‘not a number’) value, although they are actually considered numbers by Scheme. Attempts to compare a NaN value with any number (including itself) using =, <, >, <= or >= always returns #f. Although a NaN value is not = to itself, it is both eqv? and equal? to itself and other NaN values. However, the preferred way to test for them is by using nan?.

The real NaN values and infinities are written ‘+nan.0’, ‘+inf.0’ and ‘-inf.0’. This syntax is also recognized by read as an extension to the usual Scheme syntax. These special values are considered by Scheme to be inexact real numbers but not rational. Note that non-real complex numbers may also contain infinities or NaN values in their real or imaginary parts. To test a real number to see if it is infinite, a NaN value, or neither, use inf?, nan?, or finite?, respectively. Every real number in Scheme belongs to precisely one of those three classes.

On platforms that follow IEEE 754 for their floating point arithmetic, the ‘+inf.0’, ‘-inf.0’, and ‘+nan.0’ values are implemented using the corresponding IEEE 754 values. They behave in arithmetic operations like IEEE 754 describes it, i.e., (= +nan.0 +nan.0)#f.

Scheme Procedure: real? obj
C Function: scm_real_p (obj)

Return #t if obj is a real number, else #f. Note that the sets of integer and rational values form subsets of the set of real numbers, so the predicate will also be fulfilled if obj is an integer number or a rational number.

Scheme Procedure: rational? x
C Function: scm_rational_p (x)

Return #t if x is a rational number, #f otherwise. Note that the set of integer values forms a subset of the set of rational numbers, i.e. the predicate will also be fulfilled if x is an integer number.

Scheme Procedure: rationalize x eps
C Function: scm_rationalize (x, eps)

Returns the simplest rational number differing from x by no more than eps.

As required by R5RS, rationalize only returns an exact result when both its arguments are exact. Thus, you might need to use inexact->exact on the arguments.

(rationalize (inexact->exact 1.2) 1/100)
⇒ 6/5
Scheme Procedure: inf? x
C Function: scm_inf_p (x)

Return #t if the real number x is ‘+inf.0’ or ‘-inf.0’. Otherwise return #f.

Scheme Procedure: nan? x
C Function: scm_nan_p (x)

Return #t if the real number x is ‘+nan.0’, or #f otherwise.

Scheme Procedure: finite? x
C Function: scm_finite_p (x)

Return #t if the real number x is neither infinite nor a NaN, #f otherwise.

Scheme Procedure: nan
C Function: scm_nan ()

Return ‘+nan.0’, a NaN value.

Scheme Procedure: inf
C Function: scm_inf ()

Return ‘+inf.0’, positive infinity.

Scheme Procedure: numerator x
C Function: scm_numerator (x)

Return the numerator of the rational number x.

Scheme Procedure: denominator x
C Function: scm_denominator (x)

Return the denominator of the rational number x.

C Function: int scm_is_real (SCM val)
C Function: int scm_is_rational (SCM val)

Equivalent to scm_is_true (scm_real_p (val)) and scm_is_true (scm_rational_p (val)), respectively.

C Function: double scm_to_double (SCM val)

Returns the number closest to val that is representable as a double. Returns infinity for a val that is too large in magnitude. The argument val must be a real number.

C Function: SCM scm_from_double (double val)

Return the SCM value that represents val. The returned value is inexact according to the predicate inexact?, but it will be exactly equal to val.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.4 Complex Numbers

Complex numbers are the set of numbers that describe all possible points in a two-dimensional space. The two coordinates of a particular point in this space are known as the real and imaginary parts of the complex number that describes that point.

In Guile, complex numbers are written in rectangular form as the sum of their real and imaginary parts, using the symbol i to indicate the imaginary part.

3+4i
⇒
3.0+4.0i

(* 3-8i 2.3+0.3i)
⇒
9.3-17.5i

Polar form can also be used, with an ‘@’ between magnitude and angle,

1@3.141592 ⇒ -1.0      (approx)
-1@1.57079 ⇒ 0.0-1.0i  (approx)

Guile represents a complex number as a pair of inexact reals, so the real and imaginary parts of a complex number have the same properties of inexactness and limited precision as single inexact real numbers.

Note that each part of a complex number may contain any inexact real value, including the special values ‘+nan.0’, ‘+inf.0’ and ‘-inf.0’, as well as either of the signed zeroes ‘0.0’ or ‘-0.0’.

Scheme Procedure: complex? z
C Function: scm_complex_p (z)

Return #t if z is a complex number, #f otherwise. Note that the sets of real, rational and integer values form subsets of the set of complex numbers, i.e. the predicate will also be fulfilled if z is a real, rational or integer number.

C Function: int scm_is_complex (SCM val)

Equivalent to scm_is_true (scm_complex_p (val)).


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.5 Exact and Inexact Numbers

R5RS requires that, with few exceptions, a calculation involving inexact numbers always produces an inexact result. To meet this requirement, Guile distinguishes between an exact integer value such as ‘5’ and the corresponding inexact integer value which, to the limited precision available, has no fractional part, and is printed as ‘5.0’. Guile will only convert the latter value to the former when forced to do so by an invocation of the inexact->exact procedure.

The only exception to the above requirement is when the values of the inexact numbers do not affect the result. For example (expt n 0) is ‘1’ for any value of n, therefore (expt 5.0 0) is permitted to return an exact ‘1’.

Scheme Procedure: exact? z
C Function: scm_exact_p (z)

Return #t if the number z is exact, #f otherwise.

(exact? 2)
⇒ #t

(exact? 0.5)
⇒ #f

(exact? (/ 2))
⇒ #t
C Function: int scm_is_exact (SCM z)

Return a 1 if the number z is exact, and 0 otherwise. This is equivalent to scm_is_true (scm_exact_p (z)).

An alternate approch to testing the exactness of a number is to use scm_is_signed_integer or scm_is_unsigned_integer.

Scheme Procedure: inexact? z
C Function: scm_inexact_p (z)

Return #t if the number z is inexact, #f else.

C Function: int scm_is_inexact (SCM z)

Return a 1 if the number z is inexact, and 0 otherwise. This is equivalent to scm_is_true (scm_inexact_p (z)).

Scheme Procedure: inexact->exact z
C Function: scm_inexact_to_exact (z)

Return an exact number that is numerically closest to z, when there is one. For inexact rationals, Guile returns the exact rational that is numerically equal to the inexact rational. Inexact complex numbers with a non-zero imaginary part can not be made exact.

(inexact->exact 0.5)
⇒ 1/2

The following happens because 12/10 is not exactly representable as a double (on most platforms). However, when reading a decimal number that has been marked exact with the “#e” prefix, Guile is able to represent it correctly.

(inexact->exact 1.2)  
⇒ 5404319552844595/4503599627370496

#e1.2
⇒ 6/5
Scheme Procedure: exact->inexact z
C Function: scm_exact_to_inexact (z)

Convert the number z to its inexact representation.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.6 Read Syntax for Numerical Data

The read syntax for integers is a string of digits, optionally preceded by a minus or plus character, a code indicating the base in which the integer is encoded, and a code indicating whether the number is exact or inexact. The supported base codes are:

#b
#B

the integer is written in binary (base 2)

#o
#O

the integer is written in octal (base 8)

#d
#D

the integer is written in decimal (base 10)

#x
#X

the integer is written in hexadecimal (base 16)

If the base code is omitted, the integer is assumed to be decimal. The following examples show how these base codes are used.

-13
⇒ -13

#d-13
⇒ -13

#x-13
⇒ -19

#b+1101
⇒ 13

#o377
⇒ 255

The codes for indicating exactness (which can, incidentally, be applied to all numerical values) are:

#e
#E

the number is exact

#i
#I

the number is inexact.

If the exactness indicator is omitted, the number is exact unless it contains a radix point. Since Guile can not represent exact complex numbers, an error is signalled when asking for them.

(exact? 1.2)
⇒ #f

(exact? #e1.2)
⇒ #t

(exact? #e+1i)
ERROR: Wrong type argument

Guile also understands the syntax ‘+inf.0’ and ‘-inf.0’ for plus and minus infinity, respectively. The value must be written exactly as shown, that is, they always must have a sign and exactly one zero digit after the decimal point. It also understands ‘+nan.0’ and ‘-nan.0’ for the special ‘not-a-number’ value. The sign is ignored for ‘not-a-number’ and the value is always printed as ‘+nan.0’.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.7 Operations on Integer Values

Scheme Procedure: odd? n
C Function: scm_odd_p (n)

Return #t if n is an odd number, #f otherwise.

Scheme Procedure: even? n
C Function: scm_even_p (n)

Return #t if n is an even number, #f otherwise.

Scheme Procedure: quotient n d
Scheme Procedure: remainder n d
C Function: scm_quotient (n, d)
C Function: scm_remainder (n, d)

Return the quotient or remainder from n divided by d. The quotient is rounded towards zero, and the remainder will have the same sign as n. In all cases quotient and remainder satisfy n = q*d + r.

(remainder 13 4) ⇒ 1
(remainder -13 4) ⇒ -1

See also truncate-quotient, truncate-remainder and related operations in Arithmetic.

Scheme Procedure: modulo n d
C Function: scm_modulo (n, d)

Return the remainder from n divided by d, with the same sign as d.

(modulo 13 4) ⇒ 1
(modulo -13 4) ⇒ 3
(modulo 13 -4) ⇒ -3
(modulo -13 -4) ⇒ -1

See also floor-quotient, floor-remainder and related operations in Arithmetic.

Scheme Procedure: gcd x…
C Function: scm_gcd (x, y)

Return the greatest common divisor of all arguments. If called without arguments, 0 is returned.

The C function scm_gcd always takes two arguments, while the Scheme function can take an arbitrary number.

Scheme Procedure: lcm x…
C Function: scm_lcm (x, y)

Return the least common multiple of the arguments. If called without arguments, 1 is returned.

The C function scm_lcm always takes two arguments, while the Scheme function can take an arbitrary number.

Scheme Procedure: modulo-expt n k m
C Function: scm_modulo_expt (n, k, m)

Return n raised to the integer exponent k, modulo m.

(modulo-expt 2 3 5)
   ⇒ 3
Scheme Procedure: exact-integer-sqrt k
C Function: void scm_exact_integer_sqrt (SCM k, SCM *s, SCM *r)

Return two exact non-negative integers s and r such that k = s^2 + r and s^2 <= k < (s + 1)^2. An error is raised if k is not an exact non-negative integer.

(exact-integer-sqrt 10) ⇒ 3 and 1

Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.8 Comparison Predicates

The C comparison functions below always takes two arguments, while the Scheme functions can take an arbitrary number. Also keep in mind that the C functions return one of the Scheme boolean values SCM_BOOL_T or SCM_BOOL_F which are both true as far as C is concerned. Thus, always write scm_is_true (scm_num_eq_p (x, y)) when testing the two Scheme numbers x and y for equality, for example.

Scheme Procedure: =
C Function: scm_num_eq_p (x, y)

Return #t if all parameters are numerically equal.

Scheme Procedure: <
C Function: scm_less_p (x, y)

Return #t if the list of parameters is monotonically increasing.

Scheme Procedure: >
C Function: scm_gr_p (x, y)

Return #t if the list of parameters is monotonically decreasing.

Scheme Procedure: <=
C Function: scm_leq_p (x, y)

Return #t if the list of parameters is monotonically non-decreasing.

Scheme Procedure: >=
C Function: scm_geq_p (x, y)

Return #t if the list of parameters is monotonically non-increasing.

Scheme Procedure: zero? z
C Function: scm_zero_p (z)

Return #t if z is an exact or inexact number equal to zero.

Scheme Procedure: positive? x
C Function: scm_positive_p (x)

Return #t if x is an exact or inexact number greater than zero.

Scheme Procedure: negative? x
C Function: scm_negative_p (x)

Return #t if x is an exact or inexact number less than zero.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.9 Converting Numbers To and From Strings

The following procedures read and write numbers according to their external representation as defined by R5RS (see R5RS Lexical Structure in The Revised^5 Report on the Algorithmic Language Scheme). See the (ice-9 i18n) module, for locale-dependent number parsing.

Scheme Procedure: number->string n [radix]
C Function: scm_number_to_string (n, radix)

Return a string holding the external representation of the number n in the given radix. If n is inexact, a radix of 10 will be used.

Scheme Procedure: string->number string [radix]
C Function: scm_string_to_number (string, radix)

Return a number of the maximally precise representation expressed by the given string. radix must be an exact integer, either 2, 8, 10, or 16. If supplied, radix is a default radix that may be overridden by an explicit radix prefix in string (e.g. "#o177"). If radix is not supplied, then the default radix is 10. If string is not a syntactically valid notation for a number, then string->number returns #f.

C Function: SCM scm_c_locale_stringn_to_number (const char *string, size_t len, unsigned radix)

As per string->number above, but taking a C string, as pointer and length. The string characters should be in the current locale encoding (locale in the name refers only to that, there’s no locale-dependent parsing).


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.10 Complex Number Operations

Scheme Procedure: make-rectangular real_part imaginary_part
C Function: scm_make_rectangular (real_part, imaginary_part)

Return a complex number constructed of the given real-part and imaginary-part parts.

Scheme Procedure: make-polar mag ang
C Function: scm_make_polar (mag, ang)

Return the complex number mag * e^(i * ang).

Scheme Procedure: real-part z
C Function: scm_real_part (z)

Return the real part of the number z.

Scheme Procedure: imag-part z
C Function: scm_imag_part (z)

Return the imaginary part of the number z.

Scheme Procedure: magnitude z
C Function: scm_magnitude (z)

Return the magnitude of the number z. This is the same as abs for real arguments, but also allows complex numbers.

Scheme Procedure: angle z
C Function: scm_angle (z)

Return the angle of the complex number z.

C Function: SCM scm_c_make_rectangular (double re, double im)
C Function: SCM scm_c_make_polar (double x, double y)

Like scm_make_rectangular or scm_make_polar, respectively, but these functions take doubles as their arguments.

C Function: double scm_c_real_part (z)
C Function: double scm_c_imag_part (z)

Returns the real or imaginary part of z as a double.

C Function: double scm_c_magnitude (z)
C Function: double scm_c_angle (z)

Returns the magnitude or angle of z as a double.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.11 Arithmetic Functions

The C arithmetic functions below always takes two arguments, while the Scheme functions can take an arbitrary number. When you need to invoke them with just one argument, for example to compute the equivalent of (- x), pass SCM_UNDEFINED as the second one: scm_difference (x, SCM_UNDEFINED).

Scheme Procedure: + z1 …
C Function: scm_sum (z1, z2)

Return the sum of all parameter values. Return 0 if called without any parameters.

Scheme Procedure: - z1 z2 …
C Function: scm_difference (z1, z2)

If called with one argument z1, -z1 is returned. Otherwise the sum of all but the first argument are subtracted from the first argument.

Scheme Procedure: * z1 …
C Function: scm_product (z1, z2)

Return the product of all arguments. If called without arguments, 1 is returned.

Scheme Procedure: / z1 z2 …
C Function: scm_divide (z1, z2)

Divide the first argument by the product of the remaining arguments. If called with one argument z1, 1/z1 is returned.

Scheme Procedure: 1+ z
C Function: scm_oneplus (z)

Return z + 1.

Scheme Procedure: 1- z
C function: scm_oneminus (z)

Return z - 1.

Scheme Procedure: abs x
C Function: scm_abs (x)

Return the absolute value of x.

x must be a number with zero imaginary part. To calculate the magnitude of a complex number, use magnitude instead.

Scheme Procedure: max x1 x2 …
C Function: scm_max (x1, x2)

Return the maximum of all parameter values.

Scheme Procedure: min x1 x2 …
C Function: scm_min (x1, x2)

Return the minimum of all parameter values.

Scheme Procedure: truncate x
C Function: scm_truncate_number (x)

Round the inexact number x towards zero.

Scheme Procedure: round x
C Function: scm_round_number (x)

Round the inexact number x to the nearest integer. When exactly halfway between two integers, round to the even one.

Scheme Procedure: floor x
C Function: scm_floor (x)

Round the number x towards minus infinity.

Scheme Procedure: ceiling x
C Function: scm_ceiling (x)

Round the number x towards infinity.

C Function: double scm_c_truncate (double x)
C Function: double scm_c_round (double x)

Like scm_truncate_number or scm_round_number, respectively, but these functions take and return double values.

Scheme Procedure: euclidean/ x y
Scheme Procedure: euclidean-quotient x y
Scheme Procedure: euclidean-remainder x y
C Function: void scm_euclidean_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_euclidean_quotient (SCM x, SCM y)
C Function: SCM scm_euclidean_remainder (SCM x, SCM y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. euclidean-quotient returns the integer q and euclidean-remainder returns the real number r such that x = q*y + r and 0 <= r < |y|. euclidean/ returns both q and r, and is more efficient than computing each separately. Note that when y > 0, euclidean-quotient returns floor(x/y), otherwise it returns ceiling(x/y).

Note that these operators are equivalent to the R6RS operators div, mod, and div-and-mod.

(euclidean-quotient 123 10) ⇒ 12
(euclidean-remainder 123 10) ⇒ 3
(euclidean/ 123 10) ⇒ 12 and 3
(euclidean/ 123 -10) ⇒ -12 and 3
(euclidean/ -123 10) ⇒ -13 and 7
(euclidean/ -123 -10) ⇒ 13 and 7
(euclidean/ -123.2 -63.5) ⇒ 2.0 and 3.8
(euclidean/ 16/3 -10/7) ⇒ -3 and 22/21
Scheme Procedure: floor/ x y
Scheme Procedure: floor-quotient x y
Scheme Procedure: floor-remainder x y
C Function: void scm_floor_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_floor_quotient (x, y)
C Function: SCM scm_floor_remainder (x, y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. floor-quotient returns the integer q and floor-remainder returns the real number r such that q = floor(x/y) and x = q*y + r. floor/ returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the same sign as y.

When x and y are integers, floor-remainder is equivalent to the R5RS integer-only operator modulo.

(floor-quotient 123 10) ⇒ 12
(floor-remainder 123 10) ⇒ 3
(floor/ 123 10) ⇒ 12 and 3
(floor/ 123 -10) ⇒ -13 and -7
(floor/ -123 10) ⇒ -13 and 7
(floor/ -123 -10) ⇒ 12 and -3
(floor/ -123.2 -63.5) ⇒ 1.0 and -59.7
(floor/ 16/3 -10/7) ⇒ -4 and -8/21
Scheme Procedure: ceiling/ x y
Scheme Procedure: ceiling-quotient x y
Scheme Procedure: ceiling-remainder x y
C Function: void scm_ceiling_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_ceiling_quotient (x, y)
C Function: SCM scm_ceiling_remainder (x, y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. ceiling-quotient returns the integer q and ceiling-remainder returns the real number r such that q = ceiling(x/y) and x = q*y + r. ceiling/ returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the opposite sign of y.

(ceiling-quotient 123 10) ⇒ 13
(ceiling-remainder 123 10) ⇒ -7
(ceiling/ 123 10) ⇒ 13 and -7
(ceiling/ 123 -10) ⇒ -12 and 3
(ceiling/ -123 10) ⇒ -12 and -3
(ceiling/ -123 -10) ⇒ 13 and 7
(ceiling/ -123.2 -63.5) ⇒ 2.0 and 3.8
(ceiling/ 16/3 -10/7) ⇒ -3 and 22/21
Scheme Procedure: truncate/ x y
Scheme Procedure: truncate-quotient x y
Scheme Procedure: truncate-remainder x y
C Function: void scm_truncate_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_truncate_quotient (x, y)
C Function: SCM scm_truncate_remainder (x, y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. truncate-quotient returns the integer q and truncate-remainder returns the real number r such that q is x/y rounded toward zero, and x = q*y + r. truncate/ returns both q and r, and is more efficient than computing each separately. Note that r, if non-zero, will have the same sign as x.

When x and y are integers, these operators are equivalent to the R5RS integer-only operators quotient and remainder.

(truncate-quotient 123 10) ⇒ 12
(truncate-remainder 123 10) ⇒ 3
(truncate/ 123 10) ⇒ 12 and 3
(truncate/ 123 -10) ⇒ -12 and 3
(truncate/ -123 10) ⇒ -12 and -3
(truncate/ -123 -10) ⇒ 12 and -3
(truncate/ -123.2 -63.5) ⇒ 1.0 and -59.7
(truncate/ 16/3 -10/7) ⇒ -3 and 22/21
Scheme Procedure: centered/ x y
Scheme Procedure: centered-quotient x y
Scheme Procedure: centered-remainder x y
C Function: void scm_centered_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_centered_quotient (SCM x, SCM y)
C Function: SCM scm_centered_remainder (SCM x, SCM y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. centered-quotient returns the integer q and centered-remainder returns the real number r such that x = q*y + r and -|y/2| <= r < |y/2|. centered/ returns both q and r, and is more efficient than computing each separately.

Note that centered-quotient returns x/y rounded to the nearest integer. When x/y lies exactly half-way between two integers, the tie is broken according to the sign of y. If y > 0, ties are rounded toward positive infinity, otherwise they are rounded toward negative infinity. This is a consequence of the requirement that -|y/2| <= r < |y/2|.

Note that these operators are equivalent to the R6RS operators div0, mod0, and div0-and-mod0.

(centered-quotient 123 10) ⇒ 12
(centered-remainder 123 10) ⇒ 3
(centered/ 123 10) ⇒ 12 and 3
(centered/ 123 -10) ⇒ -12 and 3
(centered/ -123 10) ⇒ -12 and -3
(centered/ -123 -10) ⇒ 12 and -3
(centered/ 125 10) ⇒ 13 and -5
(centered/ 127 10) ⇒ 13 and -3
(centered/ 135 10) ⇒ 14 and -5
(centered/ -123.2 -63.5) ⇒ 2.0 and 3.8
(centered/ 16/3 -10/7) ⇒ -4 and -8/21
Scheme Procedure: round/ x y
Scheme Procedure: round-quotient x y
Scheme Procedure: round-remainder x y
C Function: void scm_round_divide (SCM x, SCM y, SCM *q, SCM *r)
C Function: SCM scm_round_quotient (x, y)
C Function: SCM scm_round_remainder (x, y)

These procedures accept two real numbers x and y, where the divisor y must be non-zero. round-quotient returns the integer q and round-remainder returns the real number r such that x = q*y + r and q is x/y rounded to the nearest integer, with ties going to the nearest even integer. round/ returns both q and r, and is more efficient than computing each separately.

Note that round/ and centered/ are almost equivalent, but their behavior differs when x/y lies exactly half-way between two integers. In this case, round/ chooses the nearest even integer, whereas centered/ chooses in such a way to satisfy the constraint -|y/2| <= r < |y/2|, which is stronger than the corresponding constraint for round/, -|y/2| <= r <= |y/2|. In particular, when x and y are integers, the number of possible remainders returned by centered/ is |y|, whereas the number of possible remainders returned by round/ is |y|+1 when y is even.

(round-quotient 123 10) ⇒ 12
(round-remainder 123 10) ⇒ 3
(round/ 123 10) ⇒ 12 and 3
(round/ 123 -10) ⇒ -12 and 3
(round/ -123 10) ⇒ -12 and -3
(round/ -123 -10) ⇒ 12 and -3
(round/ 125 10) ⇒ 12 and 5
(round/ 127 10) ⇒ 13 and -3
(round/ 135 10) ⇒ 14 and -5
(round/ -123.2 -63.5) ⇒ 2.0 and 3.8
(round/ 16/3 -10/7) ⇒ -4 and -8/21

Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.12 Scientific Functions

The following procedures accept any kind of number as arguments, including complex numbers.

Scheme Procedure: sqrt z

Return the square root of z. Of the two possible roots (positive and negative), the one with a positive real part is returned, or if that’s zero then a positive imaginary part. Thus,

(sqrt 9.0)       ⇒ 3.0
(sqrt -9.0)      ⇒ 0.0+3.0i
(sqrt 1.0+1.0i)  ⇒ 1.09868411346781+0.455089860562227i
(sqrt -1.0-1.0i) ⇒ 0.455089860562227-1.09868411346781i
Scheme Procedure: expt z1 z2

Return z1 raised to the power of z2.

Scheme Procedure: sin z

Return the sine of z.

Scheme Procedure: cos z

Return the cosine of z.

Scheme Procedure: tan z

Return the tangent of z.

Scheme Procedure: asin z

Return the arcsine of z.

Scheme Procedure: acos z

Return the arccosine of z.

Scheme Procedure: atan z
Scheme Procedure: atan y x

Return the arctangent of z, or of y/x.

Scheme Procedure: exp z

Return e to the power of z, where e is the base of natural logarithms (2.71828…).

Scheme Procedure: log z

Return the natural logarithm of z.

Scheme Procedure: log10 z

Return the base 10 logarithm of z.

Scheme Procedure: sinh z

Return the hyperbolic sine of z.

Scheme Procedure: cosh z

Return the hyperbolic cosine of z.

Scheme Procedure: tanh z

Return the hyperbolic tangent of z.

Scheme Procedure: asinh z

Return the hyperbolic arcsine of z.

Scheme Procedure: acosh z

Return the hyperbolic arccosine of z.

Scheme Procedure: atanh z

Return the hyperbolic arctangent of z.


Next: , Previous: , Up: Numbers   [Contents][Index]

6.6.2.13 Bitwise Operations

For the following bitwise functions, negative numbers are treated as infinite precision twos-complements. For instance -6 is bits …111010, with infinitely many ones on the left. It can be seen that adding 6 (binary 110) to such a bit pattern gives all zeros.

Scheme Procedure: logand n1 n2 …
C Function: scm_logand (n1, n2)

Return the bitwise AND of the integer arguments.

(logand) ⇒ -1
(logand 7) ⇒ 7
(logand #b111 #b011 #b001) ⇒ 1
Scheme Procedure: logior n1 n2 …
C Function: scm_logior (n1, n2)

Return the bitwise OR of the integer arguments.

(logior) ⇒ 0
(logior 7) ⇒ 7
(logior #b000 #b001 #b011) ⇒ 3
Scheme Procedure: logxor n1 n2 …
C Function: scm_loxor (n1, n2)

Return the bitwise XOR of the integer arguments. A bit is set in the result if it is set in an odd number of arguments.

(logxor) ⇒ 0
(logxor 7) ⇒ 7
(logxor #b000 #b001 #b011) ⇒ 2
(logxor #b000 #b001 #b011 #b011) ⇒ 1
Scheme Procedure: lognot n
C Function: scm_lognot (n)

Return the integer which is the ones-complement of the integer argument, ie. each 0 bit is changed to 1 and each 1 bit to 0.

(number->string (lognot #b10000000) 2)
   ⇒ "-10000001"
(number->string (lognot #b0) 2)
   ⇒ "-1"
Scheme Procedure: logtest j k
C Function: scm_logtest (j, k)

Test whether j and k have any 1 bits in common. This is equivalent to (not (zero? (logand j k))), but without actually calculating the logand, just testing for non-zero.

(logtest #b0100 #b1011) ⇒ #f
(logtest #b0100 #b0111) ⇒ #t
Scheme Procedure: logbit? index j
C Function: scm_logbit_p (index, j)

Test whether bit number index in j is set. index starts from 0 for the least significant bit.

(logbit? 0 #b1101) ⇒ #t
(logbit? 1 #b1101) ⇒ #f
(logbit? 2 #b1101) ⇒ #t
(logbit? 3 #b1101) ⇒ #t
(logbit? 4 #b1101) ⇒ #f
Scheme Procedure: ash n count
C Function: scm_ash (n, count)

Return floor(n * 2^count). n and count must be exact integers.

With n viewed as an infinite-precision twos-complement integer, ash means a left shift introducing zero bits when count is positive, or a right shift dropping bits when count is negative. This is an “arithmetic” shift.

(number->string (ash #b1 3) 2)     ⇒ "1000"
(number->string (ash #b1010 -1) 2) ⇒ "101"

;; -23 is bits ...11101001, -6 is bits ...111010
(ash -23 -2) ⇒ -6
Scheme Procedure: round-ash n count
C Function: scm_round_ash (n, count)

Return round(n * 2^count). n and count must be exact integers.

With n viewed as an infinite-precision twos-complement integer, round-ash means a left shift introducing zero bits when count is positive, or a right shift rounding to the nearest integer (with ties going to the nearest even integer) when count is negative. This is a rounded “arithmetic” shift.

(number->string (round-ash #b1 3) 2)     ⇒ \"1000\"
(number->string (round-ash #b1010 -1) 2) ⇒ \"101\"
(number->string (round-ash #b1010 -2) 2) ⇒ \"10\"
(number->string (round-ash #b1011 -2) 2) ⇒ \"11\"
(number->string (round-ash #b1101 -2) 2) ⇒ \"11\"
(number->string (round-ash #b1110 -2) 2) ⇒ \"100\"
Scheme Procedure: logcount n
C Function: scm_logcount (n)

Return the number of bits in integer n. If n is positive, the 1-bits in its binary representation are counted. If negative, the 0-bits in its two’s-complement binary representation are counted. If zero, 0 is returned.

(logcount #b10101010)
   ⇒ 4
(logcount 0)
   ⇒ 0
(logcount -2)
   ⇒ 1
Scheme Procedure: integer-length n
C Function: scm_integer_length (n)

Return the number of bits necessary to represent n.

For positive n this is how many bits to the most significant one bit. For negative n it’s how many bits to the most significant zero bit in twos complement form.

(integer-length #b10101010) ⇒ 8
(integer-length #b1111)     ⇒ 4
(integer-length 0)          ⇒ 0
(integer-length -1)         ⇒ 0
(integer-length -256)       ⇒ 8
(integer-length -257)       ⇒ 9
Scheme Procedure: integer-expt n k
C Function: scm_integer_expt (n, k)

Return n raised to the power k. k must be an exact integer, n can be any number.

Negative k is supported, and results in 1/n^abs(k) in the usual way. n^0 is 1, as usual, and that includes 0^0 is 1.

(integer-expt 2 5)   ⇒ 32
(integer-expt -3 3)  ⇒ -27
(integer-expt 5 -3)  ⇒ 1/125
(integer-expt 0 0)   ⇒ 1
Scheme Procedure: bit-extract n start end
C Function: scm_bit_extract (n, start, end)

Return the integer composed of the start (inclusive) through end (exclusive) bits of n. The startth bit becomes the 0-th bit in the result.

(number->string (bit-extract #b1101101010 0 4) 2)
   ⇒ "1010"
(number->string (bit-extract #b1101101010 4 9) 2)
   ⇒ "10110"

Previous: , Up: Numbers   [Contents][Index]

6.6.2.14 Random Number Generation

Pseudo-random numbers are generated from a random state object, which can be created with seed->random-state or datum->random-state. An external representation (i.e. one which can written with write and read with read) of a random state object can be obtained via random-state->datum. The state parameter to the various functions below is optional, it defaults to the state object in the *random-state* variable.

Scheme Procedure: copy-random-state [state]
C Function: scm_copy_random_state (state)

Return a copy of the random state state.

Scheme Procedure: random n [state]
C Function: scm_random (n, state)

Return a number in [0, n).

Accepts a positive integer or real n and returns a number of the same type between zero (inclusive) and n (exclusive). The values returned have a uniform distribution.

Scheme Procedure: random:exp [state]
C Function: scm_random_exp (state)

Return an inexact real in an exponential distribution with mean 1. For an exponential distribution with mean u use (* u (random:exp)).

Scheme Procedure: random:hollow-sphere! vect [state]
C Function: scm_random_hollow_sphere_x (vect, state)

Fills vect with inexact real random numbers the sum of whose squares is equal to 1.0. Thinking of vect as coordinates in space of dimension n = (vector-length vect), the coordinates are uniformly distributed over the surface of the unit n-sphere.

Scheme Procedure: random:normal [state]
C Function: scm_random_normal (state)

Return an inexact real in a normal distribution. The distribution used has mean 0 and standard deviation 1. For a normal distribution with mean m and standard deviation d use (+ m (* d (random:normal))).

Scheme Procedure: random:normal-vector! vect [state]
C Function: scm_random_normal_vector_x (vect, state)

Fills vect with inexact real random numbers that are independent and standard normally distributed (i.e., with mean 0 and variance 1).

Scheme Procedure: random:solid-sphere! vect [state]
C Function: scm_random_solid_sphere_x (vect, state)

Fills vect with inexact real random numbers the sum of whose squares is less than 1.0. Thinking of vect as coordinates in space of dimension n = (vector-length vect), the coordinates are uniformly distributed within the unit n-sphere.

Scheme Procedure: random:uniform [state]
C Function: scm_random_uniform (state)

Return a uniformly distributed inexact real random number in [0,1).

Scheme Procedure: seed->random-state seed
C Function: scm_seed_to_random_state (seed)

Return a new random state using seed.

Scheme Procedure: datum->random-state datum
C Function: scm_datum_to_random_state (datum)

Return a new random state from datum, which should have been obtained by random-state->datum.

Scheme Procedure: random-state->datum state
C Function: scm_random_state_to_datum (state)

Return a datum representation of state that may be written out and read back with the Scheme reader.

Scheme Procedure: random-state-from-platform
C Function: scm_random_state_from_platform ()

Construct a new random state seeded from a platform-specific source of entropy, appropriate for use in non-security-critical applications. Currently /dev/urandom is tried first, or else the seed is based on the time, date, process ID, an address from a freshly allocated heap cell, an address from the local stack frame, and a high-resolution timer if available.

Variable: *random-state*

The global random state used by the above functions when the state parameter is not given.

Note that the initial value of *random-state* is the same every time Guile starts up. Therefore, if you don’t pass a state parameter to the above procedures, and you don’t set *random-state* to (seed->random-state your-seed), where your-seed is something that isn’t the same every time, you’ll get the same sequence of “random” numbers on every run.

For example, unless the relevant source code has changed, (map random (cdr (iota 30))), if the first use of random numbers since Guile started up, will always give:

(map random (cdr (iota 19)))
⇒
(0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)

To seed the random state in a sensible way for non-security-critical applications, do this during initialization of your program:

(set! *random-state* (random-state-from-platform))

Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.3 Characters

In Scheme, there is a data type to describe a single character.

Defining what exactly a character is can be more complicated than it seems. Guile follows the advice of R6RS and uses The Unicode Standard to help define what a character is. So, for Guile, a character is anything in the Unicode Character Database.

The Unicode Character Database is basically a table of characters indexed using integers called ’code points’. Valid code points are in the ranges 0 to #xD7FF inclusive or #xE000 to #x10FFFF inclusive, which is about 1.1 million code points.

Any code point that has been assigned to a character or that has otherwise been given a meaning by Unicode is called a ’designated code point’. Most of the designated code points, about 200,000 of them, indicate characters, accents or other combining marks that modify other characters, symbols, whitespace, and control characters. Some are not characters but indicators that suggest how to format or display neighboring characters.

If a code point is not a designated code point – if it has not been assigned to a character by The Unicode Standard – it is a ’reserved code point’, meaning that they are reserved for future use. Most of the code points, about 800,000, are ’reserved code points’.

By convention, a Unicode code point is written as “U+XXXX” where “XXXX” is a hexadecimal number. Please note that this convenient notation is not valid code. Guile does not interpret “U+XXXX” as a character.

In Scheme, a character literal is written as #\name where name is the name of the character that you want. Printable characters have their usual single character name; for example, #\a is a lower case a.

Some of the code points are ’combining characters’ that are not meant to be printed by themselves but are instead meant to modify the appearance of the previous character. For combining characters, an alternate form of the character literal is #\ followed by U+25CC (a small, dotted circle), followed by the combining character. This allows the combining character to be drawn on the circle, not on the backslash of #\.

Many of the non-printing characters, such as whitespace characters and control characters, also have names.

The most commonly used non-printing characters have long character names, described in the table below.

Character NameCodepoint
#\nulU+0000
#\alarmu+0007
#\backspaceU+0008
#\tabU+0009
#\linefeedU+000A
#\newlineU+000A
#\vtabU+000B
#\pageU+000C
#\returnU+000D
#\escU+001B
#\spaceU+0020
#\deleteU+007F

There are also short names for all of the “C0 control characters” (those with code points below 32). The following table lists the short name for each character.

0 = #\nul1 = #\soh2 = #\stx3 = #\etx
4 = #\eot5 = #\enq6 = #\ack7 = #\bel
8 = #\bs9 = #\ht10 = #\lf11 = #\vt
12 = #\ff13 = #\cr14 = #\so15 = #\si
16 = #\dle17 = #\dc118 = #\dc219 = #\dc3
20 = #\dc421 = #\nak22 = #\syn23 = #\etb
24 = #\can25 = #\em26 = #\sub27 = #\esc
28 = #\fs29 = #\gs30 = #\rs31 = #\us
32 = #\sp

The short name for the “delete” character (code point U+007F) is #\del.

The R7RS name for the “escape” character (code point U+001B) is #\escape.

There are also a few alternative names left over for compatibility with previous versions of Guile.

AlternateStandard
#\nl#\newline
#\np#\page
#\null#\nul

Characters may also be written using their code point values. They can be written with as an octal number, such as #\10 for #\bs or #\177 for #\del.

If one prefers hex to octal, there is an additional syntax for character escapes: #\xHHHH – the letter ’x’ followed by a hexadecimal number of one to eight digits.

Scheme Procedure: char? x
C Function: scm_char_p (x)

Return #t if x is a character, else #f.

Fundamentally, the character comparison operations below are numeric comparisons of the character’s code points.

Scheme Procedure: char=? x y

Return #t if code point of x is equal to the code point of y, else #f.

Scheme Procedure: char<? x y

Return #t if the code point of x is less than the code point of y, else #f.

Scheme Procedure: char<=? x y

Return #t if the code point of x is less than or equal to the code point of y, else #f.

Scheme Procedure: char>? x y

Return #t if the code point of x is greater than the code point of y, else #f.

Scheme Procedure: char>=? x y

Return #t if the code point of x is greater than or equal to the code point of y, else #f.

Case-insensitive character comparisons use Unicode case folding. In case folding comparisons, if a character is lowercase and has an uppercase form that can be expressed as a single character, it is converted to uppercase before comparison. All other characters undergo no conversion before the comparison occurs. This includes the German sharp S (Eszett) which is not uppercased before conversion because its uppercase form has two characters. Unicode case folding is language independent: it uses rules that are generally true, but, it cannot cover all cases for all languages.

Scheme Procedure: char-ci=? x y

Return #t if the case-folded code point of x is the same as the case-folded code point of y, else #f.

Scheme Procedure: char-ci<? x y

Return #t if the case-folded code point of x is less than the case-folded code point of y, else #f.

Scheme Procedure: char-ci<=? x y

Return #t if the case-folded code point of x is less than or equal to the case-folded code point of y, else #f.

Scheme Procedure: char-ci>? x y

Return #t if the case-folded code point of x is greater than the case-folded code point of y, else #f.

Scheme Procedure: char-ci>=? x y

Return #t if the case-folded code point of x is greater than or equal to the case-folded code point of y, else #f.

Scheme Procedure: char-alphabetic? chr
C Function: scm_char_alphabetic_p (chr)

Return #t if chr is alphabetic, else #f.

Scheme Procedure: char-numeric? chr
C Function: scm_char_numeric_p (chr)

Return #t if chr is numeric, else #f.

Scheme Procedure: char-whitespace? chr
C Function: scm_char_whitespace_p (chr)

Return #t if chr is whitespace, else #f.

Scheme Procedure: char-upper-case? chr
C Function: scm_char_upper_case_p (chr)

Return #t if chr is uppercase, else #f.

Scheme Procedure: char-lower-case? chr
C Function: scm_char_lower_case_p (chr)

Return #t if chr is lowercase, else #f.

Scheme Procedure: char-is-both? chr
C Function: scm_char_is_both_p (chr)

Return #t if chr is either uppercase or lowercase, else #f.

Scheme Procedure: char-general-category chr
C Function: scm_char_general_category (chr)

Return a symbol giving the two-letter name of the Unicode general category assigned to chr or #f if no named category is assigned. The following table provides a list of category names along with their meanings.

LuUppercase letterPfFinal quote punctuation
LlLowercase letterPoOther punctuation
LtTitlecase letterSmMath symbol
LmModifier letterScCurrency symbol
LoOther letterSkModifier symbol
MnNon-spacing markSoOther symbol
McCombining spacing markZsSpace separator
MeEnclosing markZlLine separator
NdDecimal digit numberZpParagraph separator
NlLetter numberCcControl
NoOther numberCfFormat
PcConnector punctuationCsSurrogate
PdDash punctuationCoPrivate use
PsOpen punctuationCnUnassigned
PeClose punctuation
PiInitial quote punctuation
Scheme Procedure: char->integer chr
C Function: scm_char_to_integer (chr)

Return the code point of chr.

Scheme Procedure: integer->char n
C Function: scm_integer_to_char (n)

Return the character that has code point n. The integer n must be a valid code point. Valid code points are in the ranges 0 to #xD7FF inclusive or #xE000 to #x10FFFF inclusive.

Scheme Procedure: char-upcase chr
C Function: scm_char_upcase (chr)

Return the uppercase character version of chr.

Scheme Procedure: char-downcase chr
C Function: scm_char_downcase (chr)

Return the lowercase character version of chr.

Scheme Procedure: char-titlecase chr
C Function: scm_char_titlecase (chr)

Return the titlecase character version of chr if one exists; otherwise return the uppercase version.

For most characters these will be the same, but the Unicode Standard includes certain digraph compatibility characters, such as U+01F3 “dz”, for which the uppercase and titlecase characters are different (U+01F1 “DZ” and U+01F2 “Dz” in this case, respectively).

C Function: scm_t_wchar scm_c_upcase (scm_t_wchar c)
C Function: scm_t_wchar scm_c_downcase (scm_t_wchar c)
C Function: scm_t_wchar scm_c_titlecase (scm_t_wchar c)

These C functions take an integer representation of a Unicode codepoint and return the codepoint corresponding to its uppercase, lowercase, and titlecase forms respectively. The type scm_t_wchar is a signed, 32-bit integer.


Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.4 Character Sets

The features described in this section correspond directly to SRFI-14.

The data type charset implements sets of characters (see Characters). Because the internal representation of character sets is not visible to the user, a lot of procedures for handling them are provided.

Character sets can be created, extended, tested for the membership of a characters and be compared to other character sets.


Next: , Up: Character Sets   [Contents][Index]

6.6.4.1 Character Set Predicates/Comparison

Use these procedures for testing whether an object is a character set, or whether several character sets are equal or subsets of each other. char-set-hash can be used for calculating a hash value, maybe for usage in fast lookup procedures.

Scheme Procedure: char-set? obj
C Function: scm_char_set_p (obj)

Return #t if obj is a character set, #f otherwise.

Scheme Procedure: char-set= char_set …
C Function: scm_char_set_eq (char_sets)

Return #t if all given character sets are equal.

Scheme Procedure: char-set<= char_set …
C Function: scm_char_set_leq (char_sets)

Return #t if every character set char_seti is a subset of character set char_seti+1.

Scheme Procedure: char-set-hash cs [bound]
C Function: scm_char_set_hash (cs, bound)

Compute a hash value for the character set cs. If bound is given and non-zero, it restricts the returned value to the range 0 … bound - 1.


Next: , Previous: , Up: Character Sets   [Contents][Index]

6.6.4.2 Iterating Over Character Sets

Character set cursors are a means for iterating over the members of a character sets. After creating a character set cursor with char-set-cursor, a cursor can be dereferenced with char-set-ref, advanced to the next member with char-set-cursor-next. Whether a cursor has passed past the last element of the set can be checked with end-of-char-set?.

Additionally, mapping and (un-)folding procedures for character sets are provided.

Scheme Procedure: char-set-cursor cs
C Function: scm_char_set_cursor (cs)

Return a cursor into the character set cs.

Scheme Procedure: char-set-ref cs cursor
C Function: scm_char_set_ref (cs, cursor)

Return the character at the current cursor position cursor in the character set cs. It is an error to pass a cursor for which end-of-char-set? returns true.

Scheme Procedure: char-set-cursor-next cs cursor
C Function: scm_char_set_cursor_next (cs, cursor)

Advance the character set cursor cursor to the next character in the character set cs. It is an error if the cursor given satisfies end-of-char-set?.

Scheme Procedure: end-of-char-set? cursor
C Function: scm_end_of_char_set_p (cursor)

Return #t if cursor has reached the end of a character set, #f otherwise.

Scheme Procedure: char-set-fold kons knil cs
C Function: scm_char_set_fold (kons, knil, cs)

Fold the procedure kons over the character set cs, initializing it with knil.

Scheme Procedure: char-set-unfold p f g seed [base_cs]
C Function: scm_char_set_unfold (p, f, g, seed, base_cs)

This is a fundamental constructor for character sets.

Scheme Procedure: char-set-unfold! p f g seed base_cs
C Function: scm_char_set_unfold_x (p, f, g, seed, base_cs)

This is a fundamental constructor for character sets.

Scheme Procedure: char-set-for-each proc cs
C Function: scm_char_set_for_each (proc, cs)

Apply proc to every character in the character set cs. The return value is not specified.

Scheme Procedure: char-set-map proc cs
C Function: scm_char_set_map (proc, cs)

Map the procedure proc over every character in cs. proc must be a character -> character procedure.


Next: , Previous: , Up: Character Sets   [Contents][Index]

6.6.4.3 Creating Character Sets

New character sets are produced with these procedures.

Scheme Procedure: char-set-copy cs
C Function: scm_char_set_copy (cs)

Return a newly allocated character set containing all characters in cs.

Scheme Procedure: char-set chr …
C Function: scm_char_set (chrs)

Return a character set containing all given characters.

Scheme Procedure: list->char-set list [base_cs]
C Function: scm_list_to_char_set (list, base_cs)

Convert the character list list to a character set. If the character set base_cs is given, the character in this set are also included in the result.

Scheme Procedure: list->char-set! list base_cs
C Function: scm_list_to_char_set_x (list, base_cs)

Convert the character list list to a character set. The characters are added to base_cs and base_cs is returned.

Scheme Procedure: string->char-set str [base_cs]
C Function: scm_string_to_char_set (str, base_cs)

Convert the string str to a character set. If the character set base_cs is given, the characters in this set are also included in the result.

Scheme Procedure: string->char-set! str base_cs
C Function: scm_string_to_char_set_x (str, base_cs)

Convert the string str to a character set. The characters from the string are added to base_cs, and base_cs is returned.

Scheme Procedure: char-set-filter pred cs [base_cs]
C Function: scm_char_set_filter (pred, cs, base_cs)

Return a character set containing every character from cs so that it satisfies pred. If provided, the characters from base_cs are added to the result.

Scheme Procedure: char-set-filter! pred cs base_cs
C Function: scm_char_set_filter_x (pred, cs, base_cs)

Return a character set containing every character from cs so that it satisfies pred. The characters are added to base_cs and base_cs is returned.

Scheme Procedure: ucs-range->char-set lower upper [error [base_cs]]
C Function: scm_ucs_range_to_char_set (lower, upper, error, base_cs)

Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).

If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is #f, these characters are silently left out of the resulting character set.

The characters in base_cs are added to the result, if given.

Scheme Procedure: ucs-range->char-set! lower upper error base_cs
C Function: scm_ucs_range_to_char_set_x (lower, upper, error, base_cs)

Return a character set containing all characters whose character codes lie in the half-open range [lower,upper).

If error is a true value, an error is signalled if the specified range contains characters which are not contained in the implemented character range. If error is #f, these characters are silently left out of the resulting character set.

The characters are added to base_cs and base_cs is returned.

Scheme Procedure: ->char-set x
C Function: scm_to_char_set (x)

Coerces x into a char-set. x may be a string, character or char-set. A string is converted to the set of its constituent characters; a character is converted to a singleton set; a char-set is returned as-is.


Next: , Previous: , Up: Character Sets   [Contents][Index]

6.6.4.4 Querying Character Sets

Access the elements and other information of a character set with these procedures.

Scheme Procedure: %char-set-dump cs

Returns an association list containing debugging information for cs. The association list has the following entries.

char-set

The char-set itself

len

The number of groups of contiguous code points the char-set contains

ranges

A list of lists where each sublist is a range of code points and their associated characters

The return value of this function cannot be relied upon to be consistent between versions of Guile and should not be used in code.

Scheme Procedure: char-set-size cs
C Function: scm_char_set_size (cs)

Return the number of elements in character set cs.

Scheme Procedure: char-set-count pred cs
C Function: scm_char_set_count (pred, cs)

Return the number of the elements int the character set cs which satisfy the predicate pred.

Scheme Procedure: char-set->list cs
C Function: scm_char_set_to_list (cs)

Return a list containing the elements of the character set cs.

Scheme Procedure: char-set->string cs
C Function: scm_char_set_to_string (cs)

Return a string containing the elements of the character set cs. The order in which the characters are placed in the string is not defined.

Scheme Procedure: char-set-contains? cs ch
C Function: scm_char_set_contains_p (cs, ch)

Return #t if the character ch is contained in the character set cs, or #f otherwise.

Scheme Procedure: char-set-every pred cs
C Function: scm_char_set_every (pred, cs)

Return a true value if every character in the character set cs satisfies the predicate pred.

Scheme Procedure: char-set-any pred cs
C Function: scm_char_set_any (pred, cs)

Return a true value if any character in the character set cs satisfies the predicate pred.


Next: , Previous: , Up: Character Sets   [Contents][Index]

6.6.4.5 Character-Set Algebra

Character sets can be manipulated with the common set algebra operation, such as union, complement, intersection etc. All of these procedures provide side-effecting variants, which modify their character set argument(s).

Scheme Procedure: char-set-adjoin cs chr …
C Function: scm_char_set_adjoin (cs, chrs)

Add all character arguments to the first argument, which must be a character set.

Scheme Procedure: char-set-delete cs chr …
C Function: scm_char_set_delete (cs, chrs)

Delete all character arguments from the first argument, which must be a character set.

Scheme Procedure: char-set-adjoin! cs chr …
C Function: scm_char_set_adjoin_x (cs, chrs)

Add all character arguments to the first argument, which must be a character set.

Scheme Procedure: char-set-delete! cs chr …
C Function: scm_char_set_delete_x (cs, chrs)

Delete all character arguments from the first argument, which must be a character set.

Scheme Procedure: char-set-complement cs
C Function: scm_char_set_complement (cs)

Return the complement of the character set cs.

Note that the complement of a character set is likely to contain many reserved code points (code points that are not associated with characters). It may be helpful to modify the output of char-set-complement by computing its intersection with the set of designated code points, char-set:designated.

Scheme Procedure: char-set-union cs …
C Function: scm_char_set_union (char_sets)

Return the union of all argument character sets.

Scheme Procedure: char-set-intersection cs …
C Function: scm_char_set_intersection (char_sets)

Return the intersection of all argument character sets.

Scheme Procedure: char-set-difference cs1 cs …
C Function: scm_char_set_difference (cs1, char_sets)

Return the difference of all argument character sets.

Scheme Procedure: char-set-xor cs …
C Function: scm_char_set_xor (char_sets)

Return the exclusive-or of all argument character sets.

Scheme Procedure: char-set-diff+intersection cs1 cs …
C Function: scm_char_set_diff_plus_intersection (cs1, char_sets)

Return the difference and the intersection of all argument character sets.

Scheme Procedure: char-set-complement! cs
C Function: scm_char_set_complement_x (cs)

Return the complement of the character set cs.

Scheme Procedure: char-set-union! cs1 cs …
C Function: scm_char_set_union_x (cs1, char_sets)

Return the union of all argument character sets.

Scheme Procedure: char-set-intersection! cs1 cs …
C Function: scm_char_set_intersection_x (cs1, char_sets)

Return the intersection of all argument character sets.

Scheme Procedure: char-set-difference! cs1 cs …
C Function: scm_char_set_difference_x (cs1, char_sets)

Return the difference of all argument character sets.

Scheme Procedure: char-set-xor! cs1 cs …
C Function: scm_char_set_xor_x (cs1, char_sets)

Return the exclusive-or of all argument character sets.

Scheme Procedure: char-set-diff+intersection! cs1 cs2 cs …
C Function: scm_char_set_diff_plus_intersection_x (cs1, cs2, char_sets)

Return the difference and the intersection of all argument character sets.


Previous: , Up: Character Sets   [Contents][Index]

6.6.4.6 Standard Character Sets

In order to make the use of the character set data type and procedures useful, several predefined character set variables exist.

These character sets are locale independent and are not recomputed upon a setlocale call. They contain characters from the whole range of Unicode code points. For instance, char-set:letter contains about 100,000 characters.

Scheme Variable: char-set:lower-case
C Variable: scm_char_set_lower_case

All lower-case characters.

Scheme Variable: char-set:upper-case
C Variable: scm_char_set_upper_case

All upper-case characters.

Scheme Variable: char-set:title-case
C Variable: scm_char_set_title_case

All single characters that function as if they were an upper-case letter followed by a lower-case letter.

Scheme Variable: char-set:letter
C Variable: scm_char_set_letter

All letters. This includes char-set:lower-case, char-set:upper-case, char-set:title-case, and many letters that have no case at all. For example, Chinese and Japanese characters typically have no concept of case.

Scheme Variable: char-set:digit
C Variable: scm_char_set_digit

All digits.

Scheme Variable: char-set:letter+digit
C Variable: scm_char_set_letter_and_digit

The union of char-set:letter and char-set:digit.

Scheme Variable: char-set:graphic
C Variable: scm_char_set_graphic

All characters which would put ink on the paper.

Scheme Variable: char-set:printing
C Variable: scm_char_set_printing

The union of char-set:graphic and char-set:whitespace.

Scheme Variable: char-set:whitespace
C Variable: scm_char_set_whitespace

All whitespace characters.

Scheme Variable: char-set:blank
C Variable: scm_char_set_blank

All horizontal whitespace characters, which notably includes #\space and #\tab.

Scheme Variable: char-set:iso-control
C Variable: scm_char_set_iso_control

The ISO control characters are the C0 control characters (U+0000 to U+001F), delete (U+007F), and the C1 control characters (U+0080 to U+009F).

Scheme Variable: char-set:punctuation
C Variable: scm_char_set_punctuation

All punctuation characters, such as the characters !"#%&'()*,-./:;?@[\\]_{}

Scheme Variable: char-set:symbol
C Variable: scm_char_set_symbol

All symbol characters, such as the characters $+<=>^`|~.

Scheme Variable: char-set:hex-digit
C Variable: scm_char_set_hex_digit

The hexadecimal digits 0123456789abcdefABCDEF.

Scheme Variable: char-set:ascii
C Variable: scm_char_set_ascii

All ASCII characters.

Scheme Variable: char-set:empty
C Variable: scm_char_set_empty

The empty character set.

Scheme Variable: char-set:designated
C Variable: scm_char_set_designated

This character set contains all designated code points. This includes all the code points to which Unicode has assigned a character or other meaning.

Scheme Variable: char-set:full
C Variable: scm_char_set_full

This character set contains all possible code points. This includes both designated and reserved code points.


Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.5 Strings

Strings are fixed-length sequences of characters. They can be created by calling constructor procedures, but they can also literally get entered at the REPL or in Scheme source files.

Strings always carry the information about how many characters they are composed of with them, so there is no special end-of-string character, like in C. That means that Scheme strings can contain any character, even the ‘#\nul’ character ‘\0’.

To use strings efficiently, you need to know a bit about how Guile implements them. In Guile, a string consists of two parts, a head and the actual memory where the characters are stored. When a string (or a substring of it) is copied, only a new head gets created, the memory is usually not copied. The two heads start out pointing to the same memory.

When one of these two strings is modified, as with string-set!, their common memory does get copied so that each string has its own memory and modifying one does not accidentally modify the other as well. Thus, Guile’s strings are ‘copy on write’; the actual copying of their memory is delayed until one string is written to.

This implementation makes functions like substring very efficient in the common case that no modifications are done to the involved strings.

If you do know that your strings are getting modified right away, you can use substring/copy instead of substring. This function performs the copy immediately at the time of creation. This is more efficient, especially in a multi-threaded program. Also, substring/copy can avoid the problem that a short substring holds on to the memory of a very large original string that could otherwise be recycled.

If you want to avoid the copy altogether, so that modifications of one string show up in the other, you can use substring/shared. The strings created by this procedure are called mutation sharing substrings since the substring and the original string share modifications to each other.

If you want to prevent modifications, use substring/read-only.

Guile provides all procedures of SRFI-13 and a few more.


Next: , Up: Strings   [Contents][Index]

6.6.5.1 String Read Syntax

The read syntax for strings is an arbitrarily long sequence of characters enclosed in double quotes (").

Backslash is an escape character and can be used to insert the following special characters. \" and \\ are R5RS standard, \| is R7RS standard, the next seven are R6RS standard — notice they follow C syntax — and the remaining four are Guile extensions.

\\

Backslash character.

\"

Double quote character (an unescaped " is otherwise the end of the string).

\|

Vertical bar character.

\a

Bell character (ASCII 7).

\f

Formfeed character (ASCII 12).

\n

Newline character (ASCII 10).

\r

Carriage return character (ASCII 13).

\t

Tab character (ASCII 9).

\v

Vertical tab character (ASCII 11).

\b

Backspace character (ASCII 8).

\0

NUL character (ASCII 0).

\ followed by newline (ASCII 10)

Nothing. This way if \ is the last character in a line, the string will continue with the first character from the next line, without a line break.

If the hungry-eol-escapes reader option is enabled, which is not the case by default, leading whitespace on the next line is discarded.

"foo\
  bar"
⇒ "foo  bar"
(read-enable 'hungry-eol-escapes)
"foo\
  bar"
⇒ "foobar"
\xHH

Character code given by two hexadecimal digits. For example \x7f for an ASCII DEL (127).

\uHHHH

Character code given by four hexadecimal digits. For example \u0100 for a capital A with macron (U+0100).

\UHHHHHH

Character code given by six hexadecimal digits. For example \U010402.

The following are examples of string literals:

"foo"
"bar plonk"
"Hello World"
"\"Hi\", he said."

The three escape sequences \xHH, \uHHHH and \UHHHHHH were chosen to not break compatibility with code written for previous versions of Guile. The R6RS specification suggests a different, incompatible syntax for hex escapes: \xHHHH; – a character code followed by one to eight hexadecimal digits terminated with a semicolon. If this escape format is desired instead, it can be enabled with the reader option r6rs-hex-escapes.

(read-enable 'r6rs-hex-escapes)

For more on reader options, See Scheme Read.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.2 String Predicates

The following procedures can be used to check whether a given string fulfills some specified property.

Scheme Procedure: string? obj
C Function: scm_string_p (obj)

Return #t if obj is a string, else #f.

C Function: int scm_is_string (SCM obj)

Returns 1 if obj is a string, 0 otherwise.

Scheme Procedure: string-null? str
C Function: scm_string_null_p (str)

Return #t if str’s length is zero, and #f otherwise.

(string-null? "")  ⇒ #t
y                    ⇒ "foo"
(string-null? y)     ⇒ #f
Scheme Procedure: string-any char_pred s [start [end]]
C Function: scm_string_any (char_pred, s, start, end)

Check if char_pred is true for any character in string s.

char_pred can be a character to check for any equal to that, or a character set (see Character Sets) to check for any in that set, or a predicate procedure to call.

For a procedure, calls (char_pred c) are made successively on the characters from start to end. If char_pred returns true (ie. non-#f), string-any stops and that return value is the return from string-any. The call on the last character (ie. at end-1), if that point is reached, is a tail call.

If there are no characters in s (ie. start equals end) then the return is #f.

Scheme Procedure: string-every char_pred s [start [end]]
C Function: scm_string_every (char_pred, s, start, end)

Check if char_pred is true for every character in string s.

char_pred can be a character to check for every character equal to that, or a character set (see Character Sets) to check for every character being in that set, or a predicate procedure to call.

For a procedure, calls (char_pred c) are made successively on the characters from start to end. If char_pred returns #f, string-every stops and returns #f. The call on the last character (ie. at end-1), if that point is reached, is a tail call and the return from that call is the return from string-every.

If there are no characters in s (ie. start equals end) then the return is #t.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.3 String Constructors

The string constructor procedures create new string objects, possibly initializing them with some specified character data. See also See String Selection, for ways to create strings from existing strings.

Scheme Procedure: string char…

Return a newly allocated string made from the given character arguments.

(string #\x #\y #\z) ⇒ "xyz"
(string)             ⇒ ""
Scheme Procedure: list->string lst
C Function: scm_string (lst)

Return a newly allocated string made from a list of characters.

(list->string '(#\a #\b #\c)) ⇒ "abc"
Scheme Procedure: reverse-list->string lst
C Function: scm_reverse_list_to_string (lst)

Return a newly allocated string made from a list of characters, in reverse order.

(reverse-list->string '(#\a #\B #\c)) ⇒ "cBa"
Scheme Procedure: make-string k [chr]
C Function: scm_make_string (k, chr)

Return a newly allocated string of length k. If chr is given, then all elements of the string are initialized to chr, otherwise the contents of the string are unspecified.

C Function: SCM scm_c_make_string (size_t len, SCM chr)

Like scm_make_string, but expects the length as a size_t.

Scheme Procedure: string-tabulate proc len
C Function: scm_string_tabulate (proc, len)

proc is an integer->char procedure. Construct a string of size len by applying proc to each index to produce the corresponding string element. The order in which proc is applied to the indices is not specified.

Scheme Procedure: string-join ls [delimiter [grammar]]
C Function: scm_string_join (ls, delimiter, grammar)

Append the string in the string list ls, using the string delimiter as a delimiter between the elements of ls. grammar is a symbol which specifies how the delimiter is placed between the strings, and defaults to the symbol infix.

infix

Insert the separator between list elements. An empty string will produce an empty list.

strict-infix

Like infix, but will raise an error if given the empty list.

suffix

Insert the separator after every list element.

prefix

Insert the separator before each list element.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.4 List/String conversion

When processing strings, it is often convenient to first convert them into a list representation by using the procedure string->list, work with the resulting list, and then convert it back into a string. These procedures are useful for similar tasks.

Scheme Procedure: string->list str [start [end]]
C Function: scm_substring_to_list (str, start, end)
C Function: scm_string_to_list (str)

Convert the string str into a list of characters.

Scheme Procedure: string-split str char_pred
C Function: scm_string_split (str, char_pred)

Split the string str into a list of substrings delimited by appearances of characters that

Note that an empty substring between separator characters will result in an empty string in the result list.

(string-split "root:x:0:0:root:/root:/bin/bash" #\:)
⇒
("root" "x" "0" "0" "root" "/root" "/bin/bash")

(string-split "::" #\:)
⇒
("" "" "")

(string-split "" #\:)
⇒
("")

Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.5 String Selection

Portions of strings can be extracted by these procedures. string-ref delivers individual characters whereas substring can be used to extract substrings from longer strings.

Scheme Procedure: string-length string
C Function: scm_string_length (string)

Return the number of characters in string.

C Function: size_t scm_c_string_length (SCM str)

Return the number of characters in str as a size_t.

Scheme Procedure: string-ref str k
C Function: scm_string_ref (str, k)

Return character k of str using zero-origin indexing. k must be a valid index of str.

C Function: SCM scm_c_string_ref (SCM str, size_t k)

Return character k of str using zero-origin indexing. k must be a valid index of str.

Scheme Procedure: string-copy str [start [end]]
C Function: scm_substring_copy (str, start, end)
C Function: scm_string_copy (str)

Return a copy of the given string str.

The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.

Scheme Procedure: substring str start [end]
C Function: scm_substring (str, start, end)

Return a new string formed from the characters of str beginning with index start (inclusive) and ending with index end (exclusive). str must be a string, start and end must be exact integers satisfying:

0 <= start <= end <= (string-length str).

The returned string shares storage with str initially, but it is copied as soon as one of the two strings is modified.

Scheme Procedure: substring/shared str start [end]
C Function: scm_substring_shared (str, start, end)

Like substring, but the strings continue to share their storage even if they are modified. Thus, modifications to str show up in the new string, and vice versa.

Scheme Procedure: substring/copy str start [end]
C Function: scm_substring_copy (str, start, end)

Like substring, but the storage for the new string is copied immediately.

Scheme Procedure: substring/read-only str start [end]
C Function: scm_substring_read_only (str, start, end)

Like substring, but the resulting string can not be modified.

C Function: SCM scm_c_substring (SCM str, size_t start, size_t end)
C Function: SCM scm_c_substring_shared (SCM str, size_t start, size_t end)
C Function: SCM scm_c_substring_copy (SCM str, size_t start, size_t end)
C Function: SCM scm_c_substring_read_only (SCM str, size_t start, size_t end)

Like scm_substring, etc. but the bounds are given as a size_t.

Scheme Procedure: string-take s n
C Function: scm_string_take (s, n)

Return the n first characters of s.

Scheme Procedure: string-drop s n
C Function: scm_string_drop (s, n)

Return all but the first n characters of s.

Scheme Procedure: string-take-right s n
C Function: scm_string_take_right (s, n)

Return the n last characters of s.

Scheme Procedure: string-drop-right s n
C Function: scm_string_drop_right (s, n)

Return all but the last n characters of s.

Scheme Procedure: string-pad s len [chr [start [end]]]
Scheme Procedure: string-pad-right s len [chr [start [end]]]
C Function: scm_string_pad (s, len, chr, start, end)
C Function: scm_string_pad_right (s, len, chr, start, end)

Take characters start to end from the string s and either pad with chr or truncate them to give len characters.

string-pad pads or truncates on the left, so for example

(string-pad "x" 3)     ⇒ "  x"
(string-pad "abcde" 3) ⇒ "cde"

string-pad-right pads or truncates on the right, so for example

(string-pad-right "x" 3)     ⇒ "x  "
(string-pad-right "abcde" 3) ⇒ "abc"
Scheme Procedure: string-trim s [char_pred [start [end]]]
Scheme Procedure: string-trim-right s [char_pred [start [end]]]
Scheme Procedure: string-trim-both s [char_pred [start [end]]]
C Function: scm_string_trim (s, char_pred, start, end)
C Function: scm_string_trim_right (s, char_pred, start, end)
C Function: scm_string_trim_both (s, char_pred, start, end)

Trim occurrences of char_pred from the ends of s.

string-trim trims char_pred characters from the left (start) of the string, string-trim-right trims them from the right (end) of the string, string-trim-both trims from both ends.

char_pred can be a character, a character set, or a predicate procedure to call on each character. If char_pred is not given the default is whitespace as per char-set:whitespace (see Standard Character Sets).

(string-trim " x ")              ⇒ "x "
(string-trim-right "banana" #\a) ⇒ "banan"
(string-trim-both ".,xy:;" char-set:punctuation)
                  ⇒ "xy"
(string-trim-both "xyzzy" (lambda (c)
                             (or (eqv? c #\x)
                                 (eqv? c #\y))))
                  ⇒ "zz"

Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.6 String Modification

These procedures are for modifying strings in-place. This means that the result of the operation is not a new string; instead, the original string’s memory representation is modified.

Scheme Procedure: string-set! str k chr
C Function: scm_string_set_x (str, k, chr)

Store chr in element k of str and return an unspecified value. k must be a valid index of str.

C Function: void scm_c_string_set_x (SCM str, size_t k, SCM chr)

Like scm_string_set_x, but the index is given as a size_t.

Scheme Procedure: string-fill! str chr [start [end]]
C Function: scm_substring_fill_x (str, chr, start, end)
C Function: scm_string_fill_x (str, chr)

Stores chr in every element of the given str and returns an unspecified value.

Scheme Procedure: substring-fill! str start end fill
C Function: scm_substring_fill_x (str, start, end, fill)

Change every character in str between start and end to fill.

(define y (string-copy "abcdefg"))
(substring-fill! y 1 3 #\r)
y
⇒ "arrdefg"
Scheme Procedure: substring-move! str1 start1 end1 str2 start2
C Function: scm_substring_move_x (str1, start1, end1, str2, start2)

Copy the substring of str1 bounded by start1 and end1 into str2 beginning at position start2. str1 and str2 can be the same string.

Scheme Procedure: string-copy! target tstart s [start [end]]
C Function: scm_string_copy_x (target, tstart, s, start, end)

Copy the sequence of characters from index range [start, end) in string s to string target, beginning at index tstart. The characters are copied left-to-right or right-to-left as needed – the copy is guaranteed to work, even if target and s are the same string. It is an error if the copy operation runs off the end of the target string.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.7 String Comparison

The procedures in this section are similar to the character ordering predicates (see Characters), but are defined on character sequences.

The first set is specified in R5RS and has names that end in ?. The second set is specified in SRFI-13 and the names have not ending ?.

The predicates ending in -ci ignore the character case when comparing strings. For now, case-insensitive comparison is done using the R5RS rules, where every lower-case character that has a single character upper-case form is converted to uppercase before comparison. See See the (ice-9 i18n) module, for locale-dependent string comparison.

Scheme Procedure: string=? s1 s2 s3 …

Lexicographic equality predicate; return #t if all strings are the same length and contain the same characters in the same positions, otherwise return #f.

The procedure string-ci=? treats upper and lower case letters as though they were the same character, but string=? treats upper and lower case as distinct characters.

Scheme Procedure: string<? s1 s2 s3 …

Lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than str_i+1.

Scheme Procedure: string<=? s1 s2 s3 …

Lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than or equal to str_i+1.

Scheme Procedure: string>? s1 s2 s3 …

Lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than str_i+1.

Scheme Procedure: string>=? s1 s2 s3 …

Lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than or equal to str_i+1.

Scheme Procedure: string-ci=? s1 s2 s3 …

Case-insensitive string equality predicate; return #t if all strings are the same length and their component characters match (ignoring case) at each position; otherwise return #f.

Scheme Procedure: string-ci<? s1 s2 s3 …

Case insensitive lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than str_i+1 regardless of case.

Scheme Procedure: string-ci<=? s1 s2 s3 …

Case insensitive lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than or equal to str_i+1 regardless of case.

Scheme Procedure: string-ci>? s1 s2 s3 …

Case insensitive lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than str_i+1 regardless of case.

Scheme Procedure: string-ci>=? s1 s2 s3 …

Case insensitive lexicographic ordering predicate; return #t if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than or equal to str_i+1 regardless of case.

Scheme Procedure: string-compare s1 s2 proc_lt proc_eq proc_gt [start1 [end1 [start2 [end2]]]]
C Function: scm_string_compare (s1, s2, proc_lt, proc_eq, proc_gt, start1, end1, start2, end2)

Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.

Scheme Procedure: string-compare-ci s1 s2 proc_lt proc_eq proc_gt [start1 [end1 [start2 [end2]]]]
C Function: scm_string_compare_ci (s1, s2, proc_lt, proc_eq, proc_gt, start1, end1, start2, end2)

Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position where the lowercased letters do not match.

Scheme Procedure: string= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_eq (s1, s2, start1, end1, start2, end2)

Return #f if s1 and s2 are not equal, a true value otherwise.

Scheme Procedure: string<> s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_neq (s1, s2, start1, end1, start2, end2)

Return #f if s1 and s2 are equal, a true value otherwise.

Scheme Procedure: string< s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_lt (s1, s2, start1, end1, start2, end2)

Return #f if s1 is greater or equal to s2, a true value otherwise.

Scheme Procedure: string> s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_gt (s1, s2, start1, end1, start2, end2)

Return #f if s1 is less or equal to s2, a true value otherwise.

Scheme Procedure: string<= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_le (s1, s2, start1, end1, start2, end2)

Return #f if s1 is greater to s2, a true value otherwise.

Scheme Procedure: string>= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ge (s1, s2, start1, end1, start2, end2)

Return #f if s1 is less to s2, a true value otherwise.

Scheme Procedure: string-ci= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_eq (s1, s2, start1, end1, start2, end2)

Return #f if s1 and s2 are not equal, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-ci<> s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_neq (s1, s2, start1, end1, start2, end2)

Return #f if s1 and s2 are equal, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-ci< s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_lt (s1, s2, start1, end1, start2, end2)

Return #f if s1 is greater or equal to s2, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-ci> s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_gt (s1, s2, start1, end1, start2, end2)

Return #f if s1 is less or equal to s2, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-ci<= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_le (s1, s2, start1, end1, start2, end2)

Return #f if s1 is greater to s2, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-ci>= s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_ci_ge (s1, s2, start1, end1, start2, end2)

Return #f if s1 is less to s2, a true value otherwise. The character comparison is done case-insensitively.

Scheme Procedure: string-hash s [bound [start [end]]]
C Function: scm_substring_hash (s, bound, start, end)

Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).

Scheme Procedure: string-hash-ci s [bound [start [end]]]
C Function: scm_substring_hash_ci (s, bound, start, end)

Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).

Because the same visual appearance of an abstract Unicode character can be obtained via multiple sequences of Unicode characters, even the case-insensitive string comparison functions described above may return #f when presented with strings containing different representations of the same character. For example, the Unicode character “LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE” can be represented with a single character (U+1E69) or by the character “LATIN SMALL LETTER S” (U+0073) followed by the combining marks “COMBINING DOT BELOW” (U+0323) and “COMBINING DOT ABOVE” (U+0307).

For this reason, it is often desirable to ensure that the strings to be compared are using a mutually consistent representation for every character. The Unicode standard defines two methods of normalizing the contents of strings: Decomposition, which breaks composite characters into a set of constituent characters with an ordering defined by the Unicode Standard; and composition, which performs the converse.

There are two decomposition operations. “Canonical decomposition” produces character sequences that share the same visual appearance as the original characters, while “compatibility decomposition” produces ones whose visual appearances may differ from the originals but which represent the same abstract character.

These operations are encapsulated in the following set of normalization forms:

NFD

Characters are decomposed to their canonical forms.

NFKD

Characters are decomposed to their compatibility forms.

NFC

Characters are decomposed to their canonical forms, then composed.

NFKC

Characters are decomposed to their compatibility forms, then composed.

The functions below put their arguments into one of the forms described above.

Scheme Procedure: string-normalize-nfd s
C Function: scm_string_normalize_nfd (s)

Return the NFD normalized form of s.

Scheme Procedure: string-normalize-nfkd s
C Function: scm_string_normalize_nfkd (s)

Return the NFKD normalized form of s.

Scheme Procedure: string-normalize-nfc s
C Function: scm_string_normalize_nfc (s)

Return the NFC normalized form of s.

Scheme Procedure: string-normalize-nfkc s
C Function: scm_string_normalize_nfkc (s)

Return the NFKC normalized form of s.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.8 String Searching

Scheme Procedure: string-index s char_pred [start [end]]
C Function: scm_string_index (s, char_pred, start, end)

Search through the string s from left to right, returning the index of the first occurrence of a character which

Return #f if no match is found.

Scheme Procedure: string-rindex s char_pred [start [end]]
C Function: scm_string_rindex (s, char_pred, start, end)

Search through the string s from right to left, returning the index of the last occurrence of a character which

Return #f if no match is found.

Scheme Procedure: string-prefix-length s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_prefix_length (s1, s2, start1, end1, start2, end2)

Return the length of the longest common prefix of the two strings.

Scheme Procedure: string-prefix-length-ci s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_prefix_length_ci (s1, s2, start1, end1, start2, end2)

Return the length of the longest common prefix of the two strings, ignoring character case.

Scheme Procedure: string-suffix-length s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_suffix_length (s1, s2, start1, end1, start2, end2)

Return the length of the longest common suffix of the two strings.

Scheme Procedure: string-suffix-length-ci s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_suffix_length_ci (s1, s2, start1, end1, start2, end2)

Return the length of the longest common suffix of the two strings, ignoring character case.

Scheme Procedure: string-prefix? s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_prefix_p (s1, s2, start1, end1, start2, end2)

Is s1 a prefix of s2?

Scheme Procedure: string-prefix-ci? s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_prefix_ci_p (s1, s2, start1, end1, start2, end2)

Is s1 a prefix of s2, ignoring character case?

Scheme Procedure: string-suffix? s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_suffix_p (s1, s2, start1, end1, start2, end2)

Is s1 a suffix of s2?

Scheme Procedure: string-suffix-ci? s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_suffix_ci_p (s1, s2, start1, end1, start2, end2)

Is s1 a suffix of s2, ignoring character case?

Scheme Procedure: string-index-right s char_pred [start [end]]
C Function: scm_string_index_right (s, char_pred, start, end)

Search through the string s from right to left, returning the index of the last occurrence of a character which

Return #f if no match is found.

Scheme Procedure: string-skip s char_pred [start [end]]
C Function: scm_string_skip (s, char_pred, start, end)

Search through the string s from left to right, returning the index of the first occurrence of a character which

Scheme Procedure: string-skip-right s char_pred [start [end]]
C Function: scm_string_skip_right (s, char_pred, start, end)

Search through the string s from right to left, returning the index of the last occurrence of a character which

Scheme Procedure: string-count s char_pred [start [end]]
C Function: scm_string_count (s, char_pred, start, end)

Return the count of the number of characters in the string s which

Scheme Procedure: string-contains s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_contains (s1, s2, start1, end1, start2, end2)

Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings.

Scheme Procedure: string-contains-ci s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_contains_ci (s1, s2, start1, end1, start2, end2)

Does string s1 contain string s2? Return the index in s1 where s2 occurs as a substring, or false. The optional start/end indices restrict the operation to the indicated substrings. Character comparison is done case-insensitively.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.9 Alphabetic Case Mapping

These are procedures for mapping strings to their upper- or lower-case equivalents, respectively, or for capitalizing strings.

They use the basic case mapping rules for Unicode characters. No special language or context rules are considered. The resulting strings are guaranteed to be the same length as the input strings.

See the (ice-9 i18n) module, for locale-dependent case conversions.

Scheme Procedure: string-upcase str [start [end]]
C Function: scm_substring_upcase (str, start, end)
C Function: scm_string_upcase (str)

Upcase every character in str.

Scheme Procedure: string-upcase! str [start [end]]
C Function: scm_substring_upcase_x (str, start, end)
C Function: scm_string_upcase_x (str)

Destructively upcase every character in str.

(string-upcase! y)
⇒ "ARRDEFG"
y
⇒ "ARRDEFG"
Scheme Procedure: string-downcase str [start [end]]
C Function: scm_substring_downcase (str, start, end)
C Function: scm_string_downcase (str)

Downcase every character in str.

Scheme Procedure: string-downcase! str [start [end]]
C Function: scm_substring_downcase_x (str, start, end)
C Function: scm_string_downcase_x (str)

Destructively downcase every character in str.

y
⇒ "ARRDEFG"
(string-downcase! y)
⇒ "arrdefg"
y
⇒ "arrdefg"
Scheme Procedure: string-capitalize str
C Function: scm_string_capitalize (str)

Return a freshly allocated string with the characters in str, where the first character of every word is capitalized.

Scheme Procedure: string-capitalize! str
C Function: scm_string_capitalize_x (str)

Upcase the first character of every word in str destructively and return str.

y                      ⇒ "hello world"
(string-capitalize! y) ⇒ "Hello World"
y                      ⇒ "Hello World"
Scheme Procedure: string-titlecase str [start [end]]
C Function: scm_string_titlecase (str, start, end)

Titlecase every first character in a word in str.

Scheme Procedure: string-titlecase! str [start [end]]
C Function: scm_string_titlecase_x (str, start, end)

Destructively titlecase every first character in a word in str.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.10 Reversing and Appending Strings

Scheme Procedure: string-reverse str [start [end]]
C Function: scm_string_reverse (str, start, end)

Reverse the string str. The optional arguments start and end delimit the region of str to operate on.

Scheme Procedure: string-reverse! str [start [end]]
C Function: scm_string_reverse_x (str, start, end)

Reverse the string str in-place. The optional arguments start and end delimit the region of str to operate on. The return value is unspecified.

Scheme Procedure: string-append arg …
C Function: scm_string_append (args)

Return a newly allocated string whose characters form the concatenation of the given strings, arg ....

(let ((h "hello "))
  (string-append h "world"))
⇒ "hello world"
Scheme Procedure: string-append/shared arg …
C Function: scm_string_append_shared (args)

Like string-append, but the result may share memory with the argument strings.

Scheme Procedure: string-concatenate ls
C Function: scm_string_concatenate (ls)

Append the elements (which must be strings) of ls together into a single string. Guaranteed to return a freshly allocated string.

Scheme Procedure: string-concatenate-reverse ls [final_string [end]]
C Function: scm_string_concatenate_reverse (ls, final_string, end)

Without optional arguments, this procedure is equivalent to

(string-concatenate (reverse ls))

If the optional argument final_string is specified, it is consed onto the beginning to ls before performing the list-reverse and string-concatenate operations. If end is given, only the characters of final_string up to index end are used.

Guaranteed to return a freshly allocated string.

Scheme Procedure: string-concatenate/shared ls
C Function: scm_string_concatenate_shared (ls)

Like string-concatenate, but the result may share memory with the strings in the list ls.

Scheme Procedure: string-concatenate-reverse/shared ls [final_string [end]]
C Function: scm_string_concatenate_reverse_shared (ls, final_string, end)

Like string-concatenate-reverse, but the result may share memory with the strings in the ls arguments.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.11 Mapping, Folding, and Unfolding

Scheme Procedure: string-map proc s [start [end]]
C Function: scm_string_map (proc, s, start, end)

proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified.

Scheme Procedure: string-map! proc s [start [end]]
C Function: scm_string_map_x (proc, s, start, end)

proc is a char->char procedure, it is mapped over s. The order in which the procedure is applied to the string elements is not specified. The string s is modified in-place, the return value is not specified.

Scheme Procedure: string-for-each proc s [start [end]]
C Function: scm_string_for_each (proc, s, start, end)

proc is mapped over s in left-to-right order. The return value is not specified.

Scheme Procedure: string-for-each-index proc s [start [end]]
C Function: scm_string_for_each_index (proc, s, start, end)

Call (proc i) for each index i in s, from left to right.

For example, to change characters to alternately upper and lower case,

(define str (string-copy "studly"))
(string-for-each-index
    (lambda (i)
      (string-set! str i
        ((if (even? i) char-upcase char-downcase)
         (string-ref str i))))
    str)
str ⇒ "StUdLy"
Scheme Procedure: string-fold kons knil s [start [end]]
C Function: scm_string_fold (kons, knil, s, start, end)

Fold kons over the characters of s, with knil as the terminating element, from left to right. kons must expect two arguments: The actual character and the last result of kons’ application.

Scheme Procedure: string-fold-right kons knil s [start [end]]
C Function: scm_string_fold_right (kons, knil, s, start, end)

Fold kons over the characters of s, with knil as the terminating element, from right to left. kons must expect two arguments: The actual character and the last result of kons’ application.

Scheme Procedure: string-unfold p f g seed [base [make_final]]
C Function: scm_string_unfold (p, f, g, seed, base, make_final)
Scheme Procedure: string-unfold-right p f g seed [base [make_final]]
C Function: scm_string_unfold_right (p, f, g, seed, base, make_final)

Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.12 Miscellaneous String Operations

Scheme Procedure: xsubstring s from [to [start [end]]]
C Function: scm_xsubstring (s, from, to, start, end)

This is the extended substring procedure that implements replicated copying of a substring of some string.

s is a string, start and end are optional arguments that demarcate a substring of s, defaulting to 0 and the length of s. Replicate this substring up and down index space, in both the positive and negative directions. xsubstring returns the substring of this string beginning at index from, and ending at to, which defaults to from + (end - start).

Scheme Procedure: string-xcopy! target tstart s sfrom [sto [start [end]]]
C Function: scm_string_xcopy_x (target, tstart, s, sfrom, sto, start, end)

Exactly the same as xsubstring, but the extracted text is written into the string target starting at index tstart. The operation is not defined if (eq? target s) or these arguments share storage – you cannot copy a string on top of itself.

Scheme Procedure: string-replace s1 s2 [start1 [end1 [start2 [end2]]]]
C Function: scm_string_replace (s1, s2, start1, end1, start2, end2)

Return the string s1, but with the characters start1end1 replaced by the characters start2end2 from s2.

Scheme Procedure: string-tokenize s [token_set [start [end]]]
C Function: scm_string_tokenize (s, token_set, start, end)

Split the string s into a list of substrings, where each substring is a maximal non-empty contiguous sequence of characters from the character set token_set, which defaults to char-set:graphic. If start or end indices are provided, they restrict string-tokenize to operating on the indicated substring of s.

Scheme Procedure: string-filter char_pred s [start [end]]
C Function: scm_string_filter (char_pred, s, start, end)

Filter the string s, retaining only those characters which satisfy char_pred.

If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.

Scheme Procedure: string-delete char_pred s [start [end]]
C Function: scm_string_delete (char_pred, s, start, end)

Delete characters satisfying char_pred from s.

If char_pred is a procedure, it is applied to each character as a predicate, if it is a character, it is tested for equality and if it is a character set, it is tested for membership.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.13 Representing Strings as Bytes

Out in the cold world outside of Guile, not all strings are treated in the same way. Out there there are only bytes, and there are many ways of representing a strings (sequences of characters) as binary data (sequences of bytes).

As a user, usually you don’t have to think about this very much. When you type on your keyboard, your system encodes your keystrokes as bytes according to the locale that you have configured on your computer. Guile uses the locale to decode those bytes back into characters – hopefully the same characters that you typed in.

All is not so clear when dealing with a system with multiple users, such as a web server. Your web server might get a request from one user for data encoded in the ISO-8859-1 character set, and then another request from a different user for UTF-8 data.

Guile provides an iconv module for converting between strings and sequences of bytes. See Bytevectors, for more on how Guile represents raw byte sequences. This module gets its name from the common UNIX command of the same name.

Note that often it is sufficient to just read and write strings from ports instead of using these functions. To do this, specify the port encoding using set-port-encoding!. See Ports, for more on ports and character encodings.

Unlike the rest of the procedures in this section, you have to load the iconv module before having access to these procedures:

(use-modules (ice-9 iconv))
Scheme Procedure: string->bytevector string encoding [conversion-strategy]

Encode string as a sequence of bytes.

The string will be encoded in the character set specified by the encoding string. If the string has characters that cannot be represented in the encoding, by default this procedure raises an encoding-error. Pass a conversion-strategy argument to specify other behaviors.

The return value is a bytevector. See Bytevectors, for more on bytevectors. See Ports, for more on character encodings and conversion strategies.

Scheme Procedure: bytevector->string bytevector encoding [conversion-strategy]

Decode bytevector into a string.

The bytes will be decoded from the character set by the encoding string. If the bytes do not form a valid encoding, by default this procedure raises an decoding-error. As with string->bytevector, pass the optional conversion-strategy argument to modify this behavior. See Ports, for more on character encodings and conversion strategies.

Scheme Procedure: call-with-output-encoded-string encoding proc [conversion-strategy]

Like call-with-output-string, but instead of returning a string, returns a encoding of the string according to encoding, as a bytevector. This procedure can be more efficient than collecting a string and then converting it via string->bytevector.


Next: , Previous: , Up: Strings   [Contents][Index]

6.6.5.14 Conversion to/from C

When creating a Scheme string from a C string or when converting a Scheme string to a C string, the concept of character encoding becomes important.

In C, a string is just a sequence of bytes, and the character encoding describes the relation between these bytes and the actual characters that make up the string. For Scheme strings, character encoding is not an issue (most of the time), since in Scheme you usually treat strings as character sequences, not byte sequences.

Converting to C and converting from C each have their own challenges.

When converting from C to Scheme, it is important that the sequence of bytes in the C string be valid with respect to its encoding. ASCII strings, for example, can’t have any bytes greater than 127. An ASCII byte greater than 127 is considered ill-formed and cannot be converted into a Scheme character.

Problems can occur in the reverse operation as well. Not all character encodings can hold all possible Scheme characters. Some encodings, like ASCII for example, can only describe a small subset of all possible characters. So, when converting to C, one must first decide what to do with Scheme characters that can’t be represented in the C string.

Converting a Scheme string to a C string will often allocate fresh memory to hold the result. You must take care that this memory is properly freed eventually. In many cases, this can be achieved by using scm_dynwind_free inside an appropriate dynwind context, See Dynamic Wind.

C Function: SCM scm_from_locale_string (const char *str)
C Function: SCM scm_from_locale_stringn (const char *str, size_t len)

Creates a new Scheme string that has the same contents as str when interpreted in the character encoding of the current locale.

For scm_from_locale_string, str must be null-terminated.

For scm_from_locale_stringn, len specifies the length of str in bytes, and str does not need to be null-terminated. If len is (size_t)-1, then str does need to be null-terminated and the real length will be found with strlen.

If the C string is ill-formed, an error will be raised.

Note that these functions should not be used to convert C string constants, because there is no guarantee that the current locale will match that of the execution character set, used for string and character constants. Most modern C compilers use UTF-8 by default, so to convert C string constants we recommend scm_from_utf8_string.

C Function: SCM scm_take_locale_string (char *str)
C Function: SCM scm_take_locale_stringn (char *str, size_t len)

Like scm_from_locale_string and scm_from_locale_stringn, respectively, but also frees str with free eventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.

C Function: char * scm_to_locale_string (SCM str)
C Function: char * scm_to_locale_stringn (SCM str, size_t *lenp)

Returns a C string with the same contents as str in the character encoding of the current locale. The C string must be freed with free eventually, maybe by using scm_dynwind_free, See Dynamic Wind.

For scm_to_locale_string, the returned string is null-terminated and an error is signalled when str contains #\nul characters.

For scm_to_locale_stringn and lenp not NULL, str might contain #\nul characters and the length of the returned string in bytes is stored in *lenp. The returned string will not be null-terminated in this case. If lenp is NULL, scm_to_locale_stringn behaves like scm_to_locale_string.

If a character in str cannot be represented in the character encoding of the current locale, the default port conversion strategy is used. See Ports, for more on conversion strategies.

If the conversion strategy is error, an error will be raised. If it is substitute, a replacement character, such as a question mark, will be inserted in its place. If it is escape, a hex escape will be inserted in its place.

C Function: size_t scm_to_locale_stringbuf (SCM str, char *buf, size_t max_len)

Puts str as a C string in the current locale encoding into the memory pointed to by buf. The buffer at buf has room for max_len bytes and scm_to_local_stringbuf will never store more than that. No terminating '\0' will be stored.

The return value of scm_to_locale_stringbuf is the number of bytes that are needed for all of str, regardless of whether buf was large enough to hold them. Thus, when the return value is larger than max_len, only max_len bytes have been stored and you probably need to try again with a larger buffer.

For most situations, string conversion should occur using the current locale, such as with the functions above. But there may be cases where one wants to convert strings from a character encoding other than the locale’s character encoding. For these cases, the lower-level functions scm_to_stringn and scm_from_stringn are provided. These functions should seldom be necessary if one is properly using locales.

C Type: scm_t_string_failed_conversion_handler

This is an enumerated type that can take one of three values: SCM_FAILED_CONVERSION_ERROR, SCM_FAILED_CONVERSION_QUESTION_MARK, and SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE. They are used to indicate a strategy for handling characters that cannot be converted to or from a given character encoding. SCM_FAILED_CONVERSION_ERROR indicates that a conversion should throw an error if some characters cannot be converted. SCM_FAILED_CONVERSION_QUESTION_MARK indicates that a conversion should replace unconvertable characters with the question mark character. And, SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE requests that a conversion should replace an unconvertable character with an escape sequence.

While all three strategies apply when converting Scheme strings to C, only SCM_FAILED_CONVERSION_ERROR and SCM_FAILED_CONVERSION_QUESTION_MARK can be used when converting C strings to Scheme.

C Function: char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding, scm_t_string_failed_conversion_handler handler)

This function returns a newly allocated C string from the Guile string str. The length of the returned string in bytes will be returned in lenp. The character encoding of the C string is passed as the ASCII, null-terminated C string encoding. The handler parameter gives a strategy for dealing with characters that cannot be converted into encoding.

If lenp is NULL, this function will return a null-terminated C string. It will throw an error if the string contains a null character.

The Scheme interface to this function is string->bytevector, from the ice-9 iconv module. See Representing Strings as Bytes.

C Function: SCM scm_from_stringn (const char *str, size_t len, const char *encoding, scm_t_string_failed_conversion_handler handler)

This function returns a scheme string from the C string str. The length in bytes of the C string is input as len. The encoding of the C string is passed as the ASCII, null-terminated C string encoding. The handler parameters suggests a strategy for dealing with unconvertable characters.

The Scheme interface to this function is bytevector->string. See Representing Strings as Bytes.

The following conversion functions are provided as a convenience for the most commonly used encodings.

C Function: SCM scm_from_latin1_string (const char *str)
C Function: SCM scm_from_utf8_string (const char *str)
C Function: SCM scm_from_utf32_string (const scm_t_wchar *str)

Return a scheme string from the null-terminated C string str, which is ISO-8859-1-, UTF-8-, or UTF-32-encoded. These functions should be used to convert hard-coded C string constants into Scheme strings.

C Function: SCM scm_from_latin1_stringn (const char *str, size_t len)
C Function: SCM scm_from_utf8_stringn (const char *str, size_t len)
C Function: SCM scm_from_utf32_stringn (const scm_t_wchar *str, size_t len)

Return a scheme string from C string str, which is ISO-8859-1-, UTF-8-, or UTF-32-encoded, of length len. len is the number of bytes pointed to by str for scm_from_latin1_stringn and scm_from_utf8_stringn; it is the number of elements (code points) in str in the case of scm_from_utf32_stringn.

C function: char *scm_to_latin1_stringn (SCM str, size_t *lenp)
C function: char *scm_to_utf8_stringn (SCM str, size_t *lenp)
C function: scm_t_wchar *scm_to_utf32_stringn (SCM str, size_t *lenp)

Return a newly allocated, ISO-8859-1-, UTF-8-, or UTF-32-encoded C string from Scheme string str. An error is thrown when str cannot be converted to the specified encoding. If lenp is NULL, the returned C string will be null terminated, and an error will be thrown if the C string would otherwise contain null characters. If lenp is not NULL, the string is not null terminated, and the length of the returned string is returned in lenp. The length returned is the number of bytes for scm_to_latin1_stringn and scm_to_utf8_stringn; it is the number of elements (code points) for scm_to_utf32_stringn.


Previous: , Up: Strings   [Contents][Index]

6.6.5.15 String Internals

Guile stores each string in memory as a contiguous array of Unicode code points along with an associated set of attributes. If all of the code points of a string have an integer range between 0 and 255 inclusive, the code point array is stored as one byte per code point: it is stored as an ISO-8859-1 (aka Latin-1) string. If any of the code points of the string has an integer value greater that 255, the code point array is stored as four bytes per code point: it is stored as a UTF-32 string.

Conversion between the one-byte-per-code-point and four-bytes-per-code-point representations happens automatically as necessary.

No API is provided to set the internal representation of strings; however, there are pair of procedures available to query it. These are debugging procedures. Using them in production code is discouraged, since the details of Guile’s internal representation of strings may change from release to release.

Scheme Procedure: string-bytes-per-char str
C Function: scm_string_bytes_per_char (str)

Return the number of bytes used to encode a Unicode code point in string str. The result is one or four.

Scheme Procedure: %string-dump str
C Function: scm_sys_string_dump (str)

Returns an association list containing debugging information for str. The association list has the following entries.

string

The string itself.

start

The start index of the string into its stringbuf

length

The length of the string

shared

If this string is a substring, it returns its parent string. Otherwise, it returns #f

read-only

#t if the string is read-only

stringbuf-chars

A new string containing this string’s stringbuf’s characters

stringbuf-length

The number of characters in this stringbuf

stringbuf-shared

#t if this stringbuf is shared

stringbuf-wide

#t if this stringbuf’s characters are stored in a 32-bit buffer, or #f if they are stored in an 8-bit buffer


Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.6 Bytevectors

A bytevector is a raw bit string. The (rnrs bytevectors) module provides the programming interface specified by the Revised^6 Report on the Algorithmic Language Scheme (R6RS). It contains procedures to manipulate bytevectors and interpret their contents in a number of ways: bytevector contents can be accessed as signed or unsigned integer of various sizes and endianness, as IEEE-754 floating point numbers, or as strings. It is a useful tool to encode and decode binary data.

The R6RS (Section 4.3.4) specifies an external representation for bytevectors, whereby the octets (integers in the range 0–255) contained in the bytevector are represented as a list prefixed by #vu8:

#vu8(1 53 204)

denotes a 3-byte bytevector containing the octets 1, 53, and 204. Like string literals, booleans, etc., bytevectors are “self-quoting”, i.e., they do not need to be quoted:

#vu8(1 53 204)
⇒ #vu8(1 53 204)

Bytevectors can be used with the binary input/output primitives of the R6RS (see R6RS I/O Ports).


Next: , Up: Bytevectors   [Contents][Index]

6.6.6.1 Endianness

Some of the following procedures take an endianness parameter. The endianness is defined as the order of bytes in multi-byte numbers: numbers encoded in big endian have their most significant bytes written first, whereas numbers encoded in little endian have their least significant bytes first5.

Little-endian is the native endianness of the IA32 architecture and its derivatives, while big-endian is native to SPARC and PowerPC, among others. The native-endianness procedure returns the native endianness of the machine it runs on.

Scheme Procedure: native-endianness
C Function: scm_native_endianness ()

Return a value denoting the native endianness of the host machine.

Scheme Macro: endianness symbol

Return an object denoting the endianness specified by symbol. If symbol is neither big nor little then an error is raised at expand-time.

C Variable: scm_endianness_big
C Variable: scm_endianness_little

The objects denoting big- and little-endianness, respectively.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.2 Manipulating Bytevectors

Bytevectors can be created, copied, and analyzed with the following procedures and C functions.

Scheme Procedure: make-bytevector len [fill]
C Function: scm_make_bytevector (len, fill)
C Function: scm_c_make_bytevector (size_t len)

Return a new bytevector of len bytes. Optionally, if fill is given, fill it with fill; fill must be in the range [-128,255].

Scheme Procedure: bytevector? obj
C Function: scm_bytevector_p (obj)

Return true if obj is a bytevector.

C Function: int scm_is_bytevector (SCM obj)

Equivalent to scm_is_true (scm_bytevector_p (obj)).

Scheme Procedure: bytevector-length bv
C Function: scm_bytevector_length (bv)

Return the length in bytes of bytevector bv.

C Function: size_t scm_c_bytevector_length (SCM bv)

Likewise, return the length in bytes of bytevector bv.

Scheme Procedure: bytevector=? bv1 bv2
C Function: scm_bytevector_eq_p (bv1, bv2)

Return is bv1 equals to bv2—i.e., if they have the same length and contents.

Scheme Procedure: bytevector-fill! bv fill
C Function: scm_bytevector_fill_x (bv, fill)

Fill bytevector bv with fill, a byte.

Scheme Procedure: bytevector-copy! source source-start target target-start len
C Function: scm_bytevector_copy_x (source, source_start, target, target_start, len)

Copy len bytes from source into target, starting reading from source-start (a positive index within source) and start writing at target-start. It is permitted for the source and target regions to overlap.

Scheme Procedure: bytevector-copy bv
C Function: scm_bytevector_copy (bv)

Return a newly allocated copy of bv.

C Function: scm_t_uint8 scm_c_bytevector_ref (SCM bv, size_t index)

Return the byte at index in bytevector bv.

C Function: void scm_c_bytevector_set_x (SCM bv, size_t index, scm_t_uint8 value)

Set the byte at index in bv to value.

Low-level C macros are available. They do not perform any type-checking; as such they should be used with care.

C Macro: size_t SCM_BYTEVECTOR_LENGTH (bv)

Return the length in bytes of bytevector bv.

C Macro: signed char * SCM_BYTEVECTOR_CONTENTS (bv)

Return a pointer to the contents of bytevector bv.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.3 Interpreting Bytevector Contents as Integers

The contents of a bytevector can be interpreted as a sequence of integers of any given size, sign, and endianness.

(let ((bv (make-bytevector 4)))
  (bytevector-u8-set! bv 0 #x12)
  (bytevector-u8-set! bv 1 #x34)
  (bytevector-u8-set! bv 2 #x56)
  (bytevector-u8-set! bv 3 #x78)

  (map (lambda (number)
         (number->string number 16))
       (list (bytevector-u8-ref bv 0)
             (bytevector-u16-ref bv 0 (endianness big))
             (bytevector-u32-ref bv 0 (endianness little)))))

⇒ ("12" "1234" "78563412")

The most generic procedures to interpret bytevector contents as integers are described below.

Scheme Procedure: bytevector-uint-ref bv index endianness size
C Function: scm_bytevector_uint_ref (bv, index, endianness, size)

Return the size-byte long unsigned integer at index index in bv, decoded according to endianness.

Scheme Procedure: bytevector-sint-ref bv index endianness size
C Function: scm_bytevector_sint_ref (bv, index, endianness, size)

Return the size-byte long signed integer at index index in bv, decoded according to endianness.

Scheme Procedure: bytevector-uint-set! bv index value endianness size
C Function: scm_bytevector_uint_set_x (bv, index, value, endianness, size)

Set the size-byte long unsigned integer at index to value, encoded according to endianness.

Scheme Procedure: bytevector-sint-set! bv index value endianness size
C Function: scm_bytevector_sint_set_x (bv, index, value, endianness, size)

Set the size-byte long signed integer at index to value, encoded according to endianness.

The following procedures are similar to the ones above, but specialized to a given integer size:

Scheme Procedure: bytevector-u8-ref bv index
Scheme Procedure: bytevector-s8-ref bv index
Scheme Procedure: bytevector-u16-ref bv index endianness
Scheme Procedure: bytevector-s16-ref bv index endianness
Scheme Procedure: bytevector-u32-ref bv index endianness
Scheme Procedure: bytevector-s32-ref bv index endianness
Scheme Procedure: bytevector-u64-ref bv index endianness
Scheme Procedure: bytevector-s64-ref bv index endianness
C Function: scm_bytevector_u8_ref (bv, index)
C Function: scm_bytevector_s8_ref (bv, index)
C Function: scm_bytevector_u16_ref (bv, index, endianness)
C Function: scm_bytevector_s16_ref (bv, index, endianness)
C Function: scm_bytevector_u32_ref (bv, index, endianness)
C Function: scm_bytevector_s32_ref (bv, index, endianness)
C Function: scm_bytevector_u64_ref (bv, index, endianness)
C Function: scm_bytevector_s64_ref (bv, index, endianness)

Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to endianness.

Scheme Procedure: bytevector-u8-set! bv index value
Scheme Procedure: bytevector-s8-set! bv index value
Scheme Procedure: bytevector-u16-set! bv index value endianness
Scheme Procedure: bytevector-s16-set! bv index value endianness
Scheme Procedure: bytevector-u32-set! bv index value endianness
Scheme Procedure: bytevector-s32-set! bv index value endianness
Scheme Procedure: bytevector-u64-set! bv index value endianness
Scheme Procedure: bytevector-s64-set! bv index value endianness
C Function: scm_bytevector_u8_set_x (bv, index, value)
C Function: scm_bytevector_s8_set_x (bv, index, value)
C Function: scm_bytevector_u16_set_x (bv, index, value, endianness)
C Function: scm_bytevector_s16_set_x (bv, index, value, endianness)
C Function: scm_bytevector_u32_set_x (bv, index, value, endianness)
C Function: scm_bytevector_s32_set_x (bv, index, value, endianness)
C Function: scm_bytevector_u64_set_x (bv, index, value, endianness)
C Function: scm_bytevector_s64_set_x (bv, index, value, endianness)

Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to endianness.

Finally, a variant specialized for the host’s endianness is available for each of these functions (with the exception of the u8 accessors, for obvious reasons):

Scheme Procedure: bytevector-u16-native-ref bv index
Scheme Procedure: bytevector-s16-native-ref bv index
Scheme Procedure: bytevector-u32-native-ref bv index
Scheme Procedure: bytevector-s32-native-ref bv index
Scheme Procedure: bytevector-u64-native-ref bv index
Scheme Procedure: bytevector-s64-native-ref bv index
C Function: scm_bytevector_u16_native_ref (bv, index)
C Function: scm_bytevector_s16_native_ref (bv, index)
C Function: scm_bytevector_u32_native_ref (bv, index)
C Function: scm_bytevector_s32_native_ref (bv, index)
C Function: scm_bytevector_u64_native_ref (bv, index)
C Function: scm_bytevector_s64_native_ref (bv, index)

Return the unsigned n-bit (signed) integer (where n is 8, 16, 32 or 64) from bv at index, decoded according to the host’s native endianness.

Scheme Procedure: bytevector-u16-native-set! bv index value
Scheme Procedure: bytevector-s16-native-set! bv index value
Scheme Procedure: bytevector-u32-native-set! bv index value
Scheme Procedure: bytevector-s32-native-set! bv index value
Scheme Procedure: bytevector-u64-native-set! bv index value
Scheme Procedure: bytevector-s64-native-set! bv index value
C Function: scm_bytevector_u16_native_set_x (bv, index, value)
C Function: scm_bytevector_s16_native_set_x (bv, index, value)
C Function: scm_bytevector_u32_native_set_x (bv, index, value)
C Function: scm_bytevector_s32_native_set_x (bv, index, value)
C Function: scm_bytevector_u64_native_set_x (bv, index, value)
C Function: scm_bytevector_s64_native_set_x (bv, index, value)

Store value as an n-bit (signed) integer (where n is 8, 16, 32 or 64) in bv at index, encoded according to the host’s native endianness.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.4 Converting Bytevectors to/from Integer Lists

Bytevector contents can readily be converted to/from lists of signed or unsigned integers:

(bytevector->sint-list (u8-list->bytevector (make-list 4 255))
                       (endianness little) 2)
⇒ (-1 -1)
Scheme Procedure: bytevector->u8-list bv
C Function: scm_bytevector_to_u8_list (bv)

Return a newly allocated list of unsigned 8-bit integers from the contents of bv.

Scheme Procedure: u8-list->bytevector lst
C Function: scm_u8_list_to_bytevector (lst)

Return a newly allocated bytevector consisting of the unsigned 8-bit integers listed in lst.

Scheme Procedure: bytevector->uint-list bv endianness size
C Function: scm_bytevector_to_uint_list (bv, endianness, size)

Return a list of unsigned integers of size bytes representing the contents of bv, decoded according to endianness.

Scheme Procedure: bytevector->sint-list bv endianness size
C Function: scm_bytevector_to_sint_list (bv, endianness, size)

Return a list of signed integers of size bytes representing the contents of bv, decoded according to endianness.

Scheme Procedure: uint-list->bytevector lst endianness size
C Function: scm_uint_list_to_bytevector (lst, endianness, size)

Return a new bytevector containing the unsigned integers listed in lst and encoded on size bytes according to endianness.

Scheme Procedure: sint-list->bytevector lst endianness size
C Function: scm_sint_list_to_bytevector (lst, endianness, size)

Return a new bytevector containing the signed integers listed in lst and encoded on size bytes according to endianness.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.5 Interpreting Bytevector Contents as Floating Point Numbers

Bytevector contents can also be accessed as IEEE-754 single- or double-precision floating point numbers (respectively 32 and 64-bit long) using the procedures described here.

Scheme Procedure: bytevector-ieee-single-ref bv index endianness
Scheme Procedure: bytevector-ieee-double-ref bv index endianness
C Function: scm_bytevector_ieee_single_ref (bv, index, endianness)
C Function: scm_bytevector_ieee_double_ref (bv, index, endianness)

Return the IEEE-754 single-precision floating point number from bv at index according to endianness.

Scheme Procedure: bytevector-ieee-single-set! bv index value endianness
Scheme Procedure: bytevector-ieee-double-set! bv index value endianness
C Function: scm_bytevector_ieee_single_set_x (bv, index, value, endianness)
C Function: scm_bytevector_ieee_double_set_x (bv, index, value, endianness)

Store real number value in bv at index according to endianness.

Specialized procedures are also available:

Scheme Procedure: bytevector-ieee-single-native-ref bv index
Scheme Procedure: bytevector-ieee-double-native-ref bv index
C Function: scm_bytevector_ieee_single_native_ref (bv, index)
C Function: scm_bytevector_ieee_double_native_ref (bv, index)

Return the IEEE-754 single-precision floating point number from bv at index according to the host’s native endianness.

Scheme Procedure: bytevector-ieee-single-native-set! bv index value
Scheme Procedure: bytevector-ieee-double-native-set! bv index value
C Function: scm_bytevector_ieee_single_native_set_x (bv, index, value)
C Function: scm_bytevector_ieee_double_native_set_x (bv, index, value)

Store real number value in bv at index according to the host’s native endianness.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.6 Interpreting Bytevector Contents as Unicode Strings

Bytevector contents can also be interpreted as Unicode strings encoded in one of the most commonly available encoding formats. See Representing Strings as Bytes, for a more generic interface.

(utf8->string (u8-list->bytevector '(99 97 102 101)))
⇒ "cafe"

(string->utf8 "café") ;; SMALL LATIN LETTER E WITH ACUTE ACCENT
⇒ #vu8(99 97 102 195 169)
Scheme Procedure: string->utf8 str
Scheme Procedure: string->utf16 str [endianness]
Scheme Procedure: string->utf32 str [endianness]
C Function: scm_string_to_utf8 (str)
C Function: scm_string_to_utf16 (str, endianness)
C Function: scm_string_to_utf32 (str, endianness)

Return a newly allocated bytevector that contains the UTF-8, UTF-16, or UTF-32 (aka. UCS-4) encoding of str. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.

Scheme Procedure: utf8->string utf
Scheme Procedure: utf16->string utf [endianness]
Scheme Procedure: utf32->string utf [endianness]
C Function: scm_utf8_to_string (utf)
C Function: scm_utf16_to_string (utf, endianness)
C Function: scm_utf32_to_string (utf, endianness)

Return a newly allocated string that contains from the UTF-8-, UTF-16-, or UTF-32-decoded contents of bytevector utf. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.


Next: , Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.7 Accessing Bytevectors with the Array API

As an extension to the R6RS, Guile allows bytevectors to be manipulated with the array procedures (see Arrays). When using these APIs, bytes are accessed one at a time as 8-bit unsigned integers:

(define bv #vu8(0 1 2 3))

(array? bv)
⇒ #t

(array-rank bv)
⇒ 1

(array-ref bv 2)
⇒ 2

;; Note the different argument order on array-set!.
(array-set! bv 77 2)
(array-ref bv 2)
⇒ 77

(array-type bv)
⇒ vu8

Previous: , Up: Bytevectors   [Contents][Index]

6.6.6.8 Accessing Bytevectors with the SRFI-4 API

Bytevectors may also be accessed with the SRFI-4 API. See SRFI-4 and Bytevectors, for more information.


Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.7 Symbols

Symbols in Scheme are widely used in three ways: as items of discrete data, as lookup keys for alists and hash tables, and to denote variable references.

A symbol is similar to a string in that it is defined by a sequence of characters. The sequence of characters is known as the symbol’s name. In the usual case — that is, where the symbol’s name doesn’t include any characters that could be confused with other elements of Scheme syntax — a symbol is written in a Scheme program by writing the sequence of characters that make up the name, without any quotation marks or other special syntax. For example, the symbol whose name is “multiply-by-2” is written, simply:

multiply-by-2

Notice how this differs from a string with contents “multiply-by-2”, which is written with double quotation marks, like this:

"multiply-by-2"

Looking beyond how they are written, symbols are different from strings in two important respects.

The first important difference is uniqueness. If the same-looking string is read twice from two different places in a program, the result is two different string objects whose contents just happen to be the same. If, on the other hand, the same-looking symbol is read twice from two different places in a program, the result is the same symbol object both times.

Given two read symbols, you can use eq? to test whether they are the same (that is, have the same name). eq? is the most efficient comparison operator in Scheme, and comparing two symbols like this is as fast as comparing, for example, two numbers. Given two strings, on the other hand, you must use equal? or string=?, which are much slower comparison operators, to determine whether the strings have the same contents.

(define sym1 (quote hello))
(define sym2 (quote hello))
(eq? sym1 sym2) ⇒ #t

(define str1 "hello")
(define str2 "hello")
(eq? str1 str2) ⇒ #f
(equal? str1 str2) ⇒ #t

The second important difference is that symbols, unlike strings, are not self-evaluating. This is why we need the (quote …)s in the example above: (quote hello) evaluates to the symbol named "hello" itself, whereas an unquoted hello is read as the symbol named "hello" and evaluated as a variable reference … about which more below (see Symbol Variables).


Next: , Up: Symbols   [Contents][Index]

6.6.7.1 Symbols as Discrete Data

Numbers and symbols are similar to the extent that they both lend themselves to eq? comparison. But symbols are more descriptive than numbers, because a symbol’s name can be used directly to describe the concept for which that symbol stands.

For example, imagine that you need to represent some colours in a computer program. Using numbers, you would have to choose arbitrarily some mapping between numbers and colours, and then take care to use that mapping consistently:

;; 1=red, 2=green, 3=purple

(if (eq? (colour-of car) 1)
    ...)

You can make the mapping more explicit and the code more readable by defining constants:

(define red 1)
(define green 2)
(define purple 3)

(if (eq? (colour-of car) red)
    ...)

But the simplest and clearest approach is not to use numbers at all, but symbols whose names specify the colours that they refer to:

(if (eq? (colour-of car) 'red)
    ...)

The descriptive advantages of symbols over numbers increase as the set of concepts that you want to describe grows. Suppose that a car object can have other properties as well, such as whether it has or uses:

Then a car’s combined property set could be naturally represented and manipulated as a list of symbols:

(properties-of car1)
⇒
(red manual unleaded power-steering)

(if (memq 'power-steering (properties-of car1))
    (display "Unfit people can drive this car.\n")
    (display "You'll need strong arms to drive this car!\n"))
-|
Unfit people can drive this car.

Remember, the fundamental property of symbols that we are relying on here is that an occurrence of 'red in one part of a program is an indistinguishable symbol from an occurrence of 'red in another part of a program; this means that symbols can usefully be compared using eq?. At the same time, symbols have naturally descriptive names. This combination of efficiency and descriptive power makes them ideal for use as discrete data.


Next: , Previous: , Up: Symbols   [Contents][Index]

6.6.7.2 Symbols as Lookup Keys

Given their efficiency and descriptive power, it is natural to use symbols as the keys in an association list or hash table.

To illustrate this, consider a more structured representation of the car properties example from the preceding subsection. Rather than mixing all the properties up together in a flat list, we could use an association list like this:

(define car1-properties '((colour . red)
                          (transmission . manual)
                          (fuel . unleaded)
                          (steering . power-assisted)))

Notice how this structure is more explicit and extensible than the flat list. For example it makes clear that manual refers to the transmission rather than, say, the windows or the locking of the car. It also allows further properties to use the same symbols among their possible values without becoming ambiguous:

(define car1-properties '((colour . red)
                          (transmission . manual)
                          (fuel . unleaded)
                          (steering . power-assisted)
                          (seat-colour . red)
                          (locking . manual)))

With a representation like this, it is easy to use the efficient assq-XXX family of procedures (see Association Lists) to extract or change individual pieces of information:

(assq-ref car1-properties 'fuel) ⇒ unleaded
(assq-ref car1-properties 'transmission) ⇒ manual

(assq-set! car1-properties 'seat-colour 'black)
⇒
((colour . red)
 (transmission . manual)
 (fuel . unleaded)
 (steering . power-assisted)
 (seat-colour . black)
 (locking . manual)))

Hash tables also have keys, and exactly the same arguments apply to the use of symbols in hash tables as in association lists. The hash value that Guile uses to decide where to add a symbol-keyed entry to a hash table can be obtained by calling the symbol-hash procedure:

Scheme Procedure: symbol-hash symbol
C Function: scm_symbol_hash (symbol)

Return a hash value for symbol.

See Hash Tables for information about hash tables in general, and for why you might choose to use a hash table rather than an association list.


Next: , Previous: , Up: Symbols   [Contents][Index]

6.6.7.3 Symbols as Denoting Variables

When an unquoted symbol in a Scheme program is evaluated, it is interpreted as a variable reference, and the result of the evaluation is the appropriate variable’s value.

For example, when the expression (string-length "abcd") is read and evaluated, the sequence of characters string-length is read as the symbol whose name is "string-length". This symbol is associated with a variable whose value is the procedure that implements string length calculation. Therefore evaluation of the string-length symbol results in that procedure.

The details of the connection between an unquoted symbol and the variable to which it refers are explained elsewhere. See Binding Constructs, for how associations between symbols and variables are created, and Modules, for how those associations are affected by Guile’s module system.


Next: , Previous: , Up: Symbols   [Contents][Index]

6.6.7.4 Operations Related to Symbols

Given any Scheme value, you can determine whether it is a symbol using the symbol? primitive:

Scheme Procedure: symbol? obj
C Function: scm_symbol_p (obj)

Return #t if obj is a symbol, otherwise return #f.

C Function: int scm_is_symbol (SCM val)

Equivalent to scm_is_true (scm_symbol_p (val)).

Once you know that you have a symbol, you can obtain its name as a string by calling symbol->string. Note that Guile differs by default from R5RS on the details of symbol->string as regards case-sensitivity:

Scheme Procedure: symbol->string s
C Function: scm_symbol_to_string (s)

Return the name of symbol s as a string. By default, Guile reads symbols case-sensitively, so the string returned will have the same case variation as the sequence of characters that caused s to be created.

If Guile is set to read symbols case-insensitively (as specified by R5RS), and s comes into being as part of a literal expression (see Literal expressions in The Revised^5 Report on Scheme) or by a call to the read or string-ci->symbol procedures, Guile converts any alphabetic characters in the symbol’s name to lower case before creating the symbol object, so the string returned here will be in lower case.

If s was created by string->symbol, the case of characters in the string returned will be the same as that in the string that was passed to string->symbol, regardless of Guile’s case-sensitivity setting at the time s was created.

It is an error to apply mutation procedures like string-set! to strings returned by this procedure.

Most symbols are created by writing them literally in code. However it is also possible to create symbols programmatically using the following procedures:

Scheme Procedure: symbol char…

Return a newly allocated symbol made from the given character arguments.

(symbol #\x #\y #\z) ⇒ xyz
Scheme Procedure: list->symbol lst

Return a newly allocated symbol made from a list of characters.

(list->symbol '(#\a #\b #\c)) ⇒ abc
Scheme Procedure: symbol-append arg …

Return a newly allocated symbol whose characters form the concatenation of the given symbols, arg ....

(let ((h 'hello))
  (symbol-append h 'world))
⇒ helloworld
Scheme Procedure: string->symbol string
C Function: scm_string_to_symbol (string)

Return the symbol whose name is string. This procedure can create symbols with names containing special characters or letters in the non-standard case, but it is usually a bad idea to create such symbols because in some implementations of Scheme they cannot be read as themselves.

Scheme Procedure: string-ci->symbol str
C Function: scm_string_ci_to_symbol (str)

Return the symbol whose name is str. If Guile is currently reading symbols case-insensitively, str is converted to lowercase before the returned symbol is looked up or created.

The following examples illustrate Guile’s detailed behaviour as regards the case-sensitivity of symbols:

(read-enable 'case-insensitive)   ; R5RS compliant behaviour

(symbol->string 'flying-fish)    ⇒ "flying-fish"
(symbol->string 'Martin)         ⇒ "martin"
(symbol->string
   (string->symbol "Malvina"))   ⇒ "Malvina"

(eq? 'mISSISSIppi 'mississippi)  ⇒ #t
(string->symbol "mISSISSIppi")   ⇒ mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #f
(eq? 'LolliPop
  (string->symbol (symbol->string 'LolliPop))) ⇒ #t
(string=? "K. Harper, M.D."
  (symbol->string
    (string->symbol "K. Harper, M.D."))) ⇒ #t

(read-disable 'case-insensitive)   ; Guile default behaviour

(symbol->string 'flying-fish)    ⇒ "flying-fish"
(symbol->string 'Martin)         ⇒ "Martin"
(symbol->string
   (string->symbol "Malvina"))   ⇒ "Malvina"

(eq? 'mISSISSIppi 'mississippi)  ⇒ #f
(string->symbol "mISSISSIppi")   ⇒ mISSISSIppi
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒ #t
(eq? 'LolliPop
  (string->symbol (symbol->string 'LolliPop))) ⇒ #t
(string=? "K. Harper, M.D."
  (symbol->string
    (string->symbol "K. Harper, M.D."))) ⇒ #t

From C, there are lower level functions that construct a Scheme symbol from a C string in the current locale encoding.

When you want to do more from C, you should convert between symbols and strings using scm_symbol_to_string and scm_string_to_symbol and work with the strings.

C Function: SCM scm_from_latin1_symbol (const char *name)
C Function: SCM scm_from_utf8_symbol (const char *name)

Construct and return a Scheme symbol whose name is specified by the null-terminated C string name. These are appropriate when the C string is hard-coded in the source code.

C Function: SCM scm_from_locale_symbol (const char *name)
C Function: SCM scm_from_locale_symboln (const char *name, size_t len)

Construct and return a Scheme symbol whose name is specified by name. For scm_from_locale_symbol, name must be null terminated; for scm_from_locale_symboln the length of name is specified explicitly by len.

Note that these functions should not be used when name is a C string constant, because there is no guarantee that the current locale will match that of the execution character set, used for string and character constants. Most modern C compilers use UTF-8 by default, so in such cases we recommend scm_from_utf8_symbol.

C Function: SCM scm_take_locale_symbol (char *str)
C Function: SCM scm_take_locale_symboln (char *str, size_t len)

Like scm_from_locale_symbol and scm_from_locale_symboln, respectively, but also frees str with free eventually. Thus, you can use this function when you would free str anyway immediately after creating the Scheme string. In certain cases, Guile can then use str directly as its internal representation.

The size of a symbol can also be obtained from C:

C Function: size_t scm_c_symbol_length (SCM sym)

Return the number of characters in sym.

Finally, some applications, especially those that generate new Scheme code dynamically, need to generate symbols for use in the generated code. The gensym primitive meets this need:

Scheme Procedure: gensym [prefix]
C Function: scm_gensym (prefix)

Create a new symbol with a name constructed from a prefix and a counter value. The string prefix can be specified as an optional argument. Default prefix is ‘ g’. The counter is increased by 1 at each call. There is no provision for resetting the counter.

The symbols generated by gensym are likely to be unique, since their names begin with a space and it is only otherwise possible to generate such symbols if a programmer goes out of their way to do so. Uniqueness can be guaranteed by instead using uninterned symbols (see Symbol Uninterned), though they can’t be usefully written out and read back in.


Next: , Previous: , Up: Symbols   [Contents][Index]

6.6.7.5 Function Slots and Property Lists

In traditional Lisp dialects, symbols are often understood as having three kinds of value at once:

Although Scheme (as one of its simplifications with respect to Lisp) does away with the distinction between variable and function namespaces, Guile currently retains some elements of the traditional structure in case they turn out to be useful when implementing translators for other languages, in particular Emacs Lisp.

Specifically, Guile symbols have two extra slots, one for a symbol’s property list, and one for its “function value.” The following procedures are provided to access these slots.

Scheme Procedure: symbol-fref symbol
C Function: scm_symbol_fref (symbol)

Return the contents of symbol’s function slot.

Scheme Procedure: symbol-fset! symbol value
C Function: scm_symbol_fset_x (symbol, value)

Set the contents of symbol’s function slot to value.

Scheme Procedure: symbol-pref symbol
C Function: scm_symbol_pref (symbol)

Return the property list currently associated with symbol.

Scheme Procedure: symbol-pset! symbol value
C Function: scm_symbol_pset_x (symbol, value)

Set symbol’s property list to value.

Scheme Procedure: symbol-property sym prop

From sym’s property list, return the value for property prop. The assumption is that sym’s property list is an association list whose keys are distinguished from each other using equal?; prop should be one of the keys in that list. If the property list has no entry for prop, symbol-property returns #f.

Scheme Procedure: set-symbol-property! sym prop val

In sym’s property list, set the value for property prop to val, or add a new entry for prop, with value val, if none already exists. For the structure of the property list, see symbol-property.

Scheme Procedure: symbol-property-remove! sym prop

From sym’s property list, remove the entry for property prop, if there is one. For the structure of the property list, see symbol-property.

Support for these extra slots may be removed in a future release, and it is probably better to avoid using them. For a more modern and Schemely approach to properties, see Object Properties.


Next: , Previous: , Up: Symbols   [Contents][Index]

6.6.7.6 Extended Read Syntax for Symbols

The read syntax for a symbol is a sequence of letters, digits, and extended alphabetic characters, beginning with a character that cannot begin a number. In addition, the special cases of +, -, and ... are read as symbols even though numbers can begin with +, - or ..

Extended alphabetic characters may be used within identifiers as if they were letters. The set of extended alphabetic characters is:

! $ % & * + - . / : < = > ? @ ^ _ ~

In addition to the standard read syntax defined above (which is taken from R5RS (see Formal syntax in The Revised^5 Report on Scheme)), Guile provides an extended symbol read syntax that allows the inclusion of unusual characters such as space characters, newlines and parentheses. If (for whatever reason) you need to write a symbol containing characters not mentioned above, you can do so as follows.

Here are a few examples of this form of read syntax. The first symbol needs to use extended syntax because it contains a space character, the second because it contains a line break, and the last because it looks like a number.

#{foo bar}#

#{what
ever}#

#{4242}#

Although Guile provides this extended read syntax for symbols, widespread usage of it is discouraged because it is not portable and not very readable.

Alternatively, if you enable the r7rs-symbols read option (see see Scheme Read), you can write arbitrary symbols using the same notation used for strings, except delimited by vertical bars instead of double quotes.

|foo bar|
|\x3BB; is a greek lambda|
|\| is a vertical bar|

Note that there’s also an r7rs-symbols print option (see Scheme Write). To enable the use of this notation, evaluate one or both of the following expressions:

(read-enable  'r7rs-symbols)
(print-enable 'r7rs-symbols)

Previous: , Up: Symbols   [Contents][Index]

6.6.7.7 Uninterned Symbols

What makes symbols useful is that they are automatically kept unique. There are no two symbols that are distinct objects but have the same name. But of course, there is no rule without exception. In addition to the normal symbols that have been discussed up to now, you can also create special uninterned symbols that behave slightly differently.

To understand what is different about them and why they might be useful, we look at how normal symbols are actually kept unique.

Whenever Guile wants to find the symbol with a specific name, for example during read or when executing string->symbol, it first looks into a table of all existing symbols to find out whether a symbol with the given name already exists. When this is the case, Guile just returns that symbol. When not, a new symbol with the name is created and entered into the table so that it can be found later.

Sometimes you might want to create a symbol that is guaranteed ‘fresh’, i.e. a symbol that did not exist previously. You might also want to somehow guarantee that no one else will ever unintentionally stumble across your symbol in the future. These properties of a symbol are often needed when generating code during macro expansion. When introducing new temporary variables, you want to guarantee that they don’t conflict with variables in other people’s code.

The simplest way to arrange for this is to create a new symbol but not enter it into the global table of all symbols. That way, no one will ever get access to your symbol by chance. Symbols that are not in the table are called uninterned. Of course, symbols that are in the table are called interned.

You create new uninterned symbols with the function make-symbol. You can test whether a symbol is interned or not with symbol-interned?.

Uninterned symbols break the rule that the name of a symbol uniquely identifies the symbol object. Because of this, they can not be written out and read back in like interned symbols. Currently, Guile has no support for reading uninterned symbols. Note that the function gensym does not return uninterned symbols for this reason.

Scheme Procedure: make-symbol name
C Function: scm_make_symbol (name)

Return a new uninterned symbol with the name name. The returned symbol is guaranteed to be unique and future calls to string->symbol will not return it.

Scheme Procedure: symbol-interned? symbol
C Function: scm_symbol_interned_p (symbol)

Return #t if symbol is interned, otherwise return #f.

For example:

(define foo-1 (string->symbol "foo"))
(define foo-2 (string->symbol "foo"))
(define foo-3 (make-symbol "foo"))
(define foo-4 (make-symbol "foo"))

(eq? foo-1 foo-2)
⇒ #t
; Two interned symbols with the same name are the same object,

(eq? foo-1 foo-3)
⇒ #f
; but a call to make-symbol with the same name returns a
; distinct object.

(eq? foo-3 foo-4)
⇒ #f
; A call to make-symbol always returns a new object, even for
; the same name.

foo-3
⇒ #<uninterned-symbol foo 8085290>
; Uninterned symbols print differently from interned symbols,

(symbol? foo-3)
⇒ #t
; but they are still symbols,

(symbol-interned? foo-3)
⇒ #f
; just not interned.

Next: , Previous: , Up: Simple Data Types   [Contents][Index]

6.6.8 Keywords

Keywords are self-evaluating objects with a convenient read syntax that makes them easy to type.

Guile’s keyword support conforms to R5RS, and adds a (switchable) read syntax extension to permit keywords to begin with : as well as #:, or to end with :.


Next: , Up: Keywords   [Contents][Index]

6.6.8.1 Why Use Keywords?

Keywords are useful in contexts where a program or procedure wants to be able to accept a large number of optional arguments without making its interface unmanageable.

To illustrate this, consider a hypothetical make-window procedure, which creates a new window on the screen for drawing into using some graphical toolkit. There are many parameters that the caller might like to specify, but which could also be sensibly defaulted, for example:

If make-window did not use keywords, the caller would have to pass in a value for each possible argument, remembering the correct argument order and using a special value to indicate the default value for that argument:

(make-window 'default              ;; Color depth
             'default              ;; Background color
             800                   ;; Width
             100                   ;; Height
             …)                  ;; More make-window arguments

With keywords, on the other hand, defaulted arguments are omitted, and non-default arguments are clearly tagged by the appropriate keyword. As a result, the invocation becomes much clearer:

(make-window #:width 800 #:height 100)

On the other hand, for a simpler procedure with few arguments, the use of keywords would be a hindrance rather than a help. The primitive procedure cons, for example, would not be improved if it had to be invoked as

(cons #:car x #:cdr y)

So the decision whether to use keywords or not is purely pragmatic: use them if they will clarify the procedure invocation at point of call.


Next: , Previous: , Up: Keywords   [Contents][Index]

6.6.8.2 Coding With Keywords

If a procedure wants to support keywords, it should take a rest argument and then use whatever means is convenient to extract keywords and their corresponding arguments from the contents of that rest argument.

The following example illustrates the principle: the code for make-window uses a helper procedure called get-keyword-value to extract individual keyword arguments from the rest argument.

(define (get-keyword-value args keyword default)
  (let ((kv (memq keyword args)))
    (if (and kv (>= (length kv) 2))
        (cadr kv)
        default)))

(define (make-window . args)
  (let ((depth  (get-keyword-value args #:depth  screen-depth))
        (bg     (get-keyword-value args #:bg     "white"))
        (width  (get-keyword-value args #:width  800))
        (height (get-keyword-value args #:height 100))
        …)
    …))

But you don’t need to write get-keyword-value. The (ice-9 optargs) module provides a set of powerful macros that you can use to implement keyword-supporting procedures like this:

(use-modules (ice-9 optargs))

(define (make-window . args)
  (let-keywords args #f ((depth  screen-depth)
                         (bg     "white")
                         (width  800)
                         (height 100))
    ...))

Or, even more economically, like this:

(use-modules (ice-9 optargs))

(define* (make-window #:key (depth  screen-depth)
                            (bg     "white")
                            (width  800)
                            (height 100))
  ...)

For further details on let-keywords, define* and other facilities provided by the (ice-9 optargs) module, see Optional Arguments.

To handle keyword arguments from procedures implemented in C, use scm_c_bind_keyword_arguments (see Keyword Procedures).


Next: , Previous: , Up: Keywords   [Contents][Index]

6.6.8.3 Keyword Read Syntax

Guile, by default, only recognizes a keyword syntax that is compatible with R5RS. A token of the form #:NAME, where NAME has the same syntax as a Scheme symbol (see Symbol Read Syntax), is the external representation of the keyword named NAME. Keyword objects print using this syntax as well, so values containing keyword objects can be read back into Guile. When used in an expression, keywords are self-quoting objects.

If the keyword read option is set to 'prefix, Guile also recognizes the alternative read syntax :NAME. Otherwise, tokens of the form :NAME are read as symbols, as required by R5RS.

If the keyword read option is set to 'postfix, Guile recognizes the SRFI-88 read syntax NAME: (see SRFI-88). Otherwise, tokens of this form are read as symbols.

To enable and disable the alternative non-R5RS keyword syntax, you use the read-set! procedure documented Scheme Read. Note that the prefix and postfix syntax are mutually exclusive.

(read-set! keywords 'prefix)

#:type
⇒
#:type

:type
⇒
#:type

(read-set! keywords 'postfix)

type:
⇒
#:type

:type
⇒
:type

(read-set! keywords #f)

#:type
⇒
#:type

:type
-|
ERROR: In expression :type:
ERROR: Unbound variable: :type
ABORT: (unbound-variable)

Previous: , Up: Keywords   [Contents][Index]

6.6.8.4 Keyword Procedures

Scheme Procedure: keyword? obj
C Function: scm_keyword_p (obj)

Return #t if the argument obj is a keyword, else #f.

Scheme Procedure: keyword->symbol keyword
C Function: scm_keyword_to_symbol (keyword)

Return the symbol with the same name as keyword.

Scheme Procedure: symbol->keyword symbol
C Function: scm_symbol_to_keyword (symbol)

Return the keyword with the same name as symbol.

C Function: int scm_is_keyword (SCM obj)

Equivalent to scm_is_true (scm_keyword_p (obj)).

C Function: SCM scm_from_locale_keyword (const char *name)
C Function: SCM scm_from_locale_keywordn (const char *name, size_t len)

Equivalent to scm_symbol_to_keyword (scm_from_locale_symbol (name)) and scm_symbol_to_keyword (scm_from_locale_symboln (name, len)), respectively.

Note that these functions should not be used when name is a C string constant, because there is no guarantee that the current locale will match that of the execution character set, used for string and character constants. Most modern C compilers use UTF-8 by default, so in such cases we recommend scm_from_utf8_keyword.

C Function: SCM scm_from_latin1_keyword (const char *name)
C Function: SCM scm_from_utf8_keyword (const char *name)

Equivalent to scm_symbol_to_keyword (scm_from_latin1_symbol (name)) and scm_symbol_to_keyword (scm_from_utf8_symbol (name)), respectively.

C Function: void scm_c_bind_keyword_arguments (const char *subr, SCM rest, scm_t_keyword_arguments_flags flags, SCM keyword1, SCM *argp1, …, SCM keywordN, SCM *argpN, SCM_UNDEFINED)

Extract the specified keyword arguments from rest, which is not modified. If the keyword argument keyword1 is present in rest with an associated value, that value is stored in the variable pointed to by argp1, otherwise the variable is left unchanged. Similarly for the other keywords and argument pointers up to keywordN and argpN. The argument list to scm_c_bind_keyword_arguments must be terminated by SCM_UNDEFINED.

Note that since the variables pointed to by argp1 through argpN are left unchanged if the associated keyword argument is not present, they should be initialized to their default values before calling scm_c_bind_keyword_arguments. Alternatively, you can initialize them to SCM_UNDEFINED before the call, and then use SCM_UNBNDP after the call to see which ones were provided.

If an unrecognized keyword argument is present in rest and flags does not contain SCM_ALLOW_OTHER_KEYS, or if non-keyword arguments are present and flags does not contain SCM_ALLOW_NON_KEYWORD_ARGUMENTS, an exception is raised. subr should be the name of the procedure receiving the keyword arguments, for purposes of error reporting.

For example:

SCM k_delimiter;
SCM k_grammar;
SCM sym_infix;

SCM my_string_join (SCM strings, SCM rest)
{
  SCM delimiter = SCM_UNDEFINED;
  SCM grammar   = sym_infix;

  scm_c_bind_keyword_arguments ("my-string-join", rest, 0,
                                k_delimiter, &delimiter,
                                k_grammar, &grammar,
                                SCM_UNDEFINED);

  if (SCM_UNBNDP (delimiter))
    delimiter = scm_from_utf8_string (" ");

  return scm_string_join (strings, delimiter, grammar);
}

void my_init ()
{
  k_delimiter = scm_from_utf8_keyword ("delimiter");
  k_grammar   = scm_from_utf8_keyword ("grammar");
  sym_infix   = scm_from_utf8_symbol  ("infix");
  scm_c_define_gsubr ("my-string-join", 1, 0, 1, my_string_join);
}

Previous: , Up: Simple Data Types   [Contents][Index]

6.6.9 “Functionality-Centric” Data Types

Procedures and macros are documented in their own sections: see Procedures and Macros.

Variable objects are documented as part of the description of Guile’s module system: see Variables.

Asyncs, dynamic roots and fluids are described in the section on scheduling: see Scheduling.

Hooks are documented in the section on general utility functions: see Hooks.

Ports are described in the section on I/O: see Input and Output.

Regular expressions are described in their own section: see Regular Expressions.


Next: , Previous: , Up: API Reference   [Contents][Index]

6.7 Compound Data Types

This chapter describes Guile’s compound data types. By compound we mean that the primary purpose of these data types is to act as containers for other kinds of data (including other compound objects). For instance, a (non-uniform) vector with length 5 is a container that can hold five arbitrary Scheme objects.

The various kinds of container object differ from each other in how their memory is allocated, how they are indexed, and how particular values can be looked up within them.


Next: , Up: Compound Data Types   [Contents][Index]

6.7.1 Pairs

Pairs are used to combine two Scheme objects into one compound object. Hence the name: A pair stores a pair of objects.

The data type pair is extremely important in Scheme, just like in any other Lisp dialect. The reason is that pairs are not only used to make two values available as one object, but that pairs are used for constructing lists of values. Because lists are so important in Scheme, they are described in a section of their own (see Lists).

Pairs can literally get entered in source code or at the REPL, in the so-called dotted list syntax. This syntax consists of an opening parentheses, the first element of the pair, a dot, the second element and a closing parentheses. The following example shows how a pair consisting of the two numbers 1 and 2, and a pair containing the symbols foo and bar can be entered. It is very important to write the whitespace before and after the dot, because otherwise the Scheme parser would not be able to figure out where to split the tokens.

(1 . 2)
(foo . bar)

But beware, if you want to try out these examples, you have to quote the expressions. More information about quotation is available in the section Expression Syntax. The correct way to try these examples is as follows.

'(1 . 2)
⇒
(1 . 2)
'(foo . bar)
⇒
(foo . bar)

A new pair is made by calling the procedure cons with two arguments. Then the argument values are stored into a newly allocated pair, and the pair is returned. The name cons stands for "construct". Use the procedure pair? to test whether a given Scheme object is a pair or not.

Scheme Procedure: cons x y
C Function: scm_cons (x, y)

Return a newly allocated pair whose car is x and whose cdr is y. The pair is guaranteed to be different (in the sense of eq?) from every previously existing object.

Scheme Procedure: pair? x
C Function: scm_pair_p (x)

Return #t if x is a pair; otherwise return #f.

C Function: int scm_is_pair (SCM x)

Return 1 when x is a pair; otherwise return 0.

The two parts of a pair are traditionally called car and cdr. They can be retrieved with procedures of the same name (car and cdr), and can be modified with the procedures set-car! and set-cdr!.

Since a very common operation in Scheme programs is to access the car of a car of a pair, or the car of the cdr of a pair, etc., the procedures called caar, cadr and so on are also predefined. However, using these procedures is often detrimental to readability, and error-prone. Thus, accessing the contents of a list is usually better achieved using pattern matching techniques (see Pattern Matching).

Scheme Procedure: car pair
Scheme Procedure: cdr pair
C Function: scm_car (pair)
C Function: scm_cdr (pair)

Return the car or the cdr of pair, respectively.

C Macro: SCM SCM_CAR (SCM pair)
C Macro: SCM SCM_CDR (SCM pair)

These two macros are the fastest way to access the car or cdr of a pair; they can be thought of as compiling into a single memory reference.

These macros do no checking at all. The argument pair must be a valid pair.

Scheme Procedure: cddr pair
Scheme Procedure: cdar pair
Scheme Procedure: cadr pair
Scheme Procedure: caar pair
Scheme Procedure: cdddr pair
Scheme Procedure: cddar pair
Scheme Procedure: cdadr pair
Scheme Procedure: cdaar pair
Scheme Procedure: caddr pair
Scheme Procedure: cadar pair
Scheme Procedure: caadr pair
Scheme Procedure: caaar pair
Scheme Procedure: cddddr pair
Scheme Procedure: cdddar pair
Scheme Procedure: cddadr pair
Scheme Procedure: cddaar pair
Scheme Procedure: cdaddr pair
Scheme Procedure: cdadar pair
Scheme Procedure: cdaadr pair
Scheme Procedure: cdaaar pair
Scheme Procedure: cadddr pair
Scheme Procedure: caddar pair
Scheme Procedure: cadadr pair
Scheme Procedure: cadaar pair
Scheme Procedure: caaddr pair
Scheme Procedure: caadar pair
Scheme Procedure: caaadr pair
Scheme Procedure: caaaar pair
C Function: scm_cddr (pair)
C Function: scm_cdar (pair)
C Function: scm_cadr (pair)
C Function: scm_caar (pair)
C Function: scm_cdddr (pair)
C Function: scm_cddar (pair)
C Function: scm_cdadr (pair)
C Function: scm_cdaar (pair)
C Function: scm_caddr (pair)
C Function: scm_cadar (pair)
C Function: scm_caadr (pair)
C Function: scm_caaar (pair)
C Function: scm_cddddr (pair)
C Function: scm_cdddar (pair)
C Function: scm_cddadr (pair)
C Function: scm_cddaar (pair)
C Function: scm_cdaddr (pair)
C Function: scm_cdadar (pair)
C Function: scm_cdaadr (pair)
C Function: scm_cdaaar (pair)
C Function: scm_cadddr (pair)
C Function: scm_caddar (pair)
C Function: scm_cadadr (pair)
C Function: scm_cadaar (pair)
C Function: scm_caaddr (pair)
C Function: scm_caadar (pair)
C Function: scm_caaadr (pair)
C Function: scm_caaaar (pair)

These procedures are compositions of car and cdr, where for example caddr could be defined by

(define caddr (lambda (x) (car (cdr (cdr x)))))

cadr, caddr and cadddr pick out the second, third or fourth elements of a list, respectively. SRFI-1 provides the same under the names second, third and fourth (see SRFI-1 Selectors).

Scheme Procedure: set-car! pair value
C Function: scm_set_car_x (pair, value)

Stores value in the car field of pair. The value returned by set-car! is unspecified.

Scheme Procedure: set-cdr! pair value
C Function: scm_set_cdr_x (pair, value)

Stores value in the cdr field of pair. The value returned by set-cdr! is unspecified.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.2 Lists

A very important data type in Scheme—as well as in all other Lisp dialects—is the data type list.6

This is the short definition of what a list is:


Next: , Up: Lists   [Contents][Index]

6.7.2.1 List Read Syntax

The syntax for lists is an opening parentheses, then all the elements of the list (separated by whitespace) and finally a closing parentheses.7.

(1 2 3)            ; a list of the numbers 1, 2 and 3
("foo" bar 3.1415) ; a string, a symbol and a real number
()                 ; the empty list

The last example needs a bit more explanation. A list with no elements, called the empty list, is special in some ways. It is used for terminating lists by storing it into the cdr of the last pair that makes up a list. An example will clear that up:

(car '(1))
⇒
1
(cdr '(1))
⇒
()

This example also shows that lists have to be quoted when written (see Expression Syntax), because they would otherwise be mistakingly taken as procedure applications (see Simple Invocation).


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.2 List Predicates

Often it is useful to test whether a given Scheme object is a list or not. List-processing procedures could use this information to test whether their input is valid, or they could do different things depending on the datatype of their arguments.

Scheme Procedure: list? x
C Function: scm_list_p (x)

Return #t if x is a proper list, else #f.

The predicate null? is often used in list-processing code to tell whether a given list has run out of elements. That is, a loop somehow deals with the elements of a list until the list satisfies null?. Then, the algorithm terminates.

Scheme Procedure: null? x
C Function: scm_null_p (x)

Return #t if x is the empty list, else #f.

C Function: int scm_is_null (SCM x)

Return 1 when x is the empty list; otherwise return 0.


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.3 List Constructors

This section describes the procedures for constructing new lists. list simply returns a list where the elements are the arguments, cons* is similar, but the last argument is stored in the cdr of the last pair of the list.

Scheme Procedure: list elem …
C Function: scm_list_1 (elem1)
C Function: scm_list_2 (elem1, elem2)
C Function: scm_list_3 (elem1, elem2, elem3)
C Function: scm_list_4 (elem1, elem2, elem3, elem4)
C Function: scm_list_5 (elem1, elem2, elem3, elem4, elem5)
C Function: scm_list_n (elem1, …, elemN, SCM_UNDEFINED)

Return a new list containing elements elem ....

scm_list_n takes a variable number of arguments, terminated by the special SCM_UNDEFINED. That final SCM_UNDEFINED is not included in the list. None of elem … can themselves be SCM_UNDEFINED, or scm_list_n will terminate at that point.

Scheme Procedure: cons* arg1 arg2 …

Like list, but the last arg provides the tail of the constructed list, returning (cons arg1 (cons arg2 (cons … argn))). Requires at least one argument. If given one argument, that argument is returned as result. This function is called list* in some other Schemes and in Common LISP.

Scheme Procedure: list-copy lst
C Function: scm_list_copy (lst)

Return a (newly-created) copy of lst.

Scheme Procedure: make-list n [init]

Create a list containing of n elements, where each element is initialized to init. init defaults to the empty list () if not given.

Note that list-copy only makes a copy of the pairs which make up the spine of the lists. The list elements are not copied, which means that modifying the elements of the new list also modifies the elements of the old list. On the other hand, applying procedures like set-cdr! or delv! to the new list will not alter the old list. If you also need to copy the list elements (making a deep copy), use the procedure copy-tree (see Copying).


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.4 List Selection

These procedures are used to get some information about a list, or to retrieve one or more elements of a list.

Scheme Procedure: length lst
C Function: scm_length (lst)

Return the number of elements in list lst.

Scheme Procedure: last-pair lst
C Function: scm_last_pair (lst)

Return the last pair in lst, signalling an error if lst is circular.

Scheme Procedure: list-ref list k
C Function: scm_list_ref (list, k)

Return the kth element from list.

Scheme Procedure: list-tail lst k
Scheme Procedure: list-cdr-ref lst k
C Function: scm_list_tail (lst, k)

Return the "tail" of lst beginning with its kth element. The first element of the list is considered to be element 0.

list-tail and list-cdr-ref are identical. It may help to think of list-cdr-ref as accessing the kth cdr of the list, or returning the results of cdring k times down lst.

Scheme Procedure: list-head lst k
C Function: scm_list_head (lst, k)

Copy the first k elements from lst into a new list, and return it.


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.5 Append and Reverse

append and append! are used to concatenate two or more lists in order to form a new list. reverse and reverse! return lists with the same elements as their arguments, but in reverse order. The procedure variants with an ! directly modify the pairs which form the list, whereas the other procedures create new pairs. This is why you should be careful when using the side-effecting variants.

Scheme Procedure: append lst … obj
Scheme Procedure: append
Scheme Procedure: append! lst … obj
Scheme Procedure: append!
C Function: scm_append (lstlst)
C Function: scm_append_x (lstlst)

Return a list comprising all the elements of lists lstobj. If called with no arguments, return the empty list.

(append '(x) '(y))          ⇒  (x y)
(append '(a) '(b c d))      ⇒  (a b c d)
(append '(a (b)) '((c)))    ⇒  (a (b) (c))

The last argument obj may actually be any object; an improper list results if the last argument is not a proper list.

(append '(a b) '(c . d))    ⇒  (a b c . d)
(append '() 'a)             ⇒  a

append doesn’t modify the given lists, but the return may share structure with the final obj. append! is permitted, but not required, to modify the given lists to form its return.

For scm_append and scm_append_x, lstlst is a list of the list operands lstobj. That lstlst itself is not modified or used in the return.

Scheme Procedure: reverse lst
Scheme Procedure: reverse! lst [newtail]
C Function: scm_reverse (lst)
C Function: scm_reverse_x (lst, newtail)

Return a list comprising the elements of lst, in reverse order.

reverse constructs a new list. reverse! is permitted, but not required, to modify lst in constructing its return.

For reverse!, the optional newtail is appended to the result. newtail isn’t reversed, it simply becomes the list tail. For scm_reverse_x, the newtail parameter is mandatory, but can be SCM_EOL if no further tail is required.


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.6 List Modification

The following procedures modify an existing list, either by changing elements of the list, or by changing the list structure itself.

Scheme Procedure: list-set! list k val
C Function: scm_list_set_x (list, k, val)

Set the kth element of list to val.

Scheme Procedure: list-cdr-set! list k val
C Function: scm_list_cdr_set_x (list, k, val)

Set the kth cdr of list to val.

Scheme Procedure: delq item lst
C Function: scm_delq (item, lst)

Return a newly-created copy of lst with elements eq? to item removed. This procedure mirrors memq: delq compares elements of lst against item with eq?.

Scheme Procedure: delv item lst
C Function: scm_delv (item, lst)

Return a newly-created copy of lst with elements eqv? to item removed. This procedure mirrors memv: delv compares elements of lst against item with eqv?.

Scheme Procedure: delete item lst
C Function: scm_delete (item, lst)

Return a newly-created copy of lst with elements equal? to item removed. This procedure mirrors member: delete compares elements of lst against item with equal?.

See also SRFI-1 which has an extended delete (SRFI-1 Deleting), and also an lset-difference which can delete multiple items in one call (SRFI-1 Set Operations).

Scheme Procedure: delq! item lst
Scheme Procedure: delv! item lst
Scheme Procedure: delete! item lst
C Function: scm_delq_x (item, lst)
C Function: scm_delv_x (item, lst)
C Function: scm_delete_x (item, lst)

These procedures are destructive versions of delq, delv and delete: they modify the pointers in the existing lst rather than creating a new list. Caveat evaluator: Like other destructive list functions, these functions cannot modify the binding of lst, and so cannot be used to delete the first element of lst destructively.

Scheme Procedure: delq1! item lst
C Function: scm_delq1_x (item, lst)

Like delq!, but only deletes the first occurrence of item from lst. Tests for equality using eq?. See also delv1! and delete1!.

Scheme Procedure: delv1! item lst
C Function: scm_delv1_x (item, lst)

Like delv!, but only deletes the first occurrence of item from lst. Tests for equality using eqv?. See also delq1! and delete1!.

Scheme Procedure: delete1! item lst
C Function: scm_delete1_x (item, lst)

Like delete!, but only deletes the first occurrence of item from lst. Tests for equality using equal?. See also delq1! and delv1!.

Scheme Procedure: filter pred lst
Scheme Procedure: filter! pred lst

Return a list containing all elements from lst which satisfy the predicate pred. The elements in the result list have the same order as in lst. The order in which pred is applied to the list elements is not specified.

filter does not change lst, but the result may share a tail with it. filter! may modify lst to construct its return.


Next: , Previous: , Up: Lists   [Contents][Index]

6.7.2.7 List Searching

The following procedures search lists for particular elements. They use different comparison predicates for comparing list elements with the object to be searched. When they fail, they return #f, otherwise they return the sublist whose car is equal to the search object, where equality depends on the equality predicate used.

Scheme Procedure: memq x lst
C Function: scm_memq (x, lst)

Return the first sublist of lst whose car is eq? to x where the sublists of lst are the non-empty lists returned by (list-tail lst k) for k less than the length of lst. If x does not occur in lst, then #f (not the empty list) is returned.

Scheme Procedure: memv x lst
C Function: scm_memv (x, lst)

Return the first sublist of lst whose car is eqv? to x where the sublists of lst are the non-empty lists returned by (list-tail lst k) for k less than the length of lst. If x does not occur in lst, then #f (not the empty list) is returned.

Scheme Procedure: member x lst
C Function: scm_member (x, lst)

Return the first sublist of lst whose car is equal? to x where the sublists of lst are the non-empty lists returned by (list-tail lst k) for k less than the length of lst. If x does not occur in lst, then #f (not the empty list) is returned.

See also SRFI-1 which has an extended member function (SRFI-1 Searching).


Previous: , Up: Lists   [Contents][Index]

6.7.2.8 List Mapping

List processing is very convenient in Scheme because the process of iterating over the elements of a list can be highly abstracted. The procedures in this section are the most basic iterating procedures for lists. They take a procedure and one or more lists as arguments, and apply the procedure to each element of the list. They differ in their return value.

Scheme Procedure: map proc arg1 arg2 …
Scheme Procedure: map-in-order proc arg1 arg2 …
C Function: scm_map (proc, arg1, args)

Apply proc to each element of the list arg1 (if only two arguments are given), or to the corresponding elements of the argument lists (if more than two arguments are given). The result(s) of the procedure applications are saved and returned in a list. For map, the order of procedure applications is not specified, map-in-order applies the procedure from left to right to the list elements.

Scheme Procedure: for-each proc arg1 arg2 …

Like map, but the procedure is always applied from left to right, and the result(s) of the procedure applications are thrown away. The return value is not specified.

See also SRFI-1 which extends these functions to take lists of unequal lengths (SRFI-1 Fold and Map).


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.3 Vectors

Vectors are sequences of Scheme objects. Unlike lists, the length of a vector, once the vector is created, cannot be changed. The advantage of vectors over lists is that the time required to access one element of a vector given its position (synonymous with index), a zero-origin number, is constant, whereas lists have an access time linear to the position of the accessed element in the list.

Vectors can contain any kind of Scheme object; it is even possible to have different types of objects in the same vector. For vectors containing vectors, you may wish to use arrays, instead. Note, too, that vectors are the special case of one dimensional non-uniform arrays and that most array procedures operate happily on vectors (see Arrays).

Also see SRFI-43, for a comprehensive vector library.


Next: , Up: Vectors   [Contents][Index]

6.7.3.1 Read Syntax for Vectors

Vectors can literally be entered in source code, just like strings, characters or some of the other data types. The read syntax for vectors is as follows: A sharp sign (#), followed by an opening parentheses, all elements of the vector in their respective read syntax, and finally a closing parentheses. Like strings, vectors do not have to be quoted.

The following are examples of the read syntax for vectors; where the first vector only contains numbers and the second three different object types: a string, a symbol and a number in hexadecimal notation.

#(1 2 3)
#("Hello" foo #xdeadbeef)

Next: , Previous: , Up: Vectors   [Contents][Index]

6.7.3.2 Dynamic Vector Creation and Validation

Instead of creating a vector implicitly by using the read syntax just described, you can create a vector dynamically by calling one of the vector and list->vector primitives with the list of Scheme values that you want to place into a vector. The size of the vector thus created is determined implicitly by the number of arguments given.

Scheme Procedure: vector arg …
Scheme Procedure: list->vector l
C Function: scm_vector (l)

Return a newly allocated vector composed of the given arguments. Analogous to list.

(vector 'a 'b 'c) ⇒ #(a b c)

The inverse operation is vector->list:

Scheme Procedure: vector->list v
C Function: scm_vector_to_list (v)

Return a newly allocated list composed of the elements of v.

(vector->list #(dah dah didah)) ⇒  (dah dah didah)
(list->vector '(dididit dah)) ⇒  #(dididit dah)

To allocate a vector with an explicitly specified size, use make-vector. With this primitive you can also specify an initial value for the vector elements (the same value for all elements, that is):

Scheme Procedure: make-vector len [fill]
C Function: scm_make_vector (len, fill)

Return a newly allocated vector of len elements. If a second argument is given, then each position is initialized to fill. Otherwise the initial contents of each position is unspecified.

C Function: SCM scm_c_make_vector (size_t k, SCM fill)

Like scm_make_vector, but the length is given as a size_t.

To check whether an arbitrary Scheme value is a vector, use the vector? primitive:

Scheme Procedure: vector? obj
C Function: scm_vector_p (obj)

Return #t if obj is a vector, otherwise return #f.

C Function: int scm_is_vector (SCM obj)

Return non-zero when obj is a vector, otherwise return zero.


Next: , Previous: , Up: Vectors   [Contents][Index]

6.7.3.3 Accessing and Modifying Vector Contents

vector-length and vector-ref return information about a given vector, respectively its size and the elements that are contained in the vector.

Scheme Procedure: vector-length vector
C Function: scm_vector_length (vector)

Return the number of elements in vector as an exact integer.

C Function: size_t scm_c_vector_length (SCM vec)

Return the number of elements in vec as a size_t.

Scheme Procedure: vector-ref vec k
C Function: scm_vector_ref (vec, k)

Return the contents of position k of vec. k must be a valid index of vec.

(vector-ref #(1 1 2 3 5 8 13 21) 5) ⇒ 8
(vector-ref #(1 1 2 3 5 8 13 21)
    (let ((i (round (* 2 (acos -1)))))
      (if (inexact? i)
        (inexact->exact i)
           i))) ⇒ 13
C Function: SCM scm_c_vector_ref (SCM vec, size_t k)

Return the contents of position k (a size_t) of vec.

A vector created by one of the dynamic vector constructor procedures (see Vector Creation) can be modified using the following procedures.

NOTE: According to R5RS, it is an error to use any of these procedures on a literally read vector, because such vectors should be considered as constants. Currently, however, Guile does not detect this error.

Scheme Procedure: vector-set! vec k obj
C Function: scm_vector_set_x (vec, k, obj)

Store obj in position k of vec. k must be a valid index of vec. The value returned by ‘vector-set!’ is unspecified.

(let ((vec (vector 0 '(2 2 2 2) "Anna")))
  (vector-set! vec 1 '("Sue" "Sue"))
  vec) ⇒  #(0 ("Sue" "Sue") "Anna")
C Function: void scm_c_vector_set_x (SCM vec, size_t k, SCM obj)

Store obj in position k (a size_t) of vec.

Scheme Procedure: vector-fill! vec fill
C Function: scm_vector_fill_x (vec, fill)

Store fill in every position of vec. The value returned by vector-fill! is unspecified.

Scheme Procedure: vector-copy vec
C Function: scm_vector_copy (vec)

Return a copy of vec.

Scheme Procedure: vector-move-left! vec1 start1 end1 vec2 start2
C Function: scm_vector_move_left_x (vec1, start1, end1, vec2, start2)

Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.

vector-move-left! copies elements in leftmost order. Therefore, in the case where vec1 and vec2 refer to the same vector, vector-move-left! is usually appropriate when start1 is greater than start2.

Scheme Procedure: vector-move-right! vec1 start1 end1 vec2 start2
C Function: scm_vector_move_right_x (vec1, start1, end1, vec2, start2)

Copy elements from vec1, positions start1 to end1, to vec2 starting at position start2. start1 and start2 are inclusive indices; end1 is exclusive.

vector-move-right! copies elements in rightmost order. Therefore, in the case where vec1 and vec2 refer to the same vector, vector-move-right! is usually appropriate when start1 is less than start2.


Next: , Previous: , Up: Vectors   [Contents][Index]

6.7.3.4 Vector Accessing from C

A vector can be read and modified from C with the functions scm_c_vector_ref and scm_c_vector_set_x, for example. In addition to these functions, there are two more ways to access vectors from C that might be more efficient in certain situations: you can restrict yourself to simple vectors and then use the very fast simple vector macros; or you can use the very general framework for accessing all kinds of arrays (see Accessing Arrays from C), which is more verbose, but can deal efficiently with all kinds of vectors (and arrays). For vectors, you can use the scm_vector_elements and scm_vector_writable_elements functions as shortcuts.

C Function: int scm_is_simple_vector (SCM obj)

Return non-zero if obj is a simple vector, else return zero. A simple vector is a vector that can be used with the SCM_SIMPLE_* macros below.

The following functions are guaranteed to return simple vectors: scm_make_vector, scm_c_make_vector, scm_vector, scm_list_to_vector.

C Macro: size_t SCM_SIMPLE_VECTOR_LENGTH (SCM vec)

Evaluates to the length of the simple vector vec. No type checking is done.

C Macro: SCM SCM_SIMPLE_VECTOR_REF (SCM vec, size_t idx)

Evaluates to the element at position idx in the simple vector vec. No type or range checking is done.

C Macro: void SCM_SIMPLE_VECTOR_SET (SCM vec, size_t idx, SCM val)

Sets the element at position idx in the simple vector vec to val. No type or range checking is done.

C Function: const SCM * scm_vector_elements (SCM vec, scm_t_array_handle *handle, size_t *lenp, ssize_t *incp)

Acquire a handle for the vector vec and return a pointer to the elements of it. This pointer can only be used to read the elements of vec. When vec is not a vector, an error is signaled. The handle must eventually be released with scm_array_handle_release.

The variables pointed to by lenp and incp are filled with the number of elements of the vector and the increment (number of elements) between successive elements, respectively. Successive elements of vec need not be contiguous in their underlying “root vector” returned here; hence the increment is not necessarily equal to 1 and may well be negative too (see Shared Arrays).

The following example shows the typical way to use this function. It creates a list of all elements of vec (in reverse order).

scm_t_array_handle handle;
size_t i, len;
ssize_t inc;
const SCM *elt;
SCM list;

elt = scm_vector_elements (vec, &handle, &len, &inc);
list = SCM_EOL;
for (i = 0; i < len; i++, elt += inc)
  list = scm_cons (*elt, list);
scm_array_handle_release (&handle);
C Function: SCM * scm_vector_writable_elements (SCM vec, scm_t_array_handle *handle, size_t *lenp, ssize_t *incp)

Like scm_vector_elements but the pointer can be used to modify the vector.

The following example shows the typical way to use this function. It fills a vector with #t.

scm_t_array_handle handle;
size_t i, len;
ssize_t inc;
SCM *elt;

elt = scm_vector_writable_elements (vec, &handle, &len, &inc);
for (i = 0; i < len; i++, elt += inc)
  *elt = SCM_BOOL_T;
scm_array_handle_release (&handle);

Previous: , Up: Vectors   [Contents][Index]

6.7.3.5 Uniform Numeric Vectors

A uniform numeric vector is a vector whose elements are all of a single numeric type. Guile offers uniform numeric vectors for signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers, two sizes of floating point values, and complex floating-point numbers of these two sizes. See SRFI-4, for more information.

For many purposes, bytevectors work just as well as uniform vectors, and have the advantage that they integrate well with binary input and output. See Bytevectors, for more information on bytevectors.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.4 Bit Vectors

Bit vectors are zero-origin, one-dimensional arrays of booleans. They are displayed as a sequence of 0s and 1s prefixed by #*, e.g.,

(make-bitvector 8 #f) ⇒
#*00000000

Bit vectors are the special case of one dimensional bit arrays, and can thus be used with the array procedures, See Arrays.

Scheme Procedure: bitvector? obj
C Function: scm_bitvector_p (obj)

Return #t when obj is a bitvector, else return #f.

C Function: int scm_is_bitvector (SCM obj)

Return 1 when obj is a bitvector, else return 0.

Scheme Procedure: make-bitvector len [fill]
C Function: scm_make_bitvector (len, fill)

Create a new bitvector of length len and optionally initialize all elements to fill.

C Function: SCM scm_c_make_bitvector (size_t len, SCM fill)

Like scm_make_bitvector, but the length is given as a size_t.

Scheme Procedure: bitvector bit …
C Function: scm_bitvector (bits)

Create a new bitvector with the arguments as elements.

Scheme Procedure: bitvector-length vec
C Function: scm_bitvector_length (vec)

Return the length of the bitvector vec.

C Function: size_t scm_c_bitvector_length (SCM vec)

Like scm_bitvector_length, but the length is returned as a size_t.

Scheme Procedure: bitvector-ref vec idx
C Function: scm_bitvector_ref (vec, idx)

Return the element at index idx of the bitvector vec.

C Function: SCM scm_c_bitvector_ref (SCM vec, size_t idx)

Return the element at index idx of the bitvector vec.

Scheme Procedure: bitvector-set! vec idx val
C Function: scm_bitvector_set_x (vec, idx, val)

Set the element at index idx of the bitvector vec when val is true, else clear it.

C Function: SCM scm_c_bitvector_set_x (SCM vec, size_t idx, SCM val)

Set the element at index idx of the bitvector vec when val is true, else clear it.

Scheme Procedure: bitvector-fill! vec val
C Function: scm_bitvector_fill_x (vec, val)

Set all elements of the bitvector vec when val is true, else clear them.

Scheme Procedure: list->bitvector list
C Function: scm_list_to_bitvector (list)

Return a new bitvector initialized with the elements of list.

Scheme Procedure: bitvector->list vec
C Function: scm_bitvector_to_list (vec)

Return a new list initialized with the elements of the bitvector vec.

Scheme Procedure: bit-count bool bitvector
C Function: scm_bit_count (bool, bitvector)

Return a count of how many entries in bitvector are equal to bool. For example,

(bit-count #f #*000111000)  ⇒ 6
Scheme Procedure: bit-position bool bitvector start
C Function: scm_bit_position (bool, bitvector, start)

Return the index of the first occurrence of bool in bitvector, starting from start. If there is no bool entry between start and the end of bitvector, then return #f. For example,

(bit-position #t #*000101 0)  ⇒ 3
(bit-position #f #*0001111 3) ⇒ #f
Scheme Procedure: bit-invert! bitvector
C Function: scm_bit_invert_x (bitvector)

Modify bitvector by replacing each element with its negation.

Scheme Procedure: bit-set*! bitvector uvec bool
C Function: scm_bit_set_star_x (bitvector, uvec, bool)

Set entries of bitvector to bool, with uvec selecting the entries to change. The return value is unspecified.

If uvec is a bit vector, then those entries where it has #t are the ones in bitvector which are set to bool. uvec and bitvector must be the same length. When bool is #t it’s like uvec is OR’ed into bitvector. Or when bool is #f it can be seen as an ANDNOT.

(define bv #*01000010)
(bit-set*! bv #*10010001 #t)
bv
⇒ #*11010011

If uvec is a uniform vector of unsigned long integers, then they’re indexes into bitvector which are set to bool.

(define bv #*01000010)
(bit-set*! bv #u(5 2 7) #t)
bv
⇒ #*01100111
Scheme Procedure: bit-count* bitvector uvec bool
C Function: scm_bit_count_star (bitvector, uvec, bool)

Return a count of how many entries in bitvector are equal to bool, with uvec selecting the entries to consider.

uvec is interpreted in the same way as for bit-set*! above. Namely, if uvec is a bit vector then entries which have #t there are considered in bitvector. Or if uvec is a uniform vector of unsigned long integers then it’s the indexes in bitvector to consider.

For example,

(bit-count* #*01110111 #*11001101 #t) ⇒ 3
(bit-count* #*01110111 #u(7 0 4) #f)  ⇒ 2
C Function: const scm_t_uint32 * scm_bitvector_elements (SCM vec, scm_t_array_handle *handle, size_t *offp, size_t *lenp, ssize_t *incp)

Like scm_vector_elements (see Vector Accessing from C), but for bitvectors. The variable pointed to by offp is set to the value returned by scm_array_handle_bit_elements_offset. See scm_array_handle_bit_elements for how to use the returned pointer and the offset.

C Function: scm_t_uint32 * scm_bitvector_writable_elements (SCM vec, scm_t_array_handle *handle, size_t *offp, size_t *lenp, ssize_t *incp)

Like scm_bitvector_elements, but the pointer is good for reading and writing.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.5 Arrays

Arrays are a collection of cells organized into an arbitrary number of dimensions. Each cell can be accessed in constant time by supplying an index for each dimension.

In the current implementation, an array uses a vector of some kind for the actual storage of its elements. Any kind of vector will do, so you can have arrays of uniform numeric values, arrays of characters, arrays of bits, and of course, arrays of arbitrary Scheme values. For example, arrays with an underlying c64vector might be nice for digital signal processing, while arrays made from a u8vector might be used to hold gray-scale images.

The number of dimensions of an array is called its rank. Thus, a matrix is an array of rank 2, while a vector has rank 1. When accessing an array element, you have to specify one exact integer for each dimension. These integers are called the indices of the element. An array specifies the allowed range of indices for each dimension via an inclusive lower and upper bound. These bounds can well be negative, but the upper bound must be greater than or equal to the lower bound minus one. When all lower bounds of an array are zero, it is called a zero-origin array.

Arrays can be of rank 0, which could be interpreted as a scalar. Thus, a zero-rank array can store exactly one object and the list of indices of this element is the empty list.

Arrays contain zero elements when one of their dimensions has a zero length. These empty arrays maintain information about their shape: a matrix with zero columns and 3 rows is different from a matrix with 3 columns and zero rows, which again is different from a vector of length zero.

The array procedures are all polymorphic, treating strings, uniform numeric vectors, bytevectors, bit vectors and ordinary vectors as one dimensional arrays.


Next: , Up: Arrays   [Contents][Index]

6.7.5.1 Array Syntax

An array is displayed as # followed by its rank, followed by a tag that describes the underlying vector, optionally followed by information about its shape, and finally followed by the cells, organized into dimensions using parentheses.

In more words, the array tag is of the form

  #<rank><vectag><@lower><:len><@lower><:len>...

where <rank> is a positive integer in decimal giving the rank of the array. It is omitted when the rank is 1 and the array is non-shared and has zero-origin (see below). For shared arrays and for a non-zero origin, the rank is always printed even when it is 1 to distinguish them from ordinary vectors.

The <vectag> part is the tag for a uniform numeric vector, like u8, s16, etc, b for bitvectors, or a for strings. It is empty for ordinary vectors.

The <@lower> part is a ‘@’ character followed by a signed integer in decimal giving the lower bound of a dimension. There is one <@lower> for each dimension. When all lower bounds are zero, all <@lower> parts are omitted.

The <:len> part is a ‘:’ character followed by an unsigned integer in decimal giving the length of a dimension. Like for the lower bounds, there is one <:len> for each dimension, and the <:len> part always follows the <@lower> part for a dimension. Lengths are only then printed when they can’t be deduced from the nested lists of elements of the array literal, which can happen when at least one length is zero.

As a special case, an array of rank 0 is printed as #0<vectag>(<scalar>), where <scalar> is the result of printing the single element of the array.

Thus,

#(1 2 3)

is an ordinary array of rank 1 with lower bound 0 in dimension 0. (I.e., a regular vector.)

#@2(1 2 3)

is an ordinary array of rank 1 with lower bound 2 in dimension 0.

#2((1 2 3) (4 5 6))

is a non-uniform array of rank 2; a 3x3 matrix with index ranges 0..2 and 0..2.

#u32(0 1 2)

is a uniform u8 array of rank 1.

#2u32@2@3((1 2) (2 3))

is a uniform u8 array of rank 2 with index ranges 2..3 and 3..4.

#2()

is a two-dimensional array with index ranges 0..-1 and 0..-1, i.e. both dimensions have length zero.

#2:0:2()

is a two-dimensional array with index ranges 0..-1 and 0..1, i.e. the first dimension has length zero, but the second has length 2.

#0(12)

is a rank-zero array with contents 12.

In addition, bytevectors are also arrays, but use a different syntax (see Bytevectors):

#vu8(1 2 3)

is a 3-byte long bytevector, with contents 1, 2, 3.


Next: , Previous: , Up: Arrays   [Contents][Index]

6.7.5.2 Array Procedures

When an array is created, the range of each dimension must be specified, e.g., to create a 2x3 array with a zero-based index:

(make-array 'ho 2 3) ⇒ #2((ho ho ho) (ho ho ho))

The range of each dimension can also be given explicitly, e.g., another way to create the same array:

(make-array 'ho '(0 1) '(0 2)) ⇒ #2((ho ho ho) (ho ho ho))

The following procedures can be used with arrays (or vectors). An argument shown as idx… means one parameter for each dimension in the array. A idxlist argument means a list of such values, one for each dimension.

Scheme Procedure: array? obj
C Function: scm_array_p (obj, unused)

Return #t if the obj is an array, and #f if not.

The second argument to scm_array_p is there for historical reasons, but it is not used. You should always pass SCM_UNDEFINED as its value.

Scheme Procedure: typed-array? obj type
C Function: scm_typed_array_p (obj, type)

Return #t if the obj is an array of type type, and #f if not.

C Function: int scm_is_array (SCM obj)

Return 1 if the obj is an array and 0 if not.

C Function: int scm_is_typed_array (SCM obj, SCM type)

Return 0 if the obj is an array of type type, and 1 if not.

Scheme Procedure: make-array fill bound …
C Function: scm_make_array (fill, bounds)

Equivalent to (make-typed-array #t fill bound ...).

Scheme Procedure: make-typed-array type fill bound …
C Function: scm_make_typed_array (type, fill, bounds)

Create and return an array that has as many dimensions as there are bounds and (maybe) fill it with fill.

The underlying storage vector is created according to type, which must be a symbol whose name is the ‘vectag’ of the array as explained above, or #t for ordinary, non-specialized arrays.

For example, using the symbol f64 for type will create an array that uses a f64vector for storing its elements, and a will use a string.

When fill is not the special unspecified value, the new array is filled with fill. Otherwise, the initial contents of the array is unspecified. The special unspecified value is stored in the variable *unspecified* so that for example (make-typed-array 'u32 *unspecified* 4) creates a uninitialized u32 vector of length 4.

Each bound may be a positive non-zero integer n, in which case the index for that dimension can range from 0 through n-1; or an explicit index range specifier in the form (LOWER UPPER), where both lower and upper are integers, possibly less than zero, and possibly the same number (however, lower cannot be greater than upper).

Scheme Procedure: list->array dimspec list

Equivalent to (list->typed-array #t dimspec list).

Scheme Procedure: list->typed-array type dimspec list
C Function: scm_list_to_typed_array (type, dimspec, list)

Return an array of the type indicated by type with elements the same as those of list.

The argument dimspec determines the number of dimensions of the array and their lower bounds. When dimspec is an exact integer, it gives the number of dimensions directly and all lower bounds are zero. When it is a list of exact integers, then each element is the lower index bound of a dimension, and there will be as many dimensions as elements in the list.

Scheme Procedure: array-type array
C Function: scm_array_type (array)

Return the type of array. This is the ‘vectag’ used for printing array (or #t for ordinary arrays) and can be used with make-typed-array to create an array of the same kind as array.

Scheme Procedure: array-ref array idx …
C Function: scm_array_ref (array, idxlist)

Return the element at (idx …) in array.

(define a (make-array 999 '(1 2) '(3 4)))
(array-ref a 2 4) ⇒ 999
Scheme Procedure: array-in-bounds? array idx …
C Function: scm_array_in_bounds_p (array, idxlist)

Return #t if the given indices would be acceptable to array-ref.

(define a (make-array #f '(1 2) '(3 4)))
(array-in-bounds? a 2 3) ⇒ #t
(array-in-bounds? a 0 0) ⇒ #f
Scheme Procedure: array-set! array obj idx …
C Function: scm_array_set_x (array, obj, idxlist)

Set the element at (idx …) in array to obj. The return value is unspecified.

(define a (make-array #f '(0 1) '(0 1)))
(array-set! a #t 1 1)
a ⇒ #2((#f #f) (#f #t))
Scheme Procedure: array-shape array
Scheme Procedure: array-dimensions array
C Function: scm_array_dimensions (array)

Return a list of the bounds for each dimension of array.

array-shape gives (lower upper) for each dimension. array-dimensions instead returns just upper+1 for dimensions with a 0 lower bound. Both are suitable as input to make-array.

For example,

(define a (make-array 'foo '(-1 3) 5))
(array-shape a)      ⇒ ((-1 3) (0 4))
(array-dimensions a) ⇒ ((-1 3) 5)
Scheme Procedure: array-length array
C Function: scm_array_length (array)
C Function: size_t scm_c_array_length (array)

Return the length of an array: its first dimension. It is an error to ask for the length of an array of rank 0.

Scheme Procedure: array-rank array
C Function: scm_array_rank (array)

Return the rank of array.

C Function: size_t scm_c_array_rank (SCM array)

Return the rank of array as a size_t.

Scheme Procedure: array->list array
C Function: scm_array_to_list (array)

Return a list consisting of all the elements, in order, of array.

Scheme Procedure: array-copy! src dst
Scheme Procedure: array-copy-in-order! src dst
C Function: scm_array_copy_x (src, dst)

Copy every element from vector or array src to the corresponding element of dst. dst must have the same rank as src, and be at least as large in each dimension. The return value is unspecified.

Scheme Procedure: array-fill! array fill
C Function: scm_array_fill_x (array, fill)

Store fill in every element of array. The value returned is unspecified.

Scheme Procedure: array-equal? array …

Return #t if all arguments are arrays with the same shape, the same type, and have corresponding elements which are either equal? or array-equal?. This function differs from equal? (see Equality) in that all arguments must be arrays.

Scheme Procedure: array-map! dst proc src …
Scheme Procedure: array-map-in-order! dst proc src1 … srcN
C Function: scm_array_map_x (dst, proc, srclist)

Set each element of the dst array to values obtained from calls to proc. The value returned is unspecified.

Each call is (proc elem1elemN), where each elem is from the corresponding src array, at the dst index. array-map-in-order! makes the calls in row-major order, array-map! makes them in an unspecified order.

The src arrays must have the same number of dimensions as dst, and must have a range for each dimension which covers the range in dst. This ensures all dst indices are valid in each src.

Scheme Procedure: array-for-each proc src1 src2 …
C Function: scm_array_for_each (proc, src1, srclist)

Apply proc to each tuple of elements of src1 src2 …, in row-major order. The value returned is unspecified.

Scheme Procedure: array-index-map! dst proc
C Function: scm_array_index_map_x (dst, proc)

Set each element of the dst array to values returned by calls to proc. The value returned is unspecified.

Each call is (proc i1iN), where i1iN is the destination index, one parameter for each dimension. The order in which the calls are made is unspecified.

For example, to create a 4x4 matrix representing a cyclic group,

    / 0 1 2 3 \
    | 1 2 3 0 |
    | 2 3 0 1 |
    \ 3 0 1 2 /
(define a (make-array #f 4 4))
(array-index-map! a (lambda (i j)
                      (modulo (+ i j) 4)))
Scheme Procedure: uniform-array-read! ra [port_or_fd [start [end]]]
C Function: scm_uniform_array_read_x (ra, port_or_fd, start, end)

Attempt to read all elements of array ra, in lexicographic order, as binary objects from port_or_fd. If an end of file is encountered, the objects up to that point are put into ra (starting at the beginning) and the remainder of the array is unchanged.

The optional arguments start and end allow a specified region of a vector (or linearized array) to be read, leaving the remainder of the vector unchanged.

uniform-array-read! returns the number of objects read. port_or_fd may be omitted, in which case it defaults to the value returned by (current-input-port).

Scheme Procedure: uniform-array-write ra [port_or_fd [start [end]]]
C Function: scm_uniform_array_write (ra, port_or_fd, start, end)

Writes all elements of ra as binary objects to port_or_fd.

The optional arguments start and end allow a specified region of a vector (or linearized array) to be written.

The number of objects actually written is returned. port_or_fd may be omitted, in which case it defaults to the value returned by (current-output-port).


Next: , Previous: , Up: Arrays   [Contents][Index]

6.7.5.3 Shared Arrays

Scheme Procedure: make-shared-array oldarray mapfunc bound …
C Function: scm_make_shared_array (oldarray, mapfunc, boundlist)

Return a new array which shares the storage of oldarray. Changes made through either affect the same underlying storage. The bound … arguments are the shape of the new array, the same as make-array (see Array Procedures).

mapfunc translates coordinates from the new array to the oldarray. It’s called as (mapfunc newidx1 …) with one parameter for each dimension of the new array, and should return a list of indices for oldarray, one for each dimension of oldarray.

mapfunc must be affine linear, meaning that each oldarray index must be formed by adding integer multiples (possibly negative) of some or all of newidx1 etc, plus a possible integer offset. The multiples and offset must be the same in each call.


One good use for a shared array is to restrict the range of some dimensions, so as to apply say array-for-each or array-fill! to only part of an array. The plain list function can be used for mapfunc in this case, making no changes to the index values. For example,

(make-shared-array #2((a b c) (d e f) (g h i)) list 3 2)
⇒ #2((a b) (d e) (g h))

The new array can have fewer dimensions than oldarray, for example to take a column from an array.

(make-shared-array #2((a b c) (d e f) (g h i))
                   (lambda (i) (list i 2))
                   '(0 2))
⇒ #1(c f i)

A diagonal can be taken by using the single new array index for both row and column in the old array. For example,

(make-shared-array #2((a b c) (d e f) (g h i))
                   (lambda (i) (list i i))
                   '(0 2))
⇒ #1(a e i)

Dimensions can be increased by for instance considering portions of a one dimensional array as rows in a two dimensional array. (array-contents below can do the opposite, flattening an array.)

(make-shared-array #1(a b c d e f g h i j k l)
                   (lambda (i j) (list (+ (* i 3) j)))
                   4 3)
⇒ #2((a b c) (d e f) (g h i) (j k l))

By negating an index the order that elements appear can be reversed. The following just reverses the column order,

(make-shared-array #2((a b c) (d e f) (g h i))
                   (lambda (i j) (list i (- 2 j)))
                   3 3)
⇒ #2((c b a) (f e d) (i h g))

A fixed offset on indexes allows for instance a change from a 0 based to a 1 based array,

(define x #2((a b c) (d e f) (g h i)))
(define y (make-shared-array x
                             (lambda (i j) (list (1- i) (1- j)))
                             '(1 3) '(1 3)))
(array-ref x 0 0) ⇒ a
(array-ref y 1 1) ⇒ a

A multiple on an index allows every Nth element of an array to be taken. The following is every third element,

(make-shared-array #1(a b c d e f g h i j k l)
                   (lambda (i) (list (* i 3)))
                   4)
⇒ #1(a d g j)

The above examples can be combined to make weird and wonderful selections from an array, but it’s important to note that because mapfunc must be affine linear, arbitrary permutations are not possible.

In the current implementation, mapfunc is not called for every access to the new array but only on some sample points to establish a base and stride for new array indices in oldarray data. A few sample points are enough because mapfunc is linear.

Scheme Procedure: shared-array-increments array
C Function: scm_shared_array_increments (array)

For each dimension, return the distance between elements in the root vector.

Scheme Procedure: shared-array-offset array
C Function: scm_shared_array_offset (array)

Return the root vector index of the first element in the array.

Scheme Procedure: shared-array-root array
C Function: scm_shared_array_root (array)

Return the root vector of a shared array.

Scheme Procedure: array-contents array [strict]
C Function: scm_array_contents (array, strict)

If array may be unrolled into a one dimensional shared array without changing their order (last subscript changing fastest), then array-contents returns that shared array, otherwise it returns #f. All arrays made by make-array and make-typed-array may be unrolled, some arrays made by make-shared-array may not be.

If the optional argument strict is provided, a shared array will be returned only if its elements are stored internally contiguous in memory.

Scheme Procedure: transpose-array array dim1 dim2 …
C Function: scm_transpose_array (array, dimlist)

Return an array sharing contents with array, but with dimensions arranged in a different order. There must be one dim argument for each dimension of array. dim1, dim2, … should be integers between 0 and the rank of the array to be returned. Each integer in that range must appear at least once in the argument list.

The values of dim1, dim2, … correspond to dimensions in the array to be returned, and their positions in the argument list to dimensions of array. Several dims may have the same value, in which case the returned array will have smaller rank than array.

(transpose-array '#2((a b) (c d)) 1 0) ⇒ #2((a c) (b d))
(transpose-array '#2((a b) (c d)) 0 0) ⇒ #1(a d)
(transpose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 1 0) ⇒
                #2((a 4) (b 5) (c 6))

Previous: , Up: Arrays   [Contents][Index]

6.7.5.4 Accessing Arrays from C

For interworking with external C code, Guile provides an API to allow C code to access the elements of a Scheme array. In particular, for uniform numeric arrays, the API exposes the underlying uniform data as a C array of numbers of the relevant type.

While pointers to the elements of an array are in use, the array itself must be protected so that the pointer remains valid. Such a protected array is said to be reserved. A reserved array can be read but modifications to it that would cause the pointer to its elements to become invalid are prevented. When you attempt such a modification, an error is signalled.

(This is similar to locking the array while it is in use, but without the danger of a deadlock. In a multi-threaded program, you will need additional synchronization to avoid modifying reserved arrays.)

You must take care to always unreserve an array after reserving it, even in the presence of non-local exits. If a non-local exit can happen between these two calls, you should install a dynwind context that releases the array when it is left (see Dynamic Wind).

In addition, array reserving and unreserving must be properly paired. For instance, when reserving two or more arrays in a certain order, you need to unreserve them in the opposite order.

Once you have reserved an array and have retrieved the pointer to its elements, you must figure out the layout of the elements in memory. Guile allows slices to be taken out of arrays without actually making a copy, such as making an alias for the diagonal of a matrix that can be treated as a vector. Arrays that result from such an operation are not stored contiguously in memory and when working with their elements directly, you need to take this into account.

The layout of array elements in memory can be defined via a mapping function that computes a scalar position from a vector of indices. The scalar position then is the offset of the element with the given indices from the start of the storage block of the array.

In Guile, this mapping function is restricted to be affine: all mapping functions of Guile arrays can be written as p = b + c[0]*i[0] + c[1]*i[1] + ... + c[n-1]*i[n-1] where i[k] is the kth index and n is the rank of the array. For example, a matrix of size 3x3 would have b == 0, c[0] == 3 and c[1] == 1. When you transpose this matrix (with transpose-array, say), you will get an array whose mapping function has b == 0, c[0] == 1 and c[1] == 3.

The function scm_array_handle_dims gives you (indirect) access to the coefficients c[k].

Note that there are no functions for accessing the elements of a character array yet. Once the string implementation of Guile has been changed to use Unicode, we will provide them.

C Type: scm_t_array_handle

This is a structure type that holds all information necessary to manage the reservation of arrays as explained above. Structures of this type must be allocated on the stack and must only be accessed by the functions listed below.

C Function: void scm_array_get_handle (SCM array, scm_t_array_handle *handle)

Reserve array, which must be an array, and prepare handle to be used with the functions below. You must eventually call scm_array_handle_release on handle, and do this in a properly nested fashion, as explained above. The structure pointed to by handle does not need to be initialized before calling this function.

C Function: void scm_array_handle_release (scm_t_array_handle *handle)

End the array reservation represented by handle. After a call to this function, handle might be used for another reservation.

C Function: size_t scm_array_handle_rank (scm_t_array_handle *handle)

Return the rank of the array represented by handle.

C Type: scm_t_array_dim

This structure type holds information about the layout of one dimension of an array. It includes the following fields:

ssize_t lbnd
ssize_t ubnd

The lower and upper bounds (both inclusive) of the permissible index range for the given dimension. Both values can be negative, but lbnd is always less than or equal to ubnd.

ssize_t inc

The distance from one element of this dimension to the next. Note, too, that this can be negative.

C Function: const scm_t_array_dim * scm_array_handle_dims (scm_t_array_handle *handle)

Return a pointer to a C vector of information about the dimensions of the array represented by handle. This pointer is valid as long as the array remains reserved. As explained above, the scm_t_array_dim structures returned by this function can be used calculate the position of an element in the storage block of the array from its indices.

This position can then be used as an index into the C array pointer returned by the various scm_array_handle_<foo>_elements functions, or with scm_array_handle_ref and scm_array_handle_set.

Here is how one can compute the position pos of an element given its indices in the vector indices:

ssize_t indices[RANK];
scm_t_array_dim *dims;
ssize_t pos;
size_t i;

pos = 0;
for (i = 0; i < RANK; i++)
  {
    if (indices[i] < dims[i].lbnd || indices[i] > dims[i].ubnd)
      out_of_range ();
    pos += (indices[i] - dims[i].lbnd) * dims[i].inc;
  }
C Function: ssize_t scm_array_handle_pos (scm_t_array_handle *handle, SCM indices)

Compute the position corresponding to indices, a list of indices. The position is computed as described above for scm_array_handle_dims. The number of the indices and their range is checked and an appropriate error is signalled for invalid indices.

C Function: SCM scm_array_handle_ref (scm_t_array_handle *handle, ssize_t pos)

Return the element at position pos in the storage block of the array represented by handle. Any kind of array is acceptable. No range checking is done on pos.

C Function: void scm_array_handle_set (scm_t_array_handle *handle, ssize_t pos, SCM val)

Set the element at position pos in the storage block of the array represented by handle to val. Any kind of array is acceptable. No range checking is done on pos. An error is signalled when the array can not store val.

C Function: const SCM * scm_array_handle_elements (scm_t_array_handle *handle)

Return a pointer to the elements of a ordinary array of general Scheme values (i.e., a non-uniform array) for reading. This pointer is valid as long as the array remains reserved.

C Function: SCM * scm_array_handle_writable_elements (scm_t_array_handle *handle)

Like scm_array_handle_elements, but the pointer is good for reading and writing.

C Function: const void * scm_array_handle_uniform_elements (scm_t_array_handle *handle)

Return a pointer to the elements of a uniform numeric array for reading. This pointer is valid as long as the array remains reserved. The size of each element is given by scm_array_handle_uniform_element_size.

C Function: void * scm_array_handle_uniform_writable_elements (scm_t_array_handle *handle)

Like scm_array_handle_uniform_elements, but the pointer is good reading and writing.

C Function: size_t scm_array_handle_uniform_element_size (scm_t_array_handle *handle)

Return the size of one element of the uniform numeric array represented by handle.

C Function: const scm_t_uint8 * scm_array_handle_u8_elements (scm_t_array_handle *handle)
C Function: const scm_t_int8 * scm_array_handle_s8_elements (scm_t_array_handle *handle)
C Function: const scm_t_uint16 * scm_array_handle_u16_elements (scm_t_array_handle *handle)
C Function: const scm_t_int16 * scm_array_handle_s16_elements (scm_t_array_handle *handle)
C Function: const scm_t_uint32 * scm_array_handle_u32_elements (scm_t_array_handle *handle)
C Function: const scm_t_int32 * scm_array_handle_s32_elements (scm_t_array_handle *handle)
C Function: const scm_t_uint64 * scm_array_handle_u64_elements (scm_t_array_handle *handle)
C Function: const scm_t_int64 * scm_array_handle_s64_elements (scm_t_array_handle *handle)
C Function: const float * scm_array_handle_f32_elements (scm_t_array_handle *handle)
C Function: const double * scm_array_handle_f64_elements (scm_t_array_handle *handle)
C Function: const float * scm_array_handle_c32_elements (scm_t_array_handle *handle)
C Function: const double * scm_array_handle_c64_elements (scm_t_array_handle *handle)

Return a pointer to the elements of a uniform numeric array of the indicated kind for reading. This pointer is valid as long as the array remains reserved.

The pointers for c32 and c64 uniform numeric arrays point to pairs of floating point numbers. The even index holds the real part, the odd index the imaginary part of the complex number.

C Function: scm_t_uint8 * scm_array_handle_u8_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_int8 * scm_array_handle_s8_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_uint16 * scm_array_handle_u16_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_int16 * scm_array_handle_s16_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_uint32 * scm_array_handle_u32_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_int32 * scm_array_handle_s32_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_uint64 * scm_array_handle_u64_writable_elements (scm_t_array_handle *handle)
C Function: scm_t_int64 * scm_array_handle_s64_writable_elements (scm_t_array_handle *handle)
C Function: float * scm_array_handle_f32_writable_elements (scm_t_array_handle *handle)
C Function: double * scm_array_handle_f64_writable_elements (scm_t_array_handle *handle)
C Function: float * scm_array_handle_c32_writable_elements (scm_t_array_handle *handle)
C Function: double * scm_array_handle_c64_writable_elements (scm_t_array_handle *handle)

Like scm_array_handle_<kind>_elements, but the pointer is good for reading and writing.

C Function: const scm_t_uint32 * scm_array_handle_bit_elements (scm_t_array_handle *handle)

Return a pointer to the words that store the bits of the represented array, which must be a bit array.

Unlike other arrays, bit arrays have an additional offset that must be figured into index calculations. That offset is returned by scm_array_handle_bit_elements_offset.

To find a certain bit you first need to calculate its position as explained above for scm_array_handle_dims and then add the offset. This gives the absolute position of the bit, which is always a non-negative integer.

Each word of the bit array storage block contains exactly 32 bits, with the least significant bit in that word having the lowest absolute position number. The next word contains the next 32 bits.

Thus, the following code can be used to access a bit whose position according to scm_array_handle_dims is given in pos:

SCM bit_array;
scm_t_array_handle handle;
scm_t_uint32 *bits;
ssize_t pos;
size_t abs_pos;
size_t word_pos, mask;

scm_array_get_handle (&bit_array, &handle);
bits = scm_array_handle_bit_elements (&handle);

pos = ...
abs_pos = pos + scm_array_handle_bit_elements_offset (&handle);
word_pos = abs_pos / 32;
mask = 1L << (abs_pos % 32);

if (bits[word_pos] & mask)
  /* bit is set. */

scm_array_handle_release (&handle);
C Function: scm_t_uint32 * scm_array_handle_bit_writable_elements (scm_t_array_handle *handle)

Like scm_array_handle_bit_elements but the pointer is good for reading and writing. You must take care not to modify bits outside of the allowed index range of the array, even for contiguous arrays.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.6 VLists

The (ice-9 vlist) module provides an implementation of the VList data structure designed by Phil Bagwell in 2002. VLists are immutable lists, which can contain any Scheme object. They improve on standard Scheme linked lists in several areas:

The idea behind VLists is to store vlist elements in increasingly large contiguous blocks (implemented as vectors here). These blocks are linked to one another using a pointer to the next block and an offset within that block. The size of these blocks form a geometric series with ratio block-growth-factor (2 by default).

The VList structure also serves as the basis for the VList-based hash lists or “vhashes”, an immutable dictionary type (see VHashes).

However, the current implementation in (ice-9 vlist) has several noteworthy shortcomings:

We hope to address these in the future.

The programming interface exported by (ice-9 vlist) is defined below. Most of it is the same as SRFI-1 with an added vlist- prefix to function names.

Scheme Procedure: vlist? obj

Return true if obj is a VList.

Scheme Variable: vlist-null

The empty VList. Note that it’s possible to create an empty VList not eq? to vlist-null; thus, callers should always use vlist-null? when testing whether a VList is empty.

Scheme Procedure: vlist-null? vlist

Return true if vlist is empty.

Scheme Procedure: vlist-cons item vlist

Return a new vlist with item as its head and vlist as its tail.

Scheme Procedure: vlist-head vlist

Return the head of vlist.

Scheme Procedure: vlist-tail vlist

Return the tail of vlist.

Scheme Variable: block-growth-factor

A fluid that defines the growth factor of VList blocks, 2 by default.

The functions below provide the usual set of higher-level list operations.

Scheme Procedure: vlist-fold proc init vlist
Scheme Procedure: vlist-fold-right proc init vlist

Fold over vlist, calling proc for each element, as for SRFI-1 fold and fold-right (see fold).

Scheme Procedure: vlist-ref vlist index

Return the element at index index in vlist. This is typically a constant-time operation.

Scheme Procedure: vlist-length vlist

Return the length of vlist. This is typically logarithmic in the number of elements in vlist.

Scheme Procedure: vlist-reverse vlist

Return a new vlist whose content are those of vlist in reverse order.

Scheme Procedure: vlist-map proc vlist

Map proc over the elements of vlist and return a new vlist.

Scheme Procedure: vlist-for-each proc vlist

Call proc on each element of vlist. The result is unspecified.

Scheme Procedure: vlist-drop vlist count

Return a new vlist that does not contain the count first elements of vlist. This is typically a constant-time operation.

Scheme Procedure: vlist-take vlist count

Return a new vlist that contains only the count first elements of vlist.

Scheme Procedure: vlist-filter pred vlist

Return a new vlist containing all the elements from vlist that satisfy pred.

Scheme Procedure: vlist-delete x vlist [equal?]

Return a new vlist corresponding to vlist without the elements equal? to x.

Scheme Procedure: vlist-unfold p f g seed [tail-gen]
Scheme Procedure: vlist-unfold-right p f g seed [tail]

Return a new vlist, as for SRFI-1 unfold and unfold-right (see unfold).

Scheme Procedure: vlist-append vlist …

Append the given vlists and return the resulting vlist.

Scheme Procedure: list->vlist lst

Return a new vlist whose contents correspond to lst.

Scheme Procedure: vlist->list vlist

Return a new list whose contents match those of vlist.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.7 Record Overview

Records, also called structures, are Scheme’s primary mechanism to define new disjoint types. A record type defines a list of fields that instances of the type consist of. This is like C’s struct.

Historically, Guile has offered several different ways to define record types and to create records, offering different features, and making different trade-offs. Over the years, each “standard” has also come with its own new record interface, leading to a maze of record APIs.

At the highest level is SRFI-9, a high-level record interface implemented by most Scheme implementations (see SRFI-9 Records). It defines a simple and efficient syntactic abstraction of record types and their associated type predicate, fields, and field accessors. SRFI-9 is suitable for most uses, and this is the recommended way to create record types in Guile. Similar high-level record APIs include SRFI-35 (see SRFI-35) and R6RS records (see rnrs records syntactic).

Then comes Guile’s historical “records” API (see Records). Record types defined this way are first-class objects. Introspection facilities are available, allowing users to query the list of fields or the value of a specific field at run-time, without prior knowledge of the type.

Finally, the common denominator of these interfaces is Guile’s structure API (see Structures). Guile’s structures are the low-level building block for all other record APIs. Application writers will normally not need to use it.

Records created with these APIs may all be pattern-matched using Guile’s standard pattern matcher (see Pattern Matching).


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.8 SRFI-9 Records

SRFI-9 standardizes a syntax for defining new record types and creating predicate, constructor, and field getter and setter functions. In Guile this is the recommended option to create new record types (see Record Overview). It can be used with:

(use-modules (srfi srfi-9))
Scheme Syntax: define-record-type type
(constructor fieldname …)
predicate
(fieldname accessor [modifier]) …

Create a new record type, and make various defines for using it. This syntax can only occur at the top-level, not nested within some other form.

type is bound to the record type, which is as per the return from the core make-record-type. type also provides the name for the record, as per record-type-name.

constructor is bound to a function to be called as (constructor fieldval …) to create a new record of this type. The arguments are initial values for the fields, one argument for each field, in the order they appear in the define-record-type form.

The fieldnames provide the names for the record fields, as per the core record-type-fields etc, and are referred to in the subsequent accessor/modifier forms.

predicate is bound to a function to be called as (predicate obj). It returns #t or #f according to whether obj is a record of this type.

Each accessor is bound to a function to be called (accessor record) to retrieve the respective field from a record. Similarly each modifier is bound to a function to be called (modifier record val) to set the respective field in a record.

An example will illustrate typical usage,

(define-record-type <employee>
  (make-employee name age salary)
  employee?
  (name    employee-name)
  (age     employee-age    set-employee-age!)
  (salary  employee-salary set-employee-salary!))

This creates a new employee data type, with name, age and salary fields. Accessor functions are created for each field, but no modifier function for the name (the intention in this example being that it’s established only when an employee object is created). These can all then be used as for example,

<employee> ⇒ #<record-type <employee>>

(define fred (make-employee "Fred" 45 20000.00))

(employee? fred)        ⇒ #t
(employee-age fred)     ⇒ 45
(set-employee-salary! fred 25000.00)  ;; pay rise

The functions created by define-record-type are ordinary top-level defines. They can be redefined or set! as desired, exported from a module, etc.

Non-toplevel Record Definitions

The SRFI-9 specification explicitly disallows record definitions in a non-toplevel context, such as inside lambda body or inside a let block. However, Guile’s implementation does not enforce that restriction.

Custom Printers

You may use set-record-type-printer! to customize the default printing behavior of records. This is a Guile extension and is not part of SRFI-9. It is located in the (srfi srfi-9 gnu) module.

Scheme Syntax: set-record-type-printer! name proc

Where type corresponds to the first argument of define-record-type, and proc is a procedure accepting two arguments, the record to print, and an output port.

This example prints the employee’s name in brackets, for instance [Fred].

(set-record-type-printer! <employee>
  (lambda (record port)
    (write-char #\[ port)
    (display (employee-name record) port)
    (write-char #\] port)))

Functional “Setters”

When writing code in a functional style, it is desirable to never alter the contents of records. For such code, a simple way to return new record instances based on existing ones is highly desirable.

The (srfi srfi-9 gnu) module extends SRFI-9 with facilities to return new record instances based on existing ones, only with one or more field values changed—functional setters. First, the define-immutable-record-type works like define-record-type, except that fields are immutable and setters are defined as functional setters.

Scheme Syntax: define-immutable-record-type type
(constructor fieldname …)
predicate
(fieldname accessor [modifier]) …

Define type as a new record type, like define-record-type. However, the record type is made immutable (records may not be mutated, even with struct-set!), and any modifier is defined to be a functional setter—a procedure that returns a new record instance with the specified field changed, and leaves the original unchanged (see example below.)

In addition, the generic set-field and set-fields macros may be applied to any SRFI-9 record.

Scheme Syntax: set-field record (field sub-fields ...) value

Return a new record of record’s type whose fields are equal to the corresponding fields of record except for the one specified by field.

field must be the name of the getter corresponding to the field of record being “set”. Subsequent sub-fields must be record getters designating sub-fields within that field value to be set (see example below.)

Scheme Syntax: set-fields record ((field sub-fields ...) value) ...

Like set-field, but can be used to set more than one field at a time. This expands to code that is more efficient than a series of single set-field calls.

To illustrate the use of functional setters, let’s assume these two record type definitions:

(define-record-type <address>
  (address street city country)
  address?
  (street  address-street)
  (city    address-city)
  (country address-country))

(define-immutable-record-type <person>
  (person age email address)
  person?
  (age     person-age set-person-age)
  (email   person-email set-person-email)
  (address person-address set-person-address))

First, note that the <person> record type definition introduces named functional setters. These may be used like this:

(define fsf-address
  (address "Franklin Street" "Boston" "USA"))

(define rms
  (person 30 "rms@gnu.org" fsf-address))

(and (equal? (set-person-age rms 60)
             (person 60 "rms@gnu.org" fsf-address))
     (= (person-age rms) 30))
⇒ #t

Here, the original <person> record, to which rms is bound, is left unchanged.

Now, suppose we want to change both the street and age of rms. This can be achieved using set-fields:

(set-fields rms
  ((person-age) 60)
  ((person-address address-street) "Temple Place"))
⇒ #<<person> age: 60 email: "rms@gnu.org"
  address: #<<address> street: "Temple Place" city: "Boston" country: "USA">>

Notice how the above changed two fields of rms, including the street field of its address field, in a concise way. Also note that set-fields works equally well for types defined with just define-record-type.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.9 Records

A record type is a first class object representing a user-defined data type. A record is an instance of a record type.

Note that in many ways, this interface is too low-level for every-day use. Most uses of records are better served by SRFI-9 records. See SRFI-9 Records.

Scheme Procedure: record? obj

Return #t if obj is a record of any type and #f otherwise.

Note that record? may be true of any Scheme value; there is no promise that records are disjoint with other Scheme types.

Scheme Procedure: make-record-type type-name field-names [print]

Create and return a new record-type descriptor.

type-name is a string naming the type. Currently it’s only used in the printed representation of records, and in diagnostics. field-names is a list of symbols naming the fields of a record of the type. Duplicates are not allowed among these symbols.

(make-record-type "employee" '(name age salary))

The optional print argument is a function used by display, write, etc, for printing a record of the new type. It’s called as (print record port) and should look at record and write to port.

Scheme Procedure: record-constructor rtd [field-names]

Return a procedure for constructing new members of the type represented by rtd. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in that list are unspecified. The field-names argument defaults to the list of field names in the call to make-record-type that created the type represented by rtd; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.

Scheme Procedure: record-predicate rtd

Return a procedure for testing membership in the type represented by rtd. The returned procedure accepts exactly one argument and returns a true value if the argument is a member of the indicated record type; it returns a false value otherwise.

Scheme Procedure: record-accessor rtd field-name

Return a procedure for reading the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd.

Scheme Procedure: record-modifier rtd field-name

Return a procedure for writing the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the modifier procedure is unspecified. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd.

Scheme Procedure: record-type-descriptor record

Return a record-type descriptor representing the type of the given record. That is, for example, if the returned descriptor were passed to record-predicate, the resulting predicate would return a true value when passed the given record. Note that it is not necessarily the case that the returned descriptor is the one that was passed to record-constructor in the call that created the constructor procedure that created the given record.

Scheme Procedure: record-type-name rtd

Return the type-name associated with the type represented by rtd. The returned value is eqv? to the type-name argument given in the call to make-record-type that created the type represented by rtd.

Scheme Procedure: record-type-fields rtd

Return a list of the symbols naming the fields in members of the type represented by rtd. The returned value is equal? to the field-names argument given in the call to make-record-type that created the type represented by rtd.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.10 Structures

A structure is a first class data type which holds Scheme values or C words in fields numbered 0 upwards. A vtable is a structure that represents a structure type, giving field types and permissions, and an optional print function for write etc.

Structures are lower level than records (see Records). Usually, when you need to represent structured data, you just want to use records. But sometimes you need to implement new kinds of structured data abstractions, and for that purpose structures are useful. Indeed, records in Guile are implemented with structures.


Next: , Up: Structures   [Contents][Index]

6.7.10.1 Vtables

A vtable is a structure type, specifying its layout, and other information. A vtable is actually itself a structure, but there’s no need to worry about that initially (see Vtable Contents.)

Scheme Procedure: make-vtable fields [print]

Create a new vtable.

fields is a string describing the fields in the structures to be created. Each field is represented by two characters, a type letter and a permissions letter, for example "pw". The types are as follows.

The second letter for each field is a permission code,

Here are some examples. See Tail Arrays, for information on the legacy tail array facility.

(make-vtable "pw")      ;; one writable field
(make-vtable "prpw")    ;; one read-only and one writable
(make-vtable "pwuwuw")  ;; one scheme and two uninterpreted

The optional print argument is a function called by display and write (etc) to give a printed representation of a structure created from this vtable. It’s called (print struct port) and should look at struct and write to port. The default print merely gives a form like ‘#<struct ADDR:ADDR>’ with a pair of machine addresses.

The following print function for example shows the two fields of its structure.

(make-vtable "prpw"
             (lambda (struct port)
               (format port "#<~a and ~a>"
                       (struct-ref struct 0)
                       (struct-ref struct 1))))

Next: , Previous: , Up: Structures   [Contents][Index]

6.7.10.2 Structure Basics

This section describes the basic procedures for working with structures. make-struct creates a structure, and struct-ref and struct-set! access its fields.

Scheme Procedure: make-struct vtable tail-size init …
Scheme Procedure: make-struct/no-tail vtable init …

Create a new structure, with layout per the given vtable (see Vtables).

The optional init… arguments are initial values for the fields of the structure. This is the only way to put values in read-only fields. If there are fewer init arguments than fields then the defaults are #f for a Scheme field (type p) or 0 for an uninterpreted field (type u).

Structures also have the ability to allocate a variable number of additional cells at the end, at their tails. However, this legacy tail array facilty is confusing and inefficient, and so we do not recommend it. See Tail Arrays, for more on the legacy tail array interface.

Type s self-reference fields, permission o opaque fields, and the count field of a tail array are all ignored for the init arguments, ie. an argument is not consumed by such a field. An s is always set to the structure itself, an o is always set to #f or 0 (with the intention that C code will do something to it later), and the tail count is always the given tail-size.

For example,

(define v (make-vtable "prpwpw"))
(define s (make-struct v 0 123 "abc" 456))
(struct-ref s 0) ⇒ 123
(struct-ref s 1) ⇒ "abc"
C Function: SCM scm_make_struct (SCM vtable, SCM tail_size, SCM init_list)
C Function: SCM scm_c_make_struct (SCM vtable, SCM tail_size, SCM init, ...)
C Function: SCM scm_c_make_structv (SCM vtable, SCM tail_size, size_t n_inits, scm_t_bits init[])

There are a few ways to make structures from C. scm_make_struct takes a list, scm_c_make_struct takes variable arguments terminated with SCM_UNDEFINED, and scm_c_make_structv takes a packed array.

Scheme Procedure: struct? obj
C Function: scm_struct_p (obj)

Return #t if obj is a structure, or #f if not.

Scheme Procedure: struct-ref struct n
C Function: scm_struct_ref (struct, n)

Return the contents of field number n in struct. The first field is number 0.

An error is thrown if n is out of range, or if the field cannot be read because it’s o opaque.

Scheme Procedure: struct-set! struct n value
C Function: scm_struct_set_x (struct, n, value)

Set field number n in struct to value. The first field is number 0.

An error is thrown if n is out of range, or if the field cannot be written because it’s r read-only or o opaque.

Scheme Procedure: struct-vtable struct
C Function: scm_struct_vtable (struct)

Return the vtable that describes struct.

The vtable is effectively the type of the structure. See Vtable Contents, for more on vtables.


Next: , Previous: , Up: Structures   [Contents][Index]

6.7.10.3 Vtable Contents

A vtable is itself a structure. It has a specific set of fields describing various aspects of its instances: the structures created from a vtable. Some of the fields are internal to Guile, some of them are part of the public interface, and there may be additional fields added on by the user.

Every vtable has a field for the layout of their instances, a field for the procedure used to print its instances, and a field for the name of the vtable itself. Access to the layout and printer is exposed directly via field indexes. Access to the vtable name is exposed via accessor procedures.

Scheme Variable: vtable-index-layout
C Macro: scm_vtable_index_layout

The field number of the layout specification in a vtable. The layout specification is a symbol like pwpw formed from the fields string passed to make-vtable, or created by make-struct-layout (see Meta-Vtables).

(define v (make-vtable "pwpw" 0))
(struct-ref v vtable-index-layout) ⇒ pwpw

This field is read-only, since the layout of structures using a vtable cannot be changed.

Scheme Variable: vtable-index-printer
C Macro: scm_vtable_index_printer

The field number of the printer function. This field contains #f if the default print function should be used.

(define (my-print-func struct port)
  ...)
(define v (make-vtable "pwpw" my-print-func))
(struct-ref v vtable-index-printer) ⇒ my-print-func

This field is writable, allowing the print function to be changed dynamically.

Scheme Procedure: struct-vtable-name vtable
Scheme Procedure: set-struct-vtable-name! vtable name
C Function: scm_struct_vtable_name (vtable)
C Function: scm_set_struct_vtable_name_x (vtable, name)

Get or set the name of vtable. name is a symbol and is used in the default print function when printing structures created from vtable.

(define v (make-vtable "pw"))
(set-struct-vtable-name! v 'my-name)

(define s (make-struct v 0))
(display s) -| #<my-name b7ab3ae0:b7ab3730>

Next: , Previous: , Up: Structures   [Contents][Index]

6.7.10.4 Meta-Vtables

As a structure, a vtable also has a vtable, which is also a structure. Structures, their vtables, the vtables of the vtables, and so on form a tree of structures. Making a new structure adds a leaf to the tree, and if that structure is a vtable, it may be used to create other leaves.

If you traverse up the tree of vtables, via calling struct-vtable, eventually you reach a root which is the vtable of itself:

scheme@(guile-user)> (current-module)
$1 = #<directory (guile-user) 221b090>
scheme@(guile-user)> (struct-vtable $1)
$2 = #<record-type module>
scheme@(guile-user)> (struct-vtable $2)
$3 = #<<standard-vtable> 12c30a0>
scheme@(guile-user)> (struct-vtable $3)
$4 = #<<standard-vtable> 12c3fa0>
scheme@(guile-user)> (struct-vtable $4)
$5 = #<<standard-vtable> 12c3fa0>
scheme@(guile-user)> <standard-vtable>
$6 = #<<standard-vtable> 12c3fa0>

In this example, we can say that $1 is an instance of $2, $2 is an instance of $3, $3 is an instance of $4, and $4, strangely enough, is an instance of itself. The value bound to $4 in this console session also bound to <standard-vtable> in the default environment.

Scheme Variable: <standard-vtable>

A meta-vtable, useful for making new vtables.

All of these values are structures. All but $1 are vtables. As $2 is an instance of $3, and $3 is a vtable, we can say that $3 is a meta-vtable: a vtable that can create vtables.

With this definition, we can specify more precisely what a vtable is: a vtable is a structure made from a meta-vtable. Making a structure from a meta-vtable runs some special checks to ensure that the first field of the structure is a valid layout. Additionally, if these checks see that the layout of the child vtable contains all the required fields of a vtable, in the correct order, then the child vtable will also be a meta-table, inheriting a magical bit from the parent.

Scheme Procedure: struct-vtable? obj
C Function: scm_struct_vtable_p (obj)

Return #t if obj is a vtable structure: an instance of a meta-vtable.

<standard-vtable> is a root of the vtable tree. (Normally there is only one root in a given Guile process, but due to some legacy interfaces there may be more than one.)

The set of required fields of a vtable is the set of fields in the <standard-vtable>, and is bound to standard-vtable-fields in the default environment. It is possible to create a meta-vtable that with additional fields in its layout, which can be used to create vtables with additional data:

scheme@(guile-user)> (struct-ref $3 vtable-index-layout)
$6 = pruhsruhpwphuhuhprprpw
scheme@(guile-user)> (struct-ref $4 vtable-index-layout)
$7 = pruhsruhpwphuhuh
scheme@(guile-user)> standard-vtable-fields 
$8 = "pruhsruhpwphuhuh"
scheme@(guile-user)> (struct-ref $2 vtable-offset-user)
$9 = module

In this continuation of our earlier example, $2 is a vtable that has extra fields, because its vtable, $3, was made from a meta-vtable with an extended layout. vtable-offset-user is a convenient definition that indicates the number of fields in standard-vtable-fields.

Scheme Variable: standard-vtable-fields

A string containing the orderedq set of fields that a vtable must have.

Scheme Variable: vtable-offset-user

The first index in a vtable that is available for a user.

Scheme Procedure: make-struct-layout fields
C Function: scm_make_struct_layout (fields)

Return a structure layout symbol, from a fields string. fields is as described under make-vtable (see Vtables). An invalid fields string is an error.

With these definitions, one can define make-vtable in this way:

(define* (make-vtable fields #:optional printer)
  (make-struct/no-tail <standard-vtable>
    (make-struct-layout fields)
    printer))

Next: , Previous: , Up: Structures   [Contents][Index]

6.7.10.5 Vtable Example

Let us bring these points together with an example. Consider a simple object system with single inheritance. Objects will be normal structures, and classes will be vtables with three extra class fields: the name of the class, the parent class, and the list of fields.

So, first we need a meta-vtable that allocates instances with these extra class fields.

(define <class>
  (make-vtable
   (string-append standard-vtable-fields "pwpwpw")
   (lambda (x port)
     (format port "<<class> ~a>" (class-name x)))))

(define (class? x)
  (and (struct? x)
       (eq? (struct-vtable x) <class>)))

To make a structure with a specific meta-vtable, we will use make-struct/no-tail, passing it the computed instance layout and printer, as with make-vtable, and additionally the extra three class fields.

(define (make-class name parent fields)
  (let* ((fields (compute-fields parent fields))
         (layout (compute-layout fields)))
    (make-struct/no-tail <class>
      layout 
      (lambda (x port)
        (print-instance x port))
      name
      parent
      fields)))

Instances will store their associated data in slots in the structure: as many slots as there are fields. The compute-layout procedure below can compute a layout, and field-index returns the slot corresponding to a field.

(define-syntax-rule (define-accessor name n)
  (define (name obj)
    (struct-ref obj n)))

;; Accessors for classes
(define-accessor class-name (+ vtable-offset-user 0))
(define-accessor class-parent (+ vtable-offset-user 1))
(define-accessor class-fields (+ vtable-offset-user 2))

(define (compute-fields parent fields)
  (if parent
      (append (class-fields parent) fields)
      fields))

(define (compute-layout fields)
  (make-struct-layout
   (string-concatenate (make-list (length fields) "pw"))))

(define (field-index class field)
  (list-index (class-fields class) field))

(define (print-instance x port)
  (format port "<~a" (class-name (struct-vtable x)))
  (for-each (lambda (field idx)
              (format port " ~a: ~a" field (struct-ref x idx)))
            (class-fields (struct-vtable x))
            (iota (length (class-fields (struct-vtable x)))))
  (format port ">"))

So, at this point we can actually make a few classes:

(define-syntax-rule (define-class name parent field ...)
  (define name (make-class 'name parent '(field ...))))

(define-class <surface> #f
  width height)

(define-class <window> <surface>
  x y)

And finally, make an instance:

(make-struct/no-tail <window> 400 300 10 20)
⇒ <<window> width: 400 height: 300 x: 10 y: 20>

And that’s that. Note that there are many possible optimizations and feature enhancements that can be made to this object system, and the included GOOPS system does make most of them. For more simple use cases, the records facility is usually sufficient. But sometimes you need to make new kinds of data abstractions, and for that purpose, structs are here.


Previous: , Up: Structures   [Contents][Index]

6.7.10.6 Tail Arrays

Guile’s structures have a facility whereby each instance of a vtable can contain a variable-length tail array of values. The length of the tail array is stored in the structure. This facility was originally intended to allow C code to expose raw C structures with word-sized tail arrays to Scheme.

However, the tail array facility is confusing and doesn’t work very well. It is very rarely used, but it insinuates itself into all invocations of make-struct. For this reason the clumsily-named make-struct/no-tail procedure can actually be more elegant in actual use, because it doesn’t have a random 0 argument stuck in the middle.

Tail arrays also inhibit optimization by allowing instances to affect their shapes. In the absence of tail arrays, all instances of a given vtable have the same number and kinds of fields. This uniformity can be exploited by the runtime and the optimizer. The presence of tail arrays make some of these optimizations more difficult.

Finally, the tail array facility is ad-hoc and does not compose with the rest of Guile. If a Guile user wants an array with user-specified length, it’s best to use a vector. It is more clear in the code, and the standard optimization techniques will do a good job with it.

That said, we should mention some details about the interface. A vtable that has tail array has upper-case permission descriptors: W, R or O, correspoding to tail arrays of writable, read-only, or opaque elements. A tail array permission descriptor may only appear in the last element of a vtable layout.

For exampple, ‘pW’ indicates a tail of writable Scheme-valued fields. The ‘pW’ field itself holds the tail size, and the tail fields come after it.

(define v (make-vtable "prpW")) ;; one fixed then a tail array
(define s (make-struct v 6 "fixed field" 'x 'y))
(struct-ref s 0) ⇒ "fixed field"
(struct-ref s 1) ⇒ 2    ;; tail size
(struct-ref s 2) ⇒ x    ;; tail array ...
(struct-ref s 3) ⇒ y
(struct-ref s 4) ⇒ #f

Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.11 Dictionary Types

A dictionary object is a data structure used to index information in a user-defined way. In standard Scheme, the main aggregate data types are lists and vectors. Lists are not really indexed at all, and vectors are indexed only by number (e.g. (vector-ref foo 5)). Often you will find it useful to index your data on some other type; for example, in a library catalog you might want to look up a book by the name of its author. Dictionaries are used to help you organize information in such a way.

An association list (or alist for short) is a list of key-value pairs. Each pair represents a single quantity or object; the car of the pair is a key which is used to identify the object, and the cdr is the object’s value.

A hash table also permits you to index objects with arbitrary keys, but in a way that makes looking up any one object extremely fast. A well-designed hash system makes hash table lookups almost as fast as conventional array or vector references.

Alists are popular among Lisp programmers because they use only the language’s primitive operations (lists, car, cdr and the equality primitives). No changes to the language core are necessary. Therefore, with Scheme’s built-in list manipulation facilities, it is very convenient to handle data stored in an association list. Also, alists are highly portable and can be easily implemented on even the most minimal Lisp systems.

However, alists are inefficient, especially for storing large quantities of data. Because we want Guile to be useful for large software systems as well as small ones, Guile provides a rich set of tools for using either association lists or hash tables.


Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.12 Association Lists

An association list is a conventional data structure that is often used to implement simple key-value databases. It consists of a list of entries in which each entry is a pair. The key of each entry is the car of the pair and the value of each entry is the cdr.

ASSOCIATION LIST ::=  '( (KEY1 . VALUE1)
                         (KEY2 . VALUE2)
                         (KEY3 . VALUE3)
                         …
                       )

Association lists are also known, for short, as alists.

The structure of an association list is just one example of the infinite number of possible structures that can be built using pairs and lists. As such, the keys and values in an association list can be manipulated using the general list structure procedures cons, car, cdr, set-car!, set-cdr! and so on. However, because association lists are so useful, Guile also provides specific procedures for manipulating them.


Next: , Up: Association Lists   [Contents][Index]

6.7.12.1 Alist Key Equality

All of Guile’s dedicated association list procedures, apart from acons, come in three flavours, depending on the level of equality that is required to decide whether an existing key in the association list is the same as the key that the procedure call uses to identify the required entry.

acons is an exception because it is used to build association lists which do not require their entries’ keys to be unique.


Next: , Previous: , Up: Association Lists   [Contents][Index]

6.7.12.2 Adding or Setting Alist Entries

acons adds a new entry to an association list and returns the combined association list. The combined alist is formed by consing the new entry onto the head of the alist specified in the acons procedure call. So the specified alist is not modified, but its contents become shared with the tail of the combined alist that acons returns.

In the most common usage of acons, a variable holding the original association list is updated with the combined alist:

(set! address-list (acons name address address-list))

In such cases, it doesn’t matter that the old and new values of address-list share some of their contents, since the old value is usually no longer independently accessible.

Note that acons adds the specified new entry regardless of whether the alist may already contain entries with keys that are, in some sense, the same as that of the new entry. Thus acons is ideal for building alists where there is no concept of key uniqueness.

(set! task-list (acons 3 "pay gas bill" '()))
task-list
⇒
((3 . "pay gas bill"))

(set! task-list (acons 3 "tidy bedroom" task-list))
task-list
⇒
((3 . "tidy bedroom") (3 . "pay gas bill"))

assq-set!, assv-set! and assoc-set! are used to add or replace an entry in an association list where there is a concept of key uniqueness. If the specified association list already contains an entry whose key is the same as that specified in the procedure call, the existing entry is replaced by the new one. Otherwise, the new entry is consed onto the head of the old association list to create the combined alist. In all cases, these procedures return the combined alist.

assq-set! and friends may destructively modify the structure of the old association list in such a way that an existing variable is correctly updated without having to set! it to the value returned:

address-list
⇒
(("mary" . "34 Elm Road") ("james" . "16 Bow Street"))

(assoc-set! address-list "james" "1a London Road")
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))

address-list
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))

Or they may not:

(assoc-set! address-list "bob" "11 Newington Avenue")
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
 ("james" . "1a London Road"))

address-list
⇒
(("mary" . "34 Elm Road") ("james" . "1a London Road"))

The only safe way to update an association list variable when adding or replacing an entry like this is to set! the variable to the returned value:

(set! address-list
      (assoc-set! address-list "bob" "11 Newington Avenue"))
address-list
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
 ("james" . "1a London Road"))

Because of this slight inconvenience, you may find it more convenient to use hash tables to store dictionary data. If your application will not be modifying the contents of an alist very often, this may not make much difference to you.

If you need to keep the old value of an association list in a form independent from the list that results from modification by acons, assq-set!, assv-set! or assoc-set!, use list-copy to copy the old association list before modifying it.

Scheme Procedure: acons key value alist
C Function: scm_acons (key, value, alist)

Add a new key-value pair to alist. A new pair is created whose car is key and whose cdr is value, and the pair is consed onto alist, and the new list is returned. This function is not destructive; alist is not modified.

Scheme Procedure: assq-set! alist key val
Scheme Procedure: assv-set! alist key value
Scheme Procedure: assoc-set! alist key value
C Function: scm_assq_set_x (alist, key, val)
C Function: scm_assv_set_x (alist, key, val)
C Function: scm_assoc_set_x (alist, key, val)

Reassociate key in alist with value: find any existing alist entry for key and associate it with the new value. If alist does not contain an entry for key, add a new one. Return the (possibly new) alist.

These functions do not attempt to verify the structure of alist, and so may cause unusual results if passed an object that is not an association list.


Next: , Previous: , Up: Association Lists   [Contents][Index]

6.7.12.3 Retrieving Alist Entries

assq, assv and assoc find the entry in an alist for a given key, and return the (key . value) pair. assq-ref, assv-ref and assoc-ref do a similar lookup, but return just the value.

Scheme Procedure: assq key alist
Scheme Procedure: assv key alist
Scheme Procedure: assoc key alist
C Function: scm_assq (key, alist)
C Function: scm_assv (key, alist)
C Function: scm_assoc (key, alist)

Return the first entry in alist with the given key. The return is the pair (KEY . VALUE) from alist. If there’s no matching entry the return is #f.

assq compares keys with eq?, assv uses eqv? and assoc uses equal?. See also SRFI-1 which has an extended assoc (SRFI-1 Association Lists).

Scheme Procedure: assq-ref alist key
Scheme Procedure: assv-ref alist key
Scheme Procedure: assoc-ref alist key
C Function: scm_assq_ref (alist, key)
C Function: scm_assv_ref (alist, key)
C Function: scm_assoc_ref (alist, key)

Return the value from the first entry in alist with the given key, or #f if there’s no such entry.

assq-ref compares keys with eq?, assv-ref uses eqv? and assoc-ref uses equal?.

Notice these functions have the key argument last, like other -ref functions, but this is opposite to what assq etc above use.

When the return is #f it can be either key not found, or an entry which happens to have value #f in the cdr. Use assq etc above if you need to differentiate these cases.


Next: , Previous: , Up: Association Lists   [Contents][Index]

6.7.12.4 Removing Alist Entries

To remove the element from an association list whose key matches a specified key, use assq-remove!, assv-remove! or assoc-remove! (depending, as usual, on the level of equality required between the key that you specify and the keys in the association list).

As with assq-set! and friends, the specified alist may or may not be modified destructively, and the only safe way to update a variable containing the alist is to set! it to the value that assq-remove! and friends return.

address-list
⇒
(("bob" . "11 Newington Avenue") ("mary" . "34 Elm Road")
 ("james" . "1a London Road"))

(set! address-list (assoc-remove! address-list "mary"))
address-list
⇒
(("bob" . "11 Newington Avenue") ("james" . "1a London Road"))

Note that, when assq/v/oc-remove! is used to modify an association list that has been constructed only using the corresponding assq/v/oc-set!, there can be at most one matching entry in the alist, so the question of multiple entries being removed in one go does not arise. If assq/v/oc-remove! is applied to an association list that has been constructed using acons, or an assq/v/oc-set! with a different level of equality, or any mixture of these, it removes only the first matching entry from the alist, even if the alist might contain further matching entries. For example:

(define address-list '())
(set! address-list (assq-set! address-list "mary" "11 Elm Street"))
(set! address-list (assq-set! address-list "mary" "57 Pine Drive"))
address-list
⇒
(("mary" . "57 Pine Drive") ("mary" . "11 Elm Street"))

(set! address-list (assoc-remove! address-list "mary"))
address-list
⇒
(("mary" . "11 Elm Street"))

In this example, the two instances of the string "mary" are not the same when compared using eq?, so the two assq-set! calls add two distinct entries to address-list. When compared using equal?, both "mary"s in address-list are the same as the "mary" in the assoc-remove! call, but assoc-remove! stops after removing the first matching entry that it finds, and so one of the "mary" entries is left in place.

Scheme Procedure: assq-remove! alist key
Scheme Procedure: assv-remove! alist key
Scheme Procedure: assoc-remove! alist key
C Function: scm_assq_remove_x (alist, key)
C Function: scm_assv_remove_x (alist, key)
C Function: scm_assoc_remove_x (alist, key)

Delete the first entry in alist associated with key, and return the resulting alist.


Next: , Previous: , Up: Association Lists   [Contents][Index]

6.7.12.5 Sloppy Alist Functions

sloppy-assq, sloppy-assv and sloppy-assoc behave like the corresponding non-sloppy- procedures, except that they return #f when the specified association list is not well-formed, where the non-sloppy- versions would signal an error.

Specifically, there are two conditions for which the non-sloppy- procedures signal an error, which the sloppy- procedures handle instead by returning #f. Firstly, if the specified alist as a whole is not a proper list:

(assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
⇒
ERROR: In procedure assoc in expression (assoc "mary" (quote #)):
ERROR: Wrong type argument in position 2 (expecting
   association list): ((1 . 2) ("key" . "door") . "open sesame")

(sloppy-assoc "mary" '((1 . 2) ("key" . "door") . "open sesame"))
⇒
#f

Secondly, if one of the entries in the specified alist is not a pair:

(assoc 2 '((1 . 1) 2 (3 . 9)))
⇒
ERROR: In procedure assoc in expression (assoc 2 (quote #)):
ERROR: Wrong type argument in position 2 (expecting
   association list): ((1 . 1) 2 (3 . 9))

(sloppy-assoc 2 '((1 . 1) 2 (3 . 9)))
⇒
#f

Unless you are explicitly working with badly formed association lists, it is much safer to use the non-sloppy- procedures, because they help to highlight coding and data errors that the sloppy- versions would silently cover up.

Scheme Procedure: sloppy-assq key alist
C Function: scm_sloppy_assq (key, alist)

Behaves like assq but does not do any error checking. Recommended only for use in Guile internals.

Scheme Procedure: sloppy-assv key alist
C Function: scm_sloppy_assv (key, alist)

Behaves like assv but does not do any error checking. Recommended only for use in Guile internals.

Scheme Procedure: sloppy-assoc key alist
C Function: scm_sloppy_assoc (key, alist)

Behaves like assoc but does not do any error checking. Recommended only for use in Guile internals.


Previous: , Up: Association Lists   [Contents][Index]

6.7.12.6 Alist Example

Here is a longer example of how alists may be used in practice.

(define capitals '(("New York" . "Albany")
                   ("Oregon"   . "Salem")
                   ("Florida"  . "Miami")))

;; What's the capital of Oregon?
(assoc "Oregon" capitals)       ⇒ ("Oregon" . "Salem")
(assoc-ref capitals "Oregon")   ⇒ "Salem"

;; We left out South Dakota.
(set! capitals
      (assoc-set! capitals "South Dakota" "Pierre"))
capitals
⇒ (("South Dakota" . "Pierre")
    ("New York" . "Albany")
    ("Oregon" . "Salem")
    ("Florida" . "Miami"))

;; And we got Florida wrong.
(set! capitals
      (assoc-set! capitals "Florida" "Tallahassee"))
capitals
⇒ (("South Dakota" . "Pierre")
    ("New York" . "Albany")
    ("Oregon" . "Salem")
    ("Florida" . "Tallahassee"))

;; After Oregon secedes, we can remove it.
(set! capitals
      (assoc-remove! capitals "Oregon"))
capitals
⇒ (("South Dakota" . "Pierre")
    ("New York" . "Albany")
    ("Florida" . "Tallahassee"))

Next: , Previous: , Up: Compound Data Types   [Contents][Index]

6.7.13 VList-Based Hash Lists or “VHashes”

The (ice-9 vlist) module provides an implementation of VList-based hash lists (see VLists). VList-based hash lists, or vhashes, are an immutable dictionary type similar to association lists that maps keys to values. However, unlike association lists, accessing a value given its key is typically a constant-time operation.

The VHash programming interface of (ice-9 vlist) is mostly the same as that of association lists found in SRFI-1, with procedure names prefixed by vhash- instead of alist- (see SRFI-1 Association Lists).

In addition, vhashes can be manipulated using VList operations:

(vlist-head (vhash-consq 'a 1 vlist-null))
⇒ (a . 1)

(define vh1 (vhash-consq 'b 2 (vhash-consq 'a 1 vlist-null)))
(define vh2 (vhash-consq 'c 3 (vlist-tail vh1)))

(vhash-assq 'a vh2)
⇒ (a . 1)
(vhash-assq 'b vh2)
⇒ #f
(vhash-assq 'c vh2)
⇒ (c . 3)
(vlist->list vh2)
⇒ ((c . 3) (a . 1))

However, keep in mind that procedures that construct new VLists (vlist-map, vlist-filter, etc.) return raw VLists, not vhashes:

(define vh (alist->vhash '((a . 1) (b . 2) (c . 3)) hashq))
(vhash-assq 'a vh)
⇒ (a . 1)

(define vl
  ;; This will create a raw vlist.
  (vlist-filter (lambda (key+value) (odd? (cdr key+value))) vh))
(vhash-assq 'a vl)
⇒ ERROR: Wrong type argument in position 2

(vlist->list vl)
⇒ ((a . 1) (c . 3))
Scheme Procedure: vhash? obj

Return true if obj is a vhash.

Scheme Procedure: vhash-cons key value vhash [hash-proc]
Scheme Procedure: vhash-consq key value vhash
Scheme Procedure: vhash-consv key value vhash

Return a new hash list based on vhash where key is associated with value, using hash-proc to compute the hash of key. vhash must be either vlist-null or a vhash returned by a previous call to vhash-cons. hash-proc defaults to hash (see hash procedure). With vhash-consq, the hashq hash function is used; with vhash-consv the hashv hash function is used.

All vhash-cons calls made to construct a vhash should use the same hash-proc. Failing to do that, the result is undefined.

Scheme Procedure: vhash-assoc key vhash [equal? [hash-proc]]
Scheme Procedure: vhash-assq key vhash
Scheme Procedure: vhash-assv key vhash

Return the first key/value pair from vhash whose key is equal to key according to the equal? equality predicate (which defaults to equal?), and using hash-proc (which defaults to hash) to compute the hash of key. The second form uses eq? as the equality predicate and hashq as the hash function; the last form uses eqv? and hashv.

Note that it is important to consistently use the same hash function for hash-proc as was passed to vhash-cons. Failing to do that, the result is unpredictable.

Scheme Procedure: vhash-delete key vhash [equal? [hash-proc]]
Scheme Procedure: vhash-delq key vhash
Scheme Procedure: vhash-delv key vhash

Remove all associations from vhash with key, comparing keys with equal? (which defaults to equal?), and computing the hash of key using hash-proc (which defaults to hash). The second form uses eq? as the equality predicate and hashq as the hash function; the last one uses eqv? and hashv.

Again the choice of hash-proc must be consistent with previous calls to vhash-cons.

Scheme Procedure: vhash-fold proc init vhash
Scheme Procedure: vhash-fold-right proc init vhash

Fold over the key/value elements of vhash in the given direction, with each call to proc having the form (proc key value result), where result is the result of the previous call to proc and init the value of result for the first call to proc.

Scheme Procedure: vhash-fold* proc init key vhash [equal? [hash]]
Scheme Procedure: vhash-foldq* proc init key vhash
Scheme Procedure: vhash-foldv* proc init key vhash

Fold over all the values associated with key in vhash, with each call to proc having the form (proc value result), where result is the result of the previous call to proc and init the value of result for the first call to proc.

Keys in vhash are hashed using hash are compared using equal?. The second form uses eq? as the equality predicate and hashq as the hash function; the third one uses eqv? and hashv.

Example:

(define vh
  (alist->vhash '((a . 1) (a . 2) (z . 0) (a . 3))))

(vhash-fold* cons '() 'a vh)
⇒ (3 2 1)

(vhash-fold* cons '() 'z vh)
⇒ (0)
Scheme Procedure: alist->vhash alist [hash-proc]

Return the vhash corresponding to alist, an association list, using hash-proc to compute key hashes. When omitted, hash-proc defaults to hash.


Previous: , Up: