Guile’s compiler is quite simple – its compilers, to put it more accurately. Guile defines a tower of languages, starting at Scheme and progressively simplifying down to languages that resemble the VM instruction set (see Instruction Set).
Each language knows how to compile to the next, so each step is simple and understandable. Furthermore, this set of languages is not hardcoded into Guile, so it is possible for the user to add new high-level languages, new passes, or even different compilation targets.
Languages are registered in the module,
(system base language):
(use-modules (system base language))
They are registered with the
Define a language.
This syntax defines a
<language> object, bound to name in
the current environment. In addition, the language will be added to the
global language set. For example, this is the language definition for
(define-language scheme #:title "Scheme" #:reader (lambda (port env) ...) #:compilers `((tree-il . ,compile-tree-il)) #:decompilers `((tree-il . ,decompile-tree-il)) #:evaluator (lambda (x module) (primitive-eval x)) #:printer write #:make-default-environment (lambda () ...))
The interesting thing about having languages defined this way is that they present a uniform interface to the read-eval-print loop. This allows the user to change the current language of the REPL:
scheme@(guile-user)> ,language tree-il Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. tree-il@(guile-user)> ,L scheme Happy hacking with Scheme! To switch back, type `,L tree-il'. scheme@(guile-user)>
Languages can be looked up by name, as they were above.
Looks up a language named name, autoloading it if necessary.
Languages are autoloaded by looking for a variable named name in
a module named
(language name spec).
The language object will be returned, or
#f if there does not
exist a language with that name.
Defining languages this way allows us to programmatically determine the necessary steps for compiling code from one language to another.
Recursively traverses the set of languages to which from can
compile, depth-first, and return the first path that can transform
from to to. Returns
#f if no path is found.
This function memoizes its results in a cache that is invalidated by
subsequent calls to
define-language, so it should be quite
There is a notion of a “current language”, which is maintained in the
current-language parameter, defined in the core
module. This language is normally Scheme, and may be rebound by the
user. The run-time compilation interfaces
(see Read/Load/Eval/Compile) also allow you to choose other source
and target languages.
The normal tower of languages when compiling Scheme goes like this:
As discussed before (see Object File Format), bytecode is in ELF format, ready to be serialized to disk. But when compiling Scheme at run time, you want a Scheme value: for example, a compiled procedure. For this reason, so as not to break the abstraction, Guile defines a fake language at the bottom of the tower:
value loads the bytecode into a procedure, turning
cold bytes into warm code.
Perhaps this strangeness can be explained by example:
compile-file defaults to compiling to bytecode, because it
produces object code that has to live in the barren world outside the
Guile runtime; but
compile defaults to compiling to
as its product re-enters the Guile world.
Indeed, the process of compilation can circulate through these different worlds indefinitely, as shown by the following quine:
((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))