9.3.1 Why a VM?

For a long time, Guile only had a Scheme interpreter, implemented in C. Guile’s interpreter operated directly on the S-expression representation of Scheme source code.

But while the interpreter was highly optimized and hand-tuned, it still performed many needless computations during the course of evaluating a Scheme expression. For example, application of a function to arguments needlessly consed up the arguments in a list. Evaluation of an expression like (f x y) always had to figure out whether f was a procedure, or a special form like if, or something else. The interpreter represented the lexical environment as a heap data structure, so every evaluation caused allocation, which was of course slow. Et cetera.

The solution to the slow-interpreter problem was to compile the higher-level language, Scheme, into a lower-level language for which all of the checks and dispatching have already been done—the code is instead stripped to the bare minimum needed to “do the job”.

The question becomes then, what low-level language to choose? There are many options. We could compile to native code directly, but that poses portability problems for Guile, as it is a highly cross-platform project.

So we want the performance gains that compilation provides, but we also want to maintain the portability benefits of a single code path. The obvious solution is to compile to a virtual machine that is present on all Guile installations.

The easiest (and most fun) way to depend on a virtual machine is to implement the virtual machine within Guile itself. Guile contains a bytecode interpreter (written in C) and a Scheme to bytecode compiler (written in Scheme). This way the virtual machine provides what Scheme needs (tail calls, multiple values, call/cc) and can provide optimized inline instructions for Guile as well (GC-managed allocations, type checks, etc.).

Guile also includes a just-in-time (JIT) compiler to translate bytecode to native code. Because Guile embeds a portable code generation library (https://gitlab.com/wingo/lightening), we keep the benefits of portability while also benefitting from fast native code. To avoid too much time spent in the JIT compiler itself, Guile is tuned to only emit machine code for bytecode that is called often.

The rest of this section describes that VM that Guile implements, and the compiled procedures that run on it.

Before moving on, though, we should note that though we spoke of the interpreter in the past tense, Guile still has an interpreter. The difference is that before, it was Guile’s main Scheme implementation, and so was implemented in highly optimized C; now, it is actually implemented in Scheme, and compiled down to VM bytecode, just like any other program. (There is still a C interpreter around, used to bootstrap the compiler, but it is not normally used at runtime.)

The upside of implementing the interpreter in Scheme is that we preserve tail calls and multiple-value handling between interpreted and compiled code, and with advent of the JIT compiler in Guile 3.0 we reach the speed of the old hand-tuned C implementation; it’s the best of both worlds.

Also note that this decision to implement a bytecode compiler does not preclude ahead-of-time native compilation. More possibilities are discussed in Extending the Compiler.