9.4.4.2 CPS in Guile

Guile’s CPS language is composed of continuations. A continuation is a labelled program point. If you are used to traditional compilers, think of a continuation as a trivial basic block. A program is a “soup” of continuations, represented as a map from labels to continuations.

Like basic blocks, each continuation belongs to only one function. Some continuations are special, like the continuation corresponding to a function’s entry point, or the continuation that represents the tail of a function. Others contain a term. A term contains an expression, which evaluates to zero or more values. The term also records the continuation to which it will pass its values. Some terms, like conditional branches, may continue to one of a number of continuations.

Continuation labels are small integers. This makes it easy to sort them and to group them into sets. Whenever a term refers to a continuation, it does so by name, simply recording the label of the continuation. Continuation labels are unique among the set of labels in a program.

Variables are also named by small integers. Variable names are unique among the set of variables in a program.

For example, a simple continuation that receives two values and adds them together can be matched like this, using the match form from (ice-9 match):

(match cont
  (($ $kargs (x-name y-name) (x-var y-var)
      ($ $continue k src ($ $primcall '+ #f (x-var y-var))))
   (format #t "Add ~a and ~a and pass the result to label ~a"
           x-var y-var k)))

Here we see the most common kind of continuation, $kargs, which binds some number of values to variables and then evaluates a term.

CPS Continuation: $kargs names vars term

Bind the incoming values to the variables vars, with original names names, and then evaluate term.

The names of a $kargs are just for debugging, and will end up residualized in the object file for use by the debugger.

The term in a $kargs is always a $continue, which evaluates an expression and continues to a continuation.

CPS Term: $continue k src exp

Evaluate the expression exp and pass the resulting values (if any) to the continuation labelled k. The source information associated with the expression may be found in src, which is either an alist as in source-properties or is #f if there is no associated source.

There are a number of expression kinds. Above you see an example of $primcall.

CPS Expression: $primcall name param args

Perform the primitive operation identified by name, a well-known symbol, passing it the arguments args, and pass all resulting values to the continuation.

param is a constant parameter whose interpretation is up to the primcall in question. Usually it’s #f but for a primcall that might need some compile-time constant information – such as add/immediate, which adds a constant number to a value – the parameter holds this information.

The set of available primitives includes many primitives known to Tree-IL and then some more; see the source code for details. Note that some Tree-IL primcalls need to be converted to a sequence of lower-level CPS primcalls. Again, see (language tree-il compile-cps) for full details.

The variables that are used by $primcall, or indeed by any expression, must be defined before the expression is evaluated. An equivalent way of saying this is that predecessor $kargs continuation(s) that bind the variables(s) used by the expression must dominate the continuation that uses the expression: definitions dominate uses. This condition is trivially satisfied in our example above, but in general to determine the set of variables that are in “scope” for a given term, you need to do a flow analysis to see what continuations dominate a term. The variables that are in scope are those variables defined by the continuations that dominate a term.

Here is an inventory of the kinds of expressions in Guile’s CPS language, besides $primcall which has already been described. Recall that all expressions are wrapped in a $continue term which specifies their continuation.

CPS Expression: $const val

Continue with the constant value val.

CPS Expression: $prim name

Continue with the procedure that implements the primitive operation named by name.

CPS Expression: $call proc args

Call proc with the arguments args, and pass all values to the continuation. proc and the elements of the args list should all be variable names. The continuation identified by the term’s k should be a $kreceive or a $ktail instance.

CPS Expression: $values args

Pass the values named by the list args to the continuation.

CPS Expression: $prompt escape? tag handler

There are two sub-languages of CPS, higher-order CPS and first-order CPS. The difference is that in higher-order CPS, there are $fun and $rec expressions that bind functions or mutually-recursive functions in the implicit scope of their use sites. Guile transforms higher-order CPS into first-order CPS by closure conversion, which chooses representations for all closures and which arranges to access free variables through the implicit closure parameter that is passed to every function call.

CPS Expression: $fun body

Continue with a procedure. body names the entry point of the function, which should be a $kfun. This expression kind is only valid in higher-order CPS, which is the CPS language before closure conversion.

CPS Expression: $rec names vars funs

Continue with a set of mutually recursive procedures denoted by names, vars, and funs. names is a list of symbols, vars is a list of variable names (unique integers), and funs is a list of $fun values. Note that the $kargs continuation should also define names/vars bindings.

The contification pass will attempt to transform the functions declared in a $rec into local continuations. Any remaining $fun instances are later removed by the closure conversion pass. If the function has no free variables, it gets allocated as a constant.

CPS Expression: $const-fun label

A constant which is a function whose entry point is label. As a constant, instances of $const-fun with the same label will not allocate; the space for the function is allocated as part of the compilation unit.

In practice, $const-fun expressions are reified by CPS-conversion for functions whose call sites are not all visible within the compilation unit and which have no free variables. This expression kind is part of first-order CPS.

Otherwise, if the closure has free variables, it will be allocated at its definition site via an allocate-words primcall and its free variables initialized there. The code pointer in the closure is initialized from a $code expression.

CPS Expression: $code label

Continue with the value of label, which should denote some $kfun continuation in the program. Used when initializing the code pointer of closure objects.

However, If the closure can be proven to never escape its scope then other lighter-weight representations can be chosen. Additionally, if all call sites are known, closure conversion will hard-wire the calls by lowering $call to $callk.

CPS Expression: $callk label proc args

Like $call, but for the case where the call target is known to be in the same compilation unit. label should denote some $kfun continuation in the program. In this case the proc is simply an additional argument, since it is not used to determine the call target at run-time.

To summarize: a $continue is a CPS term that continues to a single label. But there are other kinds of CPS terms that can continue to a different number of labels: $branch, $switch, $throw, and $prompt.

CPS Term: $branch kf kt src op param args

Evaluate the branching primcall op, with arguments args and constant parameter param, and continue to kt with zero values if the test is true. Otherwise continue to kf.

The $branch term is like a $continue term with a $primcall expression, except that instead of binding a value and continuing to a single label, the result of the test is not bound but instead used to choose the continuation label.

The set of operations (corresponding to op values) that are valid in a $branch is limited. In the general case, bind the result of a test expression to a variable, and then make a $branch on a true? op referencing that variable. The optimizer should inline the branch if possible.

CPS Term: $switch kf kt* src arg

Continue to a label in the list k* according to the index argument arg, or to the default continuation kf if arg is greater than or equal to the length k*. The index variable arg is an unboxed, unsigned 64-bit value.

The $switch term is like C’s switch statement. The compiler to CPS can generate a $switch term directly, if the source language has such a concept, or it can rely on the CPS optimizer to turn appropriate chains of $branch statements to $switch instances, which is what the Scheme compiler does.

CPS Term: $throw src op param args

Throw a non-resumable exception. Throw terms do not continue at all. The usual value of op is throw, with two arguments key and args. There are also some specific primcalls that compile to the VM throw/value and throw/value+data instructions; see the code for full details.

The advantage of having $throw as a term is that, because it does not continue, this allows the optimizer to gather more information from type predicates. For example, if the predicate is char? and the kf continues to a throw, the set of labels dominated by kt is larger than if the throw notationally continued to some label that would never be reached by the throw.

CPS Term: $prompt k kh src escape? tag

Push a prompt on the stack identified by the variable name tag, which may be escape-only if escape? is true, and continue to kh with zero values. If the body aborts to this prompt, control will proceed at the continuation labelled kh, which should be a $kreceive continuation. Prompts are later popped by pop-prompt primcalls.

At this point we have described terms, expressions, and the most common kind of continuation, $kargs. $kargs is used when the predecessors of the continuation can be instructed to pass the values where the continuation wants them. For example, if a $kargs continuation k binds a variable v, and the compiler decides to allocate v to slot 6, all predecessors of k should put the value for v in slot 6 before jumping to k. One situation in which this isn’t possible is receiving values from function calls. Guile has a calling convention for functions which currently places return values on the stack. A continuation of a call must check that the number of values returned from a function matches the expected number of values, and then must shuffle or collect those values to named variables. $kreceive denotes this kind of continuation.

CPS Continuation: $kreceive arity k

Receive values on the stack. Parse them according to arity, and then proceed with the parsed values to the $kargs continuation labelled k. As a limitation specific to $kreceive, arity may only contain required and rest arguments.

$arity is a helper data structure used by $kreceive and also by $kclause, described below.

CPS Data: $arity req opt rest kw allow-other-keys?

A data type declaring an arity. req and opt are lists of source names of required and optional arguments, respectively. rest is either the source name of the rest variable, or #f if this arity does not accept additional values. kw is a list of the form ((keyword name var) ...), describing the keyword arguments. allow-other-keys? is true if other keyword arguments are allowed and false otherwise.

Note that all of these names with the exception of the vars in the kw list are source names, not unique variable names.

Additionally, there are three specific kinds of continuations that are only used in function entries.

CPS Continuation: $kfun src meta self tail clause

Declare a function entry. src is the source information for the procedure declaration, and meta is the metadata alist as described above in Tree-IL’s <lambda>. self is a variable bound to the procedure being called, and which may be used for self-references. tail is the label of the $ktail for this function, corresponding to the function’s tail continuation. clause is the label of the first $kclause for the first case-lambda clause in the function, or otherwise #f.

CPS Continuation: $ktail

A tail continuation.

CPS Continuation: $kclause arity cont alternate

A clause of a function with a given arity. Applications of a function with a compatible set of actual arguments will continue to the continuation labelled cont, a $kargs instance representing the clause body. If the arguments are incompatible, control proceeds to alternate, which is a $kclause for the next clause, or #f if there is no next clause.