Guile

The GNU extension language

Project Ideas


About this page

This page is largely outdated, but is kept here for reference.

This page is for developers and people learning about Guile, who want ideas to persue. If you've done any of these things, or want advice about them then please feel free to do so on the guile mailing list!

What's this page is about

People occasionally ask if there's anything they can work on in Guile. In the past, I've just been dumb and written up a separate list of ideas in response to each message; so I forget things, I waste a lot of time retyping, and ideas that occur to me when nobody's asking me disappear into the ether. So I've decided to write them down.

Also, these are just the things that have come to my head over the past few months. Each idea kind of argues for its own worldview, but I don't actually claim that they're well-developed, or that the functional boundaries are right, or anything. As a friend of mine used to say, in a similar spirit: ``Why not take two --- they're small!''

This is only meant as a source of ideas, not as a statement of what I think is important, where I think Guile should go, or what I'm willing to incorporate into some ``official'' distribution of something. People should hack on what pleases them. These are basically the things I wish I had time to do.

I hope this is helpful!

--- Jim Blandy

I've added a little here, mostly pointing to some additional information on various bits, and removing some of the bits that've been finished and added to guile, like guardians

--- Greg Harvey

Other Guile idea lists

If you've got your own list of Guile modules or features you'd like to see, we'll put a link to it here.

Guile modules

Here are some things I'd like to see people write and distribute as separate Guile modules.

Items are in most-recently-added-or-modified order.

Clean SLIB integration
This may need to wait for the new module system, but it would be nice to have a solid integration of SLIB into Guile, so that SLIB files appear as first-class Guile modules.

Server-side scripting
I'm thinking of something like a translator that reads a mix of HTML-like tags and Scheme code --- basically, something like quasiquote, except that it builds HTML structure, instead of list structure. The translator would compile HTML-like things into expressions that construct a tree which we can finally spit out as HTML.

PHP has become really popular, but there's really no such thing as a little language. I bet we'll see the same pattern with that that we've seen with Perl and Tcl --- the interpreter will start out small and fast, but then as people use it for more serious work, it'll start acquiring real control structures, real functions, more datatypes, ... until it'll just be another poorly-implemented lisp with studly libraries. That's basically what Tcl 8.0 is.

I don't know for sure, but I suspect that MHTML invented their own language too. If so, the same rules apply.

We should do it right, by taking the features of PHP and MHTML and giving them to Guile. Someone who does web application development should do this, so it'll get done right. I don't do that much web hacking myself, so I'd get the details all wrong.

Note that it's really easy to hack something up that works here and there. This project will only have a significant impact if it's an industrial-strength job.

See Olin Shivers' ``net.tar.gz''.

IEEE 754 recommended functions and predicates
Guile should provide the functions described in the appendix to IEEE 754. They're handy.

Resurrect the Mesa interface
Long ago, Guile had an interface to the Mesa 3-D graphics library, which is compatible with OpenGL. (See freshmeat for details.) This code could probably be resurrected pretty easily.

Ideally, of course, this would be combined with a Mesa widget for GTK+. :)

This is kind of stupid, but one thing you could use this for is an origami program. I'd like to be able to see the relationship between the folded and unfolded sheets. This is one of those things that's not worth coding in C, but would be fun to whip up in an interpreted language.

Simple Network Management Protocol interface
There's a library called UCD-SNMP (search freshmeat for it) which might be fun to hook up to Guile. I'd love to be able to combine UCD-SNMP with the GTK+ canvas widget and get network utilization graphs from my router.

Bug-tracking system
I'd like something akin to Gnats, but written in Guile. I'm sure we could do better than Gnats, and this would give us a chance to focus on the needs of open source development --- for example, the bug list should be open, and viewable from the web.

Directory walker with list of predicates
People have posted a bunch of functions for walking directory trees, modeled more or less after the Unix `find' command. They're essentially of the form (find DIRECTORY PROC), and apply PROC to each file they find in the directory tree DIRECTORY.

It occurs to me that it might be convenient to actually have `find' accept a list of procedures: (find DIRECTORY PROC ...), and for each file found, apply each PROC successively, until one returns a false value. That way, you could do things like this:

      (find "." (glob ".deps") directory?
            (lambda (d)
               ... now do something with the .deps directory d ...))
    
I'm imagining `glob' to be a function that accepts a filename wildcard pattern and returns a predicate that likes filenames that match the wildcard.

Maciej Stachowiak and others have suggested, as a convention, that any function which accepts a predicate as an argument, and applies that predicate to filenames, should also accept a string as that argument, and treat that string as a wildcard pattern. Thus, the code above would become:

      (find "." ".deps" directory?
            (lambda (d) ...))
    
Now we're starting to get friendly.

smart dumping
This is kind of a dumb idea; I have no idea whether it's even worth thinking about.

Would it be possible to write a function which walked data structures in the Scheme heap, and wrote out, directly from Scheme, a valid ELF shared library? Then you could load a module and spit it back out as a shared library. While walking the heap, we have all the information we need to emit the right relocs. The code is certainly machine- and ABI-dependent.

Does this have any advantage over the freezer? The freezer writes out initialized C data structures, which you can then compile into shared libraries; what's wrong with that? I guess the C compiler isn't involved, which is one more variable eliminated. But freezing is much more portable.

Sound/sample processing library
I'd love to use Guile to munge sound samples, try out various effects, etc. The perfect complement to a sound-processing library would be a nice interface to the audio hardware. Look into the Enlightened Sound Daemon (esd), and audiofile, a library that understands how to load a variety of different audio file formats. (Tom Tromey pointed me at esd and audiofile.)

Bill Schottstaedt writes:

I've written a C library of basic audio functions (support for hardware, headers, data types, etc) called sndlib which is already tied into Guile in my sound editor. And a ton of sound-processing functions can be found in clm (a Common Lisp and C based Music V implementation).

If there's interest, I could package all this up in some pretty way; or perhaps help anyone else who wants to do something along these lines. As I understand the copyright issue (I'm no lawyer), Stanford owns the copyright since I'm doing this work as their employee; they, however, have said the software author can decide distribution policy (or whatever the word is), so I placed the sound editor under the GPL, and have always made all the code freeware available via anonymous ftp.

Database engine interfaces
This is pretty important for web scripting, but also just a generally handy thing.

Perl has DBI, which I think we should use a model; here's the description from CPAN:

The Database Interface. The Perl DBI initiative has standardized the interface to a number of commercial database engines, so that you can move from, say, Oracle to Sybase with a minimum of effort. You'll find DBD::DB2, DBD::Informix, DBD::Oracle, DBD::QBase, DBD::Sybase, DBD::MySQL, and DBD:mSQL inside the DBD module set.

The job here is to discern the important ideas in DBI's design, figure out the nicest way to transpose those into Scheme, write that up, and provide at least one implementation, so that people can write database drivers that meet the spec, and use the implementation as a reference.

Database manager interfaces
These would be interfaces to Berkeley DB, GDBM, NDBM, DBM, etc.

I think the difference between a ``database engine'' and a ``database manager'' is that the former usually supports SQL, and handles multiple readers and writers, whereas the latter just implements the data structure on disk, and leaves questions of synchronization and how to actually arrange data usefully in the hands of the caller.

Just as discussed in the ``Database engine interface'' entry elsewhere in this list, we want a common interface to these libraries, so people can write code that will operate with any file format.

Query languages
Logic programming languages make great database query languages --- much nicer and more consistent than SQL. It would be really cool if someone could look into the work Richard Salter and Chris Haynes did on embedding backtracking and unification in Scheme, or Per Bothner's similar work, and then use that to produce the best database front end ever.

Of course, this should ride on the

GIMP interface
Peter Mattis did do a Guile/GIMP interface at one point, but we're not the standard there; they're still using SIOD. Someone who uses both Guile and the GIMP needs to pick this up, turn it into a package people can use, and do whatever's needed to get the GIMP people to accept it as their standard. If there are critical Guile changes necessary (Make it smaller? Can do!), I want to help with that.

Basically, I think all we need here is a module that exports the GIMP's functions. Should be real easy.

C parser
It would be cool to hook up a full parser for ANSI C to Guile. Then we could use it to parse header files and generate Guile interfaces, scan Guile code for missing argument typechecks, cases where we need to call scm_return_first or scm_remember, unbalanced calls to SCM_DEFER_INTS. We could even switch Guile to an explicit-marking GC, and then use the checker to catch errors in the explicit marking.

Tom Lord did this for Systas; perhaps his work could be ported back to Guile.

For way extra points, implement a C++ parser. Ouch.

A FastCGI interface
It would be nice to have Guile implement the FastCGI interface, turning each HTTP request into a function call, with arguments and environment broken out.

ABI-conformant packed data structure interface
It would be nice if Guile could manipulate arbitrary C data structures. Basically, you'd take the C declaration for a structure, figure out its layout at the bit/byte offset level (given a particular ABI), and generate a bunch of accessors, and a new opaque type for pointers to that structure.

We're not going for type safety here --- there should be an operation to turn these opaque values into integers and back, or to treat the data in a string as one of them. But it would be nice to provide some error checking.

If the ABI were something you could choose at run-time, that could make Guile a powerful system for doing cross-platform munging. "Sure, I know exactly how a 6-bit field would be laid out on the i960!". Well, okay --- maybe that's not thrilling. But I still think it would be cool.

Henry Spencer's regexp matcher
I'd like to see someone package up Henry Spencer's latest regexp engine (which supports Unicode's UTF-8 encoding!) as a dynamically linked module for Guile. Actually, I'd like to incorporate this into the Guile core. Our present code just uses whatever regexp engine is in the system's C library, which is sometimes pretty pathetic.

HTTP routines
It would be cool if people could use Guile to implement web robots and the like. Tim Pierce started to work on this, but it's not finished.

General URL functions
This would be a set of functions for just retriving any URL. I think the WWW Consortium has a library which implements everything.

SGML and DSSSL
As long as we're supporting multiple languages, why not DSSSL? Craig Brozefsky <craig@red-bean.com> has already attached an SGML language parser to Guile.

Emacs-like buffers, for file handling.
Everyone is used to the sed/awk/perl model of file processing --- you munge a line at a time, and maintain state to handle multi-line things. That's just the way it's done.

But actually, there's a totally different system which works a lot better for some applications, exemplified by Emacs. Emacs lisp gives you buffers with very fast search, insert, and delete operations. You don't have to process the data in any order; there are no line boundaries to obscure the semantic structure of the content, if the file isn't really line-structured; and so on.

So basically, this idea is, "implement Emacs buffers for Guile, with all the searching, editing, and I/O facilities, but none of the redisplay support."

(Greg) Initial code implementing these is available here It's not complete, but could be used to build something bigger and better (I intend to go back and reimplement these with goops classes, to allow for more flexible usage).

Cool I/O ports
Guile should be able to talk to compression libraries. You should be able to hand an ordinary output port to a function, and have it give you a new port, where data written to the new port gets written compressed to the original port. And the reverse for uncompression and input ports.

The same principle applies to any kind of stream transformation:

  • encryption
  • uu/base64 encoding, or error correcting codes
  • Unicode/JIS/ISO-8859 conversion
  • CR/LF vs. LF conversion
  • telnet (for implement FTP)
  • line and column number counting (the port just passes data through unchanged, but counts the number of characters and newlines, and has extra functions that let you read and set the counters).

Guile's port implementation already has the infrastructure needed to implement ports that do arbitrary things with their streams (see the scm_ptobfuns structure). It's just waiting to be used.

SSLeay
The quintessential example of the above. SSLeay is a library that implements the Secure Socket Layer protocol, the foundation of secure http. It's basically a generic authentication and privacy layer for network connections. I think PRMS uses it too.

Adobe Document Structuring Conventions parser
Make it trivial to write psnup, and such. Make it trivial to produce output that conforms to the DSC.

Functional PostScript
Imagine taking the primitives of PostScript and providing them in a more functional-language kind of way. That's what Olin Shivers' group has done.

FPS is a portable system for doing device-independent, resolution- independent graphics from Scheme programs. It is PostScript, with the Forth computational engine replaced with Scheme. At present, it runs on SCSH.

Occam-like thread control structures
The Occam language, designed for the INMOS transputer, made parallalism as concise to use as `let'. It was a much nicer way of thinking about threads, I think. It would be cool if someone implemented the interesting Occam features in Guile:
channels
These are one-deep message queues, with the right blocking behavior; Guile's channels would carry objects.
SEQ
Well, this is just begin.
PAR
Like begin, but execute all the subexpressions in parallel.
And so on.

RenderMan interface
It would be nice to be able to generate RenderMan scene description files using Guile code.

Translators for other languages

Here are some languages I'd love to see translated into Guile.

Items are in most-recently-added-or-modified order.

Tcl 8.0
Tcl's syntax is remarkably simple for a language its age; I quite like it. Especially with the semantic cleanups made for Tcl 8.0, we should be able to do a good integration here.

Ian Bicking <bickiia@earlham.edu> has done some work on a translator. The first cut was an interpreter, partially evaluated using Similix to produce a compiler; the latest version has been rewritten by hand.

PHP
The simple server-side scripting language.
MetaHTML
A more complex server-side scripting system, but a new language, so still probably clean enough to tackle.
Python
Quite a pretty language, clean in both syntax and semantics. ("Pure in body and mind.") Datatypes seem very friendly with Scheme's, so it should be possible to do a very satisfying integration here.
Emacs Lisp
An interesting challenge. Needs a decent solution for reconciling the nil/()/#f issue, which this translator should use. This is being worked on in the 1.9 pre-release series, which features an Emacs Lisp compiler.
Perl
This is a herculean task, because Perl's syntax and semantics are so complicated. Hats off to whoever even tries this.

Changes to the Guile Core

Here are some improvements I'd like to see made to the core Guile interpreter.

Internationalization/Multilingualization

Guile 1.9 supports Unicode via GNU libunistring. It also comes with a new modules called (ice-9 i18n), which provides advanced internationalization capabilities.

Henry Spencer's latest regexp engine handles UTF-8, but as of this writing, it hadn't been optimized yet.

Guile also needs some kind of gettext interface. We could add a new syntax for translatable strings like

#"This is a translatable string."

Custom buffered I/O
Guile has several different kinds of I/O ports. Those that talk to the outside world are implemented on top of the ubiquitous C standard I/O FILE buffered streams. This leads to a few problems:
  • We have to use fgets for speed, but it's difficult to handle lines containing null characters, given fgets's interface. So we use ftell to find out how much we've read with fgets. But that doesn't work on sockets. So on sockets we fall back to our old, slow routine based on getc.
  • We have to use unbuffered input sockets, because standard I/O streams only promise you one buffer, so you can't mix read and write operations, to implement a network protocol, say. You have to do an fflush between a write and a read, and an fseek or something equivalent between a read and a write. This is stupid.
  • There's no way to tell whether there's input immediately available on a buffered stream. You can get the underlying file descriptor and do fcntl magic on that, but that won't tell you whether there are characters waiting in the buffer.

You get the idea. The thing is, every Unix system has read, write, and seek, and I don't know of any system that doesn't have select. So we could actually implement our own buffering port implementation, and address all these problems.

We'd actually do better, because people have had some good ideas since standard I/O was implemented. For example, we could follow the lead of the libio library, and expose the buffer directly to the consumer, thus avoiding some copies. We could run regexp matches directly in the buffer. We could implement our own definition of line boundaries with little penalty. Each port could have a magic writable shared substring object that gave Scheme code direct access to the "current line", with no copies. (Well, maybe that's not such a hot idea. But that's what Perl does.)

Having our own buffered stream implementation would also allow us to start acquiring cool optimizations strictly below the interface. The presence of the interface would protect Guile from whatever weird system-specific stuff we wanted to do for speed.

(Greg) Gary Houston has been working on this; patches against cvs guile (for the brave) are available at Gary's web page.

Generational garbage collection
Guile 1.9 uses the Boehm-Demers-Weiser garbage collector (also referred to as "libgc"), which implements generational garbage collection. The text below is kept for historical purposes.

(Greg Harvey has started taking a shot at this; ((Greg) check out my personal Guile page, for news, notes, and code).

At the moment, anyone who profiles the Guile interpreter notices that it's spending a lot of its time in gc_mark. This is not too surprising. I'd like Guile to have a conservative generational collector. The hard parts here are the write barrier, and managing conservatism. I've put some of my ideas for dealing with conservatism here.

The usual way to keep track of an object's generation is to keep each generation in a separate region of memory, and then check the object's address to see which generation it's in. To age an object, you copy it from from a younger generation to an older generation, and update all pointers to that object. There's nothing magic about this; you could just as well have a field in each object saying what generation it's in. However, using address ranges saves space; you don't need that extra field per object.

Unfortunately, when you're using a conservative collector like Guile's, you can't move an object that's pointed to by the stack. You have no idea whether any given word on the stack is actually a pointer, or just some integer, or a piece of a string, so you can't fix up the pointer after you've moved the object. Which complicates the collector a bit, if you want to copy objects.

One approach would be to assume that the stack doesn't contain pointers to too many objects, so you could just leave those there. After all, aging just affects performance; it's not necessary for correctness. I think this is a variant of Joel Bartlett's ``Mostly-copying Collector'' idea. Guile uses a free list to manage its storage, so having a few old objects sticking around (is there some nice concise derogatory term for people who have failed a grade in school and are sticking around for another year?) doesn't affect the allocation strategy at all.

(Greg) This assumption is a pretty safe one; the number of cells traced conservatively is generally a very small fraction of the actual number of cells traced. However, it's worth mentioning that Bartlett's method is patented (don't get me started), so we have to be a little careful about the gc we end up with.

Anyway, anyone considering this project should check out Paul Wilson's survey papers on garbage collection.

Miscellanea

Here are projects that don't fall into the above categories.

GDB support for debugging Guile-using C code
GDB actually has some Scheme support in there; we should teach it how to print Scheme values, how to print interpreter frames, and so on.

This has been done in the past with a mixed GDB/Guile solution, but I think it would be more robust to actually put everything in GDB.

Negotiate design with the GDB group, so it can be merged in.