5.5.4 Foreign Object Memory Management

Once a foreign object has been released to the tender mercies of the Scheme system, it must be prepared to survive garbage collection. In the example above, all the memory associated with the foreign object is managed by the garbage collector because we used the scm_gc_ allocation functions. Thus, no special care must be taken: the garbage collector automatically scans them and reclaims any unused memory.

However, when data associated with a foreign object is managed in some other way—e.g., malloc’d memory or file descriptors—it is possible to specify a finalizer function to release those resources when the foreign object is reclaimed.

As discussed in see Garbage Collection, Guile’s garbage collector will reclaim inaccessible memory as needed. This reclamation process runs concurrently with the main program. When Guile analyzes the heap and determines that an object’s memory can be reclaimed, that memory is put on a “free list” of objects that can be reclaimed. Usually that’s the end of it—the object is available for immediate re-use. However some objects can have “finalizers” associated with them—functions that are called on reclaimable objects to effect any external cleanup actions.

Finalizers are tricky business and it is best to avoid them. They can be invoked at unexpected times, or not at all—for example, they are not invoked on process exit. They don’t help the garbage collector do its job; in fact, they are a hindrance. Furthermore, they perturb the garbage collector’s internal accounting. The GC decides to scan the heap when it thinks that it is necessary, after some amount of allocation. Finalizable objects almost always represent an amount of allocation that is invisible to the garbage collector. The effect can be that the actual resource usage of a system with finalizable objects is higher than what the GC thinks it should be.

All those caveats aside, some foreign object types will need finalizers. For example, if we had a foreign object type that wrapped file descriptors—and we aren’t suggesting this, as Guile already has ports —then you might define the type like this:

static SCM file_type;

static void
finalize_file (SCM file)
{
  int fd = scm_foreign_object_signed_ref (file, 0);
  if (fd >= 0)
    {
      scm_foreign_object_signed_set_x (file, 0, -1);
      close (fd);
    }
}

static void
init_file_type (void)
{
  SCM name, slots;
  scm_t_struct_finalize finalizer;

  name = scm_from_utf8_symbol ("file");
  slots = scm_list_1 (scm_from_utf8_symbol ("fd"));
  finalizer = finalize_file;

  image_type =
    scm_make_foreign_object_type (name, slots, finalizer);
}

static SCM
make_file (int fd)
{
  return scm_make_foreign_object_1 (file_type, (void *) fd);
}

Note that the finalizer may be invoked in ways and at times you might not expect. In a Guile built without threading support, finalizers are invoked via “asyncs”, which interleaves them with running Scheme code; see Asynchronous Interrupts. If the user’s Guile is built with support for threads, the finalizer will probably be called by a dedicated finalization thread, unless the user invokes scm_run_finalizers () explicitly.

In either case, finalizers run concurrently with the main program, and so they need to be async-safe and thread-safe. If for some reason this is impossible, perhaps because you are embedding Guile in some application that is not itself thread-safe, you have a few options. One is to use guardians instead of finalizers, and arrange to pump the guardians for finalizable objects. See Guardians, for more information. The other option is to disable automatic finalization entirely, and arrange to call scm_run_finalizers () at appropriate points. See Foreign Objects, for more on these interfaces.

Finalizers are allowed to allocate memory, access GC-managed memory, and in general can do anything any Guile user code can do. This was not the case in Guile 1.8, where finalizers were much more restricted. In particular, in Guile 2.0, finalizers can resuscitate objects. We do not recommend that users avail themselves of this possibility, however, as a resuscitated object can re-expose other finalizable objects that have been already finalized back to Scheme. These objects will not be finalized again, but they could cause use-after-free problems to code that handles objects of that particular foreign object type. To guard against this possibility, robust finalization routines should clear state from the foreign object, as in the above free_file example.

One final caveat. Foreign object finalizers are associated with the lifetime of a foreign object, not of its fields. If you access a field of a finalizable foreign object, and do not arrange to keep a reference on the foreign object itself, it could be that the outer foreign object gets finalized while you are working with its field.

For example, consider a procedure to read some data from a file, from our example above.

SCM
read_bytes (SCM file, SCM n)
{
  int fd;
  SCM buf;
  size_t len, pos;

  scm_assert_foreign_object_type (file_type, file);

  fd = scm_foreign_object_signed_ref (file, 0);
  if (fd < 0)
    scm_wrong_type_arg_msg ("read-bytes", SCM_ARG1,
                            file, "open file");

  len = scm_to_size_t (n);
  SCM buf = scm_c_make_bytevector (scm_to_size_t (n));

  pos = 0;
  while (pos < len)
    {
      char *bytes = SCM_BYTEVECTOR_CONTENTS (buf);
      ssize_t count = read (fd, bytes + pos, len - pos);
      if (count < 0)
        scm_syserror ("read-bytes");
      if (count == 0)
        break;
      pos += count;
    }

  scm_remember_upto_here_1 (file);

  return scm_values (scm_list_2 (buf, scm_from_size_t (pos)));
}

After the prelude, only the fd value is used and the C compiler has no reason to keep the file object around. If scm_c_make_bytevector results in a garbage collection, file might not be on the stack or anywhere else and could be finalized, leaving read to read a closed (or, in a multi-threaded program, possibly re-used) file descriptor. The use of scm_remember_upto_here_1 prevents this, by creating a reference to file after all data accesses. See Function related to Garbage Collection.

scm_remember_upto_here_1 is only needed on finalizable objects, because garbage collection of other values is invisible to the program – it happens when needed, and is not observable. But if you can, save yourself the headache and build your program in such a way that it doesn’t need finalization.