Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]


5.5.4 Garbage Collecting Smobs

Once a smob has been released to the tender mercies of the Scheme system, it must be prepared to survive garbage collection. In the example above, all the memory associated with the smob is managed by the garbage collector because we used the scm_gc_ allocation functions. Thus, no special care must be taken: the garbage collector automatically scans them and reclaims any unused memory.

However, when data associated with a smob is managed in some other way—e.g., malloc’d memory or file descriptors—it is possible to specify a free function to release those resources when the smob is reclaimed, and a mark function to mark Scheme objects otherwise invisible to the garbage collector.

As described in more detail elsewhere (see Conservative GC), every object in the Scheme system has a mark bit, which the garbage collector uses to tell live objects from dead ones. When collection starts, every object’s mark bit is clear. The collector traces pointers through the heap, starting from objects known to be live, and sets the mark bit on each object it encounters. When it can find no more unmarked objects, the collector walks all objects, live and dead, frees those whose mark bits are still clear, and clears the mark bit on the others.

The two main portions of the collection are called the mark phase, during which the collector marks live objects, and the sweep phase, during which the collector frees all unmarked objects.

The mark bit of a smob lives in a special memory region. When the collector encounters a smob, it sets the smob’s mark bit, and uses the smob’s type tag to find the appropriate mark function for that smob. It then calls this mark function, passing it the smob as its only argument.

The mark function is responsible for marking any other Scheme objects the smob refers to. If it does not do so, the objects’ mark bits will still be clear when the collector begins to sweep, and the collector will free them. If this occurs, it will probably break, or at least confuse, any code operating on the smob; the smob’s SCM values will have become dangling references.

To mark an arbitrary Scheme object, the mark function calls scm_gc_mark.

Thus, here is how we might write mark_image—again this is not needed in our example since we used the scm_gc_ allocation routines, so this is just for the sake of illustration:

SCM
mark_image (SCM image_smob)
{
  /* Mark the image's name and update function.  */
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_mark (image->name);
  scm_gc_mark (image->update_func);

  return SCM_BOOL_F;
}

Note that, even though the image’s update_func could be an arbitrarily complex structure (representing a procedure and any values enclosed in its environment), scm_gc_mark will recurse as necessary to mark all its components. Because scm_gc_mark sets an object’s mark bit before it recurses, it is not confused by circular structures.

As an optimization, the collector will mark whatever value is returned by the mark function; this helps limit depth of recursion during the mark phase. Thus, the code above should really be written as:

SCM
mark_image (SCM image_smob)
{
  /* Mark the image's name and update function.  */
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_mark (image->name);
  return image->update_func;
}

Finally, when the collector encounters an unmarked smob during the sweep phase, it uses the smob’s tag to find the appropriate free function for the smob. It then calls that function, passing it the smob as its only argument.

The free function must release any resources used by the smob. However, it must not free objects managed by the collector; the collector will take care of them. For historical reasons, the return type of the free function should be size_t, an unsigned integral type; the free function should always return zero.

Here is how we might write the free_image function for the image smob type—again for the sake of illustration, since our example does not need it thanks to the use of the scm_gc_ allocation routines:

size_t
free_image (SCM image_smob)
{
  struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);

  scm_gc_free (image->pixels,
               image->width * image->height,
               "image pixels");
  scm_gc_free (image, sizeof (struct image), "image");

  return 0;
}

During the sweep phase, the garbage collector will clear the mark bits on all live objects. The code which implements a smob need not do this itself.

Note that the free function can be called in any context. In particular, if your Guile is built with support for threads, the finalizer may be called from any thread that is running Guile. In Guile 2.0, finalizers are invoked via “asyncs”, which interleaves them with running Scheme code; see System asyncs. In Guile 2.2 there will be a dedicated finalization thread, to ensure that the finalization doesn’t run within the critical section of any other thread known to Guile.

In either case, finalizers (free functions) run concurrently with the main program, and so they need to be async-safe and thread-safe. If for some reason this is impossible, perhaps because you are embedding Guile in some application that is not itself thread-safe, you have a few options. One is to use guardians instead of free functions, and arrange to pump the guardians for finalizable objects. See Guardians, for more information. The other option is to disable automatic finalization entirely, and arrange to call scm_run_finalizers () at appropriate points. See Smobs, for more on these interfaces.

There is no way for smob code to be notified when collection is complete.

It is usually a good idea to minimize the amount of processing done during garbage collection; keep the mark and free functions very simple. Since collections occur at unpredictable times, it is easy for any unusual activity to interfere with normal code.


Next: , Previous: , Up: Defining New Types (Smobs)   [Contents][Index]