Next: Weight-Balanced Trees, Previous: Object Hashing, Up: Associations

Balanced binary trees are a useful data structure for maintaining large sets of associations whose keys are ordered. While most applications involving large association sets should use hash tables, some applications can benefit from the use of binary trees. Binary trees have two advantages over hash tables:

- The contents of a binary tree can be converted to an alist, sorted by key, in time proportional to the number of associations in the tree. A hash table can be converted into an unsorted alist in linear time; sorting it requires additional time.
- Two binary trees can be compared for equality in linear time. Hash tables, on the other hand, cannot be compared at all; they must be converted to alists before comparison can be done, and alist comparison is quadratic unless the alists are sorted.

MIT/GNU Scheme provides an implementation of red-black trees. The red-black tree-balancing algorithm provides generally good performance because it doesn't try to keep the tree very closely balanced. At any given node in the tree, one side of the node can be twice as high as the other in the worst case. With typical data the tree will remain fairly well balanced anyway.

A red-black tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is eight words per association.

Red-black trees hold their keys strongly. In other words, if a red-black tree contains an association for a given key, that key cannot be reclaimed by the garbage collector.

The red-black tree implementation is a run-time-loadable option. To use red-black trees, execute

(load-option 'rb-tree)

once before calling any of the procedures defined here.

— procedure: **make-rb-tree**` key=? key<?`

This procedure creates and returns a newly allocated red-black tree. The tree contains no associations.

Key=?andkey<?are predicates that compare two keys and determine whether they are equal to or less than one another, respectively. For any two keys, at most one of these predicates is true.

— procedure: **rb-tree/insert!**` rb-tree key datum`

Associates

datumwithkeyinrb-treeand returns an unspecified value. Ifrb-treealready has an association forkey, that association is replaced. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations inrb-tree.

— procedure: **rb-tree/lookup**` rb-tree key default`

Returns the datum associated with

keyinrb-tree. Ifrb-treedoesn't contain an association forkey,defaultis returned. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations inrb-tree.

— procedure: **rb-tree/delete!**` rb-tree key`

If

rb-treecontains an association forkey, removes it. Returns an unspecified value. The average and worst-case times required by this operation are proportional to the logarithm of the number of assocations inrb-tree.

— procedure: **rb-tree->alist**` rb-tree`

Returns the contents of

rb-treeas a newly allocated alist. Each element of the alist is a pair`(`

key`.`

datum`)`

wherekeyis one of the keys ofrb-tree, anddatumis its associated datum. The alist is sorted by key according to thekey<?argument used to constructrb-tree. The time required by this operation is proportional to the number of associations in the tree.

— procedure: **rb-tree/key-list**` rb-tree`

Returns a newly allocated list of the keys in

rb-tree. The list is sorted by key according to thekey<?argument used to constructrb-tree. The time required by this operation is proportional to the number of associations in the tree.

— procedure: **rb-tree/datum-list**` rb-tree`

Returns a newly allocated list of the datums in

rb-tree. Each element of the list corresponds to one of the associations inrb-tree, so if the tree contains multiple associations with the same datum, so will this list. The list is sorted by the keys of the associations, even though they do not appear in the result. The time required by this operation is proportional to the number of associations in the tree.This procedure is equivalent to:

(lambda (rb-tree) (map cdr (rb-tree->alist rb-tree)))

— procedure: **rb-tree/equal?**` rb-tree-1 rb-tree-2 datum=?`

Compares

rb-tree-1andrb-tree-2for equality, returning`#t`

iff they are equal and`#f`

otherwise. The trees must have been constructed with the same equality and order predicates (same in the sense of`eq?`

). The keys of the trees are compared using thekey=?predicate used to build the trees, while the datums of the trees are compared using the equivalence predicatedatum=?. The worst-case time required by this operation is proportional to the number of associations in the tree.

— procedure: **rb-tree/empty?**` rb-tree`

Returns

`#t`

iffrb-treecontains no associations. Otherwise returns`#f`

.

— procedure: **rb-tree/size**` rb-tree`

Returns the number of associations in

rb-tree, an exact non-negative integer. The average and worst-case times required by this operation are proportional to the number of associations in the tree.

— procedure: **rb-tree/height**` rb-tree`

Returns the height of

rb-tree, an exact non-negative integer. This is the length of the longest path from a leaf of the tree to the root. The average and worst-case times required by this operation are proportional to the number of associations in the tree.The returned value satisfies the following:

(lambda (rb-tree) (let ((size (rb-tree/size rb-tree)) (lg (lambda (x) (/ (log x) (log 2))))) (<= (lg size) (rb-tree/height rb-tree) (* 2 (lg (+ size 1))))))

— procedure: **rb-tree/copy**` rb-tree`

Returns a newly allocated copy of

rb-tree. The copy is identical torb-treein all respects, except that changes torb-treedo not affect the copy, and vice versa. The time required by this operation is proportional to the number of associations in the tree.

— procedure: **alist->rb-tree**` alist key=? key<?`

Returns a newly allocated red-black tree that contains the same associations as

alist. This procedure is equivalent to:(lambda (alist key=? key<?) (let ((tree (make-rb-tree key=? key<?))) (for-each (lambda (association) (rb-tree/insert! tree (car association) (cdr association))) alist) tree))

The following operations provide access to the smallest and largest members in a red/black tree. They are useful for implementing priority queues.

— procedure: **rb-tree/min**` rb-tree default`

Returns the smallest key in

rb-tree, ordefaultif the tree is empty.

— procedure: **rb-tree/min-datum**` rb-tree default`

Returns the datum associated with the smallest key in

rb-tree, ordefaultif the tree is empty.

— procedure: **rb-tree/min-pair**` rb-tree`

Finds the smallest key in

rb-treeand returns a pair containing that key and its associated datum. If the tree is empty, returns`#f`

.

— procedure: **rb-tree/max**` rb-tree default`

Returns the largest key in

rb-tree, ordefaultif the tree is empty.

— procedure: **rb-tree/max-datum**` rb-tree default`

Returns the datum associated with the largest key in

rb-tree, ordefaultif the tree is empty.

— procedure: **rb-tree/max-pair**` rb-tree`

Finds the largest key in

rb-treeand returns a pair containing that key and its associated datum. If the tree is empty, returns`#f`

.

— procedure: **rb-tree/delete-min!**` rb-tree default`

— procedure:**rb-tree/delete-min-datum!**` rb-tree default`

— procedure:**rb-tree/delete-min-pair!**` rb-tree`

— procedure:**rb-tree/delete-max!**` rb-tree default`

— procedure:**rb-tree/delete-max-datum!**` rb-tree default`

— procedure:**rb-tree/delete-max-pair!**` rb-tree`

— procedure:

— procedure:

— procedure:

— procedure:

— procedure:

These operations are exactly like the accessors above, in that they return information associated with the smallest or largest key, except that they simultaneously delete that key.