14.6 lengths-list-file in Detail

The core of the lengths-list-file function is a while loop containing a function to move point forward defun by defun, and a function to count the number of words and symbols in each defun. This core must be surrounded by functions that do various other tasks, including finding the file, and ensuring that point starts out at the beginning of the file. The function definition looks like this:

(defun lengths-list-file (filename)
  "Return list of definitions' lengths within FILE.
The returned list is a list of numbers.
Each number is the number of words or
symbols in one function definition."
  (message "Working on `%s' ... " filename)
  (save-excursion
    (let ((buffer (find-file-noselect filename))
          (lengths-list))
      (set-buffer buffer)
      (setq buffer-read-only t)
      (widen)
      (goto-char (point-min))
      (while (re-search-forward "^(defun" nil t)
        (setq lengths-list
              (cons (count-words-in-defun) lengths-list)))
      (kill-buffer buffer)
      lengths-list)))

The function is passed one argument, the name of the file on which it will work. It has four lines of documentation, but no interactive specification. Since people worry that a computer is broken if they don’t see anything going on, the first line of the body is a message.

The next line contains a save-excursion that returns Emacs’s attention to the current buffer when the function completes. This is useful in case you embed this function in another function that presumes point is restored to the original buffer.

In the varlist of the let expression, Emacs finds the file and binds the local variable buffer to the buffer containing the file. At the same time, Emacs creates lengths-list as a local variable.

Next, Emacs switches its attention to the buffer.

In the following line, Emacs makes the buffer read-only. Ideally, this line is not necessary. None of the functions for counting words and symbols in a function definition should change the buffer. Besides, the buffer is not going to be saved, even if it were changed. This line is entirely the consequence of great, perhaps excessive, caution. The reason for the caution is that this function and those it calls work on the sources for Emacs and it is inconvenient if they are inadvertently modified. It goes without saying that I did not realize a need for this line until an experiment went awry and started to modify my Emacs source files …

Next comes a call to widen the buffer if it is narrowed. This function is usually not needed—Emacs creates a fresh buffer if none already exists; but if a buffer visiting the file already exists Emacs returns that one. In this case, the buffer may be narrowed and must be widened. If we wanted to be fully user-friendly, we would arrange to save the restriction and the location of point, but we won’t.

The (goto-char (point-min)) expression moves point to the beginning of the buffer.

Then comes a while loop in which the work of the function is carried out. In the loop, Emacs determines the length of each definition and constructs a lengths’ list containing the information.

Emacs kills the buffer after working through it. This is to save space inside of Emacs. My version of GNU Emacs 19 contained over 300 source files of interest; GNU Emacs 22 contains over a thousand source files. Another function will apply lengths-list-file to each of the files.

Finally, the last expression within the let expression is the lengths-list variable; its value is returned as the value of the whole function.

You can try this function by installing it in the usual fashion. Then place your cursor after the following expression and type C-x C-e (eval-last-sexp).

(lengths-list-file
 "/usr/local/share/emacs/22.1/lisp/emacs-lisp/debug.el")

You may need to change the pathname of the file; the one here is for GNU Emacs version 22.1. To change the expression, copy it to the *scratch* buffer and edit it.

Also, to see the full length of the list, rather than a truncated version, you may have to evaluate the following:

(custom-set-variables '(eval-expression-print-length nil))

(See Specifying Variables using defcustom. Then evaluate the lengths-list-file expression.)

The lengths’ list for debug.el takes less than a second to produce and looks like this in GNU Emacs 22:

(83 113 105 144 289 22 30 97 48 89 25 52 52 88 28 29 77 49 43 290 232 587)

(Using my old machine, the version 19 lengths’ list for debug.el took seven seconds to produce and looked like this:

(75 41 80 62 20 45 44 68 45 12 34 235)

The newer version of debug.el contains more defuns than the earlier one; and my new machine is much faster than the old one.)

Note that the length of the last definition in the file is first in the list.