8.4 Digression into C

The copy-region-as-kill function (see copy-region-as-kill) uses the filter-buffer-substring function, which in turn uses the delete-and-extract-region function. It removes the contents of a region and you cannot get them back.

Unlike the other code discussed here, the delete-and-extract-region function is not written in Emacs Lisp; it is written in C and is one of the primitives of the GNU Emacs system. Since it is very simple, I will digress briefly from Lisp and describe it here.

Like many of the other Emacs primitives, delete-and-extract-region is written as an instance of a C macro, a macro being a template for code. The complete macro looks like this:

DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
       Sdelete_and_extract_region, 2, 2, 0,
       doc: /* Delete the text between START and END and return it.  */)
  (Lisp_Object start, Lisp_Object end)
{
  validate_region (&start, &end);
  if (XFIXNUM (start) == XFIXNUM (end))
    return empty_unibyte_string;
  return del_range_1 (XFIXNUM (start), XFIXNUM (end), 1, 1);
}

Without going into the details of the macro writing process, let me point out that this macro starts with the word DEFUN. The word DEFUN was chosen since the code serves the same purpose as defun does in Lisp. (The DEFUN C macro is defined in emacs/src/lisp.h.)

The word DEFUN is followed by seven parts inside of parentheses:

In a C macro, the formal parameters come next, with a statement of what kind of object they are, followed by the body of the macro. For delete-and-extract-region the body consists of the following four lines:

validate_region (&start, &end);
if (XFIXNUM (start) == XFIXNUM (end))
  return empty_unibyte_string;
return del_range_1 (XFIXNUM (start), XFIXNUM (end), 1, 1);

The validate_region function checks whether the values passed as the beginning and end of the region are the proper type and are within range. If the beginning and end positions are the same, then return an empty string.

The del_range_1 function actually deletes the text. It is a complex function we will not look into. It updates the buffer and does other things. However, it is worth looking at the two arguments passed to del_range_1. These are XFIXNUM (start) and XFIXNUM (end).

As far as the C language is concerned, start and end are two opaque values that mark the beginning and end of the region to be deleted. More precisely, and requiring more expert knowledge to understand, the two values are of type Lisp_Object, which might be a C pointer, a C integer, or a C struct; C code ordinarily should not care how Lisp_Object is implemented.

Lisp_Object widths depend on the machine, and are typically 32 or 64 bits. A few of the bits are used to specify the type of information; the remaining bits are used as content.

XFIXNUM’ is a C macro that extracts the relevant integer from the longer collection of bits; the type bits are discarded.

The command in delete-and-extract-region looks like this:

del_range_1 (XFIXNUM (start), XFIXNUM (end), 1, 1);

It deletes the region between the beginning position, start, and the ending position, end.

From the point of view of the person writing Lisp, Emacs is all very simple; but hidden underneath is a great deal of complexity to make it all work.