Appendix B Performance Issues

C and its derivative languages are highly complex creatures. Often, ambiguous code situations arise that require CC Mode to scan large portions of the buffer to determine syntactic context. Such pathological code can cause CC Mode to perform fairly badly. This section gives some insight in how CC Mode operates, how that interacts with some coding styles, and what you can use to improve performance.

The overall goal is that CC Mode shouldn’t be overly slow (i.e., take more than a fraction of a second) in any interactive operation. I.e., it’s tuned to limit the maximum response time in single operations, which is sometimes at the expense of batch-like operations like reindenting whole blocks. If you find that CC Mode gradually gets slower and slower in certain situations, perhaps as the file grows in size or as the macro or comment you’re editing gets bigger, then chances are that something isn’t working right. You should consider reporting it, unless it’s something that’s mentioned in this section.

Because CC Mode has to scan the buffer backwards from the current insertion point, and because C’s syntax is fairly difficult to parse in the backwards direction, CC Mode often tries to find the nearest position higher up in the buffer from which to begin a forward scan (it’s typically an opening or closing parenthesis of some kind). The farther this position is from the current insertion point, the slower it gets.

In earlier versions of CC Mode, we used to recommend putting the opening brace of a top-level construct53 into the leftmost column. Earlier still, this used to be a rigid Emacs constraint, as embodied in the beginning-of-defun function. CC Mode now caches syntactic information much better, so that the delay caused by searching for such a brace when it’s not in column 0 is minimal, except perhaps when you’ve just moved a long way inside the file.

A special note about defun-prompt-regexp in Java mode: The common style is to hang the opening braces of functions and classes on the right side of the line, and that doesn’t work well with the Emacs approach. CC Mode comes with a constant c-Java-defun-prompt-regexp which tries to define a regular expression usable for this style, but there are problems with it. In some cases it can cause beginning-of-defun to hang54. For this reason, it is not used by default, but if you feel adventurous, you can set defun-prompt-regexp to it in your mode hook. In any event, setting and relying on defun-prompt-regexp will definitely slow things down because (X)Emacs will be doing regular expression searches a lot, so you’ll probably be taking a hit either way!

CC Mode maintains a cache of the opening parentheses of the blocks surrounding the point, and it adapts that cache as the point is moved around. That means that in bad cases it can take noticeable time to indent a line in a new surrounding, but after that it gets fast as long as the point isn’t moved far off. The farther the point is moved, the less useful is the cache. Since editing typically is done in “chunks” rather than on single lines far apart from each other, the cache typically gives good performance even when the code doesn’t fit the Emacs approach to finding the defun starts.

XEmacs users can set the variable c-enable-xemacs-performance-kludge-p to non-nil. This tells CC Mode to use XEmacs-specific built-in functions which, in some circumstances, can locate the top-most opening brace much more quickly than beginning-of-defun. Preliminary testing has shown that for styles where these braces are hung (e.g., most JDK-derived Java styles), this hack can improve performance of the core syntax parsing routines from 3 to 60 times. However, for styles which do conform to Emacs’s recommended style of putting top-level braces in column zero, this hack can degrade performance by about as much. Thus this variable is set to nil by default, since the Emacs-friendly styles should be more common (and encouraged!). Note that this variable has no effect in Emacs since the necessary built-in functions don’t exist (in Emacs 22.1 as of this writing in February 2007).

Text properties are used to speed up skipping over syntactic whitespace, i.e., comments and preprocessor directives. Indenting a line after a huge macro definition can be slow the first time, but after that the text properties are in place and it should be fast (even after you’ve edited other parts of the file and then moved back).

Font locking can be a CPU hog, especially the font locking done on decoration level 3 which tries to be very accurate. Note that that level is designed to be used with a font lock support mode that only fontifies the text that’s actually shown, i.e., Lazy Lock or Just-in-time Lock mode, so make sure you use one of them. Fontification of a whole buffer with some thousand lines can often take over a minute. That is a known weakness; the idea is that it never should happen.

The most effective way to speed up font locking is to reduce the decoration level to 2 by setting font-lock-maximum-decoration appropriately. That level is designed to be as pretty as possible without sacrificing performance. See Font Locking Preliminaries, for more info.


Footnotes

(53)

E.g., a function in C, or outermost class definition in C++ or Java.

(54)

This has been observed in Emacs 19.34 and XEmacs 19.15.