Next: , Previous: Input Format, Up: Description

3.2 Output Format for Generated C Code with gperf

Several options control how the generated C code appears on the standard output. Two C functions are generated. They are called hash and in_word_set, although you may modify their names with a command-line option. Both functions require two arguments, a string, char * str, and a length parameter, int len. Their default function prototypes are as follows:

— Function: unsigned int hash (const char * str, unsigned int len)

By default, the generated hash function returns an integer value created by adding len to several user-specified str byte positions indexed into an associated values table stored in a local static array. The associated values table is constructed internally by gperf and later output as a static local C array called ‘hash_table’. The relevant selected positions (i.e. indices into str) are specified via the ‘-k’ option when running gperf, as detailed in the Options section below (see Options).

— Function: in_word_set (const char * str, unsigned int len)

If str is in the keyword set, returns a pointer to that keyword. More exactly, if the option ‘-t’ (or, equivalently, the ‘%struct-type’ declaration) was given, it returns a pointer to the matching keyword's structure. Otherwise it returns NULL.

If the option ‘-c’ (or, equivalently, the ‘%compare-strncmp’ declaration) is not used, str must be a NUL terminated string of exactly length len. If ‘-c’ (or, equivalently, the ‘%compare-strncmp’ declaration) is used, str must simply be an array of len bytes and does not need to be NUL terminated.

The code generated for these two functions is affected by the following options:

Make use of the user-defined struct.
-S total-switch-statements
Generate 1 or more C switch statement rather than use a large, (and potentially sparse) static array. Although the exact time and space savings of this approach vary according to your C compiler's degree of optimization, this method often results in smaller and faster code.

If the ‘-t’ and ‘-S’ options (or, equivalently, the ‘%struct-type’ and ‘%switch’ declarations) are omitted, the default action is to generate a char * array containing the keywords, together with additional empty strings used for padding the array. By experimenting with the various input and output options, and timing the resulting C code, you can determine the best option choices for different keyword set characteristics.