GNU Translate Tables (GNU Gnulib)

18.6.1.7 GNU Translate Tables

If you set the translate field of a pattern buffer to a translate table, then the GNU Regex functions to which you’ve passed that pattern buffer use it to apply a simple transformation to all the regular expression and string characters at which they look.

A translate table is an array indexed by the characters in your character set. Under the ASCII encoding, therefore, a translate table has 256 elements. The array’s elements are also characters in your character set. When the Regex functions see a character c, they use translate[c] in its place, with one exception: the character after a ‘\’ is not translated. (This ensures that, the operators, e.g., ‘\B’ and ‘\b’, are always distinguishable.)

For example, a table that maps all lowercase letters to the corresponding uppercase ones would cause the matcher to ignore differences in case.⁷ Such a table would map all characters except lowercase letters to themselves, and lowercase letters to the corresponding uppercase ones. Under the ASCII encoding, here’s how you could initialize such a table (we’ll call it case_fold):

for (i = 0; i < 256; i++)
  case_fold[i] = i;
for (i = 'a'; i <= 'z'; i++)
  case_fold[i] = i - ('a' - 'A');

You tell Regex to use a translate table on a given pattern buffer by assigning that table’s address to the translate field of that buffer. If you don’t want Regex to do any translation, put zero into this field. You’ll get weird results if you change the table’s contents anytime between compiling the pattern buffer, compiling its fastmap, and matching or searching with the pattern buffer.

18.6.1.7 GNU Translate Tables

Footnotes

(7)