Previous: , Up: Strings and Characters   [Contents][Index]


16.2 Characters

A character is the elementary unit that strings are made of.

What is a character? “A character is an element of a character set” is sort of a circular definition, but it highlights the fact that it is not merely a number. Although many characters are visually represented by a single glyph, there are characters that, for example, have a different glyph when used at the end of a word than when used inside a word. A character is also not the minimal rendered text processing unit; that is a grapheme cluster and in general consists of one or more characters. If you want to know more about the concept of character and various concepts associated with characters, refer to the Unicode standard.

For the representation in memory of a character, various types have been in use, and some of them were failures: char and wchar_t were invented for this purpose, but are not the right types. char32_t is the right type (successor of wchar_t); and mbchar_t (defined by Gnulib) is an alternative for specific kinds of processing.