Next: Character Properties, Previous: Selecting a Representation, Up: Non-ASCII Characters
The unibyte and multibyte text representations use different character codes. The valid character codes for unibyte representation range from 0 to 255—the values that can fit in one byte. The valid character codes for multibyte representation range from 0 to 4194303 (#x3FFFFF). In this code space, values 0 through 127 are for ASCII charcters, and values 129 through 4194175 (#x3FFF7F) are for non-ASCII characters. Values 0 through 1114111 (#10FFFF) correspond to Unicode characters of the same codepoint; values 1114112 (#110000) through 4194175 (#x3FFF7F) represent characters that are not unified with Unicode; and values 4194176 (#x3FFF80) through 4194303 (#x3FFFFF) represent eight-bit raw bytes.
This returns
tif charcode is a valid character, andnilotherwise.(characterp 65) t (characterp 4194303) t (characterp 4194304) nil
This function returns the largest value that a valid character codepoint can have.
(characterp (max-char)) t (characterp (1+ (max-char))) nil
This function returns the byte at character position pos in the current buffer. If the current buffer is unibyte, this is literally the byte at that position. If the buffer is multibyte, byte values of ASCII characters are the same as character codepoints, whereas eight-bit raw bytes are converted to their 8-bit codes. The function signals an error if the character at pos is non-ASCII.
The optional argument string means to get a byte value from that string instead of the current buffer.