31.5. Encodings

31.5.1. Introduction
31.5.2. Character Sets
31.5.3. Line Terminators
31.5.4. Function EXT:MAKE-ENCODING
31.5.5. Function EXT:ENCODING-CHARSET
31.5.6. Default encodings
31.5.6.1. Default line terminator
31.5.7. Converting between strings and byte vectors

31.5.1. Introduction

An “encoding” describes the correspondence between CHARACTERs and raw bytes during input/output via STREAMs with STREAM-ELEMENT-TYPE CHARACTER.

An EXT:ENCODING is an object composed of the following facets:

character set
This denotes both the set of CHARACTERs that can be represented and passed through the I/O channel, and the way these characters translate into raw bytes, i.e., the map between sequences of CHARACTER and (UNSIGNED-BYTE 8) in the form of STRINGs and (VECTOR (UNSIGNED-BYTE 8)) as well as character and byte STREAMs. In this context, for example, CHARSET:UTF-8 and CHARSET:UCS-4 are considered different, although they can represent the same set of characters.
line terminator mode
This denotes the way newline characters are represented.

EXT:ENCODINGs are also TYPEs. As such, they represent the set of characters encodable in the character set. In this context, the way characters are translated into raw bytes is ignored, and the line terminator mode is ignored as well. TYPEP and SUBTYPEP can be used on encodings:

(SUBTYPEP CHARSET:UTF-8 CHARSET:UTF-16)
⇒ T ;
⇒ T
(SUBTYPEP CHARSET:UTF-16 CHARSET:UTF-8)
⇒ T ;
⇒ T
(SUBTYPEP CHARSET:ASCII CHARSET:ISO-8859-1)
⇒ T ;
⇒ T
(SUBTYPEP CHARSET:ISO-8859-1 CHARSET:ASCII)
⇒ NIL ;
⇒ T

31.5.2. Character Sets

[an error occurred while processing this directive]