Next: , Previous: Bytevectors as Floats, Up: Bytevectors


6.6.6.6 Interpreting Bytevector Contents as Unicode Strings

Bytevector contents can also be interpreted as Unicode strings encoded in one of the most commonly available encoding formats.

     (utf8->string (u8-list->bytevector '(99 97 102 101)))
     ⇒ "cafe"
     
     (string->utf8 "café") ;; SMALL LATIN LETTER E WITH ACUTE ACCENT
     ⇒ #vu8(99 97 102 195 169)
— Scheme Procedure: string->utf8 str
— Scheme Procedure: string->utf16 str [endianness]
— Scheme Procedure: string->utf32 str [endianness]
— C Function: scm_string_to_utf8 (str)
— C Function: scm_string_to_utf16 (str, endianness)
— C Function: scm_string_to_utf32 (str, endianness)

Return a newly allocated bytevector that contains the UTF-8, UTF-16, or UTF-32 (aka. UCS-4) encoding of str. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.

— Scheme Procedure: utf8->string utf
— Scheme Procedure: utf16->string utf [endianness]
— Scheme Procedure: utf32->string utf [endianness]
— C Function: scm_utf8_to_string (utf)
— C Function: scm_utf16_to_string (utf, endianness)
— C Function: scm_utf32_to_string (utf, endianness)

Return a newly allocated string that contains from the UTF-8-, UTF-16-, or UTF-32-decoded contents of bytevector utf. For UTF-16 and UTF-32, endianness should be the symbol big or little; when omitted, it defaults to big endian.