Next: , Previous: , Up: GNU libunistring   [Contents][Index]


Appendix B The char32_t problem

In response to the wchar_t mess described in the previous section, ISO C 11 introduces two new types: char32_t and char16_t.

char32_t is a type like wchar_t, with the added guarantee that it is 32 bits wide. So, it is a type that is appropriate for encoding a Unicode character. It is meant to resolve the problems of the 16-bit wide wchar_t on AIX and Windows platforms, and allow a saner programming model for wide character strings across all platforms.

char16_t is a type like wchar_t, with the added guarantee that it is 16 bits wide. It is meant to allow porting programs that use the broken wide character strings programming model from Windows to all platforms. Of course, no one needs this.

These types are accompanied with a syntax for defining wide string literals with these element types: u"..." and U"...".

So far, so good. What the ISO C designers forgot, is to provide standardized C library functions that operate on these wide character strings. They standardized only the most basic functions, mbrtoc32 and c32rtomb, which are analogous to mbrtowc and wcrtomb, respectively. For the rest, GNU gnulib https://www.gnu.org/software/gnulib/ provides the functions:

Still, this API has two problems:


Next: Licenses, Previous: The wchar_t mess, Up: GNU libunistring   [Contents][Index]