Next: Elementary string functions with memory allocation, Previous: Elementary string conversions, Up: unistr.h

The following functions inspect and return details about the first character in a Unicode string.

— Function: int **u8_mblen** (`const uint8_t *s, size_t n`)

— Function: int**u16_mblen** (`const uint16_t *s, size_t n`)

— Function: int**u32_mblen** (`const uint32_t *s, size_t n`)

— Function: int

— Function: int

Returns the length (number of units) of the first character in

s, which is no longer thann. Returns 0 if it is the NUL character. Returns -1 upon failure.This function is similar to

`mblen`

, except that it operates on a Unicode string and thatsmust not be NULL.

— Function: int **u8_mbtouc_unsafe** (`ucs4_t *puc, const uint8_t *s, size_t n`)

— Function: int**u16_mbtouc_unsafe** (`ucs4_t *puc, const uint16_t *s, size_t n`)

— Function: int**u32_mbtouc_unsafe** (`ucs4_t *puc, const uint32_t *s, size_t n`)

— Function: int

— Function: int

Returns the length (number of units) of the first character in

s, putting its`ucs4_t`

representation in`*`

puc. Upon failure,`*`

pucis set to`0xfffd`

, and an appropriate number of units is returned.The number of available units,

n, must be > 0.This function is similar to

`mbtowc`

, except that it operates on a Unicode string,pucandsmust not be NULL,nmust be > 0, and the NUL character is not treated specially.

— Function: int **u8_mbtouc** (`ucs4_t *puc, const uint8_t *s, size_t n`)

— Function: int**u16_mbtouc** (`ucs4_t *puc, const uint16_t *s, size_t n`)

— Function: int**u32_mbtouc** (`ucs4_t *puc, const uint32_t *s, size_t n`)

— Function: int

— Function: int

This function is like

`u8_mbtouc_unsafe`

, except that it will detect an invalid UTF-8 character, even if the library is compiled without--enable-safety.

— Function: int **u8_mbtoucr** (`ucs4_t *puc, const uint8_t *s, size_t n`)

— Function: int**u16_mbtoucr** (`ucs4_t *puc, const uint16_t *s, size_t n`)

— Function: int**u32_mbtoucr** (`ucs4_t *puc, const uint32_t *s, size_t n`)

— Function: int

— Function: int

Returns the length (number of units) of the first character in

s, putting its`ucs4_t`

representation in`*`

puc. Upon failure,`*`

pucis set to`0xfffd`

, and -1 is returned for an invalid sequence of units, -2 is returned for an incomplete sequence of units.The number of available units,

n, must be > 0.This function is similar to

`u8_mbtouc`

, except that the return value gives more details about the failure, similar to`mbrtowc`

.

The following function stores a Unicode character as a Unicode string in memory.

— Function: int **u8_uctomb** (`uint8_t *s, ucs4_t uc, int n`)

— Function: int**u16_uctomb** (`uint16_t *s, ucs4_t uc, int n`)

— Function: int**u32_uctomb** (`uint32_t *s, ucs4_t uc, int n`)

— Function: int

— Function: int

Puts the multibyte character represented by

ucins, returning its length. Returns -1 upon failure, -2 if the number of available units,n, is too small. The latter case cannot occur ifn>= 6/2/1, respectively.This function is similar to

`wctomb`

, except that it operates on a Unicode strings,smust not be NULL, and the argumentnmust be specified.

The following functions copy Unicode strings in memory.

— Function: uint8_t * **u8_cpy** (`uint8_t *dest, const uint8_t *src, size_t n`)

— Function: uint16_t ***u16_cpy** (`uint16_t *dest, const uint16_t *src, size_t n`)

— Function: uint32_t ***u32_cpy** (`uint32_t *dest, const uint32_t *src, size_t n`)

— Function: uint16_t *

— Function: uint32_t *

Copies

nunits fromsrctodest.This function is similar to

`memcpy`

, except that it operates on Unicode strings.

— Function: uint8_t * **u8_move** (`uint8_t *dest, const uint8_t *src, size_t n`)

— Function: uint16_t ***u16_move** (`uint16_t *dest, const uint16_t *src, size_t n`)

— Function: uint32_t ***u32_move** (`uint32_t *dest, const uint32_t *src, size_t n`)

— Function: uint16_t *

— Function: uint32_t *

Copies

nunits fromsrctodest, guaranteeing correct behavior for overlapping memory areas.This function is similar to

`memmove`

, except that it operates on Unicode strings.

The following function fills a Unicode string.

— Function: uint8_t * **u8_set** (`uint8_t *s, ucs4_t uc, size_t n`)

— Function: uint16_t ***u16_set** (`uint16_t *s, ucs4_t uc, size_t n`)

— Function: uint32_t ***u32_set** (`uint32_t *s, ucs4_t uc, size_t n`)

— Function: uint16_t *

— Function: uint32_t *

Sets the first

ncharacters ofstouc.ucshould be a character that occupies only 1 unit.This function is similar to

`memset`

, except that it operates on Unicode strings.

The following function compares two Unicode strings of the same length.

— Function: int **u8_cmp** (`const uint8_t *s1, const uint8_t *s2, size_t n`)

— Function: int**u16_cmp** (`const uint16_t *s1, const uint16_t *s2, size_t n`)

— Function: int**u32_cmp** (`const uint32_t *s1, const uint32_t *s2, size_t n`)

— Function: int

— Function: int

Compares

s1ands2, each of lengthn, lexicographically. Returns a negative value ifs1compares smaller thans2, a positive value ifs1compares larger thans2, or 0 if they compare equal.This function is similar to

`memcmp`

, except that it operates on Unicode strings.

The following function compares two Unicode strings of possibly different lengths.

— Function: int **u8_cmp2** (`const uint8_t *s1, size_t n1, const uint8_t *s2, size_t n2`)

— Function: int**u16_cmp2** (`const uint16_t *s1, size_t n1, const uint16_t *s2, size_t n2`)

— Function: int**u32_cmp2** (`const uint32_t *s1, size_t n1, const uint32_t *s2, size_t n2`)

— Function: int

— Function: int

Compares

s1ands2, lexicographically. Returns a negative value ifs1compares smaller thans2, a positive value ifs1compares larger thans2, or 0 if they compare equal.This function is similar to the gnulib function

`memcmp2`

, except that it operates on Unicode strings.

The following function searches for a given Unicode character.

— Function: uint8_t * **u8_chr** (`const uint8_t *s, size_t n, ucs4_t uc`)

— Function: uint16_t ***u16_chr** (`const uint16_t *s, size_t n, ucs4_t uc`)

— Function: uint32_t ***u32_chr** (`const uint32_t *s, size_t n, ucs4_t uc`)

— Function: uint16_t *

— Function: uint32_t *

Searches the string at

sforuc. Returns a pointer to the first occurrence ofucins, or NULL ifucdoes not occur ins.This function is similar to

`memchr`

, except that it operates on Unicode strings.

The following function counts the number of Unicode characters.

— Function: size_t **u8_mbsnlen** (`const uint8_t *s, size_t n`)

— Function: size_t**u16_mbsnlen** (`const uint16_t *s, size_t n`)

— Function: size_t**u32_mbsnlen** (`const uint32_t *s, size_t n`)

— Function: size_t

— Function: size_t

Counts and returns the number of Unicode characters in the

nunits froms.This function is similar to the gnulib function

`mbsnlen`

, except that it operates on Unicode strings.