Previous: Elementary string functions with memory allocation, Up: unistr.h


4.5 Elementary string functions on NUL terminated strings

The following functions inspect and return details about the first character in a Unicode string.

— Function: int u8_strmblen (const uint8_t *s)
— Function: int u16_strmblen (const uint16_t *s)
— Function: int u32_strmblen (const uint32_t *s)

Returns the length (number of units) of the first character in s. Returns 0 if it is the NUL character. Returns -1 upon failure.

— Function: int u8_strmbtouc (ucs4_t *puc, const uint8_t *s)
— Function: int u16_strmbtouc (ucs4_t *puc, const uint16_t *s)
— Function: int u32_strmbtouc (ucs4_t *puc, const uint32_t *s)

Returns the length (number of units) of the first character in s, putting its ucs4_t representation in *puc. Returns 0 if it is the NUL character. Returns -1 upon failure.

— Function: const uint8_t * u8_next (ucs4_t *puc, const uint8_t *s)
— Function: const uint16_t * u16_next (ucs4_t *puc, const uint16_t *s)
— Function: const uint32_t * u32_next (ucs4_t *puc, const uint32_t *s)

Forward iteration step. Advances the pointer past the next character, or returns NULL if the end of the string has been reached. Puts the character's ucs4_t representation in *puc.

The following function inspects and returns details about the previous character in a Unicode string.

— Function: const uint8_t * u8_prev (ucs4_t *puc, const uint8_t *s, const uint8_t *start)
— Function: const uint16_t * u16_prev (ucs4_t *puc, const uint16_t *s, const uint16_t *start)
— Function: const uint32_t * u32_prev (ucs4_t *puc, const uint32_t *s, const uint32_t *start)

Backward iteration step. Advances the pointer to point to the previous character, or returns NULL if the beginning of the string had been reached. Puts the character's ucs4_t representation in *puc.

The following functions determine the length of a Unicode string.

— Function: size_t u8_strlen (const uint8_t *s)
— Function: size_t u16_strlen (const uint16_t *s)
— Function: size_t u32_strlen (const uint32_t *s)

Returns the number of units in s.

This function is similar to strlen and wcslen, except that it operates on Unicode strings.

— Function: size_t u8_strnlen (const uint8_t *s, size_t maxlen)
— Function: size_t u16_strnlen (const uint16_t *s, size_t maxlen)
— Function: size_t u32_strnlen (const uint32_t *s, size_t maxlen)

Returns the number of units in s, but at most maxlen.

This function is similar to strnlen and wcsnlen, except that it operates on Unicode strings.

The following functions copy portions of Unicode strings in memory.

— Function: uint8_t * u8_strcpy (uint8_t *dest, const uint8_t *src)
— Function: uint16_t * u16_strcpy (uint16_t *dest, const uint16_t *src)
— Function: uint32_t * u32_strcpy (uint32_t *dest, const uint32_t *src)

Copies src to dest.

This function is similar to strcpy and wcscpy, except that it operates on Unicode strings.

— Function: uint8_t * u8_stpcpy (uint8_t *dest, const uint8_t *src)
— Function: uint16_t * u16_stpcpy (uint16_t *dest, const uint16_t *src)
— Function: uint32_t * u32_stpcpy (uint32_t *dest, const uint32_t *src)

Copies src to dest, returning the address of the terminating NUL in dest.

This function is similar to stpcpy, except that it operates on Unicode strings.

— Function: uint8_t * u8_strncpy (uint8_t *dest, const uint8_t *src, size_t n)
— Function: uint16_t * u16_strncpy (uint16_t *dest, const uint16_t *src, size_t n)
— Function: uint32_t * u32_strncpy (uint32_t *dest, const uint32_t *src, size_t n)

Copies no more than n units of src to dest.

This function is similar to strncpy and wcsncpy, except that it operates on Unicode strings.

— Function: uint8_t * u8_stpncpy (uint8_t *dest, const uint8_t *src, size_t n)
— Function: uint16_t * u16_stpncpy (uint16_t *dest, const uint16_t *src, size_t n)
— Function: uint32_t * u32_stpncpy (uint32_t *dest, const uint32_t *src, size_t n)

Copies no more than n units of src to dest. Returns a pointer past the last non-NUL unit written into dest. In other words, if the units written into dest include a NUL, the return value is the address of the first such NUL unit, otherwise it is dest + n.

This function is similar to stpncpy, except that it operates on Unicode strings.

— Function: uint8_t * u8_strcat (uint8_t *dest, const uint8_t *src)
— Function: uint16_t * u16_strcat (uint16_t *dest, const uint16_t *src)
— Function: uint32_t * u32_strcat (uint32_t *dest, const uint32_t *src)

Appends src onto dest.

This function is similar to strcat and wcscat, except that it operates on Unicode strings.

— Function: uint8_t * u8_strncat (uint8_t *dest, const uint8_t *src, size_t n)
— Function: uint16_t * u16_strncat (uint16_t *dest, const uint16_t *src, size_t n)
— Function: uint32_t * u32_strncat (uint32_t *dest, const uint32_t *src, size_t n)

Appends no more than n units of src onto dest.

This function is similar to strncat and wcsncat, except that it operates on Unicode strings.

The following functions compare two Unicode strings.

— Function: int u8_strcmp (const uint8_t *s1, const uint8_t *s2)
— Function: int u16_strcmp (const uint16_t *s1, const uint16_t *s2)
— Function: int u32_strcmp (const uint32_t *s1, const uint32_t *s2)

Compares s1 and s2, lexicographically. Returns a negative value if s1 compares smaller than s2, a positive value if s1 compares larger than s2, or 0 if they compare equal.

This function is similar to strcmp and wcscmp, except that it operates on Unicode strings.

— Function: int u8_strcoll (const uint8_t *s1, const uint8_t *s2)
— Function: int u16_strcoll (const uint16_t *s1, const uint16_t *s2)
— Function: int u32_strcoll (const uint32_t *s1, const uint32_t *s2)

Compares s1 and s2 using the collation rules of the current locale. Returns -1 if s1 < s2, 0 if s1 = s2, 1 if s1 > s2. Upon failure, sets errno and returns any value.

This function is similar to strcoll and wcscoll, except that it operates on Unicode strings.

Note that this function may consider different canonical normalizations of the same string as having a large distance. It is therefore better to use the function u8_normcoll instead of this one; see uninorm.h.

— Function: int u8_strncmp (const uint8_t *s1, const uint8_t *s2, size_t n)
— Function: int u16_strncmp (const uint16_t *s1, const uint16_t *s2, size_t n)
— Function: int u32_strncmp (const uint32_t *s1, const uint32_t *s2, size_t n)

Compares no more than n units of s1 and s2.

This function is similar to strncmp and wcsncmp, except that it operates on Unicode strings.

The following function allocates a duplicate of a Unicode string.

— Function: uint8_t * u8_strdup (const uint8_t *s)
— Function: uint16_t * u16_strdup (const uint16_t *s)
— Function: uint32_t * u32_strdup (const uint32_t *s)

Duplicates s, returning an identical malloc'd string.

This function is similar to strdup and wcsdup, except that it operates on Unicode strings.

The following functions search for a given Unicode character.

— Function: uint8_t * u8_strchr (const uint8_t *str, ucs4_t uc)
— Function: uint16_t * u16_strchr (const uint16_t *str, ucs4_t uc)
— Function: uint32_t * u32_strchr (const uint32_t *str, ucs4_t uc)

Finds the first occurrence of uc in str.

This function is similar to strchr and wcschr, except that it operates on Unicode strings.

— Function: uint8_t * u8_strrchr (const uint8_t *str, ucs4_t uc)
— Function: uint16_t * u16_strrchr (const uint16_t *str, ucs4_t uc)
— Function: uint32_t * u32_strrchr (const uint32_t *str, ucs4_t uc)

Finds the last occurrence of uc in str.

This function is similar to strrchr and wcsrchr, except that it operates on Unicode strings.

The following functions search for the first occurrence of some Unicode character in or outside a given set of Unicode characters.

— Function: size_t u8_strcspn (const uint8_t *str, const uint8_t *reject)
— Function: size_t u16_strcspn (const uint16_t *str, const uint16_t *reject)
— Function: size_t u32_strcspn (const uint32_t *str, const uint32_t *reject)

Returns the length of the initial segment of str which consists entirely of Unicode characters not in reject.

This function is similar to strcspn and wcscspn, except that it operates on Unicode strings.

— Function: size_t u8_strspn (const uint8_t *str, const uint8_t *accept)
— Function: size_t u16_strspn (const uint16_t *str, const uint16_t *accept)
— Function: size_t u32_strspn (const uint32_t *str, const uint32_t *accept)

Returns the length of the initial segment of str which consists entirely of Unicode characters in accept.

This function is similar to strspn and wcsspn, except that it operates on Unicode strings.

— Function: uint8_t * u8_strpbrk (const uint8_t *str, const uint8_t *accept)
— Function: uint16_t * u16_strpbrk (const uint16_t *str, const uint16_t *accept)
— Function: uint32_t * u32_strpbrk (const uint32_t *str, const uint32_t *accept)

Finds the first occurrence in str of any character in accept.

This function is similar to strpbrk and wcspbrk, except that it operates on Unicode strings.

The following functions search whether a given Unicode string is a substring of another Unicode string.

— Function: uint8_t * u8_strstr (const uint8_t *haystack, const uint8_t *needle)
— Function: uint16_t * u16_strstr (const uint16_t *haystack, const uint16_t *needle)
— Function: uint32_t * u32_strstr (const uint32_t *haystack, const uint32_t *needle)

Finds the first occurrence of needle in haystack.

This function is similar to strstr and wcsstr, except that it operates on Unicode strings.

— Function: bool u8_startswith (const uint8_t *str, const uint8_t *prefix)
— Function: bool u16_startswith (const uint16_t *str, const uint16_t *prefix)
— Function: bool u32_startswith (const uint32_t *str, const uint32_t *prefix)

Tests whether str starts with prefix.

— Function: bool u8_endswith (const uint8_t *str, const uint8_t *suffix)
— Function: bool u16_endswith (const uint16_t *str, const uint16_t *suffix)
— Function: bool u32_endswith (const uint32_t *str, const uint32_t *suffix)

Tests whether str ends with suffix.

The following function does one step in tokenizing a Unicode string.

— Function: uint8_t * u8_strtok (uint8_t *str, const uint8_t *delim, uint8_t **ptr)
— Function: uint16_t * u16_strtok (uint16_t *str, const uint16_t *delim, uint16_t **ptr)
— Function: uint32_t * u32_strtok (uint32_t *str, const uint32_t *delim, uint32_t **ptr)

Divides str into tokens separated by characters in delim.

This function is similar to strtok_r and wcstok, except that it operates on Unicode strings. Its interface is actually more similar to wcstok than to strtok.