8.1.1 The object oriented API for general category
— Type:
uc_general_category_t
This data type denotes a general category value. It is an immediate type that
can be copied by simple assignment, without involving memory allocation. It is
not an array type.
The following are the predefined general category value. Additional general
categories may be added in the future.
— Constant: uc_general_category_t
UC_CATEGORY_L
— Constant: uc_general_category_t
UC_CATEGORY_Lu
— Constant: uc_general_category_t
UC_CATEGORY_Ll
— Constant: uc_general_category_t
UC_CATEGORY_Lt
— Constant: uc_general_category_t
UC_CATEGORY_Lm
— Constant: uc_general_category_t
UC_CATEGORY_Lo
— Constant: uc_general_category_t
UC_CATEGORY_M
— Constant: uc_general_category_t
UC_CATEGORY_Mn
— Constant: uc_general_category_t
UC_CATEGORY_Mc
— Constant: uc_general_category_t
UC_CATEGORY_Me
— Constant: uc_general_category_t
UC_CATEGORY_N
— Constant: uc_general_category_t
UC_CATEGORY_Nd
— Constant: uc_general_category_t
UC_CATEGORY_Nl
— Constant: uc_general_category_t
UC_CATEGORY_No
— Constant: uc_general_category_t
UC_CATEGORY_P
— Constant: uc_general_category_t
UC_CATEGORY_Pc
— Constant: uc_general_category_t
UC_CATEGORY_Pd
— Constant: uc_general_category_t
UC_CATEGORY_Ps
— Constant: uc_general_category_t
UC_CATEGORY_Pe
— Constant: uc_general_category_t
UC_CATEGORY_Pi
— Constant: uc_general_category_t
UC_CATEGORY_Pf
— Constant: uc_general_category_t
UC_CATEGORY_Po
— Constant: uc_general_category_t
UC_CATEGORY_S
— Constant: uc_general_category_t
UC_CATEGORY_Sm
— Constant: uc_general_category_t
UC_CATEGORY_Sc
— Constant: uc_general_category_t
UC_CATEGORY_Sk
— Constant: uc_general_category_t
UC_CATEGORY_So
— Constant: uc_general_category_t
UC_CATEGORY_Z
— Constant: uc_general_category_t
UC_CATEGORY_Zs
— Constant: uc_general_category_t
UC_CATEGORY_Zl
— Constant: uc_general_category_t
UC_CATEGORY_Zp
— Constant: uc_general_category_t
UC_CATEGORY_C
— Constant: uc_general_category_t
UC_CATEGORY_Cc
— Constant: uc_general_category_t
UC_CATEGORY_Cf
— Constant: uc_general_category_t
UC_CATEGORY_Cs
— Constant: uc_general_category_t
UC_CATEGORY_Co
— Constant: uc_general_category_t
UC_CATEGORY_Cn
The following are alias names for predefined General category values.
— Macro: uc_general_category_t
UC_LETTER
This is another name for UC_CATEGORY_L.
— Macro: uc_general_category_t
UC_UPPERCASE_LETTER
This is another name for UC_CATEGORY_Lu.
— Macro: uc_general_category_t
UC_LOWERCASE_LETTER
This is another name for UC_CATEGORY_Ll.
— Macro: uc_general_category_t
UC_TITLECASE_LETTER
This is another name for UC_CATEGORY_Lt.
— Macro: uc_general_category_t
UC_MODIFIER_LETTER
This is another name for UC_CATEGORY_Lm.
— Macro: uc_general_category_t
UC_OTHER_LETTER
This is another name for UC_CATEGORY_Lo.
— Macro: uc_general_category_t
UC_MARK
This is another name for UC_CATEGORY_M.
— Macro: uc_general_category_t
UC_NON_SPACING_MARK
This is another name for UC_CATEGORY_Mn.
— Macro: uc_general_category_t
UC_COMBINING_SPACING_MARK
This is another name for UC_CATEGORY_Mc.
— Macro: uc_general_category_t
UC_ENCLOSING_MARK
This is another name for UC_CATEGORY_Me.
— Macro: uc_general_category_t
UC_NUMBER
This is another name for UC_CATEGORY_N.
— Macro: uc_general_category_t
UC_DECIMAL_DIGIT_NUMBER
This is another name for UC_CATEGORY_Nd.
— Macro: uc_general_category_t
UC_LETTER_NUMBER
This is another name for UC_CATEGORY_Nl.
— Macro: uc_general_category_t
UC_OTHER_NUMBER
This is another name for UC_CATEGORY_No.
— Macro: uc_general_category_t
UC_PUNCTUATION
This is another name for UC_CATEGORY_P.
— Macro: uc_general_category_t
UC_CONNECTOR_PUNCTUATION
This is another name for UC_CATEGORY_Pc.
— Macro: uc_general_category_t
UC_DASH_PUNCTUATION
This is another name for UC_CATEGORY_Pd.
— Macro: uc_general_category_t
UC_OPEN_PUNCTUATION
This is another name for UC_CATEGORY_Ps (“start punctuation”).
— Macro: uc_general_category_t
UC_CLOSE_PUNCTUATION
This is another name for UC_CATEGORY_Pe (“end punctuation”).
— Macro: uc_general_category_t
UC_INITIAL_QUOTE_PUNCTUATION
This is another name for UC_CATEGORY_Pi.
— Macro: uc_general_category_t
UC_FINAL_QUOTE_PUNCTUATION
This is another name for UC_CATEGORY_Pf.
— Macro: uc_general_category_t
UC_OTHER_PUNCTUATION
This is another name for UC_CATEGORY_Po.
— Macro: uc_general_category_t
UC_SYMBOL
This is another name for UC_CATEGORY_S.
— Macro: uc_general_category_t
UC_MATH_SYMBOL
This is another name for UC_CATEGORY_Sm.
— Macro: uc_general_category_t
UC_CURRENCY_SYMBOL
This is another name for UC_CATEGORY_Sc.
— Macro: uc_general_category_t
UC_MODIFIER_SYMBOL
This is another name for UC_CATEGORY_Sk.
— Macro: uc_general_category_t
UC_OTHER_SYMBOL
This is another name for UC_CATEGORY_So.
— Macro: uc_general_category_t
UC_SEPARATOR
This is another name for UC_CATEGORY_Z.
— Macro: uc_general_category_t
UC_SPACE_SEPARATOR
This is another name for UC_CATEGORY_Zs.
— Macro: uc_general_category_t
UC_LINE_SEPARATOR
This is another name for UC_CATEGORY_Zl.
— Macro: uc_general_category_t
UC_PARAGRAPH_SEPARATOR
This is another name for UC_CATEGORY_Zp.
— Macro: uc_general_category_t
UC_OTHER
This is another name for UC_CATEGORY_C.
— Macro: uc_general_category_t
UC_CONTROL
This is another name for UC_CATEGORY_Cc.
— Macro: uc_general_category_t
UC_FORMAT
This is another name for UC_CATEGORY_Cf.
— Macro: uc_general_category_t
UC_SURROGATE
This is another name for UC_CATEGORY_Cs. All code points in this
category are invalid characters.
— Macro: uc_general_category_t
UC_PRIVATE_USE
This is another name for UC_CATEGORY_Co.
— Macro: uc_general_category_t
UC_UNASSIGNED
This is another name for UC_CATEGORY_Cn. Some code points in this
category are invalid characters.
The following functions combine general categories, like in a boolean algebra,
except that there is no ‘not’ operation.
— Function: uc_general_category_t
uc_general_category_or (
uc_general_category_t category1, uc_general_category_t category2)
Returns the union of two general categories.
This corresponds to the unions of the two sets of characters.
— Function: uc_general_category_t
uc_general_category_and (
uc_general_category_t category1, uc_general_category_t category2)
Returns the intersection of two general categories as bit masks.
This does not correspond to the intersection of the two sets of
characters.
— Function: uc_general_category_t
uc_general_category_and_not (
uc_general_category_t category1, uc_general_category_t category2)
Returns the intersection of a general category with the complement of a
second general category, as bit masks.
This does not correspond to the intersection with complement, when
viewing the categories as sets of characters.
The following functions associate general categories with their name.
— Function: const char *
uc_general_category_name (
uc_general_category_t category)
Returns the name of a general category.
Returns NULL if the general category corresponds to a bit mask that does not
have a name.
— Function: uc_general_category_t
uc_general_category_byname (
const char *category_name)
Returns the general category given by name, e.g. "Lu".
The following functions view general categories as sets of Unicode characters.
— Function: uc_general_category_t
uc_general_category (
ucs4_t uc)
Returns the general category of a Unicode character.
This function uses a big table.
— Function: bool
uc_is_general_category (
ucs4_t uc, uc_general_category_t category)
Tests whether a Unicode character belongs to a given category.
The category argument can be a predefined general category or the
combination of several predefined general categories.