10.2 Word break property
This is a more low-level API. The word break property is a property defined
in Unicode Standard Annex #29, section “Word Boundaries”, see
http://www.unicode.org/reports/tr29/#Word_Boundaries. It is
used for determining the word breaks in a string.
The following are the possible values of the word break property. More values
may be added in the future.
— Constant: int
WBP_OTHER
— Constant: int
WBP_CR
— Constant: int
WBP_LF
— Constant: int
WBP_NEWLINE
— Constant: int
WBP_EXTEND
— Constant: int
WBP_FORMAT
— Constant: int
WBP_KATAKANA
— Constant: int
WBP_ALETTER
— Constant: int
WBP_MIDNUMLET
— Constant: int
WBP_MIDLETTER
— Constant: int
WBP_MIDNUM
— Constant: int
WBP_NUMERIC
— Constant: int
WBP_EXTENDNUMLET
The following function looks up the word break property of a character.
— Function: int
uc_wordbreak_property (
ucs4_t uc)
Returns the Word_Break property of a Unicode character.