Next: , Up: Properties   [Contents][Index]


8.9.1 Properties as objects – the object oriented API

The following type designates a property on Unicode characters.

Type: uc_property_t

This data type denotes a boolean property on Unicode characters. It is an immediate type that can be copied by simple assignment, without involving memory allocation. It is not an array type.

Many Unicode properties are predefined.

The following are general properties.

Constant: uc_property_t UC_PROPERTY_WHITE_SPACE
Constant: uc_property_t UC_PROPERTY_ALPHABETIC
Constant: uc_property_t UC_PROPERTY_OTHER_ALPHABETIC
Constant: uc_property_t UC_PROPERTY_NOT_A_CHARACTER
Constant: uc_property_t UC_PROPERTY_DEFAULT_IGNORABLE_CODE_POINT
Constant: uc_property_t UC_PROPERTY_OTHER_DEFAULT_IGNORABLE_CODE_POINT
Constant: uc_property_t UC_PROPERTY_DEPRECATED
Constant: uc_property_t UC_PROPERTY_LOGICAL_ORDER_EXCEPTION
Constant: uc_property_t UC_PROPERTY_VARIATION_SELECTOR
Constant: uc_property_t UC_PROPERTY_PRIVATE_USE
Constant: uc_property_t UC_PROPERTY_UNASSIGNED_CODE_VALUE

The following properties are related to case folding.

Constant: uc_property_t UC_PROPERTY_UPPERCASE
Constant: uc_property_t UC_PROPERTY_OTHER_UPPERCASE
Constant: uc_property_t UC_PROPERTY_LOWERCASE
Constant: uc_property_t UC_PROPERTY_OTHER_LOWERCASE
Constant: uc_property_t UC_PROPERTY_TITLECASE
Constant: uc_property_t UC_PROPERTY_CASED
Constant: uc_property_t UC_PROPERTY_CASE_IGNORABLE
Constant: uc_property_t UC_PROPERTY_CHANGES_WHEN_LOWERCASED
Constant: uc_property_t UC_PROPERTY_CHANGES_WHEN_UPPERCASED
Constant: uc_property_t UC_PROPERTY_CHANGES_WHEN_TITLECASED
Constant: uc_property_t UC_PROPERTY_CHANGES_WHEN_CASEFOLDED
Constant: uc_property_t UC_PROPERTY_CHANGES_WHEN_CASEMAPPED
Constant: uc_property_t UC_PROPERTY_SOFT_DOTTED

The following properties are related to identifiers.

Constant: uc_property_t UC_PROPERTY_ID_START
Constant: uc_property_t UC_PROPERTY_OTHER_ID_START
Constant: uc_property_t UC_PROPERTY_ID_CONTINUE
Constant: uc_property_t UC_PROPERTY_OTHER_ID_CONTINUE
Constant: uc_property_t UC_PROPERTY_XID_START
Constant: uc_property_t UC_PROPERTY_XID_CONTINUE
Constant: uc_property_t UC_PROPERTY_ID_COMPAT_MATH_START
Constant: uc_property_t UC_PROPERTY_ID_COMPAT_MATH_CONTINUE
Constant: uc_property_t UC_PROPERTY_PATTERN_WHITE_SPACE
Constant: uc_property_t UC_PROPERTY_PATTERN_SYNTAX

The following properties have an influence on shaping and rendering.

Constant: uc_property_t UC_PROPERTY_JOIN_CONTROL
Constant: uc_property_t UC_PROPERTY_GRAPHEME_BASE
Constant: uc_property_t UC_PROPERTY_GRAPHEME_EXTEND
Constant: uc_property_t UC_PROPERTY_OTHER_GRAPHEME_EXTEND
Constant: uc_property_t UC_PROPERTY_MODIFIER_COMBINING_MARK

The following properties relate to bidirectional reordering.

Constant: uc_property_t UC_PROPERTY_BIDI_CONTROL
Constant: uc_property_t UC_PROPERTY_BIDI_LEFT_TO_RIGHT
Constant: uc_property_t UC_PROPERTY_BIDI_HEBREW_RIGHT_TO_LEFT
Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_RIGHT_TO_LEFT
Constant: uc_property_t UC_PROPERTY_BIDI_EUROPEAN_DIGIT
Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_SEPARATOR
Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_TERMINATOR
Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_DIGIT
Constant: uc_property_t UC_PROPERTY_BIDI_COMMON_SEPARATOR
Constant: uc_property_t UC_PROPERTY_BIDI_BLOCK_SEPARATOR
Constant: uc_property_t UC_PROPERTY_BIDI_SEGMENT_SEPARATOR
Constant: uc_property_t UC_PROPERTY_BIDI_WHITESPACE
Constant: uc_property_t UC_PROPERTY_BIDI_NON_SPACING_MARK
Constant: uc_property_t UC_PROPERTY_BIDI_BOUNDARY_NEUTRAL
Constant: uc_property_t UC_PROPERTY_BIDI_PDF
Constant: uc_property_t UC_PROPERTY_BIDI_EMBEDDING_OR_OVERRIDE
Constant: uc_property_t UC_PROPERTY_BIDI_OTHER_NEUTRAL

The following properties deal with number representations.

Constant: uc_property_t UC_PROPERTY_HEX_DIGIT
Constant: uc_property_t UC_PROPERTY_ASCII_HEX_DIGIT

The following properties deal with CJK.

Constant: uc_property_t UC_PROPERTY_IDEOGRAPHIC
Constant: uc_property_t UC_PROPERTY_UNIFIED_IDEOGRAPH
Constant: uc_property_t UC_PROPERTY_RADICAL
Constant: uc_property_t UC_PROPERTY_IDS_UNARY_OPERATOR
Constant: uc_property_t UC_PROPERTY_IDS_BINARY_OPERATOR
Constant: uc_property_t UC_PROPERTY_IDS_TRINARY_OPERATOR

The following properties deal with pictographic symbols.

Constant: uc_property_t UC_PROPERTY_EMOJI
Constant: uc_property_t UC_PROPERTY_EMOJI_PRESENTATION
Constant: uc_property_t UC_PROPERTY_EMOJI_MODIFIER
Constant: uc_property_t UC_PROPERTY_EMOJI_MODIFIER_BASE
Constant: uc_property_t UC_PROPERTY_EMOJI_COMPONENT
Constant: uc_property_t UC_PROPERTY_EXTENDED_PICTOGRAPHIC

Other miscellaneous properties are:

Constant: uc_property_t UC_PROPERTY_ZERO_WIDTH
Constant: uc_property_t UC_PROPERTY_SPACE
Constant: uc_property_t UC_PROPERTY_NON_BREAK
Constant: uc_property_t UC_PROPERTY_ISO_CONTROL
Constant: uc_property_t UC_PROPERTY_FORMAT_CONTROL
Constant: uc_property_t UC_PROPERTY_PREPENDED_CONCATENATION_MARK
Constant: uc_property_t UC_PROPERTY_DASH
Constant: uc_property_t UC_PROPERTY_HYPHEN
Constant: uc_property_t UC_PROPERTY_PUNCTUATION
Constant: uc_property_t UC_PROPERTY_LINE_SEPARATOR
Constant: uc_property_t UC_PROPERTY_PARAGRAPH_SEPARATOR
Constant: uc_property_t UC_PROPERTY_QUOTATION_MARK
Constant: uc_property_t UC_PROPERTY_SENTENCE_TERMINAL
Constant: uc_property_t UC_PROPERTY_TERMINAL_PUNCTUATION
Constant: uc_property_t UC_PROPERTY_CURRENCY_SYMBOL
Constant: uc_property_t UC_PROPERTY_MATH
Constant: uc_property_t UC_PROPERTY_OTHER_MATH
Constant: uc_property_t UC_PROPERTY_PAIRED_PUNCTUATION
Constant: uc_property_t UC_PROPERTY_LEFT_OF_PAIR
Constant: uc_property_t UC_PROPERTY_COMBINING
Constant: uc_property_t UC_PROPERTY_COMPOSITE
Constant: uc_property_t UC_PROPERTY_DECIMAL_DIGIT
Constant: uc_property_t UC_PROPERTY_NUMERIC
Constant: uc_property_t UC_PROPERTY_DIACRITIC
Constant: uc_property_t UC_PROPERTY_EXTENDER
Constant: uc_property_t UC_PROPERTY_IGNORABLE_CONTROL
Constant: uc_property_t UC_PROPERTY_REGIONAL_INDICATOR

The following function looks up a property by its name.

Function: uc_property_t uc_property_byname (const char *property_name)

Returns the property given by name, e.g. "White space". If a property with the given name exists, the result will satisfy the uc_property_is_valid predicate. Otherwise the result will not satisfy this predicate and must not be passed to functions that expect an uc_property_t argument.

This lookup ignores spaces, underscores, or hyphens as word separators, is case-insignificant, and supports the aliases listed in Unicode’s PropertyAliases.txt file.

This function references a big table of all predefined properties. Its use can significantly increase the size of your application.

Function: bool uc_property_is_valid (uc_property_t property)

Returns true when the given property is valid, or false otherwise.

The following function views a property as a set of Unicode characters.

Function: bool uc_is_property (ucs4_t uc, uc_property_t property)

Tests whether the Unicode character uc has the given property.


Next: Properties as functions – the functional API, Up: Properties   [Contents][Index]