The Unicode standard defines four normalization forms for Unicode strings. The following type is used to denote a normalization form.
An object of type
uninorm_tdenotes a Unicode normalization form. This is a scalar type; its values can be compared with
The following constants denote the four normalization forms.
Normalization form C: canonical decomposition, then canonical composition.
Normalization form KC: compatibility decomposition, then canonical composition.
The following functions operate on
Tests whether the normalization form nf does compatibility decomposition.
Tests whether the normalization form nf includes canonical composition.
Returns the decomposing variant of the normalization form nf. This maps NFC,NFD → NFD and NFKC,NFKD → NFKD.
The following functions apply a Unicode normalization form to a Unicode string.
Returns the specified normalization form of a string.