2.2.10 Categories

Categories are arranged in a tree. Only the leaf nodes in the tree are really categories; the others just serve as grouping constructs.

Category => Value[name] (Leaf | Group)
Leaf => 00 00 00 i2 int32[leaf-index] i0
Group =>
    bool[merge] 00 01 int32[x23]
    i-1 int32[n-subcategories] Category*[n-subcategories]

name is the name of the category (or group).

A Leaf represents a leaf category. The Leaf’s leaf-index is a nonnegative integer unique within the Dimension and less than n-categories in the Dimension. If the user does not sort or rearrange the categories, then leaf-index starts at 0 for the first Leaf in the dimension and increments by 1 with each successive Leaf. If the user does sorts or rearrange the categories, then the order of categories in the file reflects that change and leaf-index reflects the original order.

A dimension can have no leaf categories at all. A table that contains such a dimension necessarily has no data at all.

A Group is a group of nested categories. Usually a Group contains at least one Category, so that n-subcategories is positive, but Groups with zero subcategories have been observed.

If a Group’s merge is 00, the most common value, then the group is really a distinct group that should be represented as such in the visual representation and user interface. If merge is 01, the categories in this group should be shown and treated as if they were direct children of the group’s containing group (or if it has no parent group, then direct children of the dimension), and this group’s name is irrelevant and should not be displayed. (Merged groups can be nested!)

Writers need not use merged groups.

A Group’s x23 appears to be i2 when all of the categories within a group are leaf categories that directly represent data values for a variable (e.g. in a frequency table or crosstabulation, a group of values in a variable being tabulated) and i0 otherwise. A writer may safely write a constant 0 in this field.