The format of the data record varies depending on the value of
compressed in the file header record:
Data is arranged as a series of 8-byte elements, one per variable
instance variable in the variable record (see Record 1 Variables Record). Numeric values are given in
flt64 format; string
values are literal characters string, padded on the right with spaces
when necessary to fill out 8-byte units.
The first 8 bytes of the data record is divided into a series of 1-byte command codes. These codes have meanings as described below:
The system-missing value.
A numeric or string value that is not compressible. The value is stored in the 8 bytes following the current block of command bytes. If this value appears twice in a block of command bytes, then it indicates the second group of 8 bytes following the command bytes, and so on.
A number with value code - 100, where code is the value of the compression code. For example, code 105 indicates a numeric variable of value 5.
The end of the 8-byte group of bytecodes is followed by any 8-byte blocks of non-compressible values indicated by code 1. After that follows another 8-byte group of bytecodes, then those bytecodes’ non-compressible values. The pattern repeats up to the number of cases specified by the main header record have been seen.
The corpus does not contain any files with command codes 2 through 95, so it is possible that some of these codes are used for special purposes.
Cases of data often, but not always, fill the entire data record. Readers should stop reading after the number of cases specified in the main header record. Otherwise, readers may try to interpret garbage following the data as additional cases.