2.2.13 Value

Value is used throughout the SPV light member format. It boils down to a number or a string.

Value => 00? 00? 00? 00? RawValue
RawValue =>
    01 ValueMod int32[format] double[x]
  | 02 ValueMod int32[format] double[x]
    string[var-name] string[value-label] byte[show]
  | 03 string[local] ValueMod string[id] string[c] bool[fixed]
  | 04 ValueMod int32[format] string[value-label] string[var-name]
    byte[show] string[s]
  | 05 ValueMod string[var-name] string[var-label] byte[show]
  | 06 string[local] ValueMod string[id] string[c]
  | ValueMod string[template] int32[n-args] Argument*[n-args]
Argument =>
    i0 Value
  | int32[x] i0 Value*[x]      /* x > 0 */

There are several possible encodings, which one can distinguish by the first nonzero byte in the encoding.

01

The numeric value x, intended to be presented to the user formatted according to format, which is about the same as the format described for system files (see System File Output Formats). The exception is that format 40 is not MTIME but instead approximately a synonym for F format with a different rule for whether a value is shown in scientific notation: a value in format 40 is shown in scientific notation if and only if it is nonzero and its magnitude is less than small (see Formats).

Most commonly, format has width 40 (the maximum).

An x with the maximum negative double value -DBL_MAX represents the system-missing value SYSMIS. (HIGHEST and LOWEST have not been observed.) See System File Format, for more about these special values.

02

Similar to 01, with the additional information that x is a value of variable var-name and has value label value-label. Both var-name and value-label can be the empty string, the latter very commonly.

show determines whether to show the numeric value or the value label. A value of 1 means to show the value, 2 to show the label, 3 to show both, and 0 means to use the default specified in show-values (see Formats).

03

A text string, in two forms: c is in English, and sometimes abbreviated or obscure, and local is localized to the user’s locale. In an English-language locale, the two strings are often the same, and in the cases where they differ, local is more appropriate for a user interface, e.g. c of “Not a PxP table for MCN...” versus local of “Computed only for a PxP table, where P must be greater than 1.”

c and local are always either both empty or both nonempty.

id is a brief identifying string whose form seems to resemble a programming language identifier, e.g. cumulative_percent or factor_14. It is not unique.

fixed is 00 for text taken from user input, such as syntax fragment, expressions, file names, data set names, and 01 for fixed text strings such as names of procedures or statistics. In the former case, id is always the empty string; in the latter case, id is still sometimes empty.

04

The string value s, intended to be presented to the user formatted according to format. The format for a string is not too interesting, and the corpus contains many clearly invalid formats like A16.39 or A255.127 or A134.1, so readers should probably entirely disregard the format. PSPP only checks format to distinguish AHEX format.

s is a value of variable var-name and has value label value-label. var-name is never empty but value-label is commonly empty.

show has the same meaning as in the encoding for 02.

05

Variable var-name with variable label var-label. In the corpus, var-name is rarely empty and var-label is often empty.

show determines whether to show the variable name or the variable label. A value of 1 means to show the name, 2 to show the label, 3 to show both, and 0 means to use the default specified in show-variables (see Formats).

06

Similar to type 03, with fixed assumed to be true.

otherwise

When the first byte of a RawValue is not one of the above, the RawValue starts with a ValueMod, whose syntax is described in the next section. (A ValueMod always begins with byte 31 or 58.)

This case is a template string, analogous to printf, followed by one or more Arguments, each of which has one or more values. The template string is copied directly into the output except for the following special syntax,

\%
\:
\[
\]

Each of these expands to the character following ‘\\’, to escape characters that have special meaning in template strings. These are effective inside and outside the […] syntax forms described below.

\n

Expands to a new-line, inside or outside the […] forms described below.

^i

Expands to a formatted version of argument i, which must have only a single value. For example, ^1 expands to the first argument’s value.

[:a:]i

Expands a for each of the values in i. a should contain one or more ^j conversions, which are drawn from the values for argument i in order. Some examples from the corpus:

[:^1:]1

All of the values for the first argument, concatenated.

[:^1\n:]1

Expands to the values for the first argument, each followed by a new-line.

[:^1 = ^2:]2

Expands to x = y where x is the second argument’s first value and y is its second value. (This would be used only if the argument has two values. If there were more values, the second and third values would be directly concatenated, which would look funny.)

[a:b:]i

This extends the previous form so that the first values are expanded using a and later values are expanded using b. For an unknown reason, within a the ^j conversions are instead written as %j. Some examples from the corpus:

[%1:*^1:]1

Expands to all of the values for the first argument, separated by ‘*’.

[%1 = %2:, ^1 = ^2:]1

Given appropriate values for the first argument, expands to X = 1, Y = 2, Z = 3.

[%1:, ^1:]1

Given appropriate values, expands to 1, 2, 3.

The template string is localized to the user’s locale.

A writer may safely omit all of the optional 00 bytes at the beginning of a Value, except that it should write a single 00 byte before a templated Value.