40.20.1 Describing Data Layout

To control unpacking and packing, you write a data layout specification, also called a Bindat type expression. This can be a base type or a composite type made of several fields, where the specification controls the length of each field to be processed, and how to pack or unpack it. We normally keep bindat type values in variables whose names end in -bindat-spec; that kind of name is automatically recognized as risky (see File Local Variables).

Macro: bindat-type &rest type

Creates a Bindat type value object according to the Bindat type expression type.

A field’s type describes the size (in bytes) of the object that the field represents and, in the case of multibyte fields, how the bytes are ordered within the field. The two possible orderings are big endian (also known as “network byte ordering”) and little endian. For instance, the number #x23cd (decimal 9165) in big endian would be the two bytes #x23 #xcd; and in little endian, #xcd #x23. Here are the possible type values:

u8
byte

Unsigned byte, with length 1.

uint bitlen &optional le

Unsigned integer in network byte order (big-endian), with bitlen bits. bitlen has to be a multiple of 8. If le is non-nil, then use little-endian byte order.

sint bitlen le

Signed integer in network byte order (big-endian), with bitlen bits. bitlen has to be a multiple of 8. If le is non-nil, then use little-endian byte order.

str len

Unibyte string (see Text Representations) of length len bytes. When packing, the first len bytes of the input string are copied to the packed output. If the input string is shorter than len, the remaining bytes will be null (zero) unless a pre-allocated string was provided to bindat-pack, in which case the remaining bytes are left unmodified. If the input string is multibyte with only ASCII and eight-bit characters, it is converted to unibyte before it is packed; other multibyte strings signal an error. When unpacking, any null bytes in the packed input string will appear in the unpacked output.

strz &optional len

If len is not provided, this is a variable-length null-terminated unibyte string (see Text Representations). When packing into strz, the entire input string is copied to the packed output followed by a null (zero) byte. (If pre-allocated string is provided for packing into strz, that pre-allocated string should have enough space for the additional null byte appended to the output string contents, see Functions to Unpack and Pack Bytes). The length of the packed output is the length of the input string plus one (for the null terminator). The input string must not contain any null bytes. If the input string is multibyte with only ASCII and eight-bit characters, it is converted to unibyte before it is packed; other multibyte strings signal an error. When unpacking a strz, the resulting output string will contain all bytes up to (but excluding) the null byte that terminated the input string.

If len is provided, strz behaves the same as str, but with a couple of differences:

  • When packing, a null terminator is written after the packed input string if the number of characters in the input string is less than len.
  • When unpacking, the first null byte encountered in the packed string is interpreted as the terminating byte, and it and all subsequent bytes are excluded from the result of the unpacking.

Caution: The packed output will not be null-terminated unless the input string is shorter than len bytes or it contains a null byte within the first len bytes.

vec len [type]

Vector of len elements. The type of the elements is given by type, defaulting to bytes. The type can be any Bindat type expression.

repeat len [type]

Like vec, but it unpacks to and packs from lists, whereas vec unpacks to vectors.

bits len

List of bits that are set to 1 in len bytes. The bytes are taken in big-endian order, and the bits are numbered starting with 8 * len − 1 and ending with zero. For example: bits 2 unpacks #x28 #x1c to (2 3 4 11 13) and #x1c #x28 to (3 5 10 11 12).

fill len

len bytes used as a mere filler. In packing, these bytes are left unchanged, which normally means they remain zero. When unpacking, this just returns nil.

align len

Same as fill except the number of bytes is that needed to skip to the next multiple of len bytes.

type exp

This lets you refer to a type indirectly: exp is a Lisp expression which should return a Bindat type value.

unit exp

This is a trivial type which uses up 0 bits of space. exp describes the value returned when we try to “unpack” such a field.

struct fields...

Composite type made of several fields. Every field is of the form (name type) where type can be any Bindat type expression. name can be _ when the field’s value does not deserve to be named, as is often the case for align and fill fields. When the context makes it clear that this is a Bindat type expression, the symbol struct can be omitted.

In the types above, len and bitlen are given as an integer specifying the number of bytes (or bits) in the field. When the length of a field is not fixed, it typically depends on the value of preceding fields. For this reason, the length len does not have to be a constant but can be any Lisp expression and it can refer to the value of previous fields via their name.

For example, the specification of a data layout where a leading byte gives the size of a subsequent vector of 16 bit integers could be:

(bindat-type
  (len      u8)
  (payload  vec (1+ len) uint 16))