A technical description of the Position ID - GNU Backgammon Manual V0.16

Next: A technical description of the Match ID, Previous: Equities explained, Up: Technical Notes

11.6 A technical description of the Position ID

This section describes a method for compactly recording a backgammon position. It demonstrates how to encode a position into 10 binary bytes, which is useful for minimizing the space used when recording large numbers of positions in memory or on disk. There is also an ASCII representation in 14 characters, which is convenient for output to the screen, for copying and pasting to transfer positions between programs which support the format, and for communicating positions via Usenet news or e-mail. The 10 byte binary format is called the key, and the 14 character ASCII format is the ID.

The key is essentially a bit string (imagine you start with an empty sequence of bits, and continue adding either 0 or 1 to the end). The way to build up a sequence that corresponds to a given position is:

For every point around the board (starting at the ace point of the player on roll, continuing around to the 24 point and ending at the bar):
append as many 1s as the player on roll has on that point (if any).
append a 0.
For every point around the board (starting at the ace point of the opponent, continuing around to the opponent's 24 point and ending at the bar):
append as many 1s as the opponent has on that point (if any).
append a 0.
Pad out the string to 80 bits with 0s.

The worst-case representation will require 80 bits: you can see that there are always 50 0 bits even if there are no checkers at all. Each player has a maximum of 15 checkers in play (not yet borne off) which require a 1 bit wherever they are positioned. That's 30 bits to take of all checkers, plus the 50 bits of overhead for a total of 80 bits (the last bit is always 0 and isn't strictly necessary, but it makes the code slightly easier). This bit string should be stored in little-endian order when packed into bytes (i.e. the first bits in the string are stored in the least significant bits of the first byte).

As an example, here's what the starting position looks like in the key format:

0 0 0 0 0 player on roll has no checkers on ace to 5 points
11111 0 5 checkers on the 6 point
0 empty bar
111 0 3 on the 8
0 0 0 0 no others in our outfield
11111 0 5 on the midpoint
0 0 0 0 0 none in the opponent's outfield
0 0 0 0 0 or in opponent's board, until...
11 0 two on the 24 point
0 none on the bar
0 0 0 0 0 opponent has no checkers on ace to 5 points
11111 0 5 checkers on the 6 point
0 empty bar
111 0 3 on the 8
0 0 0 0 no others in opponent's outfield
11111 0 5 on the midpoint
0 0 0 0 0 none in our outfield
0 0 0 0 0 or in our board, until...
11 0 two on the 24 point
0 none on the bar

so altogether it's:

00000111110011100000111110000000000011000000011111001110000011111000000000001100

In little endian bytes it looks like:

11100000 01110011 11110000 00000001 00110000 1110000001110011111100000000000100110000
0xE0 0x73 0xF0 0x01 0x30 0xE00x730xF00x010x30

so the 10 byte key (in hex) is E0 73 F0 01 30 E0 73 F0 01 30.

The ID format is simply the Base64 encoding of the key. (Technically, a Base64 encoding of 80 binary bits should consist of 14 characters followed by two = padding characters, but this padding is omitted in the ID format.)

To continue the above example, splitting the 10 8-bit bytes into 14 6-bit groups gives:

111000 000111 001111 110000 000000 010011 000011 100000 011100 111111 000000 000001 001100 000000

In Base64 encoding, these groups are respectively represented as:

4 H P w A T D g c / A B M A

So, the position ID of the checkers at the start of the game is simply:

4HPwATDgc/ABMA

You can set the board in gnubg either by writing the position ID into the position text input field in the GUI or by executing the command

set board 4HPwATDgc/ABMA.

Notes

This encoding is obviously not as compact as it could be: in particular, there are lots of redundant representations of illegal positions where both players have checkers on the same point. Theoretically, it would be possible to get it down to 64 bits by using Walter Trice's D() expressions, but I think you'd have to be a mathematical masochist to try it!
Example code to convert between a raw board encoding (the number of checkers on each point) and these keys/IDs is available licensed under GPL.
Thanks to Tom Keith and David desJardins for their suggestions on simplifying the encoding without increasing the worst case length.

0 0 0 0 0	player on roll has no checkers on ace to 5 points
11111 0	5 checkers on the 6 point
0	empty bar
111 0	3 on the 8
0 0 0 0	no others in our outfield
11111 0	5 on the midpoint
0 0 0 0 0	none in the opponent's outfield
0 0 0 0 0	or in opponent's board, until...
11 0	two on the 24 point
0	none on the bar
0 0 0 0 0	opponent has no checkers on ace to 5 points
11111 0	5 checkers on the 6 point
0	empty bar
111 0	3 on the 8
0 0 0 0	no others in opponent's outfield
11111 0	5 on the midpoint
0 0 0 0 0	none in our outfield
0 0 0 0 0	or in our board, until...
11 0	two on the 24 point
0	none on the bar

11100000	01110011	11110000	00000001	00110000	1110000001110011111100000000000100110000
0xE0	0x73	0xF0	0x01	0x30	0xE00x730xF00x010x30