2 SPSS Viewer File Format

SPSS Viewer or .spv files, here called SPV files, are written by SPSS 16 and later to represent the contents of its output editor. This chapter documents the format, based on examination of a corpus of about 8,000 files from a variety of sources. This description is detailed enough to both read and write SPV files.

SPSS 15 and earlier versions instead use .spo files, which have a completely different output format based on the Microsoft Compound Document Format. This format is not documented here.

An SPV file is a Zip archive that can be read with zipinfo and unzip and similar programs. The final member in the Zip archive is the manifest, a file named META-INF/MANIFEST.MF. This structure makes SPV files resemble Java “JAR” files (and ODF files), but whereas a JAR manifest contains a sequence of colon-delimited key/value pairs, an SPV manifest contains the string ‘allowPivoting=true’, without a new-line. PSPP uses this string to identify an SPV file; it is invariant across the corpus.23

The rest of the members in an SPV file’s Zip archive fall into two categories: structure and detail members. Structure member names take the form with outputViewernumber.xml or outputViewernumber_heading.xml, where number is an 10-digit decimal number. Each of these members represents some kind of output item (a table, a heading, a block of text, etc.) or a group of them. The member whose output goes at the beginning of the document is numbered 0, the next member in the output is numbered 1, and so on.

Structure members contain XML. This XML is sometimes self-contained, but it often references detail members in the Zip archive, which are named as follows:

prefix_table.xml and prefix_tableData.bin
prefix_lightTableData.bin

The structure of a table plus its data. Older SPV files pair a prefix_table.xml file that describes the table’s structure with a binary prefix_tableData.bin file that gives its data. Newer SPV files (the majority of those in the corpus) instead include a single prefix_lightTableData.bin file that incorporates both into a single binary format.

prefix_warning.xml and prefix_warningData.bin
prefix_lightWarningData.bin

Same format used for tables, with a different name.

prefix_notes.xml and prefix_notesData.bin
prefix_lightNotesData.bin

Same format used for tables, with a different name.

prefix_chartData.bin and prefix_chart.xml

The structure of a chart plus its data. Charts do not have a “light” format.

prefix_Imagegeneric.png
prefix_PastedObjectgeneric.png
prefix_imageData.bin

A PNG image referenced by an object element (in the first two cases) or an image element (in the final case). See The object and image Elements.

prefix_pmml.scf
prefix_stats.scf
prefix_model.xml

Not yet investigated. The corpus contains few examples.

The prefix in the names of the detail members is typically an 11-digit decimal number that increases for each item, tending to skip values. Older SPV files use different naming conventions for detail members. Structure member refer to detail members by name, and so their exact names do not matter to readers as long as they are unique.

SPSS tolerates corrupted Zip archives that Zip reader libraries tend to reject. These can be fixed up with zip -FF.


Footnotes

(2)

SPV files always begin with the 7-byte sequence 50 4b 03 04 14 00 08, but this is not a useful magic number because most Zip archives start the same way.

(3)

SPSS writes META-INF/MANIFEST.MF to every SPV file, but it does not read it or even require it to exist, so using different contents, e.g. as ‘allowingPivot=false’ has no effect.