[ << ] [ < ] [ Up ] [ > ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8 Controlling the Archive Format

Due to historical reasons, there are several formats of tar archives. All of them are based on the same principles, but have some subtle differences that often make them incompatible with each other.

GNU tar is able to create and handle archives in a variety of formats. The most frequently used formats are (in alphabetical order):

gnu

Format used by GNU tar versions up to 1.13.25. This format derived from an early POSIX standard, adding some improvements such as sparse file handling and incremental archives. Unfortunately these features were implemented in a way incompatible with other archive formats.

Archives in ‘gnu’ format are able to hold file names of unlimited length.

oldgnu

Format used by GNU tar of versions prior to 1.12.

v7

Archive format, compatible with the V7 implementation of tar. This format imposes a number of limitations. The most important of them are:

  1. File names and symbolic links can contain at most 100 bytes.
  2. File sizes must be less than 8 GiB (2^33 bytes = 8,589,934,592 bytes).
  3. It is impossible to store special files (block and character devices, fifos etc.)
  4. UIDs and GIDs must be less than 2^21 (2,097,152).
  5. V7 archives do not contain symbolic ownership information (user and group name of the file owner).

This format has traditionally been used by Automake when producing Makefiles. This practice will change in the future, in the meantime, however this means that projects containing file names more than 100 bytes long will not be able to use GNU tar 1.35 and Automake prior to 1.9.

ustar

Archive format defined by POSIX.1-1988 and later. It stores symbolic ownership information. It is also able to store special files. However, it imposes several restrictions as well:

  1. File names can contain at most 255 bytes.
  2. File names longer than 100 bytes must be split at a directory separator in two parts, the first being at most 155 bytes long. So, in most cases file names must be a bit shorter than 255 bytes.
  3. Symbolic links can contain at most 100 bytes.
  4. Files can contain at most 8 GiB (2^33 bytes = 8,589,934,592 bytes).
  5. UIDs, GIDs, device major numbers, and device minor numbers must be less than 2^21 (2,097,152).
star

The format used by the late Jörg Schilling’s star implementation. GNU tar is able to read ‘star’ archives but currently does not produce them.

posix

The format defined by POSIX.1-2001 and later. This is the most flexible and feature-rich format. It does not impose arbitrary restrictions on file sizes or file name lengths. This format is more recent, so some tar implementations cannot handle it properly. However, any tar implementation able to read ‘ustar’ archives should be able to read most ‘posix’ archives as well, except that it will extract any additional information (such as long file names) as extra plain text files.

This archive format will be the default format for future versions of GNU tar.

The following table summarizes the limitations of each of these formats:

FormatUIDFile SizeFile NameDevn
gnu1.8e19UnlimitedUnlimited63
oldgnu1.8e19UnlimitedUnlimited63
v720971518 GiB - 199n/a
ustar20971518 GiB - 125521
posixUnlimitedUnlimitedUnlimitedUnlimited

The default format for GNU tar is defined at compilation time. You may check it by running tar --help, and examining the last lines of its output. Usually, GNU tar is configured to create archives in ‘gnu’ format, however, a future version will switch to ‘posix’.


[ << ] [ < ] [ Up ] [ > ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated on August 23, 2023 using texi2html 5.0.