4 GNU `tar` Operations

4.1 Basic GNU `tar` Operations

The basic tar operations, ‘--create’ (‘-c’), ‘--list’ (‘-t’) and ‘--extract’ (‘--get’, ‘-x’), are currently presented and described in the tutorial chapter of this manual. This section provides some complementary notes for these operations.

‘--create’

‘-c’

Creating an empty archive would have some kind of elegance. One can initialize an empty archive and later use ‘--append’ (‘-r’) for adding all members. Some applications would not welcome making an exception in the way of adding the first archive member. On the other hand, many people reported that it is dangerously too easy for tar to destroy a magnetic tape with an empty archive(9). The two most common errors are:

Mistakingly using create instead of extract, when the intent was to extract the full contents of an archive. This error is likely: keys c and x are right next to each other on the QWERTY keyboard. Instead of being unpacked, the archive then gets wholly destroyed. When users speak about exploding an archive, they usually mean something else :-).
Forgetting the argument to file, when the intent was to create an archive with a single file in it. This error is likely because a tired user can easily add the f key to the cluster of option letters, by the mere force of habit, without realizing the full consequence of doing so. The usual consequence is that the single file, which was meant to be saved, is rather destroyed.

So, recognizing the likelihood and the catastrophic nature of these errors, GNU tar now takes some distance from elegance, and cowardly refuses to create an archive when ‘--create’ option is given, there are no arguments besides options, and ‘--files-from’ (‘-T’) option is not used. To get around the cautiousness of GNU tar and nevertheless create an archive with nothing in it, one may still use, as the value for the ‘--files-from’ option, a file with no names in it, as shown in the following commands:

tar --create --file=empty-archive.tar --files-from=/dev/null
tar -cf empty-archive.tar -T /dev/null

‘--extract’

‘--get’

‘-x’

A socket is stored, within a GNU tar archive, as a pipe.

‘‘--list’ (‘-t’)’

GNU tar now shows dates as ‘1996-08-30’, while it used to show them as ‘Aug 30 1996’. Preferably, people should get used to ISO 8601 dates. Local American dates should be made available again with full date localization support, once ready. In the meantime, programs not being localizable for dates should prefer international dates, that’s really the way to go.

Look up http://www.cl.cam.ac.uk/~mgk25/iso-time.html if you are curious, it contains a detailed explanation of the ISO 8601 standard.

4.2 Advanced GNU `tar` Operations

Now that you have learned the basics of using GNU tar, you may want to learn about further ways in which tar can help you.

This chapter presents five, more advanced operations which you probably won’t use on a daily basis, but which serve more specialized functions. We also explain the different styles of options and why you might want to use one or another, or a combination of them in your tar commands. Additionally, this chapter includes options which allow you to define the output from tar more carefully, and provide help and error correction in special circumstances.

4.2.1 The Five Advanced `tar` Operations

In the last chapter, you learned about the first three operations to tar. This chapter presents the remaining five operations to tar: ‘--append’, ‘--update’, ‘--concatenate’, ‘--delete’, and ‘--compare’.

You are not likely to use these operations as frequently as those covered in the last chapter; however, since they perform specialized functions, they are quite useful when you do need to use them. We will give examples using the same directory and files that you created in the last chapter. As you may recall, the directory is called ‘practice’, the files are ‘jazz’, ‘blues’, ‘folk’, and the two archive files you created are ‘collection.tar’ and ‘music.tar’.

We will also use the archive files ‘afiles.tar’ and ‘bfiles.tar’. The archive ‘afiles.tar’ contains the members ‘apple’, ‘angst’, and ‘aspic’; ‘bfiles.tar’ contains the members ‘./birds’, ‘baboon’, and ‘./box’.

Unless we state otherwise, all practicing you do and examples you follow in this chapter will take place in the ‘practice’ directory that you created in the previous chapter; see Preparing a Practice Directory for Examples. (Below in this section, we will remind you of the state of the examples where the last chapter left them.)

The five operations that we will cover in this chapter are:

‘--append’
‘-r’: Add new entries to an archive that already exists.
‘--update’
‘-u’: Add more recent copies of archive members to the end of an archive, if they exist.
‘--concatenate’
‘--catenate’
‘-A’: Add one or more pre-existing archives to the end of another archive.
‘--delete’: Delete items from an archive (does not work on tapes).
‘--compare’
‘--diff’
‘-d’: Compare archive members to their counterparts in the file system.

4.2.2 How to Add Files to Existing Archives: ‘`--append`’

If you want to add files to an existing archive, you don’t need to create a new archive; you can use ‘--append’ (‘-r’). The archive must already exist in order to use ‘--append’. (A related operation is the ‘--update’ operation; you can use this to add newer versions of archive members to an existing archive. To learn how to do this with ‘--update’, see section Updating an Archive.)

If you use ‘--append’ to add a file that has the same name as an archive member to an archive containing that archive member, then the old member is not deleted. What does happen, however, is somewhat complex. tar allows you to have infinite number of files with the same name. Some operations treat these same-named members no differently than any other set of archive members: for example, if you view an archive with ‘--list’ (‘-t’), you will see all of those members listed, with their data modification times, owners, etc.

Other operations don’t deal with these members as perfectly as you might prefer; if you were to use ‘--extract’ to extract the archive, only the most recently added copy of a member with the same name as other members would end up in the working directory. This is because ‘--extract’ extracts an archive in the order the members appeared in the archive; the most recently archived members will be extracted last. Additionally, an extracted member will replace a file of the same name which existed in the directory already, and tar will not prompt you about this(10). Thus, only the most recently archived member will end up being extracted, as it will replace the one extracted before it, and so on.

There exists a special option that allows you to get around this behavior and extract (or list) only a particular copy of the file. This is ‘--occurrence’ option. If you run tar with this option, it will extract only the first copy of the file. You may also give this option an argument specifying the number of copy to be extracted. Thus, for example if the archive ‘archive.tar’ contained three copies of file ‘myfile’, then the command

tar --extract --file archive.tar --occurrence=2 myfile

would extract only the second copy. See section —occurrence, for the description of ‘--occurrence’ option.

If you want to replace an archive member, use ‘--delete’ to delete the member you want to remove from the archive, and then use ‘--append’ to add the member you want to be in the archive. Note that you can not change the order of the archive; the most recently added member will still appear last. In this sense, you cannot truly “replace” one member with another. (Replacing one member with another will not work on certain types of media, such as tapes; see Removing Archive Members Using ‘--delete’ and Tapes and Other Archive Media, for more information.)

4.2.2.1 Appending Files to an Archive

The simplest way to add a file to an already existing archive is the ‘--append’ (‘-r’) operation, which writes specified files into the archive whether or not they are already among the archived files.

When you use ‘--append’, you must specify file name arguments, as there is no default. If you specify a file that already exists in the archive, another copy of the file will be added to the end of the archive. As with other operations, the member names of the newly added files will be exactly the same as their names given on the command line. The ‘--verbose’ (‘-v’) option will print out the names of the files as they are written into the archive.

‘--append’ cannot be performed on some tape drives, unfortunately, due to deficiencies in the formats those tape drives use. The archive must be a valid tar archive, or else the results of using this operation will be unpredictable. See section Tapes and Other Archive Media.

To demonstrate using ‘--append’ to add a file to an archive, create a file called ‘rock’ in the ‘practice’ directory. Make sure you are in the ‘practice’ directory. Then, run the following tar command to add ‘rock’ to ‘collection.tar’:

$ tar --append --file=collection.tar rock

If you now use the ‘--list’ (‘-t’) operation, you will see that ‘rock’ has been added to the archive:

$ tar --list --file=collection.tar
-rw-r--r-- me/user          28 1996-10-18 16:31 jazz
-rw-r--r-- me/user          21 1996-09-23 16:44 blues
-rw-r--r-- me/user          20 1996-09-23 16:44 folk
-rw-r--r-- me/user          20 1996-09-23 16:44 rock

4.2.2.2 Multiple Members with the Same Name

You can use ‘--append’ (‘-r’) to add copies of files which have been updated since the archive was created. (However, we do not recommend doing this since there is another tar option called ‘--update’; See section Updating an Archive, for more information. We describe this use of ‘--append’ here for the sake of completeness.) When you extract the archive, the older version will be effectively lost. This works because files are extracted from an archive in the order in which they were archived. Thus, when the archive is extracted, a file archived later in time will replace a file of the same name which was archived earlier, even though the older version of the file will remain in the archive unless you delete all versions of the file.

Supposing you change the file ‘blues’ and then append the changed version to ‘collection.tar’. As you saw above, the original ‘blues’ is in the archive ‘collection.tar’. If you change the file and append the new version of the file to the archive, there will be two copies in the archive. When you extract the archive, the older version of the file will be extracted first, and then replaced by the newer version when it is extracted.

You can append the new, changed copy of the file ‘blues’ to the archive in this way:

$ tar --append --verbose --file=collection.tar blues
blues

Because you specified the ‘--verbose’ option, tar has printed the name of the file being appended as it was acted on. Now list the contents of the archive:

$ tar --list --verbose --file=collection.tar
-rw-r--r-- me/user          28 1996-10-18 16:31 jazz
-rw-r--r-- me/user          21 1996-09-23 16:44 blues
-rw-r--r-- me/user          20 1996-09-23 16:44 folk
-rw-r--r-- me/user          20 1996-09-23 16:44 rock
-rw-r--r-- me/user          58 1996-10-24 18:30 blues

The newest version of ‘blues’ is now at the end of the archive (note the different creation dates and file sizes). If you extract the archive, the older version of the file ‘blues’ will be replaced by the newer version. You can confirm this by extracting the archive and running ‘ls’ on the directory.

If you wish to extract the first occurrence of the file ‘blues’ from the archive, use ‘--occurrence’ option, as shown in the following example:

$ tar --extract -vv --occurrence --file=collection.tar blues
-rw-r--r-- me/user          21 1996-09-23 16:44 blues

See section Changing How tar Writes Files, for more information on ‘--extract’ and see –occurrence, for a description of ‘--occurrence’ option.

4.2.3 Updating an Archive

In the previous section, you learned how to use ‘--append’ to add a file to an existing archive. A related operation is ‘--update’ (‘-u’). The ‘--update’ operation updates a tar archive by comparing the date of the specified archive members against the date of the file with the same name. If the file has been modified more recently than the archive member, then the newer version of the file is added to the archive (as with ‘--append’).

Unfortunately, you cannot use ‘--update’ with magnetic tape drives. The operation will fail.

Both ‘--update’ and ‘--append’ work by adding to the end of the archive. When you extract a file from the archive, only the version stored last will wind up in the file system, unless you use the ‘--backup’ option. See section Multiple Members with the Same Name, for a detailed discussion.

4.2.3.1 How to Update an Archive Using ‘`--update`’

You must use file name arguments with the ‘--update’ (‘-u’) operation. If you don’t specify any files, tar won’t act on any files and won’t tell you that it didn’t do anything (which may end up confusing you).

To see the ‘--update’ option at work, create a new file, ‘classical’, in your practice directory, and some extra text to the file ‘blues’, using any text editor. Then invoke tar with the ‘update’ operation and the ‘--verbose’ (‘-v’) option specified, using the names of all the files in the ‘practice’ directory as file name arguments:

$ tar --update -v -f collection.tar blues folk rock classical
blues
classical
$

Because we have specified verbose mode, tar prints out the names of the files it is working on, which in this case are the names of the files that needed to be updated. If you run ‘tar --list’ and look at the archive, you will see ‘blues’ and ‘classical’ at its end. There will be a total of two versions of the member ‘blues’; the one at the end will be newer and larger, since you added text before updating it.

The reason tar does not overwrite the older file when updating it is that writing to the middle of a section of tape is a difficult process. Tapes are not designed to go backward. See section Tapes and Other Archive Media, for more information about tapes.

‘--update’ (‘-u’) is not suitable for performing backups for two reasons: it does not change directory content entries, and it lengthens the archive every time it is used. The GNU tar options intended specifically for backups are more efficient. If you need to run backups, please consult Performing Backups and Restoring Files.

4.2.4 Combining Archives with ‘`--concatenate`’

Sometimes it may be convenient to add a second archive onto the end of an archive rather than adding individual files to the archive. To add one or more archives to the end of another archive, you should use the ‘--concatenate’ (‘--catenate’, ‘-A’) operation.

To use ‘--concatenate’, give the first archive with ‘--file’ option and name the rest of archives to be concatenated on the command line. The members, and their member names, will be copied verbatim from those archives to the first one(11). The new, concatenated archive will be called by the same name as the one given with the ‘--file’ option. As usual, if you omit ‘--file’, tar will use the value of the environment variable TAPE, or, if this has not been set, the default archive name.

To demonstrate how ‘--concatenate’ works, create two small archives called ‘bluesrock.tar’ and ‘folkjazz.tar’, using the relevant files from ‘practice’:

$ tar -cvf bluesrock.tar blues rock
blues
rock
$ tar -cvf folkjazz.tar folk jazz
folk
jazz

If you like, you can run ‘tar --list’ to make sure the archives contain what they are supposed to:

$ tar -tvf bluesrock.tar
-rw-r--r-- melissa/user    105 1997-01-21 19:42 blues
-rw-r--r-- melissa/user     33 1997-01-20 15:34 rock
$ tar -tvf jazzfolk.tar
-rw-r--r-- melissa/user     20 1996-09-23 16:44 folk
-rw-r--r-- melissa/user     65 1997-01-30 14:15 jazz

We can concatenate these two archives with tar:

$ tar --concatenate --file=bluesrock.tar jazzfolk.tar

If you now list the contents of the ‘bluesrock.tar’, you will see that now it also contains the archive members of ‘jazzfolk.tar’:

$ tar --list --file=bluesrock.tar
blues
rock
folk
jazz

When you use ‘--concatenate’, the source and target archives must already exist and must have been created using compatible format parameters. Notice, that tar does not check whether the archives it concatenates have compatible formats, it does not even check if the files are really tar archives.

Like ‘--append’ (‘-r’), this operation cannot be performed on some tape drives, due to deficiencies in the formats those tape drives use.

It may seem more intuitive to you to want or try to use cat to concatenate two archives instead of using the ‘--concatenate’ operation; after all, cat is the utility for combining files.

However, tar archives incorporate an end-of-file marker which must be removed if the concatenated archives are to be read properly as one archive. ‘--concatenate’ removes the end-of-archive marker from the target archive before each new archive is appended. If you use cat to combine the archives, the result will not be a valid tar format archive. If you need to retrieve files from an archive that was added to using the cat utility, use the ‘--ignore-zeros’ (‘-i’) option. See section Ignoring Blocks of Zeros, for further information on dealing with archives improperly combined using the cat shell utility.

4.2.5 Removing Archive Members Using ‘`--delete`’

You can remove members from an archive by using the ‘--delete’ option. Specify the name of the archive with ‘--file’ (‘-f’) and then specify the names of the members to be deleted; if you list no member names, nothing will be deleted. The ‘--verbose’ option will cause tar to print the names of the members as they are deleted. As with ‘--extract’, you must give the exact member names when using ‘tar --delete’. ‘--delete’ will remove all versions of the named file from the archive. The ‘--delete’ operation can run very slowly.

Unlike other operations, ‘--delete’ has no short form.

This operation will rewrite the archive. You can only use ‘--delete’ on an archive if the archive device allows you to write to any point on the media, such as a disk; because of this, it does not work on magnetic tapes. Do not try to delete an archive member from a magnetic tape; the action will not succeed, and you will be likely to scramble the archive and damage your tape. There is no safe way (except by completely re-writing the archive) to delete files from most kinds of magnetic tape. See section Tapes and Other Archive Media.

To delete all versions of the file ‘blues’ from the archive ‘collection.tar’ in the ‘practice’ directory, make sure you are in that directory, and then,

$ tar --list --file=collection.tar
blues
folk
jazz
rock
$ tar --delete --file=collection.tar blues
$ tar --list --file=collection.tar
folk
jazz
rock

The ‘--delete’ option has been reported to work properly when tar acts as a filter from stdin to stdout.

4.2.6 Comparing Archive Members with the File System

The ‘--compare’ (‘-d’), or ‘--diff’ operation compares specified archive members against files with the same names, and then reports differences in file size, mode, owner, modification date and contents. You should only specify archive member names, not file names. If you do not name any members, then tar will compare the entire archive. If a file is represented in the archive but does not exist in the file system, tar reports a difference.

You have to specify the record size of the archive when modifying an archive with a non-default record size.

tar ignores files in the file system that do not have corresponding members in the archive.

The following example compares the archive members ‘rock’, ‘blues’ and ‘funk’ in the archive ‘bluesrock.tar’ with files of the same name in the file system. (Note that there is no file, ‘funk’; tar will report an error message.)

$ tar --compare --file=bluesrock.tar rock blues funk
rock
blues
tar: funk not found in archive

The spirit behind the ‘--compare’ (‘--diff’, ‘-d’) option is to check whether the archive represents the current state of files on disk, more than validating the integrity of the archive media. For this latter goal, see Verifying Data as It is Stored.

4.3 Options Used by ‘`--create`’

The previous chapter described the basics of how to use ‘--create’ (‘-c’) to create an archive from a set of files. See section How to Create Archives. This section described advanced options to be used with ‘--create’.

4.3.1 Overriding File Metadata

As described above, a tar archive keeps, for each member it contains, its metadata, such as modification time, mode and ownership of the file. GNU tar allows to replace these data with other values when adding files to the archive. The options described in this section affect creation of archives of any type. For POSIX archives, see also Controlling Extended Header Keywords, for additional ways of controlling metadata, stored in the archive.

‘--mode=permissions’

When adding files to an archive, tar will use permissions for the archive members, rather than the permissions from the files. permissions can be specified either as an octal number or as symbolic permissions, like with chmod (See File permissions in GNU core utilities. This reference also has useful information for those not being overly familiar with the UNIX permission system). Using latter syntax allows for more flexibility. For example, the value ‘a+rw’ adds read and write permissions for everybody, while retaining executable bits on directories or on any other file already marked as executable:

$ tar -c -f archive.tar --mode='a+rw' .

‘--mtime=date’

When adding files to an archive, tar uses date as the modification time of members when creating archives, instead of their actual modification times. The argument date can be either a textual date representation in almost arbitrary format (see section Date input formats) or a name of an existing file, starting with ‘/’ or ‘.’. In the latter case, the modification time of that file is used.

The following example sets the modification date to 00:00:00 UTC on January 1, 1970:

$ tar -c -f archive.tar --mtime='@0' .

When used with ‘--verbose’ (see section The ‘--verbose’ Option) GNU tar converts the specified date back to a textual form and compares it with the one given with ‘--mtime’. If the two forms differ, tar prints both forms in a message, to help the user check that the right date is being used.

For example:

$ tar -c -f archive.tar -v --mtime=yesterday .
tar: Option --mtime: Treating date 'yesterday' as 2006-06-20
13:06:29.152478
…

When used with ‘--clamp-mtime’ GNU tar sets the modification date to date only on files whose actual modification date is later than date. This makes it easier to build reproducible archives given a common timestamp for generated files while still retaining the original timestamps of untouched files. See section Making tar Archives More Reproducible.

$ tar -c -f archive.tar --clamp-mtime --mtime="$SOURCE_EPOCH" .

‘--owner=user’

Specifies that tar should use user as the owner of members when creating archives, instead of the user associated with the source file.

If user contains a colon, it is taken to be of the form name:id where a nonempty name specifies the user name and a nonempty id specifies the decimal numeric user ID. If user does not contain a colon, it is taken to be a user number if it is one or more decimal digits; otherwise it is taken to be a user name.

If a name is given but no number, the number is inferred from the current host’s user database if possible, and the file’s user number is used otherwise. If a number is given but no name, the name is inferred from the number if possible, and an empty name is used otherwise. If both name and number are given, the user database is not consulted, and the name and number need not be valid on the current host.

There is no value indicating a missing number, and ‘0’ usually means root. Some people like to force ‘0’ as the value to offer in their distributions for the owner of files, because the root user is anonymous anyway, so that might as well be the owner of anonymous archives. For example:

$ tar -c -f archive.tar --owner=0 .

or:

$ tar -c -f archive.tar --owner=root .

‘--group=group’

Files added to the tar archive will have a group ID of group, rather than the group from the source file. As with ‘--owner’, the argument group can be an existing group symbolic name, or a decimal numeric group ID, or name:id.

The ‘--owner’ and ‘--group’ options affect all files added to the archive. GNU tar provides also two options that allow for more detailed control over owner translation:

‘--owner-map=file’

Read UID translation map from file.

When reading, empty lines are ignored. The ‘#’ sign, unless quoted, introduces a comment, which extends to the end of the line. Each nonempty line defines mapping for a single UID. It must consist of two fields separated by any amount of whitespace. The first field defines original username and UID. It can be a valid user name or a valid UID prefixed with a plus sign. In both cases the corresponding UID or user name is inferred from the current host’s user database.

The second field defines the UID and username to map the original one to. Its format can be the same as described above. Otherwise, it can have the form newname:newuid, in which case neither newname nor newuid are required to be valid as per the user database.

For example, consider the following file:

+10     bin
smith   root:0

Given this file, each input file that is owner by UID 10 will be stored in archive with owner name ‘bin’ and owner UID corresponding to ‘bin’. Each file owned by user ‘smith’ will be stored with owner name ‘root’ and owner ID 0. Other files will remain unchanged.

When used together with ‘--owner-map’, the ‘--owner’ option affects only files whose owner is not listed in the map file.

‘--group-map=file’

Read GID translation map from file.

The format of file is the same as for ‘--owner-map’ option:

Each nonempty line defines mapping for a single GID. It must consist of two fields separated by any amount of whitespace. The first field defines original group name and GID. It can be a valid group name or a valid GID prefixed with a plus sign. In both cases the corresponding GID or user name is inferred from the current host’s group database.

The second field defines the GID and group name to map the original one to. Its format can be the same as described above. Otherwise, it can have the form newname:newgid, in which case neither newname nor newgid are required to be valid as per the group database.

When used together with ‘--group-map’, the ‘--group’ option affects only files whose owner group is not rewritten using the map file.

4.3.2 Extended File Attributes

Extended file attributes are name-value pairs that can be associated with each node in a file system. Despite the fact that POSIX.1e draft which proposed them has been withdrawn, the extended file attributes are supported by many file systems. GNU tar can store extended file attributes along with the files. This feature is controlled by the following command line arguments:

‘--xattrs’

Enable extended attributes support. When used with ‘--create’, this option instructs GNU tar to store extended file attribute in the created archive. This implies POSIX.1-2001 archive format (‘--format=pax’).

When used with ‘--extract’, this option tells tar, for each file extracted, to read stored attributes from the archive and to apply them to the file.

‘--no-xattrs’

Disable extended attributes support. This is the default.

Attribute names are strings prefixed by a namespace name and a dot. Currently, four namespaces exist: ‘user’, ‘trusted’, ‘security’ and ‘system’. By default, when ‘--xattrs’ is used, all names are stored in the archive (with ‘--create’), but only ‘user’ namespace is extracted (if using ‘--extract’). The reason for this behavior is that any other, system defined attributes don’t provide us sufficient compatibility promise. Storing all attributes is safe operation for the archiving purposes. Though extracting those (often security related) attributes on a different system than originally archived can lead to extraction failures, or even misinterpretations. This behavior can be controlled using the following options:

‘--xattrs-exclude=pattern’: Specify exclude pattern for extended attributes.
‘--xattrs-include=pattern’: Specify include pattern for extended attributes.

Here, the pattern is a globbing pattern. For example, the following command:

$ tar --xattrs --xattrs-exclude='user.*' -cf a.tar .

will include in the archive ‘a.tar’ all attributes, except those from the ‘user’ namespace.

Users shall check the attributes are binary compatible with the target system before any other namespace is extracted with an explicit ‘--xattrs-include’ option.

Any number of these options can be given, thereby creating lists of include and exclude patterns.

When both options are used, first ‘--xattrs-include’ is applied to select the set of attribute names to keep, and then ‘--xattrs-exclude’ is applied to the resulting set. In other words, only those attributes will be stored, whose names match one of the regexps in ‘--xattrs-include’ and don’t match any of the regexps from ‘--xattrs-exclude’.

When listing the archive, if both ‘--xattrs’ and ‘--verbose’ options are given, files that have extended attributes are marked with an asterisk following their permission mask. For example:

-rw-r--r--* smith/users      110 2016-03-16 16:07 file

When two or more ‘--verbose’ options are given, a detailed listing of extended attributes is printed after each file entry. Each attribute is listed on a separate line, which begins with two spaces and the letter ‘x’ indicating extended attribute. It is followed by a colon, length of the attribute and its name, e.g.:

-rw-r--r--* smith/users      110 2016-03-16 16:07 file
  x:  7 user.mime_type
  x: 32 trusted.md5sum

File access control lists (ACL) are another actively used feature proposed by the POSIX.1e standard. Each ACL consists of a set of ACL entries, each of which describes the access permissions on the file for an individual user or a group of users as a combination of read, write and search/execute permissions.

Whether or not to use ACLs is controlled by the following two options:

‘--acls’

Enable POSIX ACLs support. When used with ‘--create’, this option instructs GNU tar to store ACLs in the created archive. This implies POSIX.1-2001 archive format (‘--format=pax’).

When used with ‘--extract’, this option tells tar, to restore ACLs for each file extracted (provided they are present in the archive).

‘--no-acls’

Disable POSIX ACLs support. This is the default.

When listing the archive, if both ‘--acls’ and ‘--verbose’ options are given, files that have ACLs are marked with a plus sign following their permission mask. For example:

-rw-r--r--+ smith/users      110 2016-03-16 16:07 file

When two or more ‘--verbose’ options are given, a detailed listing of ACL is printed after each file entry:

-rw-r--r--+ smith/users      110 2016-03-16 16:07 file
  a: user::rw-,user:gray:-w-,group::r--,mask::rw-,other::r--

Security-Enhanced Linux (SELinux for short) is a Linux kernel security module that provides a mechanism for supporting access control security policies, including so-called mandatory access controls (MAC). Support for SELinux attributes is controlled by the following command line options:

‘--selinux’: Enable the SELinux context support.
‘--no-selinux’: Disable SELinux context support.

4.3.3 Ignore Failed Read

‘--ignore-failed-read’: Do not exit with nonzero if there are mild problems while reading.

This option has effect only during creation. It instructs tar to treat as mild conditions any missing or unreadable files (directories), or files that change while reading. Such failures don’t affect the program exit code, and the corresponding diagnostic messages are marked as warnings, not errors. These warnings can be suppressed using the ‘--warning=failed-read’ option (see section Controlling Warning Messages).

4.4 Options Used by ‘`--extract`’

The previous chapter showed how to use ‘--extract’ to extract an archive into the file system. Various options cause tar to extract more information than just file contents, such as the owner, the permissions, the modification date, and so forth. This section presents options to be used with ‘--extract’ when certain special considerations arise. You may review the information presented in How to Extract Members from an Archive for more basic information about the ‘--extract’ operation.

4.4.1 Options to Help Read Archives

Normally, tar will request data in full record increments from an archive storage device. If the device cannot return a full record, tar will report an error. However, some devices do not always return full records, or do not require the last record of an archive to be padded out to the next record boundary. To keep reading until you obtain a full record, or to accept an incomplete record if it contains an end-of-archive marker, specify the ‘--read-full-records’ (‘-B’) option in conjunction with the ‘--extract’ or ‘--list’ operations. See section Blocking.

The ‘--read-full-records’ (‘-B’) option is turned on by default when tar reads an archive from standard input, or from a remote machine. This is because on BSD Unix systems, attempting to read a pipe returns however much happens to be in the pipe, even if it is less than was requested. If this option were not enabled, tar would fail as soon as it read an incomplete record from the pipe.

If you’re not sure of the blocking factor of an archive, you can read the archive by specifying ‘--read-full-records’ (‘-B’) and ‘--blocking-factor=512-size’ (‘-b 512-size’), using a blocking factor larger than what the archive uses. This lets you avoid having to determine the blocking factor of an archive. See section The Blocking Factor of an Archive.

Reading Full Records

‘--read-full-records’
‘-B’: Use in conjunction with ‘--extract’ (‘--get’, ‘-x’) to read an archive which contains incomplete records, or one which has a blocking factor less than the one specified.

Ignoring Blocks of Zeros

Normally, tar stops reading when it encounters a block of zeros between file entries (which usually indicates the end of the archive). ‘--ignore-zeros’ (‘-i’) allows tar to completely read an archive which contains a block of zeros before the end (i.e., a damaged archive, or one that was created by concatenating several archives together). This option also suppresses warnings about missing or incomplete zero blocks at the end of the archive. This can be turned on, if the need be, using the ‘--warning=alone-zero-block --warning=missing-zero-blocks’ options (see section Controlling Warning Messages).

The ‘--ignore-zeros’ (‘-i’) option is turned off by default because many versions of tar write garbage after the end-of-archive entry, since that part of the media is never supposed to be read. GNU tar does not write after the end of an archive, but seeks to maintain compatibility among archiving utilities.

‘--ignore-zeros’
‘-i’: To ignore blocks of zeros (i.e., end-of-archive entries) which may be encountered while reading an archive. Use in conjunction with ‘--extract’ or ‘--list’.

4.4.2 Changing How `tar` Writes Files

(This message will disappear, once this node revised.)

Options Controlling the Overwriting of Existing Files

When extracting files, if tar discovers that the extracted file already exists, it normally replaces the file by removing it before extracting it, to prevent confusion in the presence of hard or symbolic links. (If the existing file is a symbolic link, it is removed, not followed.) However, if a directory cannot be removed because it is nonempty, tar normally overwrites its metadata (ownership, permission, etc.). The ‘--overwrite-dir’ option enables this default behavior. To be more cautious and preserve the metadata of such a directory, use the ‘--no-overwrite-dir’ option.

To be even more cautious and prevent existing files from being replaced, use the ‘--keep-old-files’ (‘-k’) option. It causes tar to refuse to replace or update a file that already exists, i.e., a file with the same name as an archive member prevents extraction of that archive member. Instead, it reports an error. For example:

$ ls
blues
$ tar -x -k -f archive.tar
tar: blues: Cannot open: File exists
tar: Exiting with failure status due to previous errors

If you wish to preserve old files untouched, but don’t want tar to treat them as errors, use the ‘--skip-old-files’ option. This option causes tar to silently skip extracting over existing files.

To be more aggressive about altering existing files, use the ‘--overwrite’ option. It causes tar to overwrite existing files and to follow existing symbolic links when extracting.

Some people argue that GNU tar should not hesitate to overwrite files with other files when extracting. When extracting a tar archive, they expect to see a faithful copy of the state of the file system when the archive was created. It is debatable that this would always be a proper behavior. For example, suppose one has an archive in which ‘usr/local’ is a link to ‘usr/local2’. Since then, maybe the site removed the link and renamed the whole hierarchy from ‘/usr/local2’ to ‘/usr/local’. Such things happen all the time. I guess it would not be welcome at all that GNU tar removes the whole hierarchy just to make room for the link to be reinstated (unless it also simultaneously restores the full ‘/usr/local2’, of course!) GNU tar is indeed able to remove a whole hierarchy to reestablish a symbolic link, for example, but only if ‘--recursive-unlink’ is specified to allow this behavior. In any case, single files are silently removed.

Finally, the ‘--unlink-first’ (‘-U’) option can improve performance in some cases by causing tar to remove files unconditionally before extracting them.

Overwrite Old Files

‘--overwrite’

Overwrite existing files and directory metadata when extracting files from an archive.

This causes tar to write extracted files into the file system without regard to the files already on the system; i.e., files with the same names as archive members are overwritten when the archive is extracted. It also causes tar to extract the ownership, permissions, and time stamps onto any preexisting files or directories. If the name of a corresponding file name is a symbolic link, the file pointed to by the symbolic link will be overwritten instead of the symbolic link itself (if this is possible). Moreover, special devices, empty directories and even symbolic links are automatically removed if they are in the way of extraction.

Be careful when using the ‘--overwrite’ option, particularly when combined with the ‘--absolute-names’ (‘-P’) option, as this combination can change the contents, ownership or permissions of any file on your system. Also, many systems do not take kindly to overwriting files that are currently being executed.

‘--overwrite-dir’

Overwrite the metadata of directories when extracting files from an archive, but remove other files before extracting.

Keep Old Files

GNU tar provides two options to control its actions in a situation when it is about to extract a file which already exists on disk.

‘--keep-old-files’

‘-k’

Do not replace existing files from archive. When such a file is encountered, tar issues an error message. Upon end of extraction, tar exits with code 2 (see exit status).

‘--skip-old-files’

Do not replace existing files from archive, but do not treat that as error. Such files are silently skipped and do not affect tar exit status.

Additional verbosity can be obtained using ‘--warning=existing-file’ together with that option (see section Controlling Warning Messages).

Keep Newer Files

‘--keep-newer-files’: Do not replace existing files that are newer than their archive copies. This option is meaningless with ‘--list’ (‘-t’).

Unlink First

‘--unlink-first’
‘-U’: Remove files before extracting over them. This can make tar run a bit faster if you know in advance that the extracted files all need to be removed. Normally this option slows tar down slightly, so it is disabled by default.

Recursive Unlink

‘--recursive-unlink’: When this option is specified, try removing files and directory hierarchies before extracting over them. This is a dangerous option!

If you specify the ‘--recursive-unlink’ option, tar removes anything that keeps you from extracting a file as far as current permissions will allow it. This could include removal of the contents of a full directory hierarchy.

Setting Data Modification Times

Normally, tar sets the data modification times of extracted files to the corresponding times recorded for the files in the archive, but limits the permissions of extracted files by the current umask setting.

To set the data modification times of extracted files to the time when the files were extracted, use the ‘--touch’ (‘-m’) option in conjunction with ‘--extract’ (‘--get’, ‘-x’).

‘--touch’
‘-m’: Sets the data modification time of extracted archive members to the time they were extracted, not the time recorded for them in the archive. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’).

Setting Access Permissions

To set the modes (access permissions) of extracted files to those recorded for those files in the archive, use ‘--same-permissions’ in conjunction with the ‘--extract’ (‘--get’, ‘-x’) operation.

‘--preserve-permissions’
‘--same-permissions’
‘-p’: Set modes of extracted archive members to those recorded in the archive, instead of current umask settings. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’).

Directory Modification Times and Permissions

After successfully extracting a file member, GNU tar normally restores its permissions and modification times, as described in the previous sections. This cannot be done for directories, because after extracting a directory tar will almost certainly extract files into that directory and this will cause the directory modification time to be updated. Moreover, restoring that directory permissions may not permit file creation within it. Thus, restoring directory permissions and modification times must be delayed at least until all files have been extracted into that directory. GNU tar restores directories using the following approach.

The extracted directories are created with the mode specified in the archive, as modified by the umask of the user, which gives sufficient permissions to allow file creation. The meta-information about the directory is recorded in the temporary list of directories. When preparing to extract next archive member, GNU tar checks if the directory prefix of this file contains the remembered directory. If it does not, the program assumes that all files have been extracted into that directory, restores its modification time and permissions and removes its entry from the internal list. This approach allows to correctly restore directory meta-information in the majority of cases, while keeping memory requirements sufficiently small. It is based on the fact, that most tar archives use the predefined order of members: first the directory, then all the files and subdirectories in that directory.

However, this is not always true. The most important exception are incremental archives (see section Using tar to Perform Incremental Dumps). The member order in an incremental archive is reversed: first all directory members are stored, followed by other (non-directory) members. So, when extracting from incremental archives, GNU tar alters the above procedure. It remembers all restored directories, and restores their meta-data only after the entire archive has been processed. Notice, that you do not need to specify any special options for that, as GNU tar automatically detects archives in incremental format.

There may be cases, when such processing is required for normal archives too. Consider the following example:

$ tar --no-recursion -cvf archive \
    foo foo/file1 bar bar/file foo/file2
foo/
foo/file1
bar/
bar/file
foo/file2

During the normal operation, after encountering ‘bar’ GNU tar will assume that all files from the directory ‘foo’ were already extracted and will therefore restore its timestamp and permission bits. However, after extracting ‘foo/file2’ the directory timestamp will be offset again.

To correctly restore directory meta-information in such cases, use the ‘--delay-directory-restore’ command line option:

‘--delay-directory-restore’: Delays restoring of the modification times and permissions of extracted directories until the end of extraction. This way, correct meta-information is restored even if the archive has unusual member ordering.
‘--no-delay-directory-restore’: Cancel the effect of the previous ‘--delay-directory-restore’. Use this option if you have used ‘--delay-directory-restore’ in TAR_OPTIONS variable (see TAR_OPTIONS) and wish to temporarily disable it.

Writing to Standard Output

To write the extracted files to the standard output, instead of creating the files on the file system, use ‘--to-stdout’ (‘-O’) in conjunction with ‘--extract’ (‘--get’, ‘-x’). This option is useful if you are extracting files to send them through a pipe, and do not need to preserve them in the file system. If you extract multiple members, they appear on standard output concatenated, in the order they are found in the archive.

‘--to-stdout’
‘-O’: Writes files to the standard output. Use only in conjunction with ‘--extract’ (‘--get’, ‘-x’). When this option is used, instead of creating the files specified, tar writes the contents of the files extracted to its standard output. This may be useful if you are only extracting the files in order to send them through a pipe. This option is meaningless with ‘--list’ (‘-t’).

This can be useful, for example, if you have a tar archive containing a big file and don’t want to store the file on disk before processing it. You can use a command like this:

tar -xOzf foo.tgz bigfile | process

or even like this if you want to process the concatenation of the files:

tar -xOzf foo.tgz bigfile1 bigfile2 | process

However, ‘--to-command’ may be more convenient for use with multiple files. See the next section.

Writing to an External Program

You can instruct tar to send the contents of each extracted file to the standard input of an external program:

‘--to-command=command’

Extract files and pipe their contents to the standard input of command. When this option is used, instead of creating the files specified, tar invokes command and pipes the contents of the files to its standard output. The command may contain command line arguments (see Running External Commands, for more detail).

Notice, that command is executed once for each regular file extracted. Non-regular files (directories, etc.) are ignored when this option is used.

The command can obtain the information about the file it processes from the following environment variables:

TAR_FILETYPE

Type of the file. It is a single letter with the following meaning:

f	Regular file
d	Directory
l	Symbolic link
h	Hard link
b	Block device
c	Character device

Currently only regular files are supported.

TAR_MODE

File mode, an octal number.

TAR_FILENAME

The name of the file.

TAR_REALNAME

Name of the file as stored in the archive.

TAR_UNAME

Name of the file owner.

TAR_GNAME

Name of the file owner group.

TAR_ATIME

Time of last access. It is a decimal number, representing seconds since the Epoch. If the archive provides times with nanosecond precision, the nanoseconds are appended to the timestamp after a decimal point.

TAR_MTIME

Time of last modification.

TAR_CTIME

Time of last status change.

TAR_SIZE

Size of the file.

TAR_UID

UID of the file owner.

TAR_GID

GID of the file owner.

Additionally, the following variables contain information about tar mode and the archive being processed:

TAR_VERSION: GNU tar version number.
TAR_ARCHIVE: The name of the archive tar is processing.
TAR_BLOCKING_FACTOR: Current blocking factor (see section Blocking).
TAR_VOLUME: Ordinal number of the volume tar is processing.
TAR_FORMAT: Format of the archive being processed. See section Controlling the Archive Format, for a complete list of archive format names.

These variables are defined prior to executing the command, so you can pass them as arguments, if you prefer. For example, if the command proc takes the member name and size as its arguments, then you could do:

$ tar -x -f archive.tar \
       --to-command='proc $TAR_FILENAME $TAR_SIZE'

Notice single quotes to prevent variable names from being expanded by the shell when invoking tar.

If command exits with a non-0 status, tar will print an error message similar to the following:

tar: 2345: Child returned status 1

Here, ‘2345’ is the PID of the finished process.

If this behavior is not wanted, use ‘--ignore-command-error’:

‘--ignore-command-error’: Ignore exit codes of subprocesses. Notice that if the program exits on signal or otherwise terminates abnormally, the error message will be printed even if this option is used.
‘--no-ignore-command-error’: Cancel the effect of any previous ‘--ignore-command-error’ option. This option is useful if you have set ‘--ignore-command-error’ in TAR_OPTIONS (see TAR_OPTIONS) and wish to temporarily cancel it.

Removing Files

‘--remove-files’: Remove files after adding them to the archive.

4.4.3 Coping with Scarce Resources

(This message will disappear, once this node revised.)

Starting File

‘--starting-file=name’
‘-K name’: Starts an operation in the middle of an archive. Use in conjunction with ‘--extract’ (‘--get’, ‘-x’) or ‘--list’ (‘-t’).

If a previous attempt to extract files failed due to lack of disk space, you can use ‘--starting-file=name’ (‘-K name’) to start extracting only after member name of the archive. This assumes, of course, that there is now free space, or that you are now extracting into a different file system. (You could also choose to suspend tar, remove unnecessary files from the file system, and then resume the same tar operation. In this case, ‘--starting-file’ is not necessary.) See also Asking for Confirmation During Operations, and Excluding Some Files.

Same Order

‘--same-order’
‘--preserve-order’
‘-s’: To process large lists of file names on machines with small amounts of memory. Use in conjunction with ‘--compare’ (‘--diff’, ‘-d’), ‘--list’ (‘-t’) or ‘--extract’ (‘--get’, ‘-x’).

The ‘--same-order’ (‘--preserve-order’, ‘-s’) option tells tar that the list of file names to be listed or extracted is sorted in the same order as the files in the archive. This allows a large list of names to be used, even on a small machine that would not otherwise be able to hold all the names in memory at the same time. Such a sorted list can easily be created by running ‘tar -t’ on the archive and editing its output.

This option is probably never needed on modern computer systems.

4.5 Backup options

GNU tar offers options for making backups of files before writing new versions. These options control the details of these backups. They may apply to the archive itself before it is created or rewritten, as well as individual extracted members. Other GNU programs (cp, install, ln, and mv, for example) offer similar options.

Backup options may prove unexpectedly useful when extracting archives containing many members having identical name, or when extracting archives on systems having file name limitations, making different members appear as having similar names through the side-effect of name truncation.

When any existing file is backed up before being overwritten by extraction, then clashing files are automatically be renamed to be unique, and the true name is kept for only the last file of a series of clashing files. By using verbose mode, users may track exactly what happens.

At the detail level, some decisions are still experimental, and may change in the future, we are waiting comments from our users. So, please do not learn to depend blindly on the details of the backup features. For example, currently, directories themselves are never renamed through using these options, so, extracting a file over a directory still has good chances to fail. Also, backup options apply to created archives, not only to extracted members. For created archives, backups will not be attempted when the archive is a block or character device, or when it refers to a remote file.

For the sake of simplicity and efficiency, backups are made by renaming old files prior to creation or extraction, and not by copying. The original name is restored if the file creation fails. If a failure occurs after a partial extraction of a file, both the backup and the partially extracted file are kept.

‘--backup[=method]’

Back up files that are about to be overwritten or removed. Without this option, the original versions are destroyed.

Use method to determine the type of backups made. If method is not specified, use the value of the VERSION_CONTROL environment variable. And if VERSION_CONTROL is not set, use the ‘existing’ method.

This option corresponds to the Emacs variable ‘version-control’; the same values for method are accepted as in Emacs. This option also allows more descriptive names. The valid methods are:

‘t’
‘numbered’: Always make numbered backups.
‘nil’
‘existing’: Make numbered backups of files that already have them, simple backups of the others.
‘never’
‘simple’: Always make simple backups.

‘--suffix=suffix’

Append suffix to each backup file made with ‘--backup’. If this option is not specified, the value of the SIMPLE_BACKUP_SUFFIX environment variable is used. And if SIMPLE_BACKUP_SUFFIX is not set, the default is ‘~’, just as in Emacs.

4.6 Looking Ahead: The Rest of this Manual

You have now seen how to use all eight of the operations available to tar, and a number of the possible options. The next chapter explains how to choose and change file and archive names, how to use files to store names of other files which you can then call as arguments to tar (this can help you save time if you expect to archive the same list of files a number of times), and so forth.

If there are too many files to conveniently list on the command line, you can list the names in a file, and tar will read that file. See section Reading Names from a File.

There are various ways of causing tar to skip over some files, and not archive them. See section Choosing Files and Names for tar.

This document was generated on August 23, 2023 using texi2html 5.0.

4.1 Basic GNU `tar` Operations
4.2 Advanced GNU `tar` Operations
4.3 Options Used by ‘`--create`’
4.4 Options Used by ‘`--extract`’
4.5 Backup options
4.6 Looking Ahead: The Rest of this Manual

4.2.1 The Five Advanced `tar` Operations
4.2.2 How to Add Files to Existing Archives: ‘`--append`’
4.2.3 Updating an Archive
4.2.4 Combining Archives with ‘`--concatenate`’
4.2.5 Removing Archive Members Using ‘`--delete`’
4.2.6 Comparing Archive Members with the File System

4.4.1 Options to Help Read Archives
4.4.2 Changing How `tar` Writes Files
4.4.3 Coping with Scarce Resources

4 GNU tar Operations

4.1 Basic GNU tar Operations

4.2 Advanced GNU tar Operations

4.2.1 The Five Advanced tar Operations

4.2.2 How to Add Files to Existing Archives: ‘--append’