per page, with , order by , clip by
Results of 1 - 1 of about 80 for what does gnu stand for? (0.191 sec.)
what (25797), does (28628), gnu (99455), stand (2787), for? (607)
GNU tar 1.35: 6 Choosing Files and Names for tar
#score: 13012
@digest: 8d793903fbb5b16516da3ec48e6e0c71
@id: 320730
@mdate: 2023-08-22T21:32:02Z
@size: 114925
@type: text/html
content-type: text/html; charset=utf-8
description: GNU tar 1.35: 6 Choosing Files and Names for tar
distribution: global
generator: texi2html 5.0
keywords: GNU tar 1.35: 6 Choosing Files and Names for tar
resource-type: document
#keywords: exclude (23025), archive (18707), tar (12638), quoting (11769), wildcards (11246), globbing (10827), recursion (10008), backslash (9594), patterns (9223), ‘ (6754), quote (6417), transform (5924), directory (5829), names (5199), leading (4325), option (4063), extract (4060), pattern (3948), arch (3828), files (3684), options (3669), characters (3254), components (2646), file (2644), members (2635), newline (2626), absolute (2591), matches (2589), command (2406), specify (2401), contains (2384), directories (2366)
[ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6 Choosing Files and Names for tar Certain options to tar enable you to specify a name for your archive. Other options let you decide which files to include or exclude from the archive, based on when or whether files were modified, whether the file names do or don't match specified patterns, or whether files are in specified directories. This chapter discusses these options in detail. 6.1 Choosing and Naming Archive Files Choosing the Archive's Name 6.2 Selecting Archive Members 6.3 Reading Names from a File 6.4 Excluding Some Files 6.5 Wildcards Patterns and Matching 6.6 Quoting Member Names Ways of Quoting Special Characters in Names 6.7 Modifying File and Member Names 6.8 Operating Only on New Files 6.9 Descending into Directories 6.10 Crossing File System Boundaries [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.1 Choosing and Naming Archive Files By default, tar uses an archive file name that was compiled when it was built on the system; usually this name refers to some physical tape drive on the machine. However, the person who installed tar on the system may not have set the default to a meaningful value as far as most users are concerned. As a result, you will usually want to tell tar where to find (or create) the archive. The ‘ --file= archive-name ' (‘ -f archive-name ') option allows you to either specify or name a file to use as the archive instead of the default archive file location. ‘ --file= archive-name ' ‘ -f archive-name ' Name the archive to create or operate on. Use in conjunction with any operation. For example, in this tar command, $ tar -cvf collection.tar blues folk jazz ‘ collection.tar ' is the name of the archive. It must directly follow the ‘ -f ' option, since whatever directly follows ‘ -f ' will end up naming the archive. If you neglect to specify an archive name, you may end up overwriting a file in the working directory with the archive you create since tar will use this file's name for the archive name. An archive can be saved as a file in the file system, sent through a pipe or over a network, or written to an I/O device such as a tape, floppy disk, or CD write drive. If you do not name the archive, tar uses the value of the environment variable TAPE as the file name for the archive. If that is not available, tar uses a default, compiled-in archive name, usually that for tape unit zero (i.e., ‘ /dev/tu00 '). If you use ‘ - ' as an archive-name , tar reads the archive from standard input (when listing or extracting files), or writes it to standard output (when creating an archive). If you use ‘ - ' as an archive-name when modifying an archive, tar reads the original archive from its standard input and writes the entire new archive to its standard output. The following example is a convenient way of copying directory hierarchy from ‘ sourcedir ' to ‘ targetdir '. $ (cd sourcedir; tar -cf - .) | (cd targetdir; tar -xpf -) The ‘ -C ' option allows to avoid using subshells: $ tar -C sourcedir -cf - . | tar -C targetdir -xpf - In both examples above, the leftmost tar invocation archives the contents of ‘ sourcedir ' to the standard output, while the rightmost one reads this archive from its standard input and extracts it. The ‘ -p ' option tells it to restore permissions of the extracted files. To specify an archive file on a device attached to a remote machine, use the following: --file= hostname :/ dev / file-name tar will set up the remote connection, if possible, and prompt you for a username and password. If you use ‘ --file=@ hostname :/ dev / file-name ', tar will attempt to set up the remote connection using your username as the username on the remote machine. If the archive file name includes a colon (‘ : '), then it is assumed to be a file on another machine. If the archive file is ‘ user @ host : file ', then file is used on the host host . The remote host is accessed using the rsh program, with a username of user . If the username is omitted (along with the ‘ @ ' sign), then your user name will be used. (This is the normal rsh behavior.) It is necessary for the remote machine, in addition to permitting your rsh access, to have the ‘ rmt ' program installed (this command is included in the GNU tar distribution and by default is installed under ‘ prefix /libexec/rmt ', where prefix means your installation prefix). If you need to use a file whose name includes a colon, then the remote tape drive behavior can be inhibited by using the ‘ --force-local ' option. When the archive is being created to ‘ /dev/null ', GNU tar tries to minimize input and output operations. The Amanda backup system, when used with GNU tar , has an initial sizing pass which uses this feature. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.2 Selecting Archive Members File Name arguments specify which files in the file system tar operates on, when creating or adding to an archive, or which archive members tar operates on, when reading or deleting from an archive. See section The Five Advanced tar Operations . To specify file names, you can include them as the last arguments on the command line, as follows: tar operation [ option1 option2 …] [ file name-1 file name-2 …] If a file name begins with dash (‘ - '), precede it with ‘ --add-file ' option to prevent it from being treated as an option. By default GNU tar attempts to unquote each file or member name, replacing escape sequences according to the following table: Escape Replaced with \a Audible bell ( ASCII 7) \b Backspace ( ASCII 8) \f Form feed ( ASCII 12) \n New line ( ASCII 10) \r Carriage return ( ASCII 13) \t Horizontal tabulation ( ASCII 9) \v Vertical tabulation ( ASCII 11) \? ASCII 127 \ n ASCII n ( n should be an octal number of up to 3 digits) A backslash followed by any other symbol is retained. This default behavior is controlled by the following command line option: ‘ --unquote ' Enable unquoting input file or member names (default). ‘ --no-unquote ' Disable unquoting input file or member names. If you specify a directory name as a file name argument, all the files in that directory are operated on by tar . If you do not specify files, tar behavior differs depending on the operation mode as described below: When tar is invoked with ‘ --create ' (‘ -c '), tar will stop immediately, reporting the following: $ tar cf a.tar tar: Cowardly refusing to create an empty archive Try 'tar --help' or 'tar --usage' for more information. If you specify either ‘ --list ' (‘ -t ') or ‘ --extract ' (‘ --get ', ‘ -x '), tar operates on all the archive members in the archive. If run with ‘ --diff ' option, tar will compare the archive with the contents of the current working directory. If you specify any other operation, tar does nothing. By default, tar takes file names from the command line. However, there are other ways to specify file or member names, or to modify the manner in which tar selects the files or members upon which to operate. In general, these methods work both for specifying the names of files and archive members. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.3 Reading Names from a File Instead of giving the names of files or archive members on the command line, you can put the names into a file, and then use the ‘ --files-from= file-of-names ' (‘ -T file-of-names ') option to tar . Give the name of the file which contains the list of files to include as the argument to ‘ --files-from '. In the list, the file names should be separated by newlines. You will frequently use this option when you have generated the list of files to archive with the find utility. ‘ --files-from= file-name ' ‘ -T file-name ' Get names to extract or create from file file-name . If you give a single dash as a file name for ‘ --files-from ', (i.e., you specify either --files-from=- or -T - ), then the file names are read from standard input. Unless you are running tar with ‘ --create ', you cannot use both --files-from=- and --file=- ( -f - ) in the same command. Any number of ‘ -T ' options can be given in the command line. The following example shows how to use find to generate a list of files smaller than 400 blocks in length (15) and put that list into a file called ‘ small-files '. You can then use the ‘ -T ' option to tar to specify the files from that file, ‘ small-files ', to create the archive ‘ little.tgz '. (The ‘ -z ' option to tar compresses the archive with gzip ; see section Creating and Reading Compressed Archives for more information.) $ find . -size -400 -print > small-files $ tar -c -v -z -T small-files -f little.tgz By default, each line read from the file list is first stripped off any leading and trailing whitespace. If the resulting string begins with ‘ - ' character, it is considered a tar option and is processed accordingly (16) . Only a subset of GNU tar options is allowed for use in file lists. For a list of such options, Position-Sensitive Options . For example, the common use of this feature is to change to another directory by specifying ‘ -C ' option: $ cat list -C/etc passwd hosts -C/lib libc.a $ tar -c -f foo.tar --files-from list In this example, tar will first switch to ‘ /etc ' directory and add files ‘ passwd ' and ‘ hosts ' to the archive. Then it will change to ‘ /lib ' directory and will archive the file ‘ libc.a '. Thus, the resulting archive ‘ foo.tar ' will contain: $ tar tf foo.tar passwd hosts libc.a Note, that any options used in the file list remain in effect for the rest of the command line. For example, using the same ‘ list ' file as above, the following command $ tar -c -f foo.tar --files-from list libcurses.a will look for file ‘ libcurses.a ' in the directory ‘ /lib ', because it was used with the last ‘ -C ' option (see section Position-Sensitive Options ). If such option handling is undesirable, use the ‘ --verbatim-files-from ' option. When this option is in effect, each line read from the file list is treated as a file name. Notice, that this means, in particular, that no whitespace trimming is performed. The ‘ --verbatim-files-from ' affects all ‘ -T ' options that follow it in the command line. The default behavior can be restored using ‘ --no-verbatim-files-from ' option. To disable option handling for a single file name, use the ‘ --add-file ' option, e.g.: --add-file=--my-file . You can use any GNU tar command line options in the file list file, including ‘ --files-from ' option itself. This allows for including contents of a file list into another file list file. Note however, that options that control file list processing, such as ‘ --verbatim-files-from ' or ‘ --null ' won't affect the file they appear in. They will affect next ‘ --files-from ' option, if there is any. 6.3.1 NUL -Terminated File Names [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.3.1 NUL -Terminated File Names The ‘ --null ' option causes ‘ --files-from= file-of-names ' (‘ -T file-of-names ') to read file names terminated by a NUL instead of a newline, so files whose names contain newlines can be archived using ‘ --files-from '. ‘ --null ' Only consider NUL -terminated file names, instead of files that terminate in a newline. ‘ --no-null ' Undo the effect of any previous ‘ --null ' option. The ‘ --null ' option is just like the one in GNU xargs and cpio , and is useful with the ‘ -print0 ' predicate of GNU find . In tar , ‘ --null ' also disables special handling for file names that begin with dash (similar to ‘ --verbatim-files-from ' option). This example shows how to use find to generate a list of files larger than 800 blocks in length and put that list into a file called ‘ long-files '. The ‘ -print0 ' option to find is just like ‘ -print ', except that it separates files with a NUL rather than with a newline. You can then run tar with both the ‘ --null ' and ‘ -T ' options to specify that tar gets the files from that file, ‘ long-files ', to create the archive ‘ big.tgz '. The ‘ --null ' option to tar will cause tar to recognize the NUL separator between files. $ find . -size +800 -print0 > long-files $ tar -c -v --null --files-from=long-files --file=big.tar The ‘ --no-null ' option can be used if you need to read both NUL -terminated and newline-terminated files on the same command line. For example, if ‘ flist ' is a newline-terminated file, then the following command can be used to combine it with the above command: $ find . -size +800 -print0 | tar -c -f big.tar --null -T - --no-null -T flist This example uses short options for typographic reasons, to avoid very long lines. GNU tar is tries to automatically detect NUL -terminated file lists, so in many cases it is safe to use them even without the ‘ --null ' option. In this case tar will print a warning and continue reading such a file as if ‘ --null ' were actually given: $ find . -size +800 -print0 | tar -c -f big.tar -T - tar: -: file name read contains nul character The null terminator, however, remains in effect only for this particular file, any following ‘ -T ' options will assume newline termination. Of course, the null autodetection applies to these eventual surplus ‘ -T ' options as well. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.4 Excluding Some Files To avoid operating on files whose names match a particular pattern, use the ‘ --exclude ' or ‘ --exclude-from ' options. ‘ --exclude= pattern ' Causes tar to ignore files that match the pattern . The ‘ --exclude= pattern ' option prevents any file or member whose name matches the shell wildcard ( pattern ) from being operated on. For example, to create an archive with all the contents of the directory ‘ src ' except for files whose names end in ‘ .o ', use the command ‘ tar -cf src.tar --exclude='*.o' src '. You may give multiple ‘ --exclude ' options. ‘ --exclude-from= file ' ‘ -X file ' Causes tar to ignore files that match the patterns listed in file . Use the ‘ --exclude-from ' option to read a list of patterns, one per line, from file ; tar will ignore files matching those patterns. Thus if tar is called as ‘ tar -c -X foo . ' and the file ‘ foo ' contains a single line ‘ *.o ', no files whose names end in ‘ .o ' will be added to the archive. Notice, that lines from file are read verbatim. One of the frequent errors is leaving some extra whitespace after a file name, which is difficult to catch using text editors. However, empty lines are OK. When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS' ignore files (e.g. ‘ .cvsignore ', ‘ .gitignore ', etc.) The following options provide such possibility: ‘ --exclude-vcs-ignores ' Before archiving a directory, see if it contains any of the following files: ‘ cvsignore ', ‘ .gitignore ', ‘ .bzrignore ', or ‘ .hgignore '. If so, read ignore patterns from these files. The patterns are treated much as the corresponding VCS would treat them, i.e.: ‘ .cvsignore ' Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored. ‘ .gitignore ' Contains shell-style globbing patterns. Applies to the directory where ‘ .gitfile ' is located and all its subdirectories. Any line beginning with a ‘ # ' is a comment. Backslash escapes the comment character. ‘ .bzrignore ' Contains shell globbing-patterns and regular expressions (if prefixed with ‘ RE: ' (17) . Patterns affect the directory and all its subdirectories. Any line beginning with a ‘ # ' is a comment. ‘ .hgignore ' Contains POSIX regular expressions (18) . The line ‘ syntax: glob ' switches to shell globbing patterns. The line ‘ syntax: regexp ' switches back. Comments begin with a ‘ # '. Patterns affect the directory and all its subdirectories. ‘ --exclude-ignore= file ' Before dumping a directory, tar checks if it contains file . If so, exclusion patterns are read from this file. The patterns affect only the directory itself. ‘ --exclude-ignore-recursive= file ' Same as ‘ --exclude-ignore ', except that the patterns read affect both the directory where file resides and all its subdirectories. ‘ --exclude-vcs ' Exclude files and directories used by following version control systems: ‘ CVS ', ‘ RCS ', ‘ SCCS ', ‘ SVN ', ‘ Arch ', ‘ Bazaar ', ‘ Mercurial ', and ‘ Darcs '. As of version 1.35, the following files are excluded: ‘ CVS/ ', and everything under it ‘ RCS/ ', and everything under it ‘ SCCS/ ', and everything under it ‘ .git/ ', and everything under it ‘ .gitignore ' ‘ .gitmodules ' ‘ .gitattributes ' ‘ .cvsignore ' ‘ .svn/ ', and everything under it ‘ .arch-ids/ ', and everything under it ‘ {arch}/ ', and everything under it ‘ =RELEASE-ID ' ‘ =meta-update ' ‘ =update ' ‘ .bzr ' ‘ .bzrignore ' ‘ .bzrtags ' ‘ .hg ' ‘ .hgignore ' ‘ .hgrags ' ‘ _darcs ' ‘ --exclude-backups ' Exclude backup and lock files. This option causes exclusion of files that match the following shell globbing patterns: .#* *~ #*# When creating an archive, the ‘ --exclude-caches ' option family causes tar to exclude all directories that contain a cache directory tag . A cache directory tag is a short file with the well-known name ‘ CACHEDIR.TAG ' and having a standard header specified in http://www.brynosaurus.com/cachedir/spec.html . Various applications write cache directory tags into directories they use to hold regenerable, non-precious data, so that such data can be more easily excluded from backups. There are three ‘ exclude-caches ' options, each providing a different exclusion semantics: ‘ --exclude-caches ' Do not archive the contents of the directory, but archive the directory itself and the ‘ CACHEDIR.TAG ' file. ‘ --exclude-caches-under ' Do not archive the contents of the directory, nor the ‘ CACHEDIR.TAG ' file, archive only the directory itself. ‘ --exclude-caches-all ' Omit directories containing ‘ CACHEDIR.TAG ' file entirely. Another option family, ‘ --exclude-tag ', provides a generalization of this concept. It takes a single argument, a file name to look for. Any directory that contains this file will be excluded from the dump. Similarly to ‘ exclude-caches ', there are three options in this option family: ‘ --exclude-tag= file ' Do not dump the contents of the directory, but dump the directory itself and the file . ‘ --exclude-tag-under= file ' Do not dump the contents of the directory, nor the file , archive only the directory itself. ‘ --exclude-tag-all= file ' Omit directories containing file file entirely. Multiple ‘ --exclude-tag* ' options can be given. For example, given this directory: $ find dir dir dir/blues dir/jazz dir/folk dir/folk/tagfile dir/folk/sanjuan dir/folk/trote The ‘ --exclude-tag ' will produce the following: $ tar -cf archive.tar --exclude-tag=tagfile -v dir dir/ dir/blues dir/jazz dir/folk/ tar: dir/folk/: contains a cache directory tag tagfile; contents not dumped dir/folk/tagfile Both the ‘ dir/folk ' directory and its tagfile are preserved in the archive, however the rest of files in this directory are not. Now, using the ‘ --exclude-tag-under ' option will exclude ‘ tagfile ' from the dump, while still preserving the directory itself, as shown in this example: $ tar -cf archive.tar --exclude-tag-under=tagfile -v dir dir/ dir/blues dir/jazz dir/folk/ ./tar: dir/folk/: contains a cache directory tag tagfile; contents not dumped Finally, using ‘ --exclude-tag-all ' omits the ‘ dir/folk ' directory entirely: $ tar -cf archive.tar --exclude-tag-all=tagfile -v dir dir/ dir/blues dir/jazz ./tar: dir/folk/: contains a cache directory tag tagfile; directory not dumped Problems with Using the exclude Options [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] Problems with Using the exclude Options Some users find ‘ exclude ' options confusing. Here are some common pitfalls: The main operating mode of tar does not act on a file name explicitly listed on the command line, if one of its file name components is excluded. In the example above, if you create an archive and exclude files that end with ‘ *.o ', but explicitly name the file ‘ dir.o/foo ' after all the options have been listed, ‘ dir.o/foo ' will be excluded from the archive. You can sometimes confuse the meanings of ‘ --exclude ' and ‘ --exclude-from '. Be careful: use ‘ --exclude ' when files to be excluded are given as a pattern on the command line. Use ‘ --exclude-from ' to introduce the name of a file which contains a list of patterns, one per line; each of these patterns can exclude zero, one, or many files. When you use ‘ --exclude= pattern ', be sure to quote the pattern parameter, so GNU tar sees wildcard characters like ‘ * '. If you do not do this, the shell might expand the ‘ * ' itself using files at hand, so tar might receive a list of files instead of one pattern, or none at all, making the command somewhat illegal. This might not correspond to what you want. For example, write: $ tar -c -f archive.tar --exclude '*.o' directory rather than: # Wrong! $ tar -c -f archive.tar --exclude *.o directory You must use use shell syntax, or globbing, rather than regexp syntax, when using exclude options in tar . If you try to use regexp syntax to describe files to be excluded, your command might fail. In earlier versions of tar , what is now the ‘ --exclude-from ' option was called ‘ --exclude ' instead. Now, ‘ --exclude ' applies to patterns listed on the command line and ‘ --exclude-from ' applies to patterns listed in a file. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.5 Wildcards Patterns and Matching Globbing is the operation by which wildcard characters, ‘ * ' or ‘ ? ' for example, are replaced and expanded into all existing files matching the given pattern. GNU tar can use wildcard patterns for matching (or globbing) archive members when extracting from or listing an archive. Wildcard patterns are also used for verifying volume labels of tar archives. This section has the purpose of explaining wildcard syntax for tar . A pattern should be written according to shell syntax, using wildcard characters to effect globbing. Most characters in the pattern stand for themselves in the matched string, and case is significant: ‘ a ' will match only ‘ a ', and not ‘ A '. The character ‘ ? ' in the pattern matches any single character in the matched string. The character ‘ * ' in the pattern matches zero, one, or more single characters in the matched string. The character ‘ \ ' says to take the following character of the pattern literally ; it is useful when one needs to match the ‘ ? ', ‘ * ', ‘ [ ' or ‘ \ ' characters, themselves. The character ‘ [ ', up to the matching ‘ ] ', introduces a character class. A character class is a list of acceptable characters for the next single character of the matched string. For example, ‘ [abcde] ' would match any of the first five letters of the alphabet. Note that within a character class, all of the “special characters” listed above other than ‘ \ ' lose their special meaning; for example, ‘ [-\\[*?]] ' would match any of the characters, ‘ - ', ‘ \ ', ‘ [ ', ‘ * ', ‘ ? ', or ‘ ] '. (Due to parsing constraints, the characters ‘ - ' and ‘ ] ' must either come first or last in a character class.) If the first character of the class after the opening ‘ [ ' is ‘ ! ' or ‘ ^ ', then the meaning of the class is reversed. Rather than listing character to match, it lists those characters which are forbidden as the next single character of the matched string. Other characters of the class stand for themselves. The special construction ‘ [ a - e ] ', using an hyphen between two letters, is meant to represent all characters between a and e , inclusive. Periods (‘ . ') or forward slashes (‘ / ') are not considered special for wildcard matches. However, if a pattern completely matches a directory prefix of a matched string, then it matches the full matched string: thus, excluding a directory also excludes all the files beneath it. Controlling Pattern-Matching [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] Controlling Pattern-Matching For the purposes of this section, we call exclusion members all member names obtained while processing ‘ --exclude ' and ‘ --exclude-from ' options, and inclusion members those member names that were given in the command line or read from the file specified with ‘ --files-from ' option. These two pairs of member lists are used in the following operations: ‘ --diff ', ‘ --extract ', ‘ --list ', ‘ --update '. There are no inclusion members in create mode (‘ --create ' and ‘ --append '), since in this mode the names obtained from the command line refer to files , not archive members. By default, inclusion members are compared with archive members literally (19) and exclusion members are treated as globbing patterns. For example: $ tar tf foo.tar a.c b.c a.txt [remarks] # Member names are used verbatim: $ tar -xf foo.tar -v '[remarks]' [remarks] # Exclude member names are globbed: $ tar -xf foo.tar -v --exclude '*.c' a.txt [remarks] This behavior can be altered by using the following options: ‘ --wildcards ' Treat all member names as wildcards. ‘ --no-wildcards ' Treat all member names as literal strings. Thus, to extract files whose names end in ‘ .c ', you can use: $ tar -xf foo.tar -v --wildcards '*.c' a.c b.c Notice quoting of the pattern to prevent the shell from interpreting it. The effect of ‘ --wildcards ' option is canceled by ‘ --no-wildcards '. This can be used to pass part of the command line arguments verbatim and other part as globbing patterns. For example, the following invocation: $ tar -xf foo.tar --wildcards '*.txt' --no-wildcards '[remarks]' instructs tar to extract from ‘ foo.tar ' all files whose names end in ‘ .txt ' and the file named ‘ [remarks] '. Normally, a pattern matches a name if an initial subsequence of the name's components matches the pattern, where ‘ * ', ‘ ? ', and ‘ [...] ' are the usual shell wildcards, ‘ \ ' escapes wildcards, and wildcards can match ‘ / '. Other than optionally stripping leading ‘ / ' from names (see section Absolute File Names ), patterns and names are used as-is. For example, trailing ‘ / ' is not trimmed from a user-specified name before deciding whether to exclude it. However, this matching procedure can be altered by the options listed below. These options accumulate. For example: --ignore-case --exclude='makefile' --no-ignore-case ---exclude='readme' ignores case when excluding ‘ makefile ', but not when excluding ‘ readme '. ‘ --anchored ' ‘ --no-anchored ' If anchored, a pattern must match an initial subsequence of the name's components. Otherwise, the pattern can match any subsequence. Default is ‘ --no-anchored ' for exclusion members and ‘ --anchored ' inclusion members. ‘ --ignore-case ' ‘ --no-ignore-case ' When ignoring case, upper-case patterns match lower-case names and vice versa. When not ignoring case (the default), matching is case-sensitive. ‘ --wildcards-match-slash ' ‘ --no-wildcards-match-slash ' When wildcards match slash (the default for exclusion members), a wildcard like ‘ * ' in the pattern can match a ‘ / ' in the name. Otherwise, ‘ / ' is matched only by ‘ / '. The ‘ --recursion ' and ‘ --no-recursion ' options (see section Descending into Directories ) also affect how member patterns are interpreted. If recursion is in effect, a pattern matches a name if it matches any of the name's parent directories. The following table summarizes pattern-matching default values: Members Default settings Inclusion ‘ --no-wildcards --anchored --no-wildcards-match-slash ' Exclusion ‘ --wildcards --no-anchored --wildcards-match-slash ' [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.6 Quoting Member Names When displaying member names, tar takes care to avoid ambiguities caused by certain characters. This is called name quoting . The characters in question are: Non-printable control characters: Character ASCII Character name \a 7 Audible bell \b 8 Backspace \f 12 Form feed \n 10 New line \r 13 Carriage return \t 9 Horizontal tabulation \v 11 Vertical tabulation Space ( ASCII 32) Single and double quotes (‘ ' ' and ‘ " ') Backslash (‘ \ ') The exact way tar uses to quote these characters depends on the quoting style . The default quoting style, called escape (see below), uses backslash notation to represent control characters and backslash. GNU tar offers seven distinct quoting styles, which can be selected using ‘ --quoting-style ' option: ‘ --quoting-style= style ' Sets quoting style. Valid values for style argument are: literal, shell, shell-always, c, escape, locale, clocale. These styles are described in detail below. To illustrate their effect, we will use an imaginary tar archive ‘ arch.tar ' containing the following members: # 1. Contains horizontal tabulation character. a tab # 2. Contains newline character a newline # 3. Contains a space a space # 4. Contains double quotes a"double"quote # 5. Contains single quotes a'single'quote # 6. Contains a backslash character: a\backslash Here is how usual ls command would have listed them, if they had existed in the current working directory: $ ls a\ttab a\nnewline a\ space a"double"quote a'single'quote a\\backslash Quoting styles: ‘ literal ' No quoting, display each character as is: $ tar tf arch.tar --quoting-style=literal ./ ./a space ./a'single'quote ./a"double"quote ./a\backslash ./a tab ./a newline ‘ shell ' Display characters the same way Bourne shell does: control characters, except ‘ \t ' and ‘ \n ', are printed using backslash escapes, ‘ \t ' and ‘ \n ' are printed as is, and a single quote is printed as ‘ \' '. If a name contains any quoted characters, it is enclosed in single quotes. In particular, if a name contains single quotes, it is printed as several single-quoted strings: $ tar tf arch.tar --quoting-style=shell ./ './a space' './a'\''single'\''quote' './a"double"quote' './a\backslash' './a tab' './a newline' ‘ shell-always ' Same as ‘ shell ', but the names are always enclosed in single quotes: $ tar tf arch.tar --quoting-style=shell-always './' './a space' './a'\''single'\''quote' './a"double"quote' './a\backslash' './a tab' './a newline' ‘ c ' Use the notation of the C programming language. All names are enclosed in double quotes. Control characters are quoted using backslash notations, double quotes are represented as ‘ \" ', backslash characters are represented as ‘ \\ '. Single quotes and spaces are not quoted: $ tar tf arch.tar --quoting-style=c "./" "./a space" "./a'single'quote" "./a\"double\"quote" "./a\\backslash" "./a\ttab" "./a\nnewline" ‘ escape ' Control characters are printed using backslash notation, and a backslash as ‘ \\ '. This is the default quoting style, unless it was changed when configured the package. $ tar tf arch.tar --quoting-style=escape ./ ./a space ./a'single'quote ./a"double"quote ./a\\backslash ./a\ttab ./a\nnewline ‘ locale ' Control characters, single quote and backslash are printed using backslash notation. All names are quoted using left and right quotation marks, appropriate to the current locale. If it does not define quotation marks, use ‘ ' ' as left and as right quotation marks. Any occurrences of the right quotation mark in a name are escaped with ‘ \ ', for example: For example: $ tar tf arch.tar --quoting-style=locale './' './a space' './a\'single\'quote' './a"double"quote' './a\\backslash' './a\ttab' './a\nnewline' ‘ clocale ' Same as ‘ locale ', but ‘ " ' is used for both left and right quotation marks, if not provided by the currently selected locale: $ tar tf arch.tar --quoting-style=clocale "./" "./a space" "./a'single'quote" "./a\"double\"quote" "./a\\backslash" "./a\ttab" "./a\nnewline" You can specify which characters should be quoted in addition to those implied by the current quoting style: ‘ --quote-chars= string ' Always quote characters from string , even if the selected quoting style would not quote them. For example, using ‘ escape ' quoting (compare with the usual escape listing above): $ tar tf arch.tar --quoting-style=escape --quote-chars=' "' ./ ./a\ space ./a'single'quote ./a\"double\"quote ./a\\backslash ./a\ttab ./a\nnewline To disable quoting of such additional characters, use the following option: ‘ --no-quote-chars= string ' Remove characters listed in string from the list of quoted characters set by the previous ‘ --quote-chars ' option. This option is particularly useful if you have added ‘ --quote-chars ' to your TAR_OPTIONS (see TAR_OPTIONS ) and wish to disable it for the current invocation. Note, that ‘ --no-quote-chars ' does not disable those characters that are quoted by default in the selected quoting style. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.7 Modifying File and Member Names Tar archives contain detailed information about files stored in them and full file names are part of that information. When storing a file to an archive, its file name is recorded in it, along with the actual file contents. When restoring from an archive, a file is created on disk with exactly the same name as that stored in the archive. In the majority of cases this is the desired behavior of a file archiver. However, there are some cases when it is not. First of all, it is often unsafe to extract archive members with absolute file names or those that begin with a ‘ ../ '. GNU tar takes special precautions when extracting such names and provides a special option for handling them, which is described in Absolute File Names . Secondly, you may wish to extract file names without some leading directory components, or with otherwise modified names. In other cases it is desirable to store files under differing names in the archive. GNU tar provides several options for these needs. ‘ --strip-components= number ' Strip given number of leading components from file names before extraction. For example, suppose you have archived whole ‘ /usr ' hierarchy to a tar archive named ‘ usr.tar '. Among other files, this archive contains ‘ usr/include/stdlib.h ', which you wish to extract to the current working directory. To do so, you type: $ tar -xf usr.tar --strip=2 usr/include/stdlib.h The option ‘ --strip=2 ' instructs tar to strip the two leading components (‘ usr/ ' and ‘ include/ ') off the file name. If you add the ‘ --verbose ' (‘ -v ') option to the invocation above, you will note that the verbose listing still contains the full file name, with the two removed components still in place. This can be inconvenient, so tar provides a special option for altering this behavior: ‘ --show-transformed-names ' Display file or member names with all requested transformations applied. For example: $ tar -xf usr.tar -v --strip=2 usr/include/stdlib.h usr/include/stdlib.h $ tar -xf usr.tar -v --strip=2 --show-transformed usr/include/stdlib.h stdlib.h Notice that in both cases the file ‘ stdlib.h ' is extracted to the current working directory, ‘ --show-transformed-names ' affects only the way its name is displayed. This option is especially useful for verifying whether the invocation will have the desired effect. Thus, before running $ tar -x --strip= n it is often advisable to run $ tar -t -v --show-transformed --strip= n to make sure the command will produce the intended results. In case you need to apply more complex modifications to the file name, GNU tar provides a general-purpose transformation option: ‘ --transform= expression ' ‘ --xform= expression ' Modify file names using supplied expression . The expression is a sed -like replace expression of the form: s/ regexp / replace /[ flags ] where regexp is a regular expression , replace is a replacement for each file name part that matches regexp . Both regexp and replace are described in detail in The ‘s' Command in GNU sed . Any delimiter can be used in lieu of ‘ / ', the only requirement being that it be used consistently throughout the expression. For example, the following two expressions are equivalent: s/one/two/ s,one,two, Changing delimiters is often useful when the regex contains slashes. For example, it is more convenient to write s,/,-, than s/\//-/ . As in sed , you can give several replace expressions, separated by a semicolon. Supported flags are: ‘ g ' Apply the replacement to all matches to the regexp , not just the first. ‘ i ' Use case-insensitive matching. ‘ x ' regexp is an extended regular expression (see Extended regular expressions in GNU sed ). ‘ number ' Only replace the number th match of the regexp . Note: the POSIX standard does not specify what should happen when you mix the ‘ g ' and number modifiers. GNU tar follows the GNU sed implementation in this regard, so the interaction is defined to be: ignore matches before the number th, and then match and replace all matches from the number th on. In addition, several transformation scope flags are supported, that control to what files transformations apply. These are: ‘ r ' Apply transformation to regular archive members. ‘ R ' Do not apply transformation to regular archive members. ‘ s ' Apply transformation to symbolic link targets. ‘ S ' Do not apply transformation to symbolic link targets. ‘ h ' Apply transformation to hard link targets. ‘ H ' Do not apply transformation to hard link targets. Default is ‘ rsh ', which means to apply transformations to both archive members and targets of symbolic and hard links. Default scope flags can also be changed using ‘ flags= ' statement in the transform expression. The flags set this way remain in force until next ‘ flags= ' statement or end of expression, whichever occurs first. For example: --transform 'flags=S;s|^|/usr/local/|' Here are several examples of ‘ --transform ' usage: Extract ‘ usr/ ' hierarchy into ‘ usr/local/ ': $ tar --transform='s,usr/,usr/local/,' -x -f arch.tar Strip two leading directory components (equivalent to ‘ --strip-components=2 '): $ tar --transform='s,/*[^/]*/[^/]*/,,' -x -f arch.tar Convert each file name to lower case: $ tar --transform 's/.*/\L&/' -x -f arch.tar Prepend ‘ /prefix/ ' to each file name: $ tar --transform 's,^,/prefix/,' -x -f arch.tar Archive the ‘ /lib ' directory, prepending ‘ /usr/local ' to each archive member: $ tar --transform 's,^,/usr/local/,S' -c -f arch.tar /lib Notice the use of flags in the last example. The ‘ /lib ' directory often contains many symbolic links to files within it. It may look, for example, like this: $ ls -l drwxr-xr-x root/root 0 2008-07-08 16:20 /lib/ -rwxr-xr-x root/root 1250840 2008-05-25 07:44 /lib/libc-2.3.2.so lrwxrwxrwx root/root 0 2008-06-24 17:12 /lib/libc.so.6 -> libc-2.3.2.so ... Using the expression ‘ s,^,/usr/local/, ' would mean adding ‘ /usr/local ' to both regular archive members and to link targets. In this case, ‘ /lib/libc.so.6 ' would become: /usr/local/lib/libc.so.6 -> /usr/local/libc-2.3.2.so This is definitely not desired. To avoid this, the ‘ S ' flag is used, which excludes symbolic link targets from filename transformations. The result is: $ tar --transform 's,^,/usr/local/,S' -c -v -f arch.tar \ --show-transformed /lib drwxr-xr-x root/root 0 2008-07-08 16:20 /usr/local/lib/ -rwxr-xr-x root/root 1250840 2008-05-25 07:44 /usr/local/lib/libc-2.3.2.so lrwxrwxrwx root/root 0 2008-06-24 17:12 /usr/local/lib/libc.so.6 \ -> libc-2.3.2.so Unlike ‘ --strip-components ', ‘ --transform ' can be used in any GNU tar operation mode. For example, the following command adds files to the archive while replacing the leading ‘ usr/ ' component with ‘ var/ ': $ tar -cf arch.tar --transform='s,^usr/,var/,' / To test ‘ --transform ' effect we suggest using ‘ --show-transformed-names ' option: $ tar -cf arch.tar --transform='s,^usr/,var/,' \ --verbose --show-transformed-names / If both ‘ --strip-components ' and ‘ --transform ' are used together, then ‘ --transform ' is applied first, and the required number of components is then stripped from its result. You can use as many ‘ --transform ' options in a single command line as you want. The specified expressions will then be applied in order of their appearance. For example, the following two invocations are equivalent: $ tar -cf arch.tar --transform='s,/usr/var,/var/' \ --transform='s,/usr/local,/usr/,' $ tar -cf arch.tar \ --transform='s,/usr/var,/var/;s,/usr/local,/usr/,' [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.8 Operating Only on New Files The ‘ --after-date= date ' (‘ --newer= date ', ‘ -N date ') option causes tar to only work on files whose data modification or status change times are newer than the date given. If date starts with ‘ / ' or ‘ . ', it is taken to be a file name; the data modification time of that file is used as the date. If you use this option when creating or appending to an archive, the archive will only include new files. If you use ‘ --after-date ' when extracting an archive, tar will only extract files newer than the date you specify. If you want tar to make the date comparison based only on modification of the file's data (rather than status changes), then use the ‘ --newer-mtime= date ' option. You may use these options with any operation. Note that these options differ from the ‘ --update ' (‘ -u ') operation in that they allow you to specify a particular date against which tar can compare when deciding whether or not to archive the files. ‘ --after-date= date ' ‘ --newer= date ' ‘ -N date ' Only store files newer than date . Acts on files only if their data modification or status change times are later than date . Use in conjunction with any operation. If date starts with ‘ / ' or ‘ . ', it is taken to be a file name; the data modification time of that file is used as the date. ‘ --newer-mtime= date ' Act like ‘ --after-date ', but look only at data modification times. These options limit tar to operate only on files which have been modified after the date specified. A file's status is considered to have changed if its contents have been modified, or if its owner, permissions, and so forth, have been changed. (For more information on how to specify a date, see Date input formats ; remember that the entire date argument must be quoted if it contains any spaces.) Gurus would say that ‘ --after-date ' tests both the data modification time ( mtime , the time the contents of the file were last modified) and the status change time ( ctime , the time the file's status was last changed: owner, permissions, etc.) fields, while ‘ --newer-mtime ' tests only the mtime field. To be precise, ‘ --after-date ' checks both mtime and ctime and processes the file if either one is more recent than date , while ‘ --newer-mtime ' checks only mtime and disregards ctime . Neither option uses atime (the last time the contents of the file were looked at). Date specifiers can have embedded spaces. Because of this, you may need to quote date arguments to keep the shell from parsing them as separate arguments. For example, the following command will add to the archive all the files modified less than two days ago: $ tar -cf foo.tar --newer-mtime '2 days ago' When any of these options is used with the option ‘ --verbose ' (see section The ‘ --verbose ' Option ) GNU tar converts the specified date back to a textual form and compares that with the one given with the option. If the two forms differ, tar prints both forms in a message, to help the user check that the right date is being used. For example: $ tar -c -f archive.tar --after-date='10 days ago' . tar: Option --after-date: Treating date '10 days ago' as 2006-06-11 13:19:37.232434 Please Note: ‘ --after-date ' and ‘ --newer-mtime ' should not be used for incremental backups. See section Using tar to Perform Incremental Dumps , for proper way of creating incremental backups. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.9 Descending into Directories Usually, tar will recursively explore all directories (either those given on the command line or through the ‘ --files-from ' option) for the various files they contain. However, you may not always want tar to act this way. The ‘ --no-recursion ' option inhibits tar 's recursive descent into specified directories. If you specify ‘ --no-recursion ', you can use the find (see find in GNU Find Manual ) utility for hunting through levels of directories to construct a list of file names which you could then pass to tar . find allows you to be more selective when choosing which files to archive; see Reading Names from a File , for more information on using find with tar . ‘ --no-recursion ' Prevents tar from recursively descending directories. ‘ --recursion ' Requires tar to recursively descend directories. This is the default. When you use ‘ --no-recursion ', GNU tar grabs directory entries themselves, but does not descend on them recursively. Many people use find for locating files they want to back up, and since tar usually recursively descends on directories, they have to use the ‘ -not -type d ' test in their find invocation (see Type test in Finding Files ), as they usually do not want all the files in a directory. They then use the ‘ --files-from ' option to archive the files located via find . The problem when restoring files archived in this manner is that the directories themselves are not in the archive; so the ‘ --same-permissions ' (‘ --preserve-permissions ', ‘ -p ') option does not affect them—while users might really like it to. Specifying ‘ --no-recursion ' is a way to tell tar to grab only the directory entries given to it, adding no new files on its own. To summarize, if you use find to create a list of files to be stored in an archive, use it as follows: $ find dir tests | \ tar -cf archive --no-recursion -T - The ‘ --no-recursion ' option also applies when extracting: it causes tar to extract only the matched directory entries, not the files under those directories. The ‘ --no-recursion ' option also affects how globbing patterns are interpreted (see section Controlling Pattern-Matching ). The ‘ --no-recursion ' and ‘ --recursion ' options apply to later options and operands, and can be overridden by later occurrences of ‘ --no-recursion ' and ‘ --recursion '. For example: $ tar -cf jams.tar --no-recursion grape --recursion grape/concord creates an archive with one entry for ‘ grape ', and the recursive contents of ‘ grape/concord ', but no entries under ‘ grape ' other than ‘ grape/concord '. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.10 Crossing File System Boundaries tar will normally automatically cross file system boundaries in order to archive files which are part of a directory tree. You can change this behavior by running tar and specifying ‘ --one-file-system '. This option only affects files that are archived because they are in a directory that is being archived; tar will still archive files explicitly named on the command line or through ‘ --files-from ', regardless of where they reside. ‘ --one-file-system ' Prevents tar from crossing file system boundaries when archiving. Use in conjunction with any write operation. The ‘ --one-file-system ' option causes tar to modify its normal behavior in archiving the contents of directories. If a file in a directory is not on the same file system as the directory itself, then tar will not archive that file. If the file is a directory itself, tar will not archive anything beneath it; in other words, tar will not cross mount points. This option is useful for making full or incremental archival backups of a file system. If this option is used in conjunction with ‘ --verbose ' (‘ -v '), files that are excluded are mentioned by name on the standard error. 6.10.1 Changing the Working Directory Changing Directory 6.10.2 Absolute File Names [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.10.1 Changing the Working Directory To change the working directory in the middle of a list of file names, either on the command line or in a file specified using ‘ --files-from ' (‘ -T '), use ‘ --directory ' (‘ -C '). This will change the working directory to the specified directory after that point in the list. ‘ --directory= directory ' ‘ -C directory ' Changes the working directory in the middle of a command line. For example, $ tar -c -f jams.tar grape prune -C food cherry will place the files ‘ grape ' and ‘ prune ' from the current directory into the archive ‘ jams.tar ', followed by the file ‘ cherry ' from the directory ‘ food '. This option is especially useful when you have several widely separated files that you want to store in the same archive. Note that the file ‘ cherry ' is recorded in the archive under the precise name ‘ cherry ', not ‘ food/cherry '. Thus, the archive will contain three files that all appear to have come from the same directory; if the archive is extracted with plain ‘ tar --extract ', all three files will be written in the current directory. Contrast this with the command, $ tar -c -f jams.tar grape prune -C food red/cherry which records the third file in the archive under the name ‘ red/cherry ' so that, if the archive is extracted using ‘ tar --extract ', the third file will be written in a subdirectory named ‘ red '. You can use the ‘ --directory ' option to make the archive independent of the original name of the directory holding the files. The following command places the files ‘ /etc/passwd ', ‘ /etc/hosts ', and ‘ /lib/libc.a ' into the archive ‘ foo.tar ': $ tar -c -f foo.tar -C /etc passwd hosts -C /lib libc.a However, the names of the archive members will be exactly what they were on the command line: ‘ passwd ', ‘ hosts ', and ‘ libc.a '. They will not appear to be related by file name to the original directories where those files were located. Note that ‘ --directory ' options are interpreted consecutively. If ‘ --directory ' specifies a relative file name, it is interpreted relative to the then current directory, which might not be the same as the original current working directory of tar , due to a previous ‘ --directory ' option. When using ‘ --files-from ' (see section Reading Names from a File ), you can put various tar options (including ‘ -C ') in the file list. Notice, however, that in this case the option and its argument may not be separated by whitespace. If you use short option, its argument must either follow the option letter immediately, without any intervening whitespace, or occupy the next line. Otherwise, if you use long option, separate its argument by an equal sign. For instance, the file list for the above example will be: -C/etc passwd hosts --directory=/lib libc.a To use it, you would invoke tar as follows: $ tar -c -f foo.tar --files-from list The interpretation of options in file lists is disabled by ‘ --verbatim-files-from ' and ‘ --null ' options. [ << ] [ < ] [ Up ] [ > ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] 6.10.2 Absolute File Names By default, GNU tar drops a leading ‘ / ' on input or output, and complains about file names containing a ‘ .. ' component. There is an option that turns off this behavior: ‘ --absolute-names ' ‘ -P ' Do not strip leading slashes from file names, and permit file names containing a ‘ .. ' file name component. When tar extracts archive members from an archive, it strips any leading slashes (‘ / ') from the member name. This causes absolute member names in the archive to be treated as relative file names. This allows you to have such members extracted wherever you want, instead of being restricted to extracting the member in the exact directory named in the archive. For example, if the archive member has the name ‘ /etc/passwd ', tar will extract it as if the name were really ‘ etc/passwd '. File names containing ‘ .. ' can cause problems when extracting, so tar normally warns you about such files when creating an archive, and rejects attempts to extracts such files. Other tar programs do not do this. As a result, if you create an archive whose member names start with a slash, they will be difficult for other people with a non- GNU tar program to use. Therefore, GNU tar also strips leading slashes from member names when putting members into the archive. For example, if you ask tar to add the file ‘ /bin/ls ' to an archive, it will do so, but the member name will be ‘ bin/ls ' (20) . Symbolic links containing ‘ .. ' or leading ‘ / ' can also cause problems when extracting, so tar normally extracts them last; it may create empty files as placeholders during extraction. If you use the ‘ --absolute-names ' (‘ -P ') option, tar will do none of these transformations. To archive or extract files relative to the root directory, specify the ‘ --absolute-names ' (‘ -P ') option. Normally, tar acts on files relative to the working directory—ignoring superior directory names when archiving, and ignoring leading slashes when extracting. When you specify ‘ --absolute-names ' (‘ -P '), tar stores file names including all superior directory names, and preserves leading slashes. If you only invoked tar from the root directory you would never need the ‘ --absolute-names ' option, but using this option may be more convenient than switching to root. ‘ --absolute-names ' Preserves full file names (including superior directory names) when archiving and extracting files. tar prints out a message about removing the ‘ / ' from file names. This message appears once per GNU tar invocation. It represents something which ought to be told; ignoring what it means can cause very serious surprises, later. Some people, nevertheless, do not want to see this message. Wanting to play really dangerously, one may of course redirect tar standard error to the sink. For example, under sh : $ tar -c -f archive.tar /home 2> /dev/null Another solution, both nicer and simpler, would be to change to the ‘ / ' directory first, and then avoid absolute notation. For example: $ tar -c -f archive.tar -C / home See section Integrity , for some of the security-related implications of using this option. [ << ] [ >> ] [ Top ] [ Contents ] [ Index ] [ ? ] This document was generated on August 23, 2023 using texi2html 5.0 . ...
http://www.gnu.org/savannah-checkouts/gnu/tar/manual/html_chapter/Choosing.html - [detail] - [similar]
PREV NEXT
Powered by Hyper Estraier 1.4.13, with 213332 documents and 1081116 words.