2.11 Recursive Accept/Reject Options
- ‘-A acclist --accept acclist’
- ‘-R rejlist --reject rejlist’
- Specify comma-separated lists of file name suffixes or patterns to
accept or reject (see Types of Files). Note that if
any of the wildcard characters, ‘*’, ‘?’, ‘[’ or
‘]’, appear in an element of acclist or rejlist,
it will be treated as a pattern, rather than a suffix.
- ‘-D domain-list’
- Set domains to be followed. domain-list is a comma-separated list
of domains. Note that it does not turn on ‘-H’.
- ‘--exclude-domains domain-list’
- Specify the domains that are not to be followed
(see Spanning Hosts).
- Follow ftp links from html documents. Without this option,
Wget will ignore all the ftp links.
- Wget has an internal table of html tag / attribute pairs that it
considers when looking for linked documents during a recursive
retrieval. If a user wants only a subset of those tags to be
considered, however, he or she should be specify such tags in a
comma-separated list with this option.
- This is the opposite of the ‘--follow-tags’ option. To skip
certain html tags when recursively looking for documents to download,
specify them in a comma-separated list.
In the past, this option was the best bet for downloading a single page
and its requisites, using a command-line like:
wget --ignore-tags=a,area -H -k -K -r http://site/document
However, the author of this option came across a page with tags like
<LINK REL="home" HREF="/"> and came to the realization that
specifying tags to ignore was not enough. One can't just tell Wget to
<LINK>, because then stylesheets will not be downloaded.
Now the best bet for downloading a single page and its requisites is the
dedicated ‘--page-requisites’ option.
- Ignore case when matching files and directories. This influences the
behavior of -R, -A, -I, and -X options, as well as globbing
implemented when downloading from FTP sites. For example, with this
option, ‘-A *.txt’ will match ‘file1.txt’, but also
‘file2.TXT’, ‘file3.TxT’, and so on.
- Enable spanning across hosts when doing recursive retrieving
(see Spanning Hosts).
- Follow relative links only. Useful for retrieving a specific home page
without any distractions, not even those from the same hosts
(see Relative Links).
- ‘-I list’
- Specify a comma-separated list of directories you wish to follow when
downloading (see Directory-Based Limits). Elements
of list may contain wildcards.
- ‘-X list’
- Specify a comma-separated list of directories you wish to exclude from
download (see Directory-Based Limits). Elements of
list may contain wildcards.
- Do not ever ascend to the parent directory when retrieving recursively.
This is a useful option, since it guarantees that only the files
below a certain hierarchy will be downloaded.
See Directory-Based Limits, for more details.