2.8 HTTPS (SSL/TLS) Options

To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with an external SSL library. The current default is GnuTLS. In addition, Wget also supports HSTS (HTTP Strict Transport Security). If Wget is compiled without SSL support, none of these options are available.

--secure-protocol=protocol

Choose the secure protocol to be used. Legal values are ‘auto’, ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’, ‘TLSv1_3’ and ‘PFS’. If ‘auto’ is used, the SSL library is given the liberty of choosing the appropriate protocol automatically, which is achieved by sending a TLSv1 greeting. This is the default.

Specifying ‘SSLv2’, ‘SSLv3’, ‘TLSv1’, ‘TLSv1_1’, ‘TLSv1_2’ or ‘TLSv1_3’ forces the use of the corresponding protocol. This is useful when talking to old and buggy SSL server implementations that make it hard for the underlying SSL library to choose the correct protocol version. Fortunately, such servers are quite rare.

Specifying ‘PFS’ enforces the use of the so-called Perfect Forward Security cipher suites. In short, PFS adds security by creating a one-time key for each SSL connection. It has a bit more CPU impact on client and server. We use known to be secure ciphers (e.g. no MD4) and the TLS protocol. This mode also explicitly excludes non-PFS key exchange methods, such as RSA.

--https-only

When in recursive mode, only HTTPS links are followed.

--ciphers

Set the cipher list string. Typically this string sets the cipher suites and other SSL/TLS options that the user wish should be used, in a set order of preference (GnuTLS calls it ’priority string’). This string will be fed verbatim to the SSL/TLS engine (OpenSSL or GnuTLS) and hence its format and syntax is dependent on that. Wget will not process or manipulate it in any way. Refer to the OpenSSL or GnuTLS documentation for more information.

--no-check-certificate

Don’t check the server certificate against the available certificate authorities. Also don’t require the URL host name to match the common name presented by the certificate.

As of Wget 1.10, the default is to verify the server’s certificate against the recognized certificate authorities, breaking the SSL handshake and aborting the download if the verification fails. Although this provides more secure downloads, it does break interoperability with some sites that worked with previous Wget versions, particularly those using self-signed, expired, or otherwise invalid certificates. This option forces an “insecure” mode of operation that turns the certificate verification errors into warnings and allows you to proceed.

If you encounter “certificate verification” errors or ones saying that “common name doesn’t match requested host name”, you can use this option to bypass the verification and proceed with the download. Only use this option if you are otherwise convinced of the site’s authenticity, or if you really don’t care about the validity of its certificate. It is almost always a bad idea not to check the certificates when transmitting confidential or important data. For self-signed/internal certificates, you should download the certificate and verify against that instead of forcing this insecure mode. If you are really sure of not desiring any certificate verification, you can specify –check-certificate=quiet to tell wget to not print any warning about invalid certificates, albeit in most cases this is the wrong thing to do.

--certificate=file

Use the client certificate stored in file. This is needed for servers that are configured to require certificates from the clients that connect to them. Normally a certificate is not required and this switch is optional.

--certificate-type=type

Specify the type of the client certificate. Legal values are ‘PEM’ (assumed by default) and ‘DER’, also known as ‘ASN1’.

--private-key=file

Read the private key from file. This allows you to provide the private key in a file separate from the certificate.

--private-key-type=type

Specify the type of the private key. Accepted values are ‘PEM’ (the default) and ‘DER’.

--ca-certificate=file

Use file as the file with the bundle of certificate authorities (“CA”) to verify the peers. The certificates must be in PEM format.

Without this option Wget looks for CA certificates at the system-specified locations, chosen at OpenSSL installation time.

--ca-directory=directory

Specifies directory containing CA certificates in PEM format. Each file contains one CA certificate, and the file name is based on a hash value derived from the certificate. This is achieved by processing a certificate directory with the c_rehash utility supplied with OpenSSL. Using ‘--ca-directory’ is more efficient than ‘--ca-certificate’ when many certificates are installed because it allows Wget to fetch certificates on demand.

Without this option Wget looks for CA certificates at the system-specified locations, chosen at OpenSSL installation time.

--crl-file=file

Specifies a CRL file in file. This is needed for certificates that have been revocated by the CAs.

--pinnedpubkey=file/hashes

Tells wget to use the specified public key file (or hashes) to verify the peer. This can be a path to a file which contains a single public key in PEM or DER format, or any number of base64 encoded sha256 hashes preceded by “sha256//” and separated by “;”

When negotiating a TLS or SSL connection, the server sends a certificate indicating its identity. A public key is extracted from this certificate and if it does not exactly match the public key(s) provided to this option, wget will abort the connection before sending or receiving any data.

--random-file=file

[OpenSSL and LibreSSL only] Use file as the source of random data for seeding the pseudo-random number generator on systems without /dev/urandom.

On such systems the SSL library needs an external source of randomness to initialize. Randomness may be provided by EGD (see ‘--egd-file’ below) or read from an external source specified by the user. If this option is not specified, Wget looks for random data in $RANDFILE or, if that is unset, in $HOME/.rnd.

If you’re getting the “Could not seed OpenSSL PRNG; disabling SSL.” error, you should provide random data using some of the methods described above.

--egd-file=file

[OpenSSL only] Use file as the EGD socket. EGD stands for Entropy Gathering Daemon, a user-space program that collects data from various unpredictable system sources and makes it available to other programs that might need it. Encryption software, such as the SSL library, needs sources of non-repeating randomness to seed the random number generator used to produce cryptographically strong keys.

OpenSSL allows the user to specify his own source of entropy using the RAND_FILE environment variable. If this variable is unset, or if the specified file does not produce enough randomness, OpenSSL will read random data from EGD socket specified using this option.

If this option is not specified (and the equivalent startup command is not used), EGD is never contacted. EGD is not needed on modern Unix systems that support /dev/urandom.

--no-hsts

Wget supports HSTS (HTTP Strict Transport Security, RFC 6797) by default. Use ‘--no-hsts’ to make Wget act as a non-HSTS-compliant UA. As a consequence, Wget would ignore all the Strict-Transport-Security headers, and would not enforce any existing HSTS policy.

--hsts-file=file

By default, Wget stores its HSTS database in ~/.wget-hsts. You can use ‘--hsts-file’ to override this. Wget will use the supplied file as the HSTS database. Such file must conform to the correct HSTS database format used by Wget. If Wget cannot parse the provided file, the behaviour is unspecified.

The Wget’s HSTS database is a plain text file. Each line contains an HSTS entry (ie. a site that has issued a Strict-Transport-Security header and that therefore has specified a concrete HSTS policy to be applied). Lines starting with a dash (#) are ignored by Wget. Please note that in spite of this convenient human-readability hand-hacking the HSTS database is generally not a good idea.

An HSTS entry line consists of several fields separated by one or more whitespace:

<hostname> SP [<port>] SP <include subdomains> SP <created> SP <max-age>

The hostname and port fields indicate the hostname and port to which the given HSTS policy applies. The port field may be zero, and it will, in most of the cases. That means that the port number will not be taken into account when deciding whether such HSTS policy should be applied on a given request (only the hostname will be evaluated). When port is different to zero, both the target hostname and the port will be evaluated and the HSTS policy will only be applied if both of them match. This feature has been included for testing/development purposes only. The Wget testsuite (in testenv/) creates HSTS databases with explicit ports with the purpose of ensuring Wget’s correct behaviour. Applying HSTS policies to ports other than the default ones is discouraged by RFC 6797 (see Appendix B "Differences between HSTS Policy and Same-Origin Policy"). Thus, this functionality should not be used in production environments and port will typically be zero. The last three fields do what they are expected to. The field include_subdomains can either be 1 or 0 and it signals whether the subdomains of the target domain should be part of the given HSTS policy as well. The created and max-age fields hold the timestamp values of when such entry was created (first seen by Wget) and the HSTS-defined value ’max-age’, which states how long should that HSTS policy remain active, measured in seconds elapsed since the timestamp stored in created. Once that time has passed, that HSTS policy will no longer be valid and will eventually be removed from the database.

If you supply your own HSTS database via ‘--hsts-file’, be aware that Wget may modify the provided file if any change occurs between the HSTS policies requested by the remote servers and those in the file. When Wget exits, it effectively updates the HSTS database by rewriting the database file with the new entries.

If the supplied file does not exist, Wget will create one. This file will contain the new HSTS entries. If no HSTS entries were generated (no Strict-Transport-Security headers were sent by any of the servers) then no file will be created, not even an empty one. This behaviour applies to the default database file (~/.wget-hsts) as well: it will not be created until some server enforces an HSTS policy.

Care is taken not to override possible changes made by other Wget processes at the same time over the HSTS database. Before dumping the updated HSTS entries on the file, Wget will re-read it and merge the changes.

Using a custom HSTS database and/or modifying an existing one is discouraged. For more information about the potential security threats arose from such practice, see section 14 "Security Considerations" of RFC 6797, specially section 14.9 "Creative Manipulation of HSTS Policy Store".

--warc-file=file

Use file as the destination WARC file.

--warc-header=string

Use string into as the warcinfo record.

--warc-max-size=size

Set the maximum size of the WARC files to size.

--warc-cdx

Write CDX index files.

--warc-dedup=file

Do not store records listed in this CDX file.

--no-warc-compression

Do not compress WARC files with GZIP.

--no-warc-digests

Do not calculate SHA1 digests.

--no-warc-keep-log

Do not store the log file in a WARC record.

--warc-tempdir=dir

Specify the location for temporary files created by the WARC writer.