5.2 HTTP Time-Stamping Internals

Time-stamping in HTTP is implemented by checking of the Last-Modified header. If you wish to retrieve the file foo.html through HTTP, Wget will check whether foo.html exists locally. If it doesn’t, foo.html will be retrieved unconditionally.

If the file does exist locally, Wget will first check its local time-stamp (similar to the way ls -l checks it), and then send a HEAD request to the remote server, demanding the information on the remote file.

The Last-Modified header is examined to find which file was modified more recently (which makes it “newer”). If the remote file is newer, it will be downloaded; if it is older, Wget will give up.2

When ‘--backup-converted’ (‘-K’) is specified in conjunction with ‘-N’, server file ‘X’ is compared to local file ‘X.orig’, if extant, rather than being compared to local file ‘X’, which will always differ if it’s been converted by ‘--convert-links’ (‘-k’).

Arguably, HTTP time-stamping should be implemented using the If-Modified-Since request.



As an additional check, Wget will look at the Content-Length header, and compare the sizes; if they are not the same, the remote file will be downloaded no matter what the time-stamp says.