This manual is for GNU ddrescue (version 1.16, 11 June 2012).
Copyright © 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Antonio Diaz Diaz.
This manual is free documentation: you have unlimited permission to copy, distribute and modify it.
GNU ddrescue is a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying hard to rescue data in case of read errors.
The basic operation of ddrescue is fully automatic. That is, you don't have to wait for an error, stop the program, read the log, run it backwards, etc.
If you use the logfile feature of ddrescue, the data is rescued very efficiently, (only the needed blocks are read). Also you can interrupt the rescue at any time and resume it later at the same point.
Ddrescue does not write zeros to the output when it finds bad sectors in the input, and does not truncate the output file if not asked to. So, every time you run it on the same output file, it tries to fill in the gaps without wiping out the data already rescued.
Automatic merging of backups: If you have two or more damaged copies of a file, cdrom, etc, and run ddrescue on all of them, one at a time, with the same output file, you will probably obtain a complete and error-free file. This is so because the probability of having damaged areas at the same places on different input files is very low. Using the logfile, only the needed blocks are read from the second and successive copies.
Ddrescue recommends lzip for compression of backups because of its reliability and data recovery capabilities, including error-checked merging of backup copies. The combination ddrescue + lziprecover is the best option for recovering data from multiple damaged copies. See lziprecover-example, for an example.
Recordable CD and DVD media keep their data only for a finite time (typically for many years). After that time, data loss develops slowly with read errors growing from the outer media region towards the inside. Just make two (or more) copies of every important CD/DVD you burn so that you can later recover them with ddrescue.
Because ddrescue needs to read and write at random places, it only works on seekable (random access) input and output files.
If your system supports it, ddrescue can use direct disc access to read the input file, bypassing the kernel cache.
Ddrescue also features a "fill mode" able to selectively overwrite parts of the output file, which has a number of interesting uses like wiping data, marking bad areas or even, in some cases, "repair" damaged sectors.
GNU ddrescue manages efficiently the status of the rescue in progress and tries to rescue the good parts first, scheduling reads inside bad (or slow) areas for later. This maximizes the amount of data that can be finally recovered from a failing drive.
The standard dd utility can be used to save data from a failing drive, but it reads the data sequentially, which may wear out the drive without rescuing anything if the errors are at the beginning of the drive.
Other programs switch to small size reads when they find errors, but they still read the data sequentially. This is a bad idea because it means spending more time at error areas, damaging the surface, the heads and the drive mechanics, instead of getting out of them as fast as possible. This behavior reduces the chances of rescuing the remaining good data.
The algorithm of ddrescue is as follows (the user may interrupt the process at any point, but be aware that a bad drive can block ddrescue for a long time until the kernel gives up):
1) Optionally read a logfile describing the status of a multi-part or previously interrupted rescue. If no logfile is specified or is empty or does not exist, mark all the rescue domain as non-tried.
2) Read the non-tried parts of the input file, marking the failed blocks as non-trimmed and skipping beyond them, until all the rescue domain is tried. Only non-tried areas are read in large blocks. Trimming, splitting and retrying are done sector by sector. Each sector is tried at most two times; the first in this step as part of a large block read, the second in one of the steps below as a single sector read.
3) Read backwards one sector at a time the non-trimmed blocks, until a bad sector is found. For each non-trimmed block, mark the bad sector found as bad-sector and mark the rest of that block as non-split.
4) Read forwards one sector at a time the non-split blocks, marking the bad sectors found as bad-sector. After a number of consecutive bad sectors is found in a block large enough, the block is split by half and the reading continues on the second half. This recursively splits the largest failed blocks without producing a logfile too large.
5) Optionally try to read again the bad sectors until the specified number of retries is reached.
6) Optionally write a logfile for later use.
Note that as ddrescue splits the failed blocks, making them smaller, the total error size may diminish while the number of errors increases.
The logfile is periodically saved to disc, as well as when ddrescue finishes or is interrupted. So in case of a crash you can resume the rescue with little recopying.
Also, the same logfile can be used for multiple commands that copy different areas of the input file, and for multiple recovery attempts over different subsets. See this example:
Rescue the most important part of the disc first.
ddrescue -i0 -s50MiB /dev/hdc hdimage logfile
ddrescue -i0 -s1MiB -d -r3 /dev/hdc hdimage logfile
Then rescue some key disc areas.
ddrescue -i30GiB -s10GiB /dev/hdc hdimage logfile
ddrescue -i230GiB -s5GiB /dev/hdc hdimage logfile
Now rescue the rest (does not recopy what is already done).
ddrescue /dev/hdc hdimage logfile
ddrescue -d -r3 /dev/hdc hdimage logfile
The format for running ddrescue is:
ddrescue [options] infile outfile [logfile]
ddrescue supports the following options:
Note that specifying a minimum read rate above the posibilities of the
input device will result in a very low average rate because all reads
will be considered slow reads and continuous skipping will occur. This
will also make the logfile grow disproportionately.
If your system does not support direct disc access, ddrescue will warn
you. If the sector size is not correctly set, all reads will result in
errors, and no data will be rescued.
ddrescue -i 100 -s 200 infile outfile logfile
Numbers given as arguments to options (positions, sizes, retries, etc) may be followed by a multiplier and an optional `B' for "byte".
Table of SI and binary prefixes (unit multipliers):
| Prefix | Value | | | Prefix | Value
|
| | | b | hardware blocks
| ||
| k | kilobyte (10^3 = 1000) | | | Ki | kibibyte (2^10 = 1024)
|
| M | megabyte (10^6) | | | Mi | mebibyte (2^20)
|
| G | gigabyte (10^9) | | | Gi | gibibyte (2^30)
|
| T | terabyte (10^12) | | | Ti | tebibyte (2^40)
|
| P | petabyte (10^15) | | | Pi | pebibyte (2^50)
|
| E | exabyte (10^18) | | | Ei | exbibyte (2^60)
|
| Z | zettabyte (10^21) | | | Zi | zebibyte (2^70)
|
| Y | yottabyte (10^24) | | | Yi | yobibyte (2^80)
|
Return values: 0 for a normal exit, 1 for environmental problems (file not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid input file, 3 for an internal consistency error (eg, bug) which caused ddrescue to panic.
The logfile is a text file easy to read and edit. It is formed by three parts, the heading comments, the status line, and the list of data blocks.
Any line beginning with `#' is a comment line. The blocks in the list of data blocks must be contiguous and non-overlapping.
NOTE: Logfiles generated by a version of ddrescue prior to 1.6 lack the status line. If you want to use an old logfile with ddrescue 1.6 or later, you will have to insert a line like `0 +' at the beginning of the logfile.
The heading comments contain the version of ddrescue and the command line used to create the logfile. They are intended as information for the user.
The first non-comment line is the status line. It contains a non-negative integer and a status character. The integer is the position being tried in the input file. The status character is one of these:
| Character | Meaning
|
| '?' | copying non-tried blocks
|
| '*' | trimming non-trimmed blocks
|
| '/' | splitting non-split blocks
|
| '-' | retrying bad sectors
|
| 'F' | filling specified blocks
|
| 'G' | generating approximate logfile
|
| '+' | finished
|
Every line in the list of data blocks describes a block of data. It contains 2 non-negative integers and a status character. The first integer is the starting position of the block in the input file, the second integer is the size (in bytes) of the block. The status character is one of these:
| Character | Meaning
|
| '?' | non-tried block
|
| '*' | failed block non-trimmed
|
| '/' | failed block non-split
|
| '-' | failed block bad-sector(s)
|
| '+' | finished block
|
And here is an example logfile:
# Rescue Logfile. Created by GNU ddrescue version 1.16
# Command line: ddrescue /dev/fd0 fdimage logfile
# current_pos current_status
0x00120000 ?
# pos size status
| 0x00000000 | 0x00117000 | +
|
| 0x00117000 | 0x00000200 | -
|
| 0x00117200 | 0x00001000 | /
|
| 0x00118200 | 0x00007E00 | *
|
| 0x00120000 | 0x00048000 | ?
|
If you edit the file, you may use decimal, hexadecimal or octal values, using the same syntax that integer constants in C++.
Ddrescue is like any other power tool. You need to understand what it does, and you need to understand some things about the machines it does those things to, in order to use it safely.
A failing drive tends to develop more and more errors as time passes. Because of this, you should rescue the data from a drive as soon as you notice the first error. Be diligent because every time a physically damaged drive powers up and is able to output some data, it may be the very last time that it ever will.
You should make a copy of the failing drive with ddrescue, and then try to repair the copy. If your data is really important, use the first copy as a master for a second copy, and try to repair the second copy. If something goes wrong, you have the master intact to try again.
IMPORTANT! Always use a logfile unless you know you won't need it. Without a logfile, ddrescue can't resume a rescue, only reinitiate it.
IMPORTANT! Never try to rescue a r/w mounted partition. The resulting copy may be useless.
IMPORTANT! If you use a device or a partition as destination, any data stored there will be overwritten.
IMPORTANT! If you interrupt the rescue and then reboot, any partially copied partitions should be hidden before allowing them to be touched by any operating system that tries to mount and "fix" the partitions it sees.
IMPORTANT! Never try to repair a file system on a drive with I/O errors; you will probably lose even more data.
If you are trying to rescue a whole partition, first repair the copy with e2fsck or some other tool appropiate for the type of partition you are trying to rescue, then mount the repaired copy somewhere and try to recover the files in it.
If the drive is so damaged that the file system in the rescued partition can't be repaired or mounted, you will have to browse the rescued data with an hex editor and extract the desired parts by hand or use a file recovery tool like photorec.
If the partition table is damaged, you may try to rescue the whole disc, then try to repair the partition table and the partitions on the copy.
If the damaged drive is not listed in /dev, then you cannot rescue it. At least not with ddrescue.
Example 1: Rescue a whole disc with two ext2 partitions in /dev/hda to /dev/hdb.
ddrescue -f -n /dev/hda /dev/hdb logfile
ddrescue -d -f -r3 /dev/hda /dev/hdb logfile
fdisk /dev/hdb
e2fsck -v -f /dev/hdb1
e2fsck -v -f /dev/hdb2
Example 2: Rescue an ext2 partition in /dev/hda2 to /dev/hdb2.
ddrescue -f -n /dev/hda2 /dev/hdb2 logfile
ddrescue -d -f -r3 /dev/hda2 /dev/hdb2 logfile
e2fsck -v -f /dev/hdb2
mount -t ext2 -o ro /dev/hdb2 /mnt
(read rescued files from /mnt)
Example 3: Rescue a CD-ROM in /dev/cdrom.
ddrescue -n -b2048 /dev/cdrom cdimage logfile
ddrescue -d -b2048 /dev/cdrom cdimage logfile
(write cdimage to a blank CD-ROM)
Example 4: Rescue a CD-ROM in /dev/cdrom from two copies.
ddrescue -n -b2048 /dev/cdrom cdimage logfile
ddrescue -d -b2048 /dev/cdrom cdimage logfile
(insert second copy in the CD drive)
ddrescue -d -r1 -b2048 /dev/cdrom cdimage logfile
(write cdimage to a blank CD-ROM)
Example 5: Rescue a lzip compressed backup from two copies on CD-ROM with error-checked merging of copies (See the lziprecover manual for details about lziprecover).
ddrescue -b2048 /dev/cdrom cdimage1 logfile1
mount -t iso9660 -o loop,ro cdimage1 /mnt/cdimage
cp /mnt/cdimage/backup.tar.lz rescued1.tar.lz
umount /mnt/cdimage
(insert second copy in the CD drive)
ddrescue -b2048 /dev/cdrom cdimage2 logfile2
mount -t iso9660 -o loop,ro cdimage2 /mnt/cdimage
cp /mnt/cdimage/backup.tar.lz rescued2.tar.lz
umount /mnt/cdimage
lziprecover -m -v -o rescued.tar.lz rescued1.tar.lz rescued2.tar.lz
Example 6: While rescuing the whole drive /dev/hda to /dev/hdb, /dev/hdb fails and you have to rescue data to a third drive, /dev/hdc.
ddrescue -f -n /dev/hda /dev/hdb logfile1 <-- /dev/hdb fails here
ddrescue -f -m logfile1 /dev/hdb /dev/hdc logfile2
ddrescue -f -n /dev/hda /dev/hdc logfile2
ddrescue -d -f -r3 /dev/hda /dev/hdc logfile2
Example 7: While rescuing the whole drive /dev/hda to /dev/hdb, /dev/hda stops responding and disappears from /dev.
ddrescue -f -n /dev/hda /dev/hdb logfile <-- /dev/hda fails here
(restart /dev/hda or reboot computer as many times as needed)
ddrescue -f -n -A /dev/hda /dev/hdb logfile
ddrescue -d -f -r3 /dev/hda /dev/hdb logfile
If you notice that the positions and sizes in the log file are ALWAYS multiples of the sector size, maybe your kernel is caching the disc accesses and grouping them. In this case you may want to use direct disc access or a raw device to bypass the kernel cache and rescue more of your data.
NOTE! Sector size must be correctly set with the `--block-size' option for this to work. Try the `--direct' option first. If direct disc access is not available in your system, try raw devices. Read your system documentation to find how to bind a raw device to a regular block device.
Ddrescue aligns its I/O buffer to the sector size so that it can be used for direct disc access or to read from raw devices. For efficiency reasons, also aligns it to the memory page size if page size is a multiple of sector size. Ddrescue can't determine the size of a raw device, so an explicit `--max-size' or `--complete-only' option is needed.
Using direct disc access, or reading from a raw device, may be slower or faster than normal cached reading depending on your OS and hardware. In case it is slower you may want to make a first pass using normal cached reads and use direct disc access, or a raw device, only to recover the good sectors inside the failed blocks.
Example 1: using direct disc access.
ddrescue -f -n /dev/hdb1 /dev/hdc1 logfile
ddrescue -d -f -r3 /dev/hdb1 /dev/hdc1 logfile
e2fsck -v -f /dev/hdc1
mount -t ext2 -o ro /dev/hdc1 /mnt
Example 2: using a raw device.
raw /dev/raw/raw1 /dev/hdb1
ddrescue -f -n /dev/hdb1 /dev/hdc1 logfile
ddrescue -C -f -r3 /dev/raw/raw1 /dev/hdc1 logfile
raw /dev/raw/raw1 0 0
e2fsck -v -f /dev/hdc1
mount -t ext2 -o ro /dev/hdc1 /mnt
When ddrescue is invoked with the `--fill' option it operates in "fill mode", which is different from the default "rescue mode". That is, if you use the `--fill' option, ddrescue does not rescue anything. It only fills with data read from the input file the blocks of the output file whose status character from the logfile coincides with one of the type characters specified as argument to the `--fill' option.
In fill mode the input file may have any size. If it is too small, the data will be duplicated as many times as necessary to fill the input buffer. If it is too big, only the data needed to fill the input buffer will be read. Then the same data will be written to every block or sector to be filled.
Note that in fill mode the input file is always read from position 0. If you specify a `--input-position', it refers to the original input file from which the logfile was built, and is only used to calculate the offset between input and output positions.
Note also that when filling the input file of the original rescue run you should set `--input-position' and `--output-position' to identical values, whereas when filling the output file of the original rescue run you should keep the original offset between `--input-position' and `--output-position'.
The `--fill' option implies the `--complete-only' option.
In fill mode the logfile is updated to allow resumability when interrupted or in case of a crash, but as nothing is being rescued the logfile is not destroyed. The status line is the only part of the logfile that is modified.
The fill mode has a number of uses. See the following examples:
Example 1: Mark parts of the rescued copy to allow finding them when examined in an hex editor. For example, the following command line fills all blocks marked as `-' (bad-sector) with copies of the string `BAD SECTOR ':
printf "BAD SECTOR " > tmpfile
ddrescue --fill=- tmpfile outfile logfile
Example 2: Wipe only the good sectors, leaving the bad sectors alone. This way, the drive will still test bad (i.e., with unreadable sectors). This is the fastest way of wiping a failing drive, and is specially useful when sending the drive back to the manufacturer for warranty replacement.
ddrescue --fill=+ --force /dev/zero bad_drive logfile
Example 3: Force the drive to remap the bad sectors, making it usable again. If the drive has only a few bad sectors, and they are not caused by drive age, you can probably just rewrite those sectors, and the drive will reallocate them automatically to new "spare" sectors that it keeps for just this purpose. WARNING! This may not work on your drive.
ddrescue --fill=- --force --synchronous /dev/zero bad_drive logfile
Fill mode can also help you to figure out, independently of the file system used, what files are partially or entirely in the bad areas of the disc. Just follow these steps:
1) Copy the damaged drive with ddrescue until finished. Do not use sparse writes. This yields a logfile with only finished (`+') and bad-sector (`-') blocks.
2) Fill the bad-sector blocks of the copied drive or image file with a string not present in any file, for example "DEADBEEF".
3) Mount the copied drive (or the image file, via loopback device).
4) Grep for the fill string in all the files. Those files containing the string reside (at least partially) in damaged disc areas.
5) Unmount the copied drive or image file.
6) Optionally fill the bad-sector blocks of the copied drive or image file with zeros to restore the disc image.
Example 4: Figure out what files are in the bad areas of the disc.
ddrescue -b2048 /dev/cdrom cdimage logfile
printf "DEADBEEF" > tmpfile
ddrescue --fill=- tmpfile cdimage logfile
rm tmpfile
mount -t iso9660 -o loop,ro cdimage /mnt/cdimage
find /mnt/cdimage -type f -exec grep "DEADBEEF" '{}' ';'
umount /mnt/cdimage
ddrescue --fill=- /dev/zero cdimage logfile
So you didn't read the tutorial and started ddrescue without a logfile. Now, two days later, your computer crashed and you can't know how much data ddrescue managed to save. And even worse, you can't resume the rescue; you have to restart it from the very beginning.
Or maybe you started copying a drive with `dd conv=noerror,sync' and are now in the same situation described above. In this case, note that you can't use a copy made by dd unless it was invoked with the `sync' conversion argument.
Don't despair (yet). Ddrescue can in some cases generate an approximate logfile, from the input file and the (partial) copy, that is almost as good as an exact logfile. It makes this by simply assuming that sectors containing all zeros were not rescued.
However, if the destination of the copy was a drive or a partition, (or an existing regular file and truncation was not requested), most probably you will need to restart ddrescue from the very beginning. (This time with a logfile, of course). The reason is that old data may be present in the drive that have not been overwritten yet, and may be thus non-tried but non-zero.
For example, if you first tried one of these commands:
ddrescue infile outfile
or
dd if=infile of=outfile conv=noerror,sync
then you can generate an approximate logfile with this command:
ddrescue --generate-logfile infile outfile logfile
Note that you must keep the original offset between `--input-position' and `--output-position' of the original rescue run.
Ddrescuelog is a tool that manipulates ddrescue logfiles, shows logfile contents, converts logfiles to/from other formats, compares logfiles, tests rescue status, and can delete a logfile if the rescue is done. Ddrescuelog operations can be restricted to one or several parts of the logfile if the domain setting options are used.
Here are some examples of how to use ddrescuelog, alone or in combination with other tools.
Example 1: Delete the logfile if the rescue is finished (all data is recovered without errors left).
ddrescue -f /dev/hda /dev/hdb logfile
ddrescuelog -d logfile
Example 2: Rescue two ext2 partitions in /dev/hda to /dev/hdb and repair the file systems using badblock lists generated with ddrescuelog. File system block size is 4096.
fdisk /dev/hdb <-- partition /deb/hdb
ddrescue -f /dev/hda1 /dev/hdb1 logfile1
ddrescue -f /dev/hda2 /dev/hdb2 logfile2
ddrescuelog -l- -b4096 logfile1 > badblocks1
ddrescuelog -l- -b4096 logfile2 > badblocks2
e2fsck -v -f -L badblocks1 /dev/hdb1
e2fsck -v -f -L badblocks2 /dev/hdb2
Example 3: Rescue a whole disc with two ext2 partitions in /dev/hda to /dev/hdb and repair the file systems using badblock lists generated with ddrescuelog. Disc sector size is 512, file system block size is 4096. Arguments to options `-i' and `-s' are the starting positions and sizes of the partitions being rescued.
ddrescue -f /dev/hda /dev/hdb logfile
fdisk /dev/hdb <-- get partition sizes
ddrescuelog -l- -b512 -i63b -o0 -s9767457b -b4096 logfile > badblocks1
ddrescuelog -l- -b512 -i9767520b -o0 -s128520b -b4096 logfile > badblocks2
e2fsck -v -f -L badblocks1 /dev/hdb1
e2fsck -v -f -L badblocks2 /dev/hdb2
The format for running ddrescuelog is:
ddrescuelog [options] logfile
Ddrescuelog supports the following options:
type1 and type2 are block status characters as defined in
the chapter Logfile Structure (see Logfile Structure). type1
sets the type for blocks included in the list, while type2 sets
the type for the rest of the logfile. If not specified, type1
defaults to `+' and type2 defaults to `-'.
The list format is one block number per line in decimal, like the output
of the badblocks program, so that it can be used as input for e2fsck or
other similar filesystem repairing tool.
Return values: 0 for a normal exit, 1 for environmental problems (file not found, invalid flags, I/O errors, etc), 2 to indicate a corrupt or invalid input file, 3 for an internal consistency error (eg, bug) which caused ddrescuelog to panic.
There are probably bugs in ddrescue. There are certainly errors and omissions in this manual. If you report them, they will get fixed. If you don't, no one will ever know about them and they will remain unfixed for all eternity, if not longer.
If you find a bug in GNU ddrescue, please send electronic mail to bug-ddrescue@gnu.org. Include the version number, which you can find by running `ddrescue --version'.