Detect Corrupt Photos (using Bash)

Corrupt Images

Corrupt (broken) image files have been popping up for a few years but I’ve only ever fixed the damaged image files by restoring from backups as and when I found them, until now.

Possible Causes

I am not certain of the causes of the corrupt images, they could have been caused by a dying hard drive, or corrupted in RAM and then saved to disk.

Automated Detection

This weekend I decided to systematically find all the defective images in my collection (both JPEG and .CR2 raw files). I couldn’t find a free ready-made solution so I ended up creating my own and I thought I would document it here for others to find as I needed to consolidate a lot of posts & pages on the Internet with a bit of experimentation to get to this point.


The solution uses Bash and ImageMagick┬«. I’ve developed and tested it using Cygwin but I’m sure it will work on most Linux distributions as well.

Required Libraries and Dependencies

You will need ImageMagick and if you would like to also check raw files you will also need to install ufraw-batch.


Step #1 is to enumerate all the files you wish to check:

find . -iname \*\.jp*g -o -iname \*\.CR2 -type f > all-files.out

Step #2 is to ensure you have the required files:

touch done.out
touch failed.out

Then run this line:

awk '{if (f==1) { r[$0] } else if (! ($0 in r)) { print $0 } } ' f=1 done.out failed.out f=2 all-files.out | xargs -n 1 -P 2 -I '{}' ./ "{}"

The first part facilitates resuming the command. Here awk removes the lines in done.out and failed.out so we don’t need to check them again.

The xargs then calls the script (see below) in parallel (-P 2). script looks like:

if identify -verbose "$1" >/dev/null; then
    echo "$1" >> done.out
    echo "$1" >> failed.out


There is a possibility that two (or more) of the multiple processes could write to the done or failed file at exactly the same time causing a mess of two lines, but this hasn’t yet happened to me even using 3 threads. I think this is largely due to how long it takes to process an image file, especially a raw file, compared with the short amount of time to write to a file.


If you have any comments or suggestions please let me know.