Page 1 of 1

Identical MD5 but different files...

Posted: Sun Aug 15, 2010 6:11 pm
by FB
Hi,
Searching for duplicates, two files (.jpg) were found with same MD5 but they do have different names and moreover the pictures are indeed not the same. Hopefuly I('ve checked before deleting. Any ideas how it can happen since from what I read it's almost impossible ?
Thanks

Posted: Sun Aug 15, 2010 6:46 pm
by FB
My mistake - it seems I've made an error in comparing the files -

Posted: Sat Aug 28, 2010 8:51 pm
by Stu
I have 2 files in the same folder that are different sizes and pictures. They have similar names (red tailed hawk.jpg and red tailed hawk2.jpg) Scan showed both files with the same with the larger file size and identical MD5.

I am experienced and tried it several times, examined the files, etc.

How can that be?

Posted: Sat Aug 28, 2010 8:57 pm
by Stu
Re: previous post

- Please excuse my sloppy typing

- I am using version 1.4.6, downloaded yesterday - WinXP Pro - 4GB RAM

Thank you

Posted: Sun Aug 29, 2010 2:45 pm
by DV
DC won't even check for an MD5 match if the files are different sizes. They definately shouldn't have the same MD5 at any rate! Is it possible for you to do a screenshot of the duplicate file list?

Posted: Sun Oct 03, 2010 3:51 pm
by anionic
May I add my tuppenceworth here? If two files' MD5s differ, then the files are definitely different, but if the MD5s are identical, the files MAY OR MAY NOT be identical. To be sure of forming duplicate-groups accurately, when a scanned file is found to have the same MD5 as a previously scanned file, their contents should be compared byte-for-byte, and a new group started if necessary.

If DC already does this, I am seriously impressed and will donate �5 :-) but if not, there is a risk (higher than theoretical, as MD5 isn't particularly collision resistant) that a user will be misled into deleting a unique file :-/

The risk is reduced when manually selecting files for deletion if filenames give some reassurance that duplicates are genuine, but I wouldn't e.g. noninteractively hardlink all groups on my hard drive in case it screwed up something...