Same Content vs. 99.9% Similar Content with same name & size

MAS · Post by **MAS** » Thu Mar 29, 2012 2:05 pm

This may be difficult to answer, but is there a way to determine why the program sees two files as not identical? I'm scanning two hard drives which contain pretty much the same content, but after doing a same content scan and removing the duplicates from one, there are always a handful of files left that have the same file name, size, and usually a similar file path. These files get selected when i do a 99.9% similar scan. I have no reason to think the files would be different, especially for PDFs, but want to be sure they are the same before I delete.

Thanks

Post by **DigitalVolcano** » Fri Mar 30, 2012 1:26 pm

The files must have a tiny difference (eg some kind of metadata date stamp in the header). Another reason could be corruption, and DC is failing when trying to read the contents. Try and see if they produce a different value using the Digitalvolcano Hash Tool.

DigitalVolcano Software Support

Same Content vs. 99.9% Similar Content with same name & size

Same Content vs. 99.9% Similar Content with same name & size

Re: Same Content vs. 99.9% Similar Content with same name &