This may be difficult to answer, but is there a way to determine why the program sees two files as not identical? I'm scanning two hard drives which contain pretty much the same content, but after doing a same content scan and removing the duplicates from one, there are always a handful of files left that have the same file name, size, and usually a similar file path. These files get selected when i do a 99.9% similar scan. I have no reason to think the files would be different, especially for PDFs, but want to be sure they are the same before I delete.
Thanks
Same Content vs. 99.9% Similar Content with same name & size
- DigitalVolcano
- Site Admin
- Posts: 1864
- Joined: Thu Jun 09, 2011 10:04 am
Re: Same Content vs. 99.9% Similar Content with same name &
The files must have a tiny difference (eg some kind of metadata date stamp in the header). Another reason could be corruption, and DC is failing when trying to read the contents. Try and see if they produce a different value using the Digitalvolcano Hash Tool.