Needing Enhanced PDF compare/duplicate finding

The best solution for finding and removing duplicate files.
Post Reply
User avatar
Zardoz2293
Posts: 10
Joined: Wed Feb 29, 2012 10:52 pm

Needing Enhanced PDF compare/duplicate finding

Post by Zardoz2293 »

I can take a PDF file duplicate it and make a single change and DC v4.1.0 will never find the "similar content" even if it's 99.99999% identical. I have thousands of these cases.
User avatar
therube
Posts: 615
Joined: Tue Jun 28, 2011 4:38 pm

Re: Needing Enhanced PDF compare/duplicate finding

Post by therube »

Unable to confirm.

Image

Image

I took a PDF, edited it replacing the 3-bytes, as shown.
Duplicate Cleaner, at its' default of 90% similarity, showed the files (original, edited copy, & a .bak of the edited copy that my editor made) all as "duplicates".

Ensure that you don't have some other setting that is causing the compare to fail?



(I am surprised that DC does not list a similarity "%", which would point out "exact duplicates" vs "similar duplicates. RFE.)
Edit: Looks like I've had that thought before, heh, RFE: Similar Searches Should Display Similarity %.
User avatar
DigitalVolcano
Site Admin
Posts: 1725
Joined: Thu Jun 09, 2011 10:04 am

Re: Needing Enhanced PDF compare/duplicate finding

Post by DigitalVolcano »

It's possible that small changes to a PDF can impact a lot on the file, as it is a binary format, which is why the search isn't picking it up.

Enhanced PDF reading is on the wish list.
Post Reply