Page 1 of 1
Needing Enhanced PDF compare/duplicate finding
Posted: Thu Mar 22, 2018 4:38 pm
by Zardoz2293
I can take a PDF file duplicate it and make a single change and DC v4.1.0 will never find the "similar content" even if it's 99.99999% identical. I have thousands of these cases.
Re: Needing Enhanced PDF compare/duplicate finding
Posted: Thu Mar 22, 2018 11:43 pm
by therube
Unable to confirm.
I took a PDF, edited it replacing the 3-bytes, as shown.
Duplicate Cleaner, at its' default of 90% similarity, showed the files (original, edited copy, & a .bak of the edited copy that my editor made) all as "duplicates".
Ensure that you don't have some other setting that is causing the compare to fail?
(I am surprised that DC does not list a similarity "%", which would point out "exact duplicates" vs "similar duplicates. RFE.)
Edit: Looks like I've had that thought before, heh,
RFE: Similar Searches Should Display Similarity %.
Re: Needing Enhanced PDF compare/duplicate finding
Posted: Fri Mar 23, 2018 9:49 am
by DigitalVolcano
It's possible that small changes to a PDF can impact a lot on the file, as it is a binary format, which is why the search isn't picking it up.
Enhanced PDF reading is on the wish list.