DigitalVolcano Software Support

Posted: **Sun Dec 17, 2017 3:17 pm**

Hi,

Sometimes I need to find similar but not identical files, and no current system available on Duplicate cleaner can help.

Therefore I suggest to add 2 new criteria.

1) similar size: let the user decide a tolerance for sizes, let's say that user decides to tolarate 5% difference then a file 1.000 bytes long can match with files from 950 to 1.050 bytes. Of course this is mutually exclusive with same size

2) similar name: this is trickier. You could use a similarity algorithm (fuzzy search) , like this https://en.wikipedia.org/wiki/Approxima ... g_matching and let the user decide how similar must be names to be considered identical. This is mutually exclusive with same name.

The actual use case is having tons of files created with different zip level (so slightly different size) and with different naming conventions.

let's say that I have
01-kittens.zip 1.000 bytes
1-Kittens.zip 998 bytes

They appear different, but in my context I should consider them equal, so I would use a similar size tolerance of 3% and a suitable similarity index (it depends on the actual algorithm you would implement for fuzzy search) in order to find this "duplicate".

Of course is up to te user to apply those criteria with responsability and combine them with other in order to actually find duplicates, but I think hey would really help to make this great program even better.

Regards

Luca

Posted: **Mon Dec 18, 2017 10:16 am**

I think the functions you are looking for are both already in Duplicate Cleaner Pro-
https://www.duplicatecleaner.com/manual ... =&sct=MA==

DigitalVolcano Software Support

New duplicate criteria

New duplicate criteria

Re: New duplicate criteria