Issues with "Similar" functions

The best solution for finding and removing duplicate files.
Post Reply
Romy
Posts: 3
Joined: Sun Nov 05, 2017 3:17 am

Issues with "Similar" functions

Post by Romy »

Hello,

I'm quite happy with this fabulous software that helps me to make some cleaning on my disks. However, I have some issues sometimes.

I have 2 songs which filenames are the following:
Sunnery James & Ryan Marciano - Tribeca (Original Mix) (ResidentDJ.org).mp3
Sunnery James & Ryan Marciano - Tribeca (Original Mix).mp3
However the "Similar Filenames" seems to avoid detecting these songs as duplicates. I had to shorten the first filenames to "Sunnery James & Ryan Marciano - Tribeca (Original Mix) ().mp3" to be able to detect them.

By the way, I also remarked that "Similar Size" function doesn't work all the times even for these two files. If they are in the same folder, all is fine, but when I move a song to another folder full of songs, the "Similar size" function (with some Bytes Tolerance inserted) doesn't select them as duplicates.

Could you tell us more about how these "Similar" function works or improve them regarding my comments ?

Thanks a lot in advance

PS: It seems also that "French translations" need some improvements. I could bring some translations if needed, but what would be the process ?
User avatar
DigitalVolcano
Site Admin
Posts: 1725
Joined: Thu Jun 09, 2011 10:04 am

Re: Issues with "Similar" functions

Post by DigitalVolcano »

' Similar Filenames' won't detect your first example, as they aren't similar enough (generally it detects where just 3 or 4 characters are different). You might have more luck in Audio mode if the files are tagged.

'Similar Size' doesn't look at folder name - are you sure you didn't also have the 'Same folder' option checkmarked? (You can check this in the log file).
Romy
Posts: 3
Joined: Sun Nov 05, 2017 3:17 am

Re: Issues with "Similar" functions

Post by Romy »

Thanks for the reply.

It would be helpful to be able to set the similarity between filenames (percentage, or number of letters), I think that 3 or 4 characters are not enough, but this is only my opinion.

For 'Similar Size', I saw that my issue was not due to the folders but maybe to a set of criteria in specific mode.

Unfortunely in my example, the 2 songs were not tagged identically, they just have same length, bit rate and sample rate. But this is funny because :
  • They are detected as duplicates in Regular Mode with 300000 bytes tolerance
  • They are detected as duplicates in Audio Mode with same length, bit rate and sample rate
  • They are not detected as duplicates in Audio Mode with same length, bit rate and sample rate + 300000 bytes tolerance (and even more)
Post Reply