Audio file selection variations

The best solution for finding and removing duplicate files.
Post Reply
kbs
Posts: 9
Joined: Thu Jul 30, 2015 9:55 am

Audio file selection variations

Post by kbs »

It appears that selecting both identical and similar audio details and filenames causes the duplicate identification process to increase in time. To select both exact and similar, select the exact first then the similar, which greys out the exact but leaves it selected.

I have an extensive collection (67000) and the 'similar' dedupe (artist, title, filename) finishes overnight, but this greatly extends if the exact and similar are selected. It might be a function of the filename addition - I'm experimenting, but its a lengthy experiment... I will report back later.

What is the compare algorithm for these - identify exacts, then similars, or filenames first then audio tags, or what? The help is a little ambiguous.

Thanks, Keith
User avatar
therube
Posts: 615
Joined: Tue Jun 28, 2011 4:38 pm

Re: Audio file selection variations

Post by therube »

You would expect Similar to also include Same, & from the looks of things, as noted, depending on how selected, you could have it either way.

So you might be correct that different algorithms are used, with some duplication of effort in that regard.
(I have not tested to see if that is the actual case.)

IMO, & if you know that you have "exact" (hashes compare) duplicates, then you should always start with a Regular Mode scan, by Same Content as that will be far faster.

Once that's knocked out, jump over to Audio Mode scan.

And with that, anything you can do to filter, like only specific directory trees, rather then all files at once, should help too. Once they're knocked out & no easy way to filter further, then broaden your scans to pick up on the not so obvious.

Might take more effort on your part, physically setting up & running the scans, but I'd think you'd save much time (& that assumes that your data is in some fashion conducive to filtering; naming, tagging, directory layout...).
kbs
Posts: 9
Joined: Thu Jul 30, 2015 9:55 am

Re: Audio file selection variations

Post by kbs »

I didn't come to any conclusion over the length of time with my testing - it might have just been dependent on which way the wind was blowing at the time! - but the main problem with audio files is that the tag info can change, altering the file hash, so identical audio files can be rare. The compare audio only is a boon which resolves this!

One minor niggle I've just tripped over - names and titles with 'and' or '&' don't necessarily flag as dupe/similar when comparing audio tag/filename info - it might be that '&' is treated as a special case.

Regards, Keith
User avatar
DigitalVolcano
Site Admin
Posts: 1727
Joined: Thu Jun 09, 2011 10:04 am

Re: Audio file selection variations

Post by DigitalVolcano »

This is a minor bug - If 'Same title' and 'Similar title' are both checked (even though 'Same title' is disabled) then both checks will be done, slowing the scan a little.

Will be fixed in v3.2.7!
Post Reply