Feature requests: Re-analyze without re-scan & Custom define "similar filenames"

wwcanoer · Post by **wwcanoer** » Wed Feb 24, 2021 2:04 pm

Currently, one must select the scan criteria before the scan. It would be able to re-analyze for duplicates without having to re-scan the drive. (Obviously this only works for data that was included and saved in the first scan.)

Ex. There's a trade-off between using MD5/byte-by-byte with or without filename.

If I use it with filename then I have leftover duplicates copies with (1) or (2) or .bak etc. and duplicates with my sync-program suffix (ex. filename.d20210224.txt).

If I use without filename then I get a lot of small 0 byte files and small files that intentionally have the same content but different names but should be kept. (zero size files can be ignored but then I would be left with a bunch remaining that should be deleted. Plus applies to non-zero files too.)

Since my results are too long to manually review, I would like to first run "filename + byte-by-byte" to remove the safest duplicates, then re-analyze with "byte-by-byte" only, which will now give a shorter list that I can manually check. Currently, if I did that, I have to wait for the drive to be scanned again even though there's no need to because the only changes were the deletions made in the program.

Also, I've tried "Similar Filenames" but it doesn't recognize my case: My Sync program adds a suffix to file versions. (ex. "filename.20210224.txt") So, I'd like to be able to define what "similar" means, if that's possible. (Essentially define a string to ignore, ideally using wildcards, so that you strip out the string from the filename before storing it for comparison to other filenames.)