Page 1 of 1

Request: A way to scan duplicates according to specific metadata

Posted: Fri Jan 17, 2025 2:04 pm
by Soeroah
Hi,

I'm not entirely sure how feasible this is, but as someone with a lot of files across a few folders who routinely scans to check for duplicates, I'd love a way to be able to sort of filter out a lot of files after the "reading metadata" stage before the "matching duplicates" stage, to cut down on how long the total search time takes. Sometimes I just want to run a duplicate scan against a few thousand files instead of the whole folder.

For example, assume I have a particular string in the Comments field of files, or I just want to scan any files that have *anything* in the Authors field. I'd like to be able to, in the Scan Location tab, set up Keywords: 'SpecificKeyword' or Authors: 'SpecificKeyword' or even just 'Yes', and on running the scan have the software check the folders, ignore any files without those specific keywords or Author name or a blank Author field, and then continue the scan from that point, instead of scanning duplicates against all items in the folders.

I'm not quite sure if I'm making that clear enough, but it would save a lot of time in doing "I'll just check I don't already have this file" type scans if I know what metadata I would add to the file and can quickly just scan from a smaller virtual list of files with said metadata. At the moment I've got an entire second folder I keep only files I expect to run into duplicates more often in in an attempt to make scans faster but if the program had a built-in 'filter out files that don't have these metrics before running the matching duplicates subroutine' that would save me both time and storage space. It just took me half an hour to do a duplicate scan to check a dozen files against my smaller storage folder when being able to just say "scan against the master folder for any files with X in the Authors field" probably would have ended the scan after five minutes, for example.

I did a bit of a bad moc-kup to try to illustrate what I'm thinking, but again, I don't know how viable this would be as a suggestion, but figured I'd try since you added the Keywords (Sorted) option, which was very helpful
1.jpg

Re: Request: A way to scan duplicates according to specific metadata

Posted: Fri Jan 31, 2025 1:57 pm
by MegMac
What you want can be achieved, but not the way you are thinking.

In Scan Criteria, include any tags you want to access and set them to Display only.
Then on the Duplicate Files tab, use the 'Mark by Text pattern' section of the Selection Assistant.

Select the tag you want to search from the drop down list.

Check the box "Use Regular Expressions", enter a period in the text field.

Click Mark.

Any marked files will have something in that field. (but sometimes it's a space so it looks like there's nothing)

OR

Use 'Find in List' (click the magnifying glass icon at left) to look for specific content in a field.

Re: Request: A way to scan duplicates according to specific metadata

Posted: Sun Mar 09, 2025 3:28 am
by Soeroah
Hmmm.. Thank you, but that seems to be more about filtering the results after a scan has been completed. What I'm hoping to do is make scans faster by telling the program to ignore files that don't contain X metadata during an earlier stage of a scan before it gets to properly matching duplicates

Sort of like scan all the files' metadata, ignore several tens of thousands of files because they don't contain a specific piece of metadata, and just match duplicates within the remaining files, thus saving scan time during the matching duplicates phase

Re: Request: A way to scan duplicates according to specific metadata

Posted: Sun Mar 16, 2025 9:41 pm
by MegMac
Soeroah,
I understand you goal. That is not currently possible, but I would like that feature as well. I was just trying to help you achieve your goal with the application as it is now.