"No duplicates found" -- but many duplicates are "unique"?

The best solution for finding and removing duplicate files.
Post Reply
ikjadoon
Posts: 2
Joined: Sat May 12, 2018 1:57 am

"No duplicates found" -- but many duplicates are "unique"?

Post by ikjadoon » Sat May 12, 2018 2:19 am

I've either gone mad/stupid or there's a bug here.

Goal: I have two folders, "English Learning Materials" and "English Learning Materials 1". They used to be a direct copy of each other (made via Windows Explorer many moons ago). But since then, I've been editing in either/both so now I want to delete all the duplicates.

Issue: Duplicate Cleaner Pro (just bought it today and been pouring through the manual to no avail) has found no duplicates. They're all listed as "unique" even though it's the same file name & extension, same modified date, same folder structure even.

Does anyone see my obviously boneheaded mistake?

Image

Image

Image

Image
^^Why are all these being marked as unique? These are definitely duplicates.

Settings: no Master as I want all files inspected, no Scan Against Self--I want between the two folders, I do want Unique files (i.e., only created a Word doc in "English Learning Materials 1"), and yes to subfolders.

What I've tried so far:
  • Closing and re-opening the program
  • Making a new search
  • Resetting all settings/data via "More Options"
  • Restarting my computer
  • Checking "Zip Files" and Unchecking "Don't Follow NTFS Mountpoints..."
  • Ensuring the status is "Included" for both folders
  • Unchecking "Protect important system folders" (though this is definitely an external drive without Windows installed)
  • Ensuring the different Created dates aren't causing issues: "Same content: This option will find files with exactly the same content inside, regardless of name or date. Duplicate files matched with this technique will always be the same size."
  • Uninstalling and re-installing the program
  • Disconnecting and re-connecting my drive
System details:

Windows 10 Professional, x64 (1803--build 17134.48)
Duplicate Cleaner Pro (registered/paid version) 4.1.0
Drive is a 4TB Seagate Backup Plus, formatted to exFAT, and connected directly to the motherboard

Let me know if you have any questions. I really must be insane.
User avatar
therube
Posts: 493
Joined: Tue Jun 28, 2011 4:38 pm

Re: "No duplicates found" -- but many duplicates are "unique

Post by therube » Sat May 12, 2018 12:07 pm

In 'Scan location', disable 'Find uniques'.


And the reason, I'm pretty sure is going to be:
Note:
If the unique folder is also set to 'Don't scan against self' then all files which do not have duplicates in other folders will be listed in this tab - even if they have duplicates within their own folder. This is useful for drive comparison situations.
(Can't say I've used the Unique feature & would really have to think about it to have a good feel for what's going on? Though I think, with the particular settings you've used, Unique ends up subverting the Duplicate (Same content) selection you're actually wanting.)
User avatar
DigitalVolcano
Site Admin
Posts: 1310
Joined: Thu Jun 09, 2011 10:04 am

Re: "No duplicates found" -- but many duplicates are "unique

Post by DigitalVolcano » Sat May 12, 2018 4:34 pm

Yes, that's it.

If a drive/folder is set to 'Don't scan against self' and 'find uniques' then any files that aren't duplicated on other drives are listed in the unique tab. This behavior is very useful for drive comparison.

It's mentioned in the comparison tutorial (around 1:30)
https://www.youtube.com/watch?v=lbYFB5w-4nM&t=2s

Try it with the unique flag turned off for both folders.

It looks like you don't have any duplicates between the two folders, else it would have shown in the duplicates tab?
ikjadoon
Posts: 2
Joined: Sat May 12, 2018 1:57 am

Re: "No duplicates found" -- but many duplicates are "unique

Post by ikjadoon » Sun May 13, 2018 3:36 am

therube wrote:In 'Scan location', disable 'Find uniques'.


And the reason, I'm pretty sure is going to be:
Note:
If the unique folder is also set to 'Don't scan against self' then all files which do not have duplicates in other folders will be listed in this tab - even if they have duplicates within their own folder. This is useful for drive comparison situations.
(Can't say I've used the Unique feature & would really have to think about it to have a good feel for what's going on? Though I think, with the particular settings you've used, Unique ends up subverting the Duplicate (Same content) selection you're actually wanting.)
DigitalVolcano wrote:Yes, that's it.

If a drive/folder is set to 'Don't scan against self' and 'find uniques' then any files that aren't duplicated on other drives are listed in the unique tab. This behavior is very useful for drive comparison.

It's mentioned in the comparison tutorial (around 1:30)
https://www.youtube.com/watch?v=lbYFB5w-4nM&t=2s

Try it with the unique flag turned off for both folders.

It looks like you don't have any duplicates between the two folders, else it would have shown in the duplicates tab?
Thank you both for the replies.

There are actually many duplicates! There are 2,821 duplicates and 6 unique files, I learned, after running DPC a few times and manually combining the results.

The drive comparison feature is exactly what I want: compare two drives to see exactly the differences between them. What's on drive A only, on drive B only, and on both A & B (duplicates). This is what's happening right now with DPC:

Unique Checked on nothing: 2,821 duplicates found
Unique Checked on Drive A: 2,821 duplicates found + 4 unique files on drive A
Unique Checked on Drive B: 2,821 duplicates found + 2 unique files on drive B
Unique Checked on Drive A & B: no duplicates found + 2,287 unique files

I thought by checking Unique on both A & B, it would yield duplicates + unique files on drive A + unique files on drive B. Right? To compare the two drives? Am I misunderstanding the drive comparison feature? The video shows unique files on only one drive--I'd like to see unique files on both drives (I thought it might be separated by a divider like "Unique in Drive A | Unique in Drive B").

>If a drive/folder is set to 'Don't scan against self' and 'find uniques' then any files that aren't duplicated on other drives are listed in the unique tab. This behavior is very useful for drive comparison.

Ah, this issue is a bug? Don't scan + find unique files on both actually listed files in the unique tab that are duplicates. See the previous screenshot: how are these files "unique"?

Image

These files are duplicated on the other folder, but are called "unique".

The workaround is to set just one folder as unique and then keep running new scans by changing every folder to unique and resetting the rest; then, combine all the unique files and arrange them by folder. I'd hoped DPC could make it all in one go, i.e., the drive comparison feature. It's more like a "one-way drive comparison feature", but I was expecting "two-way"--unique files, on either drive, are found in a single scan.

EDIT: see here all the duplicates that are found with unique is not checked on either drive.

Image
User avatar
DigitalVolcano
Site Admin
Posts: 1310
Joined: Thu Jun 09, 2011 10:04 am

Re: "No duplicates found" -- but many duplicates are "unique

Post by DigitalVolcano » Tue May 15, 2018 11:54 am

Thanks for all the detailed info!

It's not a bug - it is working as designed, though I agree there are issues with the terminology used in the program. If you replace 'Unique' with 'Remaining' or 'left over' it starts to make more sense. This is something i hope to address in a future update, and is partly down to the way the program evolved.

It is by design that duplicate files can end up in the unique tab (if set to 'don't scan against self') because in a comparison situation you'd want to see all remaining files missing when comparing two folders - even if they happened to be duplicate only within one folder. Essentially 'Don't scan against self' means treating every file in that folder as unique until matched against another file in a 'Scan against self' and 'Not unique' folder.

So you are correct, currently you can't do a two-way drive comparison at once - it has to be done in multiple scans.

This is something we hope to address, along with the confusing terminology. Hope this makes sense! Feel free to throw any suggestions this way.
Shane
Posts: 10
Joined: Mon Nov 02, 2015 3:43 pm

Re: "No duplicates found" -- but many duplicates are "unique

Post by Shane » Sun May 20, 2018 7:31 am

I've just run into this problem (again) too, it is indeed very confusing and becomes worse as the number of folder trees you're trying to compare (and thus the number of scans) increase.

I would suggest eliminating the awkward interaction (between duplicates and uniques) by replacing "Scan against self: Yes / No" and "Find uniques: Yes / No" with the following...

"Find duplicates: No / Internal / External / Anywhere" and "Find uniques: "No / Internal / External / Anywhere".

Use case examples:
  • Find all instances of duplicate and unique files (the OP's task) = set all locations to "Find duplicates: Anywhere" and "Find uniques: Anywhere"
    (if I've misunderstood the OP and they don't want to find duplicates within each of the trees, only between trees, then "Find duplicates: External")
  • Find all files in location X that are duplicated in other locations = set "Find duplicates" for X to "External" and for all others to "No"
  • Find all files in locations X and Y that are duplicated in other locations and anywhere respectively = set "Find duplicates" for X to "External", for Y to "Anywhere" and all others to "No".
  • Find all files that are unique to location X = set X to "Find uniques: Anywhere" and all others to "Find Uniques: No".
Note that using "No" is NOT the same as excluding a location from being compared in the first place or (un)marking afterwards; it instead enables setting "directions" for comparison that allows further granularity.
User avatar
DigitalVolcano
Site Admin
Posts: 1310
Joined: Thu Jun 09, 2011 10:04 am

Re: "No duplicates found" -- but many duplicates are "unique

Post by DigitalVolcano » Wed May 23, 2018 12:54 pm

Thanks for the suggestions Shane. Some good ideas.
Will be re-visiting the whole unique/scan self area in a future update (possibly 4.1 or 5.0)
Post Reply