Suggestions

The best solution for finding and removing duplicate files.
Post Reply
Mulvaney
Posts: 4
Joined: Sun Apr 06, 2014 5:05 pm

Suggestions

Post by Mulvaney »

Hi

I'm a happy DC Pro customer with a few suggestions (I've checked the first 5 pages of the forums, so hopefully they're unique?)

******

1. RESET AND RUN
Can we have a "reset and run DC" icon? I've had a 500MB duplicatecleanerpro.data file that, even after crashing out, caused the program not to run, until I found your hint on the forum:
"You can reset Duplicate Cleaner (and the cached information) by renaming/deleting the duplicatecleanerpro.data file in your C:\Users\[**Your Username**]\AppData\Roaming\DigitalVolcano\DuplicateCleaner\ folder."

The icon could perhaps remove it for you before running DC, or run DC in "safe mode" to allow you to get to a button to clear the cache (is there one by the way)?

***

2. SEED LOCATION
Can we have a "seed location" selector, whereby you can check for duplicates of one file / folder / drive) against many i.e.

FIRST (SEED) LOCATION: d:\pictures (which contains photo.jpg and snapshot.jpg)
OTHER SCAN LOCATIONS: c: / e: / f:

This won't check to see if c: / e: / f: also have any duplicate files, it will only check to see if photo.jpg and/or snapshot.jpg exists on c: / e: / f:

The "Don't scan against self" is an excellent feature which goes some way towards this, but you'd have to do them in pairs (put the files to check in a folder on their own, and scan against one other place or drive, with "Don't scan against self" enabled). Then repeat that for every drive or location. Otherwise it looks to see if there are *any* files in location c:, e: or f: that match any other files.

I appreciate that I could put in photo.jpg and snapshot.jpg as the filenames, but I'm looking for files with the same hash, but a different filename, so that wouldn't work in this case.

***

3. EXCLUDE INCLUDED PATH (!)
It might seem counter-intuitive, but I'd like a search path to be able to be a sub folder of another path, that you can exclude from the other path, rather than getting the error "<path name> is included, Removing it from the search list". This is in situations more like suggestion 2. above with "Don't scan against self" allowing you to "pair up" one folder against another.
e.g. I'd like to see if e:\music\2014\ (containing track1.m4a and track2.m4a) is duplicated anywhere else on e: - without having to move the 2014 folder to the root and comparing it (one folder at a time) to every other folder in the root.

example folder structure:
e:\music\2014
e:\checkthese
e:\2013
e:\ (various files in the root)

because the e:\music\2014 folder is included in the e:\ tree, it won't let you compare it with e:

I'd have to move the "2014" folder to the root, then compare it with "checkthese", then start again and compare it with "2013", then start again and compare it with "music", then start again and move the loose files from the root to a new folder for comparing.

Otherwise, I suppose I could move everything in the root ("music" / "checkthese" / "2013" / the loose root files) to a "compare" folder and the "2014" folder to the root, so there are only two folders in the root, but with many thousands of files and other programs constantly relying on the folder structure remaining permanently intact, that would be a pain.

***

4. IGNORE METADATA?
I don't know how the metadata in picture (exif?) and audio (id3? not sure what m4a has?) files works as such, but I know that when adding comments to a jpg or changing something in m4a metadata, it actually alters the file, making it different to the original.

I have many photos that at least seem to be 100% identical, pixel for pixel, same resolution, bit rate, everything, and yet they're different - I assume because of the metadata. Is the metadata contained in a specific, fixed offset / first few bytes etc in the file and therefore separate from the actual photo / audio file? If so, can that metadata be ignored by DC Pro, and the actual audio data / picture data compared, so that regardless of comments and things added, you will know if a file is identical in every other sense? The 99% option is great (as are the rotated and other options for jpg) but they'll also falsely identify "burst mode" snapshots that change very little from one shot to the next.

******

MANY thanks for your time! It (as well as this fantastic program) are hugely appreciated, you've saved me many hours (and GB!) over the years!

:o)
User avatar
DigitalVolcano
Site Admin
Posts: 1717
Joined: Thu Jun 09, 2011 10:04 am

Re: Suggestions

Post by DigitalVolcano »

Thanks for your suggestions - I've added them to the database.

1. RESET AND RUN
-Good idea. There could be a /RESET command line parameter added with a short cut in the installation folder.

There is a button to clear the image cache. If you want to clear the last scan upon exiting, you can uncheck the 'show results of last scan on start' option.


2. SEED LOCATION
good idea - need to find a nice way to do this that will be obvious to the user

(EDIT - suggested here - similar scenario?)
viewtopic.php?f=4&t=1172

3.
Will look into this.

4. This is already in the 'to-do' list for Image mode. You can already do this in audio mode (Compare Audio data only) - it relies on the audio data matching exactly though (same bit rate, encoding, etc)./

Thanks!
Mulvaney
Posts: 4
Joined: Sun Apr 06, 2014 5:05 pm

Re: Suggestions

Post by Mulvaney »

Hi

Thanks for your reply - is it possible to subscribe to posts in the forum to be notified of replies, as I didn't receive an email, I just happened to pop by, soon after you posted?

1. RESET AND RUN
I guess unchecking "show results of last scan on start" would work (does that actually do it on exit rather than startup though, i.e. would it work if I'd had to crash out?) but the shortcut would be better, as it's often useful to have the last results displayed and I'd be inclined to leave that set.

2. SEED LOCATION
Yep that's the same general idea.

3.
This wouldn't be necessary if seed location was implemented, although it would have to be able to be a sub location of other locations.

4. Nice!
User avatar
DigitalVolcano
Site Admin
Posts: 1717
Joined: Thu Jun 09, 2011 10:04 am

Re: Suggestions

Post by DigitalVolcano »

You can click on 'Subscribe topic' at the bottom left of the page.
Mulvaney
Posts: 4
Joined: Sun Apr 06, 2014 5:05 pm

Re: Suggestions

Post by Mulvaney »

Ta!
kira13
Posts: 6
Joined: Thu Jul 10, 2014 7:18 pm

Re: Suggestions

Post by kira13 »

This is an older post, but I wanted to add my "votes" for items 2 and 4 in Mulvaney's list.

I originally downloaded Duplicate Cleaner to de-duplicate backups that my Windows Home Server made when I restored the backups to a different computer in preparation for making that computer my new server. Windows Home Server v.1 automatically de-duplicated backups, so that any given file was stored once, and if the file appeared in multiple locations or was stored again during subsequent backups, there were pointers to the original file instead of copies. So I didn't have enough drive space to store the restored copies of the backups, with all the actual copies instead of pointers.

I would have used Mulvaney's option 2 to pick the newest backup folder (I put the restored backups in folders named with the date of backup) as the seed and check all the others against it. I had to wade thru a lot of duplicates within the same backup that were also duplicated in the newest backup. I had to run the scans multiple times because even with upping the duplicate files limit from 500,000 to 1,000,000 I was still finding more duplicates than that.

As for option 4, I just went thru my own thread of difficulty finding duplicate photos that a new import program gave me when it imported all my photos over again and not just the new ones. Option 4 would have prevented that issue; the import program was adding info to my metadata that prevented normal scans from finding my duplicates.

Thanks,
Kira
jackThom
Posts: 16
Joined: Tue May 27, 2014 7:21 am

Re: Suggestions

Post by jackThom »

Figured I'd add mine from a while back...

I started a thread (here) to ask if there was a way to accomplish what I was trying to do, but it seems the functionality is not yet available... (I'll just continue the original numbering)

5) "scan only against self" option for file paths

6) In the Duplicate Files selection assistant, under "Mark" options, an additional two options for each category...adding "in each path" Mark options to the currently existing "in each group" options.
Post Reply