Comparing and Consolidating drives or folders

The best solution for finding and removing duplicate files.
Post Reply
User avatar
DigitalVolcano
Site Admin
Posts: 1280
Joined: Thu Jun 09, 2011 10:04 am

Comparing and Consolidating drives or folders

Post by DigitalVolcano » Tue Jan 29, 2019 10:04 pm

​Consolidating Drives/Folders/Backups with Duplicate Cleaner Pro 4.

A rough guide. This tutorial can evolve - all suggestions welcome!
There are different ways to do this, but this should be the method that involves the least copying.

There is a tutorial video "Finding Unique files with Duplicate Cleaner" which covers steps 4a-4d here: https://www.youtube.com/watch?v=lbYFB5w-4nM

1. Be sure to have backed up!

2. Pick one drive/folder to be the "base" (or master). Preferably the one with the most files in it - this will reduce the amount of copying you'll have to do later. We'll call this the base from now on.

3. It's probably best to de-duplicate your base at this stage if required using Duplicate Cleaner. Regular Mode-Same Content is always recommended for a first pass. Of course the duplicates removed are up to you - you may want to keep copies of certain files in different places.

4. Run a comparison of the base with the folder(s)/drive(s) you want to merge in. This will determine which files are missing from the base. You can achieve this with Duplicate Cleaner using the following steps:

a. Set up the criteria to find the 'Same Content' with no other restrictions
b. In the Scan Location tab, add the base folder and the other drives/folders for comparison.
c. Set 'Scan against self' to 'No' for each of the folders in the list.
d. Set 'Find Uniques' to 'Yes' for each of the comparison drives/folders *except* the base, which should have 'Find uniques' set to 'No'
e. Start the scan
d. When the scan is complete the Unique Files tab should show any files that are missing from your base. The Duplicate files tab shows files shared between the drives.

5. You can now use the 'File Removal' window to copy the files in the *Unique* tab to your base drive.
--You can quickly mark all the files in the unique tab using the Selection Assistant or right-click context menu and selecting 'Invert marked files'.
--These can be copied in. It's up to you whether to preserve the source folder structure but it's a good idea to copy them into a new subfolder.

Note 1: The merged-in files may have contained duplicates which weren't present originally in the base (See video tutorial for an explanation). You may have to de-duplicate the base again at the end.
rbflapjack
Posts: 4
Joined: Fri Jan 10, 2020 2:24 pm

Re: Comparing and Consolidating drives or folders

Post by rbflapjack » Fri Jan 10, 2020 2:26 pm

Is there a way to do this between one folder in a drive and the rest of the drive? I have a sloppy backup folder within a drive that I believe has it's contents scattered throughout the rest of the drive so I want to delete the whole folder. Can I compare this folder to the parent drive minus that folder?
Hugo
Posts: 2
Joined: Sun Dec 08, 2019 11:30 pm

Re: Comparing and Consolidating drives or folders

Post by Hugo » Tue Jan 28, 2020 9:23 pm

I did exactly what you describe here:
Started with de-duplicating both file sets.
My base set is about 100000 files
The other set is about 1000 files
Since I deduplicated both sets, no hashes have to be calculated.
I would expect it to do 1000 file compares (only the new files) but it does 100000 file compares.

My use case is having a base set of files and being able to add only unique files from small file sets. Is there a way to make this process fast and avoiding unnecessary file compares.
Lovaduck
Posts: 1
Joined: Mon Apr 20, 2020 3:07 pm

Re: Comparing and Consolidating drives or folders

Post by Lovaduck » Mon Apr 20, 2020 3:30 pm

First of all I want to congratulate the developer(s) for a great product. I have been trying to consolidate my backups for over ten years and the tools out there (like Fast Duplicate File Finder) simply didn't cut it. After using the trial for less than a day I was convinced it was the solution and I bought it without hesitation! (I am now productively using my lockdown days!)
The interface and operation are very intuitive. The options are many, the use of geo tags for finding duplicates is a great idea (that can also be repurposed to group photos by location). The product is blazing fast! Brilliant! Lovely! And it handles folders, which is a great plus as many folders are complete duplicates and can be swiped with one check mark. A great time saver function!

I have a couple questions:
One, when I save the duplicate file list (.csv) , am I getting all the information required to restart the work at a later date? For example, I am doing a search by image similarity (setting in high this time). Can I do something else, and then come back and restart cleaning from where I was? Or at least would it reduce the amount of work required to restart by a significant factor? I can test this but that would mean redoing five hours of heavy duty calculation....
Two, is the aforementioned list updated each time I clear some files, so I won't be presented with files that have been removed already?

On a different subject, I did a de-dup process on an external drive. When using the "move to folder" option, instead of deleting all files, the process took about eight hours to complete. The target folder was on the same disk so my thought was that it would simply move file by file without copying. Was there any copying involved therefore?

Thanks again, and I please feel free to use my review comments about the program as you see fit.
User avatar
DigitalVolcano
Site Admin
Posts: 1280
Joined: Thu Jun 09, 2011 10:04 am

Re: Comparing and Consolidating drives or folders

Post by DigitalVolcano » Fri Apr 24, 2020 7:24 am

Glad you are finding Duplicate Cleaner useful!

The program does save where you were between sessions, but if you mean you want to perform a different scan, then come back to earlier results, the answer is 'Mostly, yes'.

You can import the Duplicate Files csv back into the program and it will retain all your marked files. However, the list does not contain all the data required to calculate the duplicate folder tab, so this tab will be blank.

One hack around this is to back up the DuplicateCleanerPro.data file and copy it back later to restore the program to an exact previous state. (you may loose later cached data though). Version 5.? will allow the saving and backup of full scans.
Note that re-scanning images is much faster as the Image metrics are cached.

When files are deleted they will drop off the duplicate file list. If one single file is left in any group this will be hidden from the list as well.

The move option should just use the same operation as Windows Explorer, so should be fast in theory. If you were moving between partitions on the same drive or it is a NAS then it could be slower.
Post Reply