Whittling down duplicates from very large collection.

The best solution for finding and removing duplicate files.
bsacco
Posts: 66
Joined: Sun Jan 02, 2022 9:47 pm

Whittling down duplicates from very large collection.

Post by bsacco »

I have a large collection of photos and videos.

Too large to do a one-time scan for duplicates.

I have a MASTER folder where I want to keep all my deduped photos and videos.

What is the best practice/method for comparing another folder full of potential duplicates against my MASTER folder?

Is the answer setting my MASTER (de-Duped) folder as the "Master" in SCAN LOCATION>FIND DUPLICATES (Dropdown menu), AND then setting the folder with the potential duplicates in SCAN LOCATION>FIND DUPLICATES (Dropdown menu) to "EXTERNAL ONLY"?

Can someone please provide me step-by step-instructions because I find the documentation very confusing and not user-friendly.

Thanks, bob
bsacco
Posts: 66
Joined: Sun Jan 02, 2022 9:47 pm

Re: Whittling down duplicates from very large collection.

Post by bsacco »

I have tried asking AI how to do this using DCP5 but all I get is general directions not step-by-step.

Duplicate Cleaner Pro 5 provides several effective strategies for managing large collections of duplicates:
Initial Setup and Scanning

Start by configuring your scan criteria carefully. Use the file type filters to focus on specific categories (photos, documents, music) rather than scanning everything at once. Set appropriate file size thresholds to exclude very small files that are likely system files or thumbnails.
Smart Selection Methods

The software offers multiple selection modes in the results view. Use "Select Oldest" or "Select Newest" to automatically mark files based on creation dates. The "Select Shortest Path" option helps keep files in more organized folder structures while removing those buried deep in subdirectories.

Preview and Verification
Before deleting anything, use the built-in preview feature to verify duplicates. This is especially important for images and documents where file names might be misleading. The software shows file details like dimensions, creation dates, and folder paths to help you make informed decisions.

Batch Processing Approach
Rather than trying to process everything at once, work in batches. Start with the most obvious duplicates (identical file names and sizes) before moving to more complex matches. This reduces the risk of accidentally deleting important files.
Safe Deletion Options
Use the "Move to Recycle Bin" option rather than permanent deletion initially. For even more safety, consider using the "Move to Folder" feature to relocate suspected duplicates to a review folder before final deletion.

Advanced Filtering
Leverage the advanced filtering options to exclude certain folders (like system directories) or include only specific file extensions. You can also set up ignore lists for files you know you want to keep multiple copies of.
The key is working systematically and verifying your selections before committing to deletions, especially when dealing with large collections where manual review of every duplicate isn't practical.

---------------------------------------------------------------

Any chance I can get step-by-step instructions on how to specifically use DCP5 to achieve my goal of de-duping a large collection?
User avatar
DigitalVolcano
Site Admin
Posts: 1866
Joined: Thu Jun 09, 2011 10:04 am

Re: Whittling down duplicates from very large collection.

Post by DigitalVolcano »

Don't bother with AI - it will get confused and will give wrong/outdated information.

The process:
MAKE SURE YOU HAVE A SEPARATE BACKUP FIRST

- Set "Master"folder to 'External only + Master' and 'Protected'
-Set "potential duplicates" folder to 'External only'
-Run a Regular mode->Same content scan

You'll now have a list of files that appear on both the master and the potential folder. The master folder ones are protected. (check this)

If you want to delete the duplicates from the 'potential' folder-
-Use the selection assistant
-Mark 'All but one in each group' . This will mark all duplicates on the 'potential' folder, not the master folder.
-DOUBLE CHECK YOU'VE PROTECTED THE MASTER FOLDER. There should be nothing marked here if it is protected.
-Use the File Removal-> Delete function to delete the dupes. Send them to recycler if there isn't too many.

You'll now be left with non-duplicate files in the 'potential' folder.
bsacco
Posts: 66
Joined: Sun Jan 02, 2022 9:47 pm

Re: Whittling down duplicates from very large collection.

Post by bsacco »

Thank You! Thank You! Thank You! Thank You! Thank You! Thank You! Thank You! Thank You! Thank You!

The most powerful info I received on this forum to date!
Post Reply