How do I speed up DC pro?

The best solution for finding and removing duplicate files.
BinaryFinery
Posts: 2
Joined: Tue May 13, 2014 10:31 am

How do I speed up DC pro?

Post by BinaryFinery »

I started a duplicate scan on 2TB Seagate HDD and it initially went fine, but it's now been running continuously for 48hrs and is only 52% complete. It is currently checking 1 file per second with over 220,000 files to go, which means it is going to take nearly a week to complete. I'm loath to stop it, because I haven't used it before and don't know whether that will lose the duplicates it's found so far.

Can anyone diagnose what is going on here - most people seem to mention times of several hours, not days for TB size file systems?

It's important to note that this is just the vanilla file comparison using MD5 hashes, pretty much the defaults straight after installation - there's no complicated picture or music comparisons going on. Have I selected something wrong?
User avatar
therube
Posts: 634
Joined: Tue Jun 28, 2011 4:38 pm

Re: How do I speed up DC pro?

Post by therube »

> it initially went fine

In what way did it go "fine"?

> running continuously for 48hrs

Hmm?

Is the disk FAT or NTFS?
Highly fragmented?
IDE or SATA drives? And they're local (as opposed to being on a network or USB)?
If an IDE drive is it running in PIO or DMA mode? (DMA being more efficient.)

Tons of same sized files?

Have you run a CHKDSK on the drive?

> I'm loath to stop it

Can't really see a reason not to?
(Trying to recall if intermediate results will be shown if you do?)



How much free space on the drive?
What are your computer specs, RAM & CPU?



> How do I speed up DC pro?

Depending on what I'm looking for, I tend to start out with "simpler", more targeted searches, allowing me to more quickly identify likely areas for dups, while giving up "exactness". And once I have found most likely areas, I'll turn up the screws & go from there.

So instead of (initially) scanning an entire drive, I'll scan a few suspect directory trees. And instead of immediately starting off with MD5, I might go with Name & Modified date. That helps me narrow things down. And now that I've got some ideas in that regard, I'll scan particular areas there, by name & Modified date & MD5. That way I've limited the particular file set that I'm doing the harder work on. Once I've done that, I may switch to similar name & do likewise, & as I work myself through, getting rid of more obvious dups, at some point, having culled many files already, I'll then Ignore name & date & go with MD5 alone, picking up less obvious dups, but on a much smaller data set then what I had to begin with.
BinaryFinery
Posts: 2
Joined: Tue May 13, 2014 10:31 am

Re: How do I speed up DC pro?

Post by BinaryFinery »

Thanks for your response.

I did end up stopping it and thankfully DC pro does then let you analyse the duplicates it had already found. I have a lot of large files on the drive and many duplicates - the result of dumping several years of HDD with back ups onto one drive in the hope of rationalising.

What I really need to know is how to make sure DC pro isn't doing the work all over again when I restart a scan. Do you know if it stores a database of MD5s instead of having to recalculate for each scan. Some of the large files like truecrypt volumes can cause it to take a long time.

Any tips you can give other than scanning by name would be much appreciated.
User avatar
therube
Posts: 634
Joined: Tue Jun 28, 2011 4:38 pm

Re: How do I speed up DC pro?

Post by therube »

> (does it) stores a database of MD5s instead of having to recalculate for each scan

No (at least not for a Regular Mode scan).
Though by default, on startup, your last duplicate files list is reloaded (no rescanning done).
(Don't recall, but perhaps an Image/Audio scan may calculate & store hashes?)
Post Reply