Frustrated by speed/resources. Help?

The best solution for finding and removing duplicate files.
Post Reply
User avatar
BP

Frustrated by speed/resources. Help?

Post by BP »

Hi, all,

This is my first post. I've enjoyed test runs of Duplicate Cleaner on small file sets, but when I turned it loose this week to do what I really want it to do - look at my entire external hard drive, used for backup, and identify all the duplicates - it doesn't seem up to the task.

The first time I tried it, I used CRC-only criteria. The scan ran for more than two days, and was only 50 percent complete before my PC ground to a halt and I had to physically power it off in order to reboot. The second time, I thought I'd limit the possible results by checking Name, Time and Date (no CRC). After I let it run for about 24 hours, it was only 17 percent complete. I looked in the Windows Task Manager, and Duplicate Cleaner was using ~350K of memory, ~350K of virtual memory and 50% of the CPU resources, and the amount of memory used seemed to be increasing steadily, in small increments. So, I manually shut down the scan and decided to come here to ask for help.

I'd really like to support the tool and use it, but if it can't handle this many files (approx. 680 GB, just over 1.1 million files), I guess I'll have to look for a different solution. I'd greatly appreciate any tips, pointers or other help anyone here could give me.

Thanks!

Bob
User avatar
DV

Post by DV »

Sorry you're having trouble - I think that DC just doesn't handle huge sets of files very well. It's partly a limitation of the ageing framework it is written in, and partly because the code needs to be re-written to handle the memory more effectively.
User avatar
Hans Henrik

Post by Hans Henrik »

btw DV
what language is it written in? what compiler did you use? what optimization-options did you use?
[from personal experience, using MinGW c++ compiler, and NOT using -01 or higher optimization, the program will never use more than 50% cpu, btw]
User avatar
DV

Post by DV »

DC is written in VB6 with some assembler for the CRC check.
User avatar
Hans Henrik

Post by Hans Henrik »

know why Duplicate Cleaner never goes beyond 55% cpu +-?
(also when putting higher priority on the exe, it never goes beyond 55% cpu)
User avatar
DV

Post by DV »

Possibly the bottleneck is the disk speed - not the processing. It has to do a lot of reading!
Post Reply