Some serious? points

The best solution for finding and removing duplicate files.
Nagan
Posts: 26
Joined: Sun May 27, 2012 2:37 am

Re: Some serious? points

Post by Nagan »

The 'Select by Location' doesn't currently uncheck the files in the selected path, it only checks the copies it finds elsewhere (which is why your count went up). I agree this could be confusing, and go against the meaning of the word 'Preserve'. I think a good solution to this is to make this honour the 'Leave existing marks unchanged' option, which is usually off by default. Will add to the to-do list.
The previous version would select one from the outside though removing the mark from the folder which it is trying to preserve (because the option is select leaving the folder in question).

Could this preserve folder be made a standalone switch that acts like:
1.If this is the first event , this will conserve the folder and make them unavailable for selection later , till the concerned folder entry is removed.
2.If acted as a subsequent event , it marks off all those which are concerned with the folder and conserve the folder for the remaining events.


Also there are some issues with the speed esp during search and selection , which is negative compared to the previous free version.
dcwul62
Posts: 53
Joined: Mon Jun 10, 2013 9:51 am

Re: Some serious? points

Post by dcwul62 »

Nagan wrote:First of all the comparison between 3.1.0 and 3.1.4. I ran a test on a folder nearly 80gb , 40000 files.
1. 3.1.0 took 5.50 min to list the duplicates wheras 3.1.4 took 6.28min. Ofcourse the statistics were exact. So obviously the newer version is a tad slow.

2. On a particular selection method when I marked duplicate selection , 3.1.0 was instantaneous marking 4000 duplicates. But 3.1.4 suddenly seems to blacken and the marking takes place only after 5 seconds. In fact all the duplicate marking makes it to blacken the screen. :geek:

3.I use 800 x 600 res and 3.1.0 had the correct sized fonts and accomodated a better screen presentation. In 3.1.4 the fonts seem to have enlarged ,the scroll bars have to be put to use often. Is it designed only for higher res?


The bugs!
As a new user of Duplicate Cleaner I cannot judge the speed of earlier versions.
But I can compare DC to some other tools and my impression is that, in comparison, it is a bit slow.
[Sorry]

Comparison:
1 folder, 4.509 subfolders, 77.250 files, total size on disk: 13.6GB (14.697.803.776 bytes)
Duplicate Clear Pro v.3.1.5, MD5 checksum, find files: basis Same Content
(rest is all default, i.e. any date, any size, any files, don't check filenames and dates)
DC is started immediately after pc-start (i.e. no other applications are running)
Start->Finish: 25 minutes.

Duplicate File Detective v4.3.54
Check on same basis (MD5, any date, any size, any files, don't check filenames/dates)
Check same folder as above
Start->Finish: 15 minutes

Directory Opus (Explorer replacement, GPSoft)
Same basis/same folder
Start->Finish: 13 minutes

This is by no means meant to criticise your product!

Duplicate Cleaner has some options DFD sofar doesn't have, one of them being to select files basis shortest path.
Directory Opus isn't as flexible as DC, less userfriendly (that is to say, that is my opinion)

That said, hopefully Digital Vulcano may try to improve the search speed...??

Thanks

Keep up the good work!

=
later.. another scan ..

windows 7 ult 64 bit 8GB RAM
=
SnagIt-24062013 103309.jpg
=
User avatar
therube
Posts: 615
Joined: Tue Jun 28, 2011 4:38 pm

Re: Some serious? points

Post by therube »

Nothing scientific...

Caching matters.
So figuring the best way to test would be to reboot between tests.

Anyhow, I did not. Plus I ran sandboxed (Sandboxie).

After running (Duplicate Cleaner & DFD a few times (took me a few to get a feel for DFD & what it was doing or not), I'm not seeing much difference. At least not to the point where time would be my deciding factor.

I'll note that the very first run, with DFD, seemed like it took a very long time to enumerate the directory (tree).
(Again very well may be due to caching. Subsequent runs appeared & were faster.)

I searched my /TMP/ directory.

> 6.5 GB, 26,175 files, 2,527 directories

DFD:

Code: Select all

Time: 00:01:07
Folders: 2526
Folders skipped: 2 (marked as Hidden)
Files: 26022
Files skipped: 136 (0 byte or Hidden files)
Dups found: 5384
Space: 516 MB
Duplicate Cleaner (3.14 Free):

Code: Select all

Total Time Taken: 00:01:25
26175/26175 Files Scanned (6.53 GB)
1638 Groups of duplicates
5481 Files have duplicates(516 MB)
Hashes calculated: 6611
Quick Hashes calculated: 1837
Useful Quick Hashes: 2361
[updated for:]
Duplicate Cleaner (3.01 Free):

Code: Select all

Total Time Taken: 00:01:43
26175/26175 Files Scanned (6.53 GB)
1638 Groups of duplicates
5481 Files have duplicates(516 MB)
To me, the numbers are close enough that time does not matter.

You show a large time discrepancy & that would be worthwhile investigating further.

Note that the hash method used may or may not be the most efficient.
User avatar
therube
Posts: 615
Joined: Tue Jun 28, 2011 4:38 pm

Re: Some serious? points

Post by therube »

Does this apply to Duplicate Cleaner, the "hash algorithms in .NET", Comparing Hash Algorithms: Md5, Sha1 or Sha2?
(Note that what he calls "Sha2" is not what we, I, typically think of.)

I thought there was a "speed" thread in the past where I had benched various hash methods?
Recall it? I wasn't finding it? Perhaps it was somewhere else?

Maybe this is what I was thinking of (dated at this point), much faster search option.


I see that 3.14 only offers byte-to-byte or MD5, where 3.01 afforded other hashes.

Edit:

Duplicate Clearer 3.01:

SHA-1: 01:45, so identical to MD5
SHA-256: 02:10, so clearly slower in my case (or at least on this run).
(No way to test the same in the current Duplicate Cleaner.)

WinXP, Intel E4300, 2GB RAM (a powerhouse ;-)).
Post Reply