Cannot use last version !

The best solution for finding and removing duplicate files.
Post Reply
User avatar
dc_fan

Cannot use last version !

Post by dc_fan »

Dear DV,

I do love your unrivalled program,

but I have been forced to regress to version 1.31, cannot use version 1.4 and +, for these reasons:

- Much TOO SLOW with the new MD5 check, which brings only slowness to me. I was perfectly satisfied with fast CRC32. I understand that some may need MD5, but in this case, why not

      -----> provide OPTION to choose slow MD-5 check instead of CRC32, and KEEP CRC32 available

- Annoying "About" splashscreen when program start, I didn't find how to disable it. Should be an option to get rid of it, in order to start using this excellent software without delay.


********************************************

Furthermore, I suggest you could provide a way to suppress that

- Boring -useless except for newbies- message box "Are you sure you want to quit" when exiting. I acknowledge some want to have it, but why is there no OPTION to get rid of it (at least I haven't found the way not to have it appear). It prevents ending DuplicateCleaner from the taskbar while doing something else in focus, forcing the user to make DC the upmost window.

Hope you can handle these suggestions from one who is VERY fond of your work.
Thanks in advance.
User avatar
Fool4UAnyway

Post by Fool4UAnyway »

I think you are not correct in stating that you cannot use the latest version. You could have come up with a more correct thread title.
User avatar
DV

Post by DV »

Glad you like the program.

Suprised you find MD5 to be too slow - in testing there was very little difference between the two. CRC had to be removed as is it prone to hash collision (falsely marking files as duplicates) when you are comparing a large number of files (70,000+)

In DC2.0 the exit message box/splash is optional :)
User avatar
dc_fan

Post by dc_fan »

My apologies !

You're right, there is no significant difference in process time. I performed a few tests as your answer troubled me. Results are shown at the bottom of post.

I was mislead by the fact that once a path has been searched, most of the hashing must be kept somewhere in the memory, so any second run on the same range of folders will be drastically time-reduced with whatever version).
That lead me to the WRONG conclusion that the old CRC-based version was much faster, as I always ran the old version AFTER trying the latest one !

So, pardon me for casting such a stupid statement around.

Gonna switch to version 1.4.0.7 right now, despite the nag splashscreen.

************************************************

It would be nice though to be able to optionally disable the popup of the "About ..." splash at program start.

What about making Duplicate Cleaner PORTABLE (not using Windows registry) ? For now, I use a AutoIt-3 script to make it "portable" and leave no track in the registry, but I'd be glad to have the settings in DC's folder, rather. It would be cleaner, and not machine-dependant.

Thanks for adding PNG preview. Is it normal that they aren't zoom-autofit (they appear at 100% with no slider) ?


************************************************ My quick Tests ******************************

Tested on 7 PCs - Dell Studio4 Intell Quad 8200 CPU

Criteria=       File filter = "*"   -   Same content

Test 1: on an ordinary photos folder of   18.1 GO = 84 586   JPG files (19�480�799�343 bytes in 1599 folders), while some routine work is being done concurrently:
================================
1st run - version    1.31         CRC32    -->       9 ' 37 sec          2 % CPU         6.82 ms/file      33.8 Mo/s
2nd run- version    1.4.0.7      MD-5      -->         1 ' 06            23 % CPU    on same folder (supplementary run = 1'02)


Test 2:      other   folder /jpg      56 608 files - 14�999�816�775 bytes
=======
1- version    1.4.0.7      MD-5      -->         4 ' 30            2-3 % CPU             4.76 ms/file      55.6 Mo/s
2- version    1.31         CRC32    -->       0 ' 30   


Test 3:      other   folder /jpg         97 918 files - 23�453�339�944 bytes
=======
1- version    1.4.0.7      MD-5      -->         11 ' 28                                 7.03 ms/file      37.0 Mo/s

Test 4:      other   folder /jpg         93 202 files - 25�479�799�193 bytes
=======
1- version    1.31         CRC32    -->       10 ' 10                                 6.54 ms/file      41.8 Mo/s

Test 5:      other   folder /jpg         17 295 files - 4�520�297�672 bytes
=======
1- version    1.4.0.7      MD-5      -->          1 ' 17                                 4.45 ms/file      58.7 Mo/s

Test 6:      plural   folders /jpg         36 730 files - 5 296 523 902 bytes
=======
1- version    1.31         CRC32    -->        2 ' 07                                 3.45 ms/file      41.7 Mo/s

Test 7:      plural   folders /jpg         14 040 files - 3 430 181 137 bytes
=======
1- version    1.4.0.7      MD-5      -->          0 ' 41                                 2.92 ms/file      81.5 Mo/s


Both versions seem to use only an average of 0.7 - 3 % of CPU with peaks to 8%      on first pass, and a second "GO" on the same "Search Criteria" consumate around 20-27 % average cpu.
User avatar
DV

Post by DV »

Interesting stats. The second run is always much faster as your hard drive/windows caches the data from the first. Also the second will use more CPU as it is spending it's time processing the data rather than waiting for the hard drive!

DC 2 is portable providing the machine has .net installed (most up-to-date system will)
PNG support is much improved too (ie zoom)
User avatar
dc_fan

Post by dc_fan »

Good.

I don't see the link for downloading DC 2 ??

User avatar
DV

Post by DV »

It is not released yet - I hope to have a beta out shortly.
Post Reply