Feature Suggestion - Hash Databases

The best solution for finding and removing duplicate files.
Ryan
Posts: 11
Joined: Fri May 16, 2014 3:25 am

Feature Suggestion - Hash Databases

Post by Ryan »

First allow me to apologize as this is a reprint of a suggestion I originally sent to you as a support ticket. I was unaware that there was a forum which I assume is a more appropriate way for me to submit this. So I have posted it here in the forum in the hope that this is a better place for it. My apologies for sending it as a support ticket if that is not the proper way.

The suggestion follows. Thank you!

===

Duplicate Cleaner is a superb duplicate finder program... very nice work.

There is just one killer feature missing that keeps it from being the best program of this type I have ever seen.

There was a program called DupeMaster that I purchased maybe 10 years ago or so. It was a very strong duplicate finder, but had one feature that to this day I think makes it still unmatched.

You could scan files in the file list to generate their hashes and store those hashes in separate databases. The databases could then be loaded like virtual folders and files could be compared for duplicates against the hash databases as well as against other real files.

This was an enormously powerful feature as it allowed you to check for duplicate files even if the real files were not present. So, for example, you can load a database of hashes of files on another computer, stored in backups, on optical media, etc. even if these items were not physically present and compare your real files to these hashes.

You can compare your real files to both other real files and the "virtual" files in the hash databases.

In all the years I have looked at duplicate finders I only saw one other one that did this besides DupeMaster. I can't remember what it was called, but its implementation was not very good compared to DM. Of course in all this time, I have also thought of ways that DM's implementation could be improved too, but it has not been developed for maybe close to 10 years, and is not even commercially available anymore. So there will be no more changes.

Back then I was in touch with DM's developer and was able to persuade him to make a few changes that I think improved DM a lot. But other things never got changed because I didn't think of them until later. It was well written too I think because even though I believe it is written in an old, slow language like Visual Basic something... it is quite fast considering that. Of course it would be much faster still in something newer. And speed is a critical factor when you are dealing with lots of files.

DM is a killer legacy program that I still have and use as my go to duplicate finder.

I wanted to mention this to you because I think your program is fantastic. It's the best in this class I have seen in many years. I would LOVE to see you add the capability I described with hash databases and I hope you think it is an idea worth considering. I think it would send Duplicate Cleaner through the roof!

Please let me know if there is anything I can do to assist or answer any questions regarding this. If you are interested, I could probably find a copy of DupeMaster to send you since I don't think you can find it anywhere anymore. But of course I would have to send you the unregistered version... I think it works as a trial.

Again, if you do think this is something you might be interested in, I have some thoughts on how to implement it in a way that is even better than how DupeMaster did it and I would be glad to share those with you.

I hope you like this suggestion! Thanks for reading it. :)
User avatar
DigitalVolcano
Site Admin
Posts: 1864
Joined: Thu Jun 09, 2011 10:04 am

Re: Feature Suggestion - Hash Databases

Post by DigitalVolcano »

Thanks for this suggestion. It's an interesting idea, being able to build an offline databases for removable media, etc.
I will look at this for version 4.

I had a look for DupeMaster and found a version 1.7 by CH-Soft (in German) - is this the correct program or another with the same name?
Post Reply