Request and question re hash calculating

The best solution for finding and removing duplicate files.
Post Reply
Shane
Posts: 10
Joined: Mon Nov 02, 2015 2:43 pm

Request and question re hash calculating

Post by Shane »

I would like to request the following re hash calculating:

* Option to have DPC calculate and cache hashes for all files in a given (set of) location(s) in advance. That way, for very large data sets, the hashing time can be scheduled for when I'm asleep so that different comparisons can return results quickly when I'm awake!

* Better optimization for hash calculation (as far as I can tell, DPC seems to hash one file at a time, even if there are multiple drives to scan and the CPU has ~ 60% spare cycles)

And my question is whether stopping the "Calculating hashes" window before it completes its calculations will result in the loss of any hashes calculated so far (as currently it's processed 195k of 333k files in the past 27.5 hours with 211k hashes and 252k quick-hashes calculated so far)?
Shane
Posts: 10
Joined: Mon Nov 02, 2015 2:43 pm

Re: Request and question re hash calculating

Post by Shane »

I'd like to add another hashing request, could uniques please show (or be made to show) their hashes in the Unique files window? Only some have hashes shown.
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Request and question re hash calculating

Post by DigitalVolcano »

Thanks for the suggestions
* Option to have DPC calculate and cache hashes for all files in a given (set of) location(s) in advance. That way, for very large data sets, the hashing time can be scheduled for when I'm asleep so that different comparisons can return results quickly when I'm awake!
Good idea
Better optimization for hash calculation (as far as I can tell, DPC seems to hash one file at a time, even if there are multiple drives to scan and the CPU has ~ 60% spare cycles)
Not currently implemented but is on the 'to-do' list!
And my question is whether stopping the "Calculating hashes" window before it completes its calculations will result in the loss of any hashes calculated so far
It should save any calculated hashes in the cache even if the scan is cancelled.
I'd like to add another hashing request, could uniques please show (or be made to show) their hashes in the Unique files window?
Currently it only shows available hashes - if the hash is missing then likely it was never calculated (i.e. the file was a unique size). I guess this could be added as a option, though it would increase hash calculation time. This probably ties in well with your first suggestion (pre-calculate all).
Shane
Posts: 10
Joined: Mon Nov 02, 2015 2:43 pm

Re: Request and question re hash calculating

Post by Shane »

Thankyou for making DPC! I just used it plus drivepool, ffmpeg, 7zip and multipar to find and recover some family movies and records from bitrot that had crept in over time via a bad drive controller.

And another request for the pile: ability to tell the cache to re-calculate the hash for particular files/folders (e.g. if they have been recovered, repaired or edited by a tool that preserves the last modified date) instead of having to clear the entire cache and start from scratch when performing future comparisons that may involve those files.
Post Reply