Comparison clarification
Posted: Fri May 19, 2017 9:28 pm
Hi,
I was just wondering, if someone knows how Duplicate Cleaner Pro uses the hashes?
I mean, it seems that it calculates e.g. MD5 for all files and only compare on these. Is this correct?
And, in that case, why is no additional checks made, to verify that it isn't an accidental clash? It could do an additional salted MD5 hash and compare, or do SHA1 or even byte-to-byte, and perhaps compare the size as well.
Perhaps, it was an idea, instead of allowing users to change the method of comparison (byte-byte, MD5, SHA1, and so on) that the user could chose to add an additional verification, only if the files are initially (e.g. MD5) found to be identical.
The reason I ask is that I am currently processing an excessively large number of files and are concerned that there could be a single clash, which would be rather unfortunate.
/tom
I was just wondering, if someone knows how Duplicate Cleaner Pro uses the hashes?
I mean, it seems that it calculates e.g. MD5 for all files and only compare on these. Is this correct?
And, in that case, why is no additional checks made, to verify that it isn't an accidental clash? It could do an additional salted MD5 hash and compare, or do SHA1 or even byte-to-byte, and perhaps compare the size as well.
Perhaps, it was an idea, instead of allowing users to change the method of comparison (byte-byte, MD5, SHA1, and so on) that the user could chose to add an additional verification, only if the files are initially (e.g. MD5) found to be identical.
The reason I ask is that I am currently processing an excessively large number of files and are concerned that there could be a single clash, which would be rather unfortunate.
/tom