Page 1 of 1

Hard Linking Duplicate Files

Posted: Tue Sep 28, 2010 2:32 pm
by DV
In the next version (2.0) I'm tempted to remove the facility to create hard links of your duplicate files, as i think it's potentially too damaging in the wrong hands.
Any thoughts? Should it be hidden as an 'advanced' option?

Posted: Tue Sep 28, 2010 10:06 pm
by Fool4UAnyway
The possibility to create Hard Links was a good reason for using Duplicate Cleaner. I would like to keep the option.

Others have argued that Duplicate Cleaner by itself is a lethal tool because of possibly "unwanted" results...

Any medicine is bad, because "who would ever _want_ to use it?"

Posted: Thu Sep 30, 2010 12:30 am
by DellDude
Great program and keep up the good work! Please keep the hard link feature as it is THE reason I use DC. In my mind the only safe use for hard links is archived files - files that should never change and perhaps should even have read-only attributes. In a programming environment where automated nightly builds are done, most of the files are duplicates and hard-linking the duplicates saves a LOT of space.

The danger of hard-linking files is that changing any one of the files in a hard-linked group changes the one and only copy of the file and therefore loses the archived original version of the file.

SUGGESTION: If the original "file modified date/time" could be retained when hard-linking files, this often useful file attribute would not be changed and cause confusion (you would know when each "copy" was last modified). Perhaps options to retain the file date/time and to mark the hard-linked files as read-only would make it better and safer to use.

Posted: Fri Oct 01, 2010 7:38 am
by Emerson
First off, I'm with DellDude. I've tried a bunch of dupe finder tools and even used my own scripts on occasion and THE reason I use DC is because of the option to hard link dupes.

I understand hardlinking can be a confusing concept to someone that doesn't quite understand what it means but frankly, it is in no way "more" dangerous than simply deleting the same files. If you were ok deleting something, then hardlinking it shouldn't cause you any problems.

That said, I have no objection to it being only available under an "advanced" mode or whatever (as long as it *is* available and it isn't cumbersome to get to).

I keep backups from all my computers in a centralized location, and being able to hardlink duplicate music/video/etc files while being able to maintain the directories of the various different computer backups "complete" is ideal.

I think there are very many good use cases where people want to be able to leave things where they are but reclaim otherwise wasted disk space. Being able to hardlink files is a great (and most importantly a differentiating) feature of DC. I think you should definitely keep it!

Posted: Fri Oct 01, 2010 8:06 am
by Emerson
@DellDude, I didn't want my first post to get way TLDR so I'm posting this separately :P ... but as far as the timestamp issue goes, hardlinks to a file can't have their own timestamp because of the way hardlinks work under the hood.

The way files work on most file systems work is this:
- a file has a name (i.e. "example.txt")
- a file points to an "inode"

An inode is the thing that points to the actual location of the data on your drive as well as most metadata related to that data (such as file size, timestamps, etc).

So taking into account that the files you see on your drive are all filename:inode pairs, when two files are "hardlinked" it just means they point to the same inode (and hence *must* have the same timestamp).

If you wanted to keep track of the original timestamps of files for archival or whatever reason, before I hardlink files, I export the duplicate list as a csv (which you can do from the file menu) so I can refer to that info if I need it for some reason.

Hope that clears that up :)

Posted: Fri Oct 01, 2010 1:48 pm
by DV
Hi all
Thanks for your responses. I'll keep hardlinking in, glad you all find it useful. I'll be redesigning the old 'delete' screen, and may put a health warning in ;)

Posted: Sun Oct 03, 2010 8:02 pm
by test
test