Not declaring these files as the same

The best solution for finding and removing duplicate files.
drysg
Posts: 6
Joined: Wed Sep 04, 2013 6:03 pm

Not declaring these files as the same

Post by drysg »

I am doing a regular search. Exact match of content. Exact file name. These JPG files are not being declared as the same. I think they should be.

I really only want to do an exact file name, but if I do that I lose the hard link capability.

Enclosed are the files. In the original, they are in three folders that are siblings of each other (same tree level) and the file names are IDENTICAL.
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Not declaring these files as the same

Post by DigitalVolcano »

They aren't identical files - they are actually different sizes. EXIF data is blank so it could be something like a slight variation in compression or a tiny degradation in detail (if a file was loaded and saved twice the jpegs would lose quality)
drysg
Posts: 6
Joined: Wed Sep 04, 2013 6:03 pm

Re: Not declaring these files as the same

Post by drysg »

Is there any way to get the De-Duplicator to ignore those differences? and declare them the same? I want to use the HardLink feature, and it does not take into account Image Mode matching. Something like 99% match in Image Mode would work, would it not? But again, I dont have that option.

Actually just same file name would also work for us.

If we don't have this, then over 80% of the functional duplicates will not be found.
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Not declaring these files as the same

Post by DigitalVolcano »

I will get a change made in the next version (3.2.2) to remove the restriction on hard-linking where the files aren't identical. That should solve your problems as you'll be able to use Image Mode to do what you need.
drysg
Posts: 6
Joined: Wed Sep 04, 2013 6:03 pm

Re: Not declaring these files as the same

Post by drysg »

Excellent.

Would you mind posting a notice here when it is released?
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Not declaring these files as the same

Post by DigitalVolcano »

Sure. You can also subscribe to email alerts from the "Releases and updates news" forum - new releases are always posted there.
keiths
Posts: 3
Joined: Sun Oct 13, 2013 2:16 pm

Re: Not declaring these files as the same

Post by keiths »

Can I add my vote for this - I'm tring to dedupe a music collection (original albums vs compilation discs) which means that I have several similar but not identical versions of the same tracks and I'd like to keep one but maintain the album track list integrity by hard links. The compilation disc directory could comprise mainly hard links to the original album - or the original album directory to the better quality remastered versions.
Thanks, Keith
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Not declaring these files as the same

Post by DigitalVolcano »

Version 3.2.2 is now available, with the hard linking restrictions removed:
viewtopic.php?f=13&t=1273
keiths
Posts: 3
Joined: Sun Oct 13, 2013 2:16 pm

Re: Not declaring these files as the same

Post by keiths »

Thanks for the hard link change - I'm trying it at the moment, and it results in a quandary. How is the target of the hard link deduced? I have 4 tracks that are flagged as close versions - same name and artist. The problem is that 2 are live versions, 2 are studio. If I mark one and ask for a hard link, how does it work out which of the 3 remaining to link it to? If I marked 3, then I would assume that the 3 links would point to the remaining one. I haven't tried suck it and see yet, which is the obvious thing to do - uncharacteristically I'm thinking first! I'd like to delete 2 and have the had links set up accordingly for the 2 versions.

btw, registering for the new version thread didn't result in an alert for the new version.
Thanks, Keith
User avatar
DigitalVolcano
Site Admin
Posts: 1729
Joined: Thu Jun 09, 2011 10:04 am

Re: Not declaring these files as the same

Post by DigitalVolcano »

The hard link is generally created against the first file it finds in that group. If you want to steer it towards linking to a particular version, the safest thing to do is to drop the one you don't want linking from the group first (right click, drop selected from list). Or you can rename the track titles and rescan so they appear in separate groups.

This kind of issue is why the hard linking was originally locked to exact duplicates only - you've got to be careful!
Post Reply