Cache questions

The best solution for finding and removing duplicate files.
Post Reply
Taz
Posts: 2
Joined: Sun Aug 28, 2016 1:41 pm

Cache questions

Post by Taz »

Nice tool, lots of features just the way I want.

With several TB to process against several targets I don't want to reread files unnecessarily so...

What actions invoke a reread?

I'm doing a lot of reorganization, I presume any change to paths means the contents are assumed to be new / unknown and thus reread?

Is there any way to edit the cache database to avoid this? Kind of a shame to spend days reprocessing 10TB just because I renamed the root folder :-/

What about removable disks? If a USB drive gets a different drive letter next time how is that treated?

Would a second USB drive with similar contents be mistaken for the first if it appears on the drive letter previously used by the first?

How is the hash cache pruned? If a network or USB drive is missing will those records get culled?

I have a bunch of other examples but maybe a brief outline of how the cache is built & culled would be a universal answer
User avatar
DigitalVolcano
Site Admin
Posts: 1750
Joined: Thu Jun 09, 2011 10:04 am

Re: Cache questions

Post by DigitalVolcano »

A file hash is cached by full path (incl UNC or drive letter), date modified AND date created.

The cache isn't pruned automatically, but can be cleared or turned off via the Options window. The cache file isn't currently user editable.
GPC
Posts: 1
Joined: Thu Apr 26, 2018 12:40 pm

Re: Cache questions

Post by GPC »

I use a combination of removable drives and virtual drives. Windows can assign a different drive letter each time I hook one up depending on what else is attached at the time.

Can I suggest that the full path map refer to a "unique identifier" rather than the UNC or Drive letter. Volume Serial Number/Volume Size would work for removable drives. May or may not work for a network drive but I'm sure something could be found to work in all cases.

The "unique identifier" could then mapped to the actual UNC or Drive letter when required.

This would allow proper use of the hash cache on removable/network drives.

Glenn
User avatar
DigitalVolcano
Site Admin
Posts: 1750
Joined: Thu Jun 09, 2011 10:04 am

Re: Cache questions

Post by DigitalVolcano »

Thanks for the suggestion. I've put a note in to get this looked at for the next update.
ElderP
Posts: 17
Joined: Wed Aug 22, 2018 9:01 pm

Re: Cache questions

Post by ElderP »

I do a lot of "scanning" using different Profiles, and have found that clearing the cache deletes it for all profiles. Would it be possible to save these cache files by Profile?

Also, turning the hash off, also terns it off for all profiles? Could this Advanced Option be profile dependent?

Another suggestion is adding an option to "Clear Cache When Program Exits" (again profile dependent)

Thanks,

Steve
User avatar
therube
Posts: 615
Joined: Tue Jun 28, 2011 4:38 pm

Re: Cache questions

Post by therube »

Is the Cache in DuplicateCleaner4_Pro.data or elsewhere?

If the former, you might be able to do something - in a round-about way.
Say like with multiple "installs" (install not really needed), but with database.ini pointing to individualized areas for separate "Profiles".
(I haven't actually tried this, but would think it would work - at least in concept.)


(Though the concept of separate Profiles, including Cache, would be better.)
User avatar
DigitalVolcano
Site Admin
Posts: 1750
Joined: Thu Jun 09, 2011 10:04 am

Re: Cache questions

Post by DigitalVolcano »

The files are all cached in the database - DuplicateCleaner4_Pro.data

In theory you could have multiple database caches (along with all settings and any current scan) and point the program to the one you want to use by editing the ini file.

Being able to save/load complete databases from within the program may be something to look at for version 5.
wwcanoer
Posts: 51
Joined: Wed Aug 19, 2020 5:49 am

Re: Cache questions

Post by wwcanoer »

I still don't see a way to update DC when an HDD drive letter changes. :( So.. rescan.

When "not connected" is identified, then there should be a right-click command to pick the location of the drive. DC can then check that the volume name matches.

Ideally, DC would be able to identify a drive by it's volume number by itself, but manual is fine because I once had bought two of the same external drives and they had the same volume number.

I had purchased a used desktop with lots of HDD bays and USB ports so that everything could be connected at once but now I'm laptop only and don't have enough adapters to connect everything at once. So, I need to re-examine how to use DC effectively.

I'd still like to be able to store a full MD5 scan of a drive, use it for comparison, identify items to delete, and then when I reconnect the drive I would delete from that drive. Imagine that I had only one port to connect a drive. Allow me to scan each drive, one at a time, run the compare on those saves, select files to delete, and then connect each drive one by one to delete the files.
saboy
Posts: 2
Joined: Fri May 10, 2024 6:40 pm

Re: Cache questions

Post by saboy »

I have this exact same issue and requirements...
User avatar
DigitalVolcano
Site Admin
Posts: 1750
Joined: Thu Jun 09, 2011 10:04 am

Re: Cache questions

Post by DigitalVolcano »

Does using the virtual drives/folders feature help? This allows you to make an image of a drive for offline comparison.
https://www.digitalvolcano.co.uk/duplic ... =&sct=MzQ2
Post Reply