Very slow reading metadata

The best solution for finding and removing duplicate files.
aeciolemos
Posts: 8
Joined: Thu Nov 24, 2022 7:53 am

Very slow reading metadata

Post by aeciolemos »

Hi everyone,
I don't know if this is an issue or just something my system is doing. I have scanned about 163k files. Most of them were previously indexed, so that took onl about 3h. However, in the next phase, Reading Metadata, it is very very slow, taking some 3-10sec to read each file. Is that normal? I expected that to be faster much like the Discovering Files phase once files were mostly indexed previously.

Thanks
User avatar
DigitalVolcano
Site Admin
Posts: 1863
Joined: Thu Jun 09, 2011 10:04 am

Re: Very slow reading metadata

Post by DigitalVolcano »

It should be the same. It's just before it was part of the discovering files phase, so that took a lot longer without you knowing how long to expect it to take. There is currently no caching for the metadata (apart from any done by the hard drive/os).
aeciolemos
Posts: 8
Joined: Thu Nov 24, 2022 7:53 am

Re: Very slow reading metadata

Post by aeciolemos »

DigitalVolcano wrote: Thu Dec 15, 2022 3:18 pm It should be the same. It's just before it was part of the discovering files phase, so that took a lot longer without you knowing how long to expect it to take. There is currently no caching for the metadata (apart from any done by the hard drive/os).
Hi,

Thank you for your reply. Understood. Is there any plan to cache metadata? I am not sure if what I am saying is viable or even if it would be useful, but I assume if you store the hash of the file along with its metadata, this would make the process much faster. I just am not sure if that would be a good solution.

I also see it uses very little system resources. Is there any way to tell it to use more resources?

Cheers
aeciolemos
Posts: 8
Joined: Thu Nov 24, 2022 7:53 am

Re: Very slow reading metadata

Post by aeciolemos »

I noted that even the hashes are not really cached. Note here that it is calculating hashes since it only found 3417 in cache. I had scanned all these files before, so I would have expected the hashes at least to be there. Note that at least 162k of these have not been changed in the last week since my last scan. I would expect at most 1000 changes, but even that, I'm pushing it.

I hope there is a way to store/cache metadata and hashes. That would greatly speed up the process.

Cheers

Image

In case the image does not show up, here is the link
https://drive.google.com/file/d/1UilBT1 ... share_link
User avatar
DigitalVolcano
Site Admin
Posts: 1863
Joined: Thu Jun 09, 2011 10:04 am

Re: Very slow reading metadata

Post by DigitalVolcano »

It should be storing/retrieving caches.
Possible issues could be-
-Your mapped drive letter has changed
-Another piece of software has updated the files (e.g. a photo gallery program adding a piece of metadata). It re-scans if the date modified has changed.
aeciolemos
Posts: 8
Joined: Thu Nov 24, 2022 7:53 am

Re: Very slow reading metadata

Post by aeciolemos »

DigitalVolcano wrote: Fri Dec 16, 2022 1:30 pm It should be storing/retrieving caches.
Possible issues could be-
-Your mapped drive letter has changed
-Another piece of software has updated the files (e.g. a photo gallery program adding a piece of metadata). It re-scans if the date modified has changed.
Hi, yes, I considered that. The mapped drive letter is the same, however, I did recreate the mount. Would that be enough? Some GUID that is not visitble to the user but it is to the program or filesystem?

The images have not been changed in any way, I am sure of that, though.

I will wait for it to finish and then I'll run the scan again to see if it finds all the hashes.

Would it be something feasible to store/cache all the metadata?

[EDIT] My computer crashed and I lost 2 days of processing. It started again due to no cache in Metadata. It takes about 36 hours to read metadata due to most of the files being on a mapped drive, which is on the cloud and that is a limitation in speed on it's own. However, if metadata were cached, this would probably be reduced to less than 1h, I am guessing. Again, I am not sure this is even something desirable because metadata needs to be checked for file changes, but maybe optional when one knows that the files have not changed?

Cheers
Post Reply