I thought perhaps someone might find this information useful one day ~~
After having excellent success with my first attempt at deduplication while merging two older drives onto a new, larger drive, I moved on to the next effort: a 1.5TB backup drive that was nearly full and known to be loaded with duplicates. Since I have a fairly powerful computer to work with, I decided to see what would happen if I set the maximum number of duplicates to 1,000,000.
After a few hours, the results were in -- 1,000,000 duplicate files were identified. I have spent my spare time over the past month going through the list and had identified just over 806,000 confirmed duplicates with an aggregate file size of 534GB that would be freed up!
Being an experienced data hoarder and routine backer-up of databases, I had periodically saved the Duplicate Files as well as the Marked Duplicates to CSV format for redundancy. I was pleased to discover that each time I had to close the program and return at a later time, the DuplicateCleanerPro.data file loaded without loss (albeit slowly, given its >2GB size).
I was nearing completion of the 1 million files review when I clicked "Refresh" from the right-mouse Context Menu. For an unreported and unrecognizable reason, DC stopped responding. After waiting a few minutes, I used the Windows Task Manager to stop DC.
When the program reopened, it loaded a much-older .data file than the one that should have been created when I last gracefully closed DC. In fact, it was from a scan of a different drive altogether.
Fortunately, I have the Marked Duplicates CSV file from less than 2 minutes before I refreshed. Unfortunately, I don't have a reasonably current Duplicate Files CSV file. I don't feel like writing a little executable or macro that will merge the two and put me back at the point I was when the lock-up occurred.
Instead, I'm off to delete 806,000+ duplicate files so I can start the process again with the same 1.5TB drive. I believe I'll reduce the file count limitation back down to the default 500,000, however, so I can once again feel a sense of task completion!
1 million was not too high a limit for DC Pro!
-
- Posts: 8
- Joined: Wed Sep 02, 2015 5:49 pm
Re: 1 million was not too high a limit for DC Pro!
(
)
Hadn't even realized that option existed. Had to open the program & look.I'll reduce the file count limitation back down to the default 500,000
)
-
- Posts: 8
- Joined: Wed Sep 02, 2015 5:49 pm
Re: 1 million was not too high a limit for DC Pro!
I'm glad it is available, considering I am working on drives with files dating back to the 1980's. <wink>therube wrote:(Hadn't even realized that option existed. Had to open the program & look.I'll reduce the file count limitation back down to the default 500,000
)
Second pass on the same drive gave me 492,544 identified potential duplicates (same file name and content) taking up 582 of the 971GB used space. DC Pro will have freed up almost 2/3 of this drive when I'm done!