First, thanks for the program. I've tried more than a dozen of similar programs (i think I quite exhaustively researched everything available) and ended up with this one.
My problem is the following. I have a big backup storage (1TB, 1.3 million files) which contains a lot of identical files.
Duplicate cleaner seems suitable for such a large task (the scan takes around 16 hours) but it seems to run out of memory on populating the list on my 2GB machine (memory allocation of Duplicate Cleaner at the moment of crashing is 1.9GB). It stops with the message "Error in GO! process... This item's control has been deleted". The window that is displayed at this moment is "Populating List" with progress bar at roughly 15% and the text "217609 duplicates found". The version is 1.4.3.
I wonder if it is possible to make Duplicate Cleaner more memory efficient? One idea is to make it automatically save the duplicate list before populating, which can be later loaded or processed manually. Or in my case instead of populating a list to save a batch file that would replace all duplicate with hardlinks.
Again thanks for Duplicate Cleaner.
PS: I'll be happy to report on success if any improvement has been made.
More memory efficient?
Good points - I've considered dumping the 'All files' list as this must waste a lot of memory. (Does anyone use it?). I keep meaning to run some tests on the lists to see at what point they break. Backing up the list to a text file first is a good idea - might look into it.
All this issues will be cleaned up when I port DC over to .NET for version 2.0 (It currently uses the creaking VB6 framework).
All this issues will be cleaned up when I port DC over to .NET for version 2.0 (It currently uses the creaking VB6 framework).
I would like to also report that I have received this same error as well with more than 200,000 duplicated found.
Duplicate Cleaner 1.4.3 was running on 4 GB (3.25 available on Windows XP) and I did check the amount of RAM used when the error came up.
If I remember correctly, it was around 2,030,000 KB mark, which is as Eugene quoted to be around 1.9 GB.
I don't use the "All files" list, as the function is useless to me, and does not tie in with the program's other functions.
I am only interested in duplicates found. There are many other dedicated programs that can solely list files along with their file data at a much faster rate.
Duplicate Cleaner 1.4.3 was running on 4 GB (3.25 available on Windows XP) and I did check the amount of RAM used when the error came up.
If I remember correctly, it was around 2,030,000 KB mark, which is as Eugene quoted to be around 1.9 GB.
I don't use the "All files" list, as the function is useless to me, and does not tie in with the program's other functions.
I am only interested in duplicates found. There are many other dedicated programs that can solely list files along with their file data at a much faster rate.
The workaround for this is to stop the scan before it reaches the error point. Stopping the scan still allows you to work with the duplicate sets that have been found so far. I made an AutoIt script to automate this, as well close an "Internet Connection Error" message box that stops everything until it is closed.
Even with these limitations, Duplicate Cleaner is by far the best out there. The AutoIt script is included below. The script stops the scan when WHEN_TO_STOP duplicate groups have been found, so that is the only part you need to modify.
<code>
#Include <WinAPI.au3>
const $ADDR=0x004C2038 ;num dupe groups found
;const $ADDR=0x004C2180 ;num files scanned?
const $WHEN_TO_STOP=50000
$pid=WinGetProcess("[CLASS:ThunderRT6FormDC]")
$dostop=False
if $WHEN_TO_STOP then
$dostop=True
$s_whentostop=DllStructCreate("DWORD")
$hProcess=_WinAPI_OpenProcess(0x10,0,$pid,true)
if @error then exit @error
endif
do
Sleep(1000)
if $dostop Then
$iRead=0
if not _WinAPI_ReadProcessMemory($hProcess,$ADDR,DllStructGetPtr($s_whentostop),4,$iRead) or $iRead <> 4 then exit 1
if DllStructGetData($s_whentostop,1) >= $WHEN_TO_STOP then
_WinAPI_CloseHandle($hProcess)
CancelScan()
$dostop=False
EndIf
endif
if winexists("[TITLE: Internet Connection Error]") then winclose("[TITLE: Internet Connection Error]")
until not ProcessExists($pid)
exit 0
func CancelScan()
WinActivate("[CLASS:ThunderRT6FormDC]")
Sleep(2000)
WinWaitActive("[CLASS:ThunderRT6FormDC]")
controlclick("[CLASS:ThunderRT6FormDC]","","[CLASS:ThunderRT6CommandButton; INSTANCE:8]")
Sleep(2000)
WinWaitActive("[TITLE:Duplicate Cleaner]")
controlclick("[TITLE:Duplicate Cleaner]", "", "[CLASS:Button; INSTANCE:1]")
EndFunc
</code>
Even with these limitations, Duplicate Cleaner is by far the best out there. The AutoIt script is included below. The script stops the scan when WHEN_TO_STOP duplicate groups have been found, so that is the only part you need to modify.
<code>
#Include <WinAPI.au3>
const $ADDR=0x004C2038 ;num dupe groups found
;const $ADDR=0x004C2180 ;num files scanned?
const $WHEN_TO_STOP=50000
$pid=WinGetProcess("[CLASS:ThunderRT6FormDC]")
$dostop=False
if $WHEN_TO_STOP then
$dostop=True
$s_whentostop=DllStructCreate("DWORD")
$hProcess=_WinAPI_OpenProcess(0x10,0,$pid,true)
if @error then exit @error
endif
do
Sleep(1000)
if $dostop Then
$iRead=0
if not _WinAPI_ReadProcessMemory($hProcess,$ADDR,DllStructGetPtr($s_whentostop),4,$iRead) or $iRead <> 4 then exit 1
if DllStructGetData($s_whentostop,1) >= $WHEN_TO_STOP then
_WinAPI_CloseHandle($hProcess)
CancelScan()
$dostop=False
EndIf
endif
if winexists("[TITLE: Internet Connection Error]") then winclose("[TITLE: Internet Connection Error]")
until not ProcessExists($pid)
exit 0
func CancelScan()
WinActivate("[CLASS:ThunderRT6FormDC]")
Sleep(2000)
WinWaitActive("[CLASS:ThunderRT6FormDC]")
controlclick("[CLASS:ThunderRT6FormDC]","","[CLASS:ThunderRT6CommandButton; INSTANCE:8]")
Sleep(2000)
WinWaitActive("[TITLE:Duplicate Cleaner]")
controlclick("[TITLE:Duplicate Cleaner]", "", "[CLASS:Button; INSTANCE:1]")
EndFunc
</code>
Interesting script you have there Patrick, though it's hard to estimate how many duplicates you will come up with before the error point is reached.
I got a crash message at 55.5% of 2.02 million files, with 142941 duplicates at that point.
RAM used for the program was 1,922,356 KB at crash point.
I guess it was a good 20-hour stress test.
I got a crash message at 55.5% of 2.02 million files, with 142941 duplicates at that point.
RAM used for the program was 1,922,356 KB at crash point.
I guess it was a good 20-hour stress test.