Check duplicates using existing MD5 files

The best solution for finding and removing duplicate files.
User avatar
o.sandoval

Post by o.sandoval »

No me arriesgo aun porque no se usarlo.
User avatar
Burt

Post by Burt »

Does Textcrawler have a batch function to save the extracted text (the hashcode) to new files?

Maybe Examdiff could do the job, but i am only interesten in the results of files which are duplicated. So if I have to scroll through 1000's lines to look for a duplicate it is too much work
User avatar
Fool4UAnyway

Post by Fool4UAnyway »

I suggest you give ExamDiff Pro a try. There is a 30 day free trial period.

You can configure it to do a line by line comparison, so it will not try to find best matches if differences in MD5's occur.

It will find differences for you. You can easily navigate through them, or even show only all the differences.

You would just have to set the "Ignore part of lines matching regular expression" option to " .*$", or check the Column option and enter "33-" (both without the quotation marks).

You can also perform directory comparisons of lists of files, so it would be as easy to find which lists contain differences.

Just take a look at it.

www.prestosoft.com/edp_examdiffpro.asp
User avatar
Burt

Post by Burt »

I checked ExamDiff Pro but there are so many options...
And I can't figure out the right ones. It looks like files wirh different filenames are always marked as different, even if the content is the same.
User avatar
Fool4UAnyway

Post by Fool4UAnyway »

Yes, if you perform a Directory Comparison, files will only be considered to potentially be the same if they have the same filename. That is what they are aligned to (first).

However, you can manually select any two files, from the Directory Comparison or from the Explorer('s Shell Extension) to perform a File Comparison and then use the options mentioned above.
User avatar
Burt

Post by Burt »

Mmm, I have 1000's of files, some of them are the same but with different filenames. So it is not possible with Examdiff i guess without manually select 100000's of file combinations?
User avatar
Fool4UAnyway

Post by Fool4UAnyway »

Unfortunately, I guess that's how it is. I may have misunderstood what you are trying to accomplish.

If you have files with different names and same MD5's but different filenames for those MD5's, I guess you will have to remove those different filenames first, using Text Crawler, and then run Duplicate Cleaner for the stripped files containing only the (supposedly) equal (lists of) MD5's.

If you had a lot of files containing one or many MD5's with their (different) filenames, but with those "lot of files" having equal names, you could do the Directory Comparison in ExamDiff Pro, having set the File Comparison options to ignore those filenames on the MD5 lines.
User avatar
Burt

Post by Burt »

No problem, i really appreciate your help anyway!

Can Textcrawler automaticly save many stripped md5 files?
I couldn't figure that out.
User avatar
Fool4UAnyway

Post by Fool4UAnyway »

If your files are easy to access, that is in one directory or (sub)directory tree structure, I guess it is a very basic action to have Text Crawler (find and) process the files. Changes will be written to the files. You can configure Text Crawler to leave a backup (like .bak) file of each changed file, so you won't lose the original contents, if anything might go wrong.

DV: I still have this question at the Text Crawler forum about how file( change)s are written back to disk.
Post Reply