Scan Criteria tab

The Scan Criteria tab is where you tell Duplicate Cleaner what you are looking for, and how you want to match duplicates. (The Scan Location tab lets you say where).

On this page you can set in what way you want Duplicate Cleaner to match files. You can also specify additional duplicate matching options and search filters - these narrow down the type of files you are scanning for.

Mode of operation

There are four modes of operation -

Regular Mode - Match files by their exact binary content, or match by their file attributes (name, size, etc) ignoring the content
Image Mode - Match image files and photos by their visual element, or their metadata and tags
Audio Mode - Match audio files by their sound component or metadata.
Video mode- Match video files by video, audio, or metadata components.

Select the appropriate tab to activate this scan type, and then specify the criteria. Only one scan mode can be active at a time.

More duplicate options

These are additional scan criteria relating to file and folder attributes that can be used with all scan modes.

Same file name - With this option, files with the same filenames (excluding file extension) will be grouped as duplicates.
Same file extension - This option groups files with the same file extension (For example: .txt or .jpg)
Similar file names - Files are grouped which have similar names. Match tolerance can be tweaked using the Text matching options below.
Ignore 'Copy' part of filename - When matching file names, the section of a filename with copy-related text or number is ignored.

E.g

“File - Copy.txt”

“File - Copy (1).txt”

“File - Copy (2).txt”

“File - Copy (3).txt”

“File (1).txt”

“File (2).txt”

So using this “MyFile.txt” can be matched to “MyFile – Copy (16).txt” when matching file names. Note this current only works with the English 'Copy' suffix.

Same file size - Matched files have to have the same size (in bytes).
A matching tolerance can also be set. The tolerance can be set in Bytes, Kilobytes, Megabytes or by percentage difference.

Note: This is unavailable if the 'Regular mode' - 'Same Content' option is selected, as this type of scan already assumes the files will be the same size.

Same created date/time - This option will group files with the same created date/time file attribute. You can just match by date, ignoring the time component, by clicking 'Match date only'.

You can also set a matching tolerance in Hours, Minutes and Seconds.

Same modified date/time - This option will group files with the same modified date/time file attribute. You can just match by date, ignoring the time component, by clicking 'Match date only'.

You can also set a matching tolerance in Hours, Minutes and Seconds.

Same drive - Duplicate groups have to be on the same drive or device.

Same folder name - This setting enables you to group duplicates by their folder name. There are several options:-
Match full folder name - matches files by their full path excluding the drive letter or network share name. To match the drive letter as well please check the 'Same drive' option.
Match depth from top [depth] - matches folder names to a specified depth from the top of the path. Excludes drive letter.

For example: With a depth of 2 the folders C:\Mine\Stuff\Docs\Sub\ and D:\Backup\Docs\Sub will be matched. With a depth of 3 they would not match.
Match from search root - with this setting the base part (as added as a scan location) of the folder name is ignored when matching.

For example: If C:\Documents is the starting folder and C:\Documents\Mine\ is being checked then only \Mine\ is compared when checking the folder name to another.
Ignore duplicate groups within the same folder

This option will cause duplicates to be EXCLUDED from the final list where ALL the duplicates in that group are in the same folder. This is useful where you want to find duplicates which are in different folders.

Note: A group containing some duplicates from the same folder will be shown if they have at least one match in a different folder.
Text matching options - these settings affect any criteria above that rely on matching text (folder names, file names, tags)

Is case sensitive - Checks are case sensitive

Similar text tolerance - Checking tolerance (lower is stricter)

Search filters

Search filters allow you to narrow down the files to scan before any matching takes place. Having filters in place can speed up large scans.

File types

Having 'All file types' checkmarked will scan all the file types supported by the current Scanning Mode. (e.g. just image types for Image Mode). Unchecking this setting allows you to set custom filters or use an editable preset.

You can choose from one of the preset filters (graphics file, office files, movies, etc) by clicking on the 'Presets' button or you can type in your own.

Multiple filters are separated by a semicolon (;). For example:

*.txt;*.bak;*.ini

You can also choose to exclude certain file extensions should you wish. This feature is most useful when a wildcard is used in the 'included' box.

For example:

Included: *.*

Excluded: *.db;*.ini

Additionally, folder names can be excluded from the scan. This option will apply to anything in the Excluded box (For example: .picasaoriginals)

The file presets can be saved and edited from the settings tab.

File sizes

You can set the file size range of the search (Minimum and Maximum). You can also set the units for this value (Bytes, KB, MB, GB or TB). If 'Any file size' is selected then all file sizes are scanned.

'Ignore zero size files' will exclude all empty (zero bytes) files from the scan.

File dates

This allows you to search between certain file dates (Created or Modified). 'Any Date' will scan all file dates.

Image / Movie dimensions

This filter is for Image and Video modes. Here you can limit the scan to pictures or videos which fall between specified dimensions, in pixels (px). 'Any dimensions' will scan all sizes.

Hard-links

Hard-links are a way of having a single file show in multiple places on the file system. Hard-linked files may show up as duplicates, though they actually will only take up the disk space of one file.

Count hard-links on file - The number of hard-links each file contains will be counted and shown in the list. This will slow down the scan.
Exclude hard-linked files from duplicate list - Any hard-linked files are excluded from the duplicate check. Requires the hard-link count setting above to be checkmarked.

The contents of Scan Criteria tab

Regular mode

Image mode

Audio mode

Video mode