# find_duplicate_files Find duplicate files according to their size and hashing algorithm. "A hash function is a mathematical algorithm that takes an input (in this case, a file) and produces a fixed-size string of characters, known as a hash value or checksum. The hash value acts as a summary representation of the original input. This hash value is unique (disregarding unlikely [collisions]( to the input data, meaning even a slight change in the input will result in a completely different hash value." Hash algorithm options are: 1. [ahash]( (used by [hashbrown]( 2. [blake version 3]( (default) 3. [fxhash]( ([used]( by`FireFox` and `rustc`) 4. [sha256]( 5. [sha512]( find_duplicate_files just reads the files and never changes their contents. See the function [fn open_file()]( to verify. ## Usage examples 1. To find duplicate files in the current directory, run the command: ``` find_duplicate_files ``` 2. To find duplicate files with `fxhash` algorithm and `yaml` format: ``` find_duplicate_files -csta fxhash -r yaml ``` 3. To find duplicate files in the `Downloads` directory and redirect the output to a `json` file for further analysis: ``` find_duplicate_files -p ~/Downloads -r json > fdf.json ``` ## Help Type in the terminal `find_duplicate_files -h` to see the help messages and all available options: ``` find duplicate files according to their size and hashing algorithm Usage: find_duplicate_files [OPTIONS] Options: -a, --algorithm Choose the hash algorithm [default: blake3] [possible values: ahash, blake3, fxhash, sha256, sha512] -c, --clear_terminal Clear the terminal screen before listing the duplicate files -f, --full_path Prints full path of duplicate files, otherwise relative path -g, --generate If provided, outputs the completion file for given shell [possible values: bash, elvish, fish, powershell, zsh] -m, --max_depth Set the maximum depth to search for duplicate files -o, --omit_hidden Omit hidden files (starts with '.'), otherwise search all files -p, --path Set the path where to look for duplicate files, otherwise use the current directory -r, --result_format Print the result in the chosen format [default: personal] [possible values: json, yaml, personal] -s, --sort Sort result by file size, otherwise sort by number of duplicate files -t, --time Show total execution time -v, --verbose Show intermediate runtime messages -h, --help Print help (see more with '--help') -V, --version Print version ``` ## Building To build and install from source, run the following command: ``` cargo install find_duplicate_files ``` Another option is to install from github: ``` cargo install --git ``` ## Mutually exclusive features #### Walking a directory recursively: jwalk or walkdir. In general, [jwalk]( (default) is faster than [walkdir]( But if you prefer to use walkdir: ``` cargo install --features walkdir find_duplicate_files ```
