findduplicatefiles

Find duplicate files according to their size and hashing algorithm.

"A hash function is a mathematical algorithm that takes an input (in this case, a file) and produces a fixed-size string of characters, known as a hash value or checksum. This hash value is unique to the input data, meaning even a slight change in the input will result in a completely different hash value."

Hash algorithm options are:

ahash (used by hashbrown)
blake version 3 (default)
fxhash (used byFireFox and rustc)
sha256
sha512

Usage examples

To find duplicate files in a directory, run the command: find_duplicate_files
To find duplicate files with fxhash algorithm and yaml format: find_duplicate_files -csta fxhash -r yaml
To find duplicate files in the Downloads directory and redirect the output to another file for further analysis: find_duplicate_files -p ~/Downloads > /tmp/fdf.output

Help

Type in the terminal find_duplicate_files -h to see the help messages and all available options: ``` find duplicate files according to their size and hashing algorithm

Usage: findduplicatefiles [OPTIONS]

Options: -a, --algorithm Choose the hash algorithm [default: blake3] [possible values: ahash, blake3, fxhash, sha256, sha512] -c, --clearterminal Clear the terminal screen before listing the duplicate files -f, --fullpath Prints full path of duplicate files, otherwise relative path -g, --generate If provided, outputs the completion file for given shell [possible values: bash, elvish, fish, powershell, zsh] -m, --maxdepth Set the maximum depth to search for duplicate files -o, --omithidden Omit hidden files (starts with '.'), otherwise search all files -p, --path Set the path where to look for duplicate files, otherwise use the current directory -r, --result_format Print the result in the chosen format [default: personal] [possible values: json, yaml, personal] -s, --sort Sort result by file size, otherwise sort by number of duplicate files -t, --time Show total execution time -h, --help Print help (see more with '--help') -V, --version Print version ```

Building

To build and install from source, run the following command: cargo install find_duplicate_files Another option is to clone/copy the project from github, compile and generate the executable: ``` git clone https://github.com/claudiofsr/findduplicatefiles.git

cd findduplicatefiles

cargo b -r && cargo install --path=. ```

Mutually exclusive features

Walking a directory recursively: jwalk or walkdir.

In general, jwalk (default) is faster than walkdir.

But if you prefer to use walkdir: cargo b -r && cargo install --path=. --features walkdir