Find duplicate files according to their size and hashing algorithm.
"A hash function is a mathematical algorithm that takes an input (in this case, a file) and produces a fixed-size string of characters, known as a hash value or checksum. This hash value is unique to the input data, meaning even a slight change in the input will result in a completely different hash value."
Hash algorithm options are:
To find duplicate files in a directory, run the command:
find_duplicate_files
To find duplicate files with fxhash
algorithm and yaml
format:
find_duplicate_files -csta fxhash -r yaml
To find duplicate files in the Downloads
directory and redirect the output to another file for further analysis:
find_duplicate_files -p ~/Downloads > /tmp/fdf.output
Type in the terminal find_duplicate_files -h
to see the help messages and all available options:
```
find duplicate files according to their size and hashing algorithm
Usage: findduplicatefiles [OPTIONS]
Options:
-a, --algorithm
To build and install from source, run the following command:
cargo install find_duplicate_files
Another option is to clone/copy the project from github, compile and generate the executable:
```
git clone https://github.com/claudiofsr/findduplicatefiles.git
cd findduplicatefiles
cargo b -r && cargo install --path=. ```
In general, jwalk (default) is faster than walkdir.
But if you prefer to use walkdir:
cargo b -r && cargo install --path=. --features walkdir