Deduplicator

Find, Sort, Filter & Delete duplicate files

NOTE: This project is still being developed. At the moment, as shown in the screenshot below, deduplicator is able to scan through and list duplicates with and without caching. Contributions are welcome.

Usage

```bash Usage: deduplicator [OPTIONS]

Options: -t, --types Filetypes to deduplicate (default = all) --dir

Run Deduplicator on dir different from pwd -i, --interactive Delete files interactively -m, --minsize Minimum filesize of duplicates to scan (e.g., 100B/1K/2M/3G/4T). [default = 0] -h, --help Print help information -V, --version Print version information ```

Installation

Currently, deduplicator is only installable via rust's cargo package manager

``` cargo install deduplicator ```

note that if you use a version manager to install rust (like asdf), you need to reshim (`asdf reshim rust`).

Performance

Deduplicator uses fxhash (a non-cryptographic hashing algorithm) which is extremely fast. As a result, deduplicator is able to process huge amounts of data in a couple of seconds. few milliseconds.

While testing, Deduplicator was able to go through 8.6GB of pdf files and detect duplicates in 2.9 seconds As of version 0.1.1, on testing locally, deduplicator was able to process and find duplicates in 120GB of files (Videos, PDFs, Images) in ~300ms

Screenshots