Warning: alpha/prototype quality software ahead
rdedup
is a tool providing data deduplication with compression and public key
encryption written in Rust programming language. It's useful for backups.
I use rdup to make backups, and also use syncthing to duplicate my backups over a lot of systems. Some of them are more trusted (desktops with disk-level encryption, firewalls, stored in the vault etc.), and some not so much (semi-personal laptops, phones etc.)
As my backups tend to contain a lot of shared data (even backups taken on different systems), it makes perfect sense to deduplicate them.
However I'm paranoid and I don't want one of my hosts being physically or remotely compromised, give access to data inside all my backups from all my systems. Existing deduplication software like ddar or zbackup provide encryption, but only symmetrical (zbackup issue, ddar issue) which means you have to share the same key on all your hosts and one compromised system compromises all your backups.
To fill the missing piece in my master backup plan, I've decided to write it
myself using my beloved Rust programming language. That's how rdedup
started.
rdedup
works very much like zbackup and other deduplication software.
rdedup
uses a special format to use a given directory as a deduplication
storage.
When saving data, rdedup
will split it into smaller pieces (chunks) using
rolling sum algorithm, and store each chunk under unique name (sha256 digest).
Then the whole backup will be described as a list of chunks (their ids).
When restoring data, rdedup
will read the list of chunks and recreate the
original data.
Thanks to this chunking scheme, when saving frequently similar data, a lot of common chunks will be reused, saving space.
What makes rdedup
unique, is that every time new storage directory is created, a pair
of keys (public and secret) is being generated. Public key is saved in the
storage directory itself, while secret key is supposed to be written down or stored
securely in outside location.
Every rdedup
saves a new chunk of data it's encrypted with public key so it can
only be decrypted using the corresponding secret key. This way new backups can
be created, with full deduplication, while accessing the data requires the
private key.
rdedup init
will create a backup
subdirectory in current directory and generate a keypair
used for encryption.
rdedup save <name>
will save any data given on standard input under given name.
rdedup restore <name>
will write on standard output data previously stored under given name
In combination with rdup this can be used to store and restore your backup like this:
rdup -x /dev/null "$HOME" | rdedup save home
rdedup load home | rdup-up "$HOME.restored"