rdedup

Travis CI Build Status crates.io Gitter Chat

Introduction

Warning: alpha/prototype quality software ahead

rdedup is a tool providing data deduplication with compression and public key encryption written in Rust programming language. It's useful for backups.

My use case

I use rdup to make backups, and also use syncthing to duplicate my backups over a lot of systems. Some of them are more trusted (desktops with disk-level encryption, firewalls, stored in the vault etc.), and some not so much (semi-personal laptops, phones etc.)

As my backups tend to contain a lot of shared data (even backups taken on different systems), it makes perfect sense to deduplicate them.

However I'm paranoid and I don't want one of my hosts being physically or remotely compromised, give access to data inside all my backups from all my systems. Existing deduplication software like ddar or zbackup provide encryption, but only symmetrical (zbackup issue, ddar issue) which means you have to share the same key on all your hosts and one compromised system compromises all your backups.

To fill the missing piece in my master backup plan, I've decided to write it myself using my beloved Rust programming language. That's how rdedup started.

How it works

rdedup works very much like zbackup and other deduplication software.

rdedup uses a special format to use a given directory as a deduplication storage.

When saving data, rdedup will split it into smaller pieces (chunks) using rolling sum algorithm, and store each chunk under unique name (sha256 digest). Then the whole backup will be described as a list of chunks (their ids).

When restoring data, rdedup will read the list of chunks and recreate the original data.

Thanks to this chunking scheme, when saving frequently similar data, a lot of common chunks will be reused, saving space.

What makes rdedup unique, is that every time new storage directory is created, a pair of keys (public and secret) is being generated. Public key is saved in the storage directory itself, while secret key is supposed to be written down or stored securely in outside location.

Every rdedup saves a new chunk of data it's encrypted with public key so it can only be decrypted using the corresponding secret key. This way new backups can be created, with full deduplication, while accessing the data requires the private key.

Details

Usage

rdedup init

will create a backup subdirectory in current directory and generate a keypair used for encryption.

rdedup save <name>

will save any data given on standard input under given name.

rdedup restore <name>

will write on standard output data previously stored under given name

In combination with rdup this can be used to store and restore your backup like this:

rdup -x /dev/null "$HOME" | rdedup save home rdedup load home | rdup-up "$HOME.restored"