Build Status

Monotree

Rust implementation of an optimized Sparse Merkle Tree.
This is a kind of binary-radix tree based on bitwise branching, currently, no nibble of bit. For now, branching unit is just a single bit, neither a 4-bit nor a byte nibble.

Features

This library mostly relies on the Rust standard library only except for database APIs and hashers.
Currently, monotree supports these databases and hash functions following, but is designed to be super easy to customize and add:

Databases include: - HashMap - RocksDB - Sled

Hashers include: - Blake3 - Blake2s and Blake2b - SHA-2 - SHA-3 (Keccak)

Quick start

from examples/basic.rs

Regarding non-inclusion proof and inclusion proof, See Merkle proof section in More Examples below.
```rust use monotree::{Monotree, Result}; use monotree::utils::random_hash;

fn main() -> Result<()> { // Init a monotree instance // by default, with 'HashMap' and 'Blake3' hash function let mut tree = Monotree::default();

// It is natural the tree root initially has 'None'
let root = None;

// Prepare a random pair of key and leaf.
// random_hash() gives a fixed length of random array,
// where Hash -> [u8; HASH_LEN], HASH_LEN = 32
let key = random_hash();
let leaf = random_hash();

// Insert the entry (key, leaf) into tree, yielding a new root of tree
let root = tree.insert(root.as_ref(), &key, &leaf)?;
assert_ne!(root, None);

// Get the leaf inserted just before. Note that the last root was used.
let found = tree.get(root.as_ref(), &key)?;
assert_eq!(found, Some(leaf));

// Remove the entry
let root = tree.remove(root.as_ref(), &key)?;

// surely, the tree has nothing and the root back to 'None'
assert_eq!(tree.get(root.as_ref(), &key)?, None);
assert_eq!(root, None);
Ok(())

} ```

Performance

All benchmarks were performed in a cumulative way, where the root resulting from an operation just before was reused for subsequent operations. They were carried out with randomly-generated bytes entries and from the root of an empty tree. (Deletion starts from the last root.)

Tested on: Intel Core i5-3570 CPU @ 3.4GHz and 16 GB RAM / Ubuntu 18.04 with rustc stable 1.42.0
As a hasher, Blake3 was used in this benchmark.

| Op. | DB used | #Entries | Time Measured | Std Error | per Op. | | :----: | :-----: | -------: | :-----------: | :-------: | :------: | | Insert | HashMap | 10 | 8.50 us | 0.01 us | 0.85 us | | Insert | HashMap | 100 | 165.71 us | 0.14 us | 1.66 us | | Insert | HashMap | 1000 | 2.32 ms | 0.01 ms | 2.32 us | | Insert | HashMap | 10000 | 37.39 ms | 0.02 ms | 3.74 us | | Get | HashMap | 10 | 7.82 us | 0.01 us | 0.78 us | | Get | HashMap | 100 | 114.51 us | 0.14 us | 1.15 us | | Get | HashMap | 1000 | 1.52 ms | 0.01 ms | 1.52 us | | Get | HashMap | 10000 | 20.40 ms | 0.01 ms | 2.40 us | | Remove | HashMap | 10 | 15.65 us | 0.01 us | 1.57 us | | Remove | HashMap | 100 | 282.68 us | 0.09 us | 2.83 us | | Remove | HashMap | 1000 | 4.23 ms | 0.01 ms | 4.23 us | | Remove | HashMap | 10000 | 69.15 ms | 0.07 ms | 6.92 us | | | | | | | | | Insert | RocksDB | 10 | 34.80 us | 0.27 us | 3.48 us | | Insert | RocksDB | 100 | 602.55 us | 2.72 us | 6.03 us | | Insert | RocksDB | 1000 | 10.35 ms | 0.06 ms | 10.35 us | | Insert | RocksDB | 10000 | 145.83 ms | 0.20 ms | 14.58 us | | Get | RocksDB | 10 | 10.20 us | 0.06 us | 1.02 us | | Get | RocksDB | 100 | 142.20 us | 0.79 us | 1.42 us | | Get | RocksDB | 1000 | 1.74 ms | 0.01 ms | 1.74 us | | Get | RocksDB | 10000 | 22.94 ms | 0.03 ms | 2.29 us | | Remove | RocksDB | 10 | 58.33 us | 0.50 us | 5.83 us | | Remove | RocksDB | 100 | 1.02 ms | 6.40 us | 10.20 us | | Remove | RocksDB | 1000 | 20.22 ms | 0.10 ms | 20.22 us | | Remove | RocksDB | 10000 | 394.95 ms | 1.46 ms | 39.50 us |

Integration tests and benchmark

performs integration tests with full combinations of operations and tree types consisting of Databases and Hashers included.

bash ## Some tests are time consuming. ## --release is optional, but without it, it will take a longer time to complete the tests $ cargo test --release

performs a micro-benchmark based on Criterion, with full combinations of operations and tree types consisting of Databases and Hashers included.

bash $ cargo bench

and macroscopic-time benchmarks (but rather wider error bars) with all tree types were also prepared in examples/perf.rs.

bash $ cargo run --release --example perf

and some examples describing how to manipulate monotree

bash $ cargo run --example basic $ cargo run --example advanced

Further improvement

monotree is a special case among the generalized binary radix trees, I'd like to call it PoT (Power Of Two) radix tree.
If monotree were generalized with pow(2, n) of nibbles as a branching unit, there would have been room for further performance improvement.

This generalization is now on progress.

More Examples

from examples/advanced.rs ```rust

use monotree::database::; use monotree::hasher::; use monotree::utils::; use monotree::;

fn main() -> Result<()> { // Init a monotree instance: // manually select a db and a hasher as your preference // Monotree::::new(DB_PATH) // where DATABASE = {MemoryDB, RocksDB, Sled} // HASHER = {Blake3, Blake2s, Blake2b, Sha2, Sha3} let mut tree = Monotree::::new("/tmp/monotree");

// It is natural the tree root initially has 'None'
let root = None;

// Prepare 500 random pairs of key and leaf.
// random_hashes() gives Vec<Hash>
// where Hash is a fixed length of random array or [u8; HASH_LEN]
let n = 500;
let keys = random_hashes(n);
let leaves = random_hashes(n);

// Insert a bunch of entries of (key, leaf) into tree.
// looks quite similar with 'monotree::insert()', but for insertion using batch.
// 'inserts()' is much faster than 'insert()' since it's based on the following:
// (1) DB batch-write, (2) sorting keys before insertion, and (3) mem-cache.
let root = tree.inserts(root.as_ref(), &keys, &leaves)?;
assert_ne!(root, None);

// Similarly, there are methods 'gets()' and 'removes()' for batch use of
// 'get()' and 'remove()', respectively.
let result = tree.gets(root.as_ref(), &keys)?;
assert_eq!(result.len(), keys.len());

let root = tree.removes(root.as_ref(), &keys)?;
// surely, the tree has nothing nad the root back to 'None'
assert_eq!(root, None);

/////////////////////////////////////////////////////////////////////
// `Merkle proof` secion: verifying inclusion of data (inclusion proof)

// `Monotree` has compressed representation, but it fully retains
// the properties of the Sparse Merkle Tree (SMT).
// Thus, `non-inclusion proof` is quite straightforward. Just go walk down
// the tree with a key (or a path) given. If we cannot successfully get a leaf,
// we can assure that the leaf is not a part of the tree.
// The process of inclusion proof is below:

// random pre-insertion for Merkle proof test
let root = tree.inserts(root.as_ref(), &keys, &leaves)?;

// pick a random key from keys among inserted just before
let key = keys[99];

// generate the Merkle proof for the root and the key
let proof = tree.get_merkle_proof(root.as_ref(), &key)?;

// To verify the proof correctly, you need to provide a hasher matched
// Previously the tree was initialized with `Blake2b`
let hasher = Blake2b::new();

// get a leaf matched with the key: where the Merkle proof verification starts off
let leaf = leaves[99];

// verify the Merkle proof using all those above
let verified = verify_proof(&hasher, root.as_ref(), &leaf, proof.as_ref());
assert_eq!(verified, true);

/////////////////////////////////////////////////////////////////////
// Usage examples with some functional tests
// Carefully trace the variable `root` as they are frequently shadowed.

let mut tree = Monotree::default();
let mut root = None;
let hasher = Blake3::new();

//--- insert/get and gen_proof/verify_proof over iterator
for (i, (key, value)) in keys.iter().zip(leaves.iter()).enumerate() {
    // insert a key into tree
    root = tree.insert(root.as_ref(), key, value)?;

    // inserted a key and yields a root, where cumulative check-up goes on
    for (k, v) in keys.iter().zip(leaves.iter()).take(i + 1) {
        // check if the key-value pair was correctly inserted so far
        assert_eq!(tree.get(root.as_ref(), k)?, Some(*v));

        // generates a Merkle proof with all keys so far
        let proof = tree.get_merkle_proof(root.as_ref(), k)?;

        // verify the Merkle proof with all keys so far
        assert_eq!(
            verify_proof(&hasher, root.as_ref(), v, proof.as_ref()),
            true
        );
    }
}
assert_ne!(root, None);

//--- insert/get and gen_proof/verify_proof after each deletion of entry
for (i, (key, _)) in keys.iter().zip(leaves.iter()).enumerate() {
    assert_ne!(root, None);

    // assert that all other values are fine after each deletion
    for (k, v) in keys.iter().zip(leaves.iter()).skip(i) {
        // check in the same way as the previous example
        assert_eq!(tree.get(root.as_ref(), k)?, Some(*v));
        let proof = tree.get_merkle_proof(root.as_ref(), k)?;
        assert_eq!(
            verify_proof(&hasher, root.as_ref(), v, proof.as_ref()),
            true
        );
    }
    // delete a key and check if it was correctly removed
    root = tree.remove(root.as_ref(), key)?;
    assert_eq!(tree.get(root.as_ref(), key)?, None);
}
// must be back to inital state of tree
assert_eq!(root, None);

//--- faster way to insert/remove entries
// Now tree is empty, and root is back to `None` again
// Redo all those above using methods supporting batch operations
root = tree.inserts(root.as_ref(), &keys, &leaves)?;
assert_ne!(root, None);

// Even if we shuffle the keys when removing,
shuffle(&mut keys);

// there's no difference. Back to `None` of root and empty tree again.
// that's why the `root` plays a role as "state index of tree"
root = tree.removes(root.as_ref(), &keys)?;
assert_eq!(root, None);

Ok(())

} ```