This crate implements a Merkle Search Tree as described in the 2019 paper Merkle Search Trees: Efficient State-Based CRDTs in Open Networks by Alex Auvolat & François Taïani.[^cite]
When paired with CRDT-based value types, a Merkle Search Tree can act as an efficient anti-entropy primitive, allowing independent replicas to concurrently modify data within the tree with deterministic and eventual convergence across all peers.
See [MerkleSearchTree
] documentation.
```rust use merklesearchtree::{MerkleSearchTree, diff::diff};
// Initialise a tree using the default configuration, appropriate for most uses. let mut node_a = MerkleSearchTree::default();
// Upsert values into the tree. // // For the MST construct to be a CRDT itself, the values stored into the tree // must also be CRDTs (or at least, have deterministic conflict resolution). // Here the MST is used as an add-only set (a trivial CRDT) by using () as the // key values. nodea.upsert("bananas", &()); nodea.upsert("plátanos", &());
// Another node has differing keys. let mut nodeb = MerkleSearchTree::default(); nodeb.upsert("donkey", &());
// The MST root hash can be used as an efficient consistency check (comparison // is O(1) in space and time). // // In this case, both trees are inconsistent w.r.t each other, which is // indicated by their differing root hashes. assertne!(nodea.roothash(), nodeb.root_hash());
// Generate compact summaries of the MST content, suitable for transmission over // the network, and use it to compute the diff between two trees. let diffrange = diff( nodeb.serialisepageranges().unwrap().intoiter(), nodea.serialisepageranges().unwrap().into_iter(), );
// In this case, node B can obtain the missing/differing keys in node A by // requesting keys within the computed diff range (inclusive): assertmatches::assertmatches!(diffrange.asslice(), [range] => { asserteq!(range.start(), &"bananas"); asserteq!(range.end(), &"plátanos"); }); ```
Operations against a Merkle Search Tree are fast, executing against millions/billions of keys per second:
| Key Count | Insert All Keys | Generate Root | Serialise | Diff (consistent) | Diff (inconsistent) | | ------------ | --------------- | ------------- | --------- | ----------------- | ------------------- | | 100 keys | 7µs | 13µs | 98ns | 188ns | 525ns | | 1,000 keys | 92µs | 131µs | 889ns | 644ns | 6µs | | 10,000 keys | 1356µs | 1317µs | 10µs | 5µs | 71µs | | 100,000 keys | 17ms | 9ms | 112µs | 29µs | 515µs |
The above measurements capture the single-threaded performance of operations against a tree containing varying numbers of keys on a M1 MacBook Pro.
The benchmarks to generate these numbers are included in this repo (run cargo
bench
).
This crate is extensively tested using randomised fuzzing & property testing to validate correctness, and ensure no panics occur in release builds.
State-Based CRDTs in Open Networks. SRDS 2019 - 38th IEEE International
Symposium on Reliable Distributed Systems, Oct 2019, Lyon, France. pp.1-10,
⟨10.1109/SRDS.2019.00032⟩. ⟨hal-02303490⟩