nebari - noun - the surface roots that flare out from the base of a bonsai tree
This crate provides the Roots
type, which is the transactional storage layer
for BonsaiDb
. It is loosely inspired by
Couchstore
.
This crate blocks the current thread when accessing the filesystem. If you are looking for an async-ready database, BonsaiDb is our vision of an async-aware database built atop Nebari.
This crate is alpha. While its format is considered stable, there may be bugs that could lead to data loss. Please have a good backup strategy while using this crate.
Inserting a key-value pair in an on-disk tree with full revision history:
```rust use nebari::{ tree::{Root, Versioned}, Config, };
let databasefolder = tempfile::tempdir().unwrap(); let roots = Config::defaultfor(database_folder.path()) .open() .unwrap(); let tree = roots.tree(Versioned::tree("a-tree")).unwrap(); tree.set("hello", "world").unwrap(); ```
For more examples, check out nebari/examples/
.
Nebari exposes multiple levels of functionality. The lowest level functionality
is the
TreeFile
.
A TreeFile
is a key-value store that uses an append-only file format for its
implementation.
Using TreeFile
s and a transaction log,
Roots
enables
ACID-compliant, multi-tree transactions.
Each tree supports:
TreeFile::modify()
which allows operating on one or more keys and performing various
operations.Vault
trait allows you to
bring your own encryption, compression, or other functionality to this format.
Each independently-addressible chunk of data that is written to the file
passes through the vault.VersionedTreeRoot
to store information that allows scanning old revision information. Or, if you
want to avoid the extra IO, use the
UnversionedTreeRoot
which only stores the information needed to retrieve the latest data in the
file.ACID-compliance:
TreeFile
is done atomically.
Operation::CompareSwap
can be used to perform atomic operations that require evaluating the
currently stored value.Transaction IDs are recorded in the tree headers. When restoring from disk, the transaction IDs are verified with the transaction log. Because of the append-only format, if we encounter a transaction that wasn't recorded, we can continue scanning the file to recover the previous state. We do this until we find a successfluly commited transaction.
This process is much simpler than most database implementations due to the simple guarantees that append-only formats provide.
@ecton wasn't a database engineer before starting this project, and depending on your viewpoint may still not be considered a database engineer. Implementing ACID-compliance is not something that should be attempted lightly.
Creating ACID-compliance with append-only formats is much easier to achieve, however, as long as you can guarantee two things:
The B-Tree implementation in Nebari is designed to offer those exact guarantees.
The major downside of append-only formats is that deleted data isn't cleaned up until a maintenance process occurs: compaction. This process rewrites the file's contents, skipping over entries that are no longer alive. This process can happen without blocking the file from being operated on, but it does introduce IO overhead during the operation.
Nebari provides APIs that perform compaction, but currently delegates scheduling and automation to consumers of this library.
This project, like all projects from Khonsu Labs, are open-source. This repository is available under the MIT License or the Apache License 2.0.
To learn more about contributing, please see CONTRIBUTING.md.