Ordered set and map data structures via an arena-based scapegoat tree (memory-efficient, self-balancing binary search tree).
This library is #![no_std]
compatible by default, strictly #![forbid(unsafe_code)]
, and verified using differential fuzzing.
Three APIs:
Strives for two properties:
Maximal safety: strong memory safety guarantees.
unsafe
(no raw pointer dereference, etc.).debug_assert!
for logical invariants exercised in testing.Rc<RefCell<T>>
's runtime check).Minimal footprint: small binary with low resource use.
Other features:
Ord
trait.SGMap
non-exhaustive API example (would work identically for std::collections::BTreeMap
):
```rust use scapegoat::SGMap;
let mut example = SGMap::new();
example.insert(3, String::from("the")); example.insert(2, String::from("don't blame")); example.insert(1, String::from("Please")); example.insert(4, String::from("borrow checker"));
asserteq!(
example.iter().map(|(, v)| v).collect::
assert_eq!(example[&3], "the");
let pleasetuple = example.popfirst().unwrap(); asserteq!(pleasetuple, (1, String::from("Please")));
example.insert(5, String::from("! :P"));
let dontblame = example.getmut(&2).unwrap(); dontblame.remove(0); dontblame.insert(0, 'D');
asserteq!(
example.intoiter().map(|(_, v)| v).collect::
The maximum number of stack-stored elements (set) or key-value pairs (map/tree) is determined at compile-time, via the environment variable SG_MAX_STACK_ELEMS
.
Valid values are in the range [0, 32]
and powers of 2 up to 2048
.
For example, to store up to 1024
items on the stack:
bash
export SG_MAX_STACK_ELEMS=1024
cargo build --release
Please note:
If the SG_MAX_STACK_ELEMS
environment variable is not set, it will default to 2048
.
For any system with dynamic (heap) memory: the first SG_MAX_STACK_ELEMS
elements are stack-allocated and the remainder will be automatically heap-allocated.
For embedded systems without dynamic memory: SG_MAX_STACK_ELEMS
is a hard maximum - attempting to insert beyond this limit will cause a panic.
high_assurance
to ensure error handling and avoid panic (see below).high_assurance
FeatureFor embedded use cases prioritizing robustness, enabling the high_assurance
feature makes two changes:
Front-end, API Tweak: insert
and append
APIs now return Result
. Err
indicates stack storage is already at maximum capacity, so caller must handle. No heap use, no panic potential on insert.
Back-end, Integer Packing: Because the fixed/max size of the stack arena is known, indexing integers (metadata stored at every node!) can be size-optimized. This memory micro-optimization honors the original design goals of the scapegoat data structure.
That second change is a subtle but interesting one. Example of packing saving 53% (but in reality only 61 KB) of RAM usage:
```rust use scapegoat::SGMap; use core::mem::size_of;
// If you're on a 64-bit system, you can compile-time check the below numbers yourself!
// Just do:
//
// $ cargo test --doc
// $ cargo test --doc --features="high_assurance"
//
// One command per set of cfg
macros below.
// Internally, this compile time struct packing is done with the smallnum
crate:
// https://crates.io/crates/smallnum
// This code assumes SG_MAX_STACK_ELEMS == 2048
(default)
let temp: SGMap
// Without packing
{
asserteq!(sizeof::
// With packing
{
asserteq!(sizeof::
This library has three dependencies, each of which have no dependencies of their own (e.g. exactly three total dependencies).
smallvec
- !#[no_std]
compatible Vec
alternative. Used in Mozilla's Servo browser engine.micromath
- !#[no_std]
, #![forbid(unsafe_code)]
floating point approximations.smallnum
- !#[no_std]
, #![forbid(unsafe_code)]
integer packing.This project is an exercise in safe data structure design.
It's not as mature, fast, or memory efficient as the standard library's BTreeMap
/BTreeSet
.
It does, however, offer:
Best-effort Compatibility: APIs are a subset of BTreeMap
's/BTreeSet
's, making it a somewhat "drop-in" replacement for !#[no_std]
systems. Please open an issue if an API you need isn't yet supported!
Dynamic Verification: Coverage-guided differential fuzzing is used to verify that this implementation is logically equivalent and equally reliable.