Sweep-bptree(under development)

In memory general purpose b+ tree, with argumented tree support.

Features(Why use this)

Install

toml [dependencies] sweep-bptree = "0.4"

Example

Map

``` rust use sweep_bptree::BPlusTreeMap;

let mut map = BPlusTreeMap::::new(); map.insert(1, 2);

asserteq!(map.get(&1).unwrap(), &2); assert!(map.get(&2).isnone()); ```

Argumented tree

```rust use sweepbptree::BPlusTreeMap; use sweepbptree::argument::count::Count;

// use Count as Argument to create a order statistic tree let mut map = BPlusTreeMap::::new(); map.insert(1, 2); map.insert(2, 3); map.insert(3, 4);

// get by order, time complexity is log(n) asserteq!(map.getbyargument(0), Some((&1, &2))); asserteq!(map.getbyargument(1), Some((&2, &3)));

// get the offset for key

// 0 does not exists asserteq!(map.rankby_argument(&0), Err(0));

asserteq!(map.rankbyargument(&1), Ok(0)); asserteq!(map.rankbyargument(&2), Ok(1)); asserteq!(map.rankby_argument(&3), Ok(2));

// 4 does not exists asserteq!(map.rankby_argument(&4), Err(3)); ```

Set

```rust use sweep_bptree::BPlusTreeSet;

let mut set = BPlusTreeSet::::new(); set.insert(1); assert!(set.contains(&1)); set.remove(&1); assert!(!set.contains(&1)); ```

Tree

For adanced usage, use BPlusTree directly.

crud

```rust use sweep_bptree::{BPlusTree, NodeStoreVec};

// create a nodestore with u64 as key, (f64, f64) as value let mut nodestore = NodeStoreVec::

// insert new value assert!(tree.insert(3, (0., 0.)).is_none());

// update by insert again assert_eq!(tree.insert(3, (1., 1.)).unwrap(), (0., 0.));

// remove the value assert_eq!(tree.remove(&3).unwrap(), (1.0, 1.0));

assert!(tree.is_empty()); ```

Create multiple owned cursors

``` rust use sweepbptree::{BPlusTree, NodeStoreVec}; let mut nodestore = NodeStoreVec::

for i in 0..100 { tree.insert(i, (i as f64, i as f64)); }

let cursor0 = tree.cursorfirst().unwrap(); let cursor1 = tree.cursorfirst().unwrap();

// remove the key 0 tree.remove(&0);

// cursor's value should be empty now assert!(cursor0.value(&tree).isnone());

// move to next let cursornext = cursor0.next(&tree).unwrap(); asserteq!(*cursornext.key(), 1); asserteq!(cursornext.value(&tree).unwrap().0, 1.);

// insert a new value to key 0 tree.insert(0, (100., 100.));

// now cursor1 should retrieve the new value asserteq!(cursor_1.value(&tree).unwrap().0, 100.); ```

Unsafe

This crate utilize unsafe code to improve performance. Mostly for memory initialize, copy and move operations. It is tested with fuzzy testing.

Benchmark

Data collected on macbook pro m1. The Key type is Point { x: f64, y: f64}

```bash

try it

cargo bench

generate the table

critcmp > benchmarkdata python3 rendertable.py

```

Advantages

Disadvantages

Point benchmark data

Point is a struct with 2 f64 fields, with very cheap clone and cmp cost.

| name | size | btree | bptree | speed factor | | - | - | - | - | - | | clone | 1000 | 12.3±0.08µs | 5.6±0.21µs | 2.20 | | clone | 10000 | 133.2±1.60µs | 60.3±0.53µs | 2.21 | | clone | 100000 | 2.5±0.03ms | 1073.9±93.60µs | 2.33 | | cursor | 1000 | - | 4.6±0.03µs | - | | cursor | 10000 | - | 46.8±0.18µs | - | | cursor | 100000 | - | 470.0±2.39µs | - | | drop | 1000 | 7.9±0.10µs | 1093.1±17.60ns | 7.23 | | drop | 10000 | 84.7±1.08µs | 12.1±0.32µs | 7.00 | | drop | 100000 | 1799.9±23.56µs | 295.2±11.19µs | 6.10 | | intoiter | 1000 | 8.1±0.58µs | 1134.7±13.43ns | 7.14 | | intoiter | 10000 | 84.9±0.71µs | 12.1±0.10µs | 7.02 | | intoiter | 100000 | 1825.7±79.25µs | 254.9±7.95µs | 7.16 | | iter | 1000 | 844.2±8.44ns | 416.8±5.02ns | 2.03 | | iter | 10000 | 11.9±0.29µs | 4.7±0.03µs | 2.53 | | iter | 100000 | 216.5±0.80µs | 54.8±0.13µs | 3.95 | | orderedget | 1000 | 26.8±1.08µs | 10.0±0.10µs | 2.68 | | orderedget | 10000 | 418.7±2.83µs | 107.5±0.53µs | 3.89 | | orderedget | 100000 | 4.7±0.01ms | 1126.8±4.21µs | 4.17 | | orderedremove | 1000 | 32.5±0.84µs | 84.1±0.80µs | 0.39 | | orderedremove | 10000 | 344.9±3.55µs | 873.5±0.72µs | 0.39 | | orderedremove | 100000 | 4.5±0.47ms | 9.6±0.20ms | 0.47 | | randomget | 1000 | 25.1±1.45µs | 18.6±1.77µs | 1.35 | | randomget | 10000 | 664.3±50.24µs | 956.8±6.58µs | 0.69 | | randomget | 100000 | 9.3±0.01ms | 13.4±0.02ms | 0.69 | | randomremove | 1000 | 61.6±4.01µs | 89.6±1.39µs | 0.69 | | randomremove | 10000 | 912.4±0.84µs | 1411.6±2.83µs | 0.65 | | random_remove | 100000 | 12.8±0.90ms | 18.7±0.04ms | 0.68 |

Point benchmark data with bulk_load

| name | size | type | time | | - | - | - | - | | orderedinsert | 1000 | bptree | 26.9±0.23µs | | | bptreebulk | 6.1±0.08µs | | | btree | 50.8±1.80µs | orderedinsert | 10000 | bptree | 358.3±10.02µs | | | bptreebulk | 66.0±0.61µs | | | btree | 778.5±13.34µs | orderedinsert | 100000 | bptree | 4.7±0.02ms | | | bptreebulk | 916.1±20.79µs | | | btree | 8.8±0.04ms | randominsert | 1000 | bptree | 81.3±1.15µs | | | bptreesortload | 43.2±0.29µs | | | btree | 55.5±1.36µs | randominsert | 10000 | bptree | 1381.6±16.30µs | | | bptreesortload | 698.6±7.84µs | | | btree | 962.6±9.49µs | randominsert | 100000 | bptree | 18.3±0.06ms | | | bptreesort_load | 9.5±0.01ms | | | btree | 14.0±0.08ms

String benchmark data

String type is relative heavy for clone, cmp and drop.

For clone and drop, bptree actually needs to do more work, but still faster. For randomget and randomremove, bptree still slower, but the margin is smaller.

| name | size | btree | bptree | speed factor | | - | - | - | - | - | | clone | 1000 | 34.4±0.41µs | 27.6±0.55µs | 1.25 | | clone | 10000 | 431.6±10.88µs | 307.0±8.21µs | 1.41 | | clone | 100000 | 7.2±0.07ms | 5.2±0.02ms | 1.38 | | cursor | 1000 | - | 30.2±0.16µs | - | | cursor | 10000 | - | 306.2±0.89µs | - | | cursor | 100000 | - | 3.5±0.64ms | - | | drop | 1000 | 21.5±1.03µs | 14.7±0.06µs | 1.46 | | drop | 10000 | 234.5±2.83µs | 146.6±2.22µs | 1.60 | | drop | 100000 | 4.5±0.06ms | 2.3±0.02ms | 1.96 | | intoiter | 1000 | 20.5±0.19µs | 13.2±0.82µs | 1.55 | | intoiter | 10000 | 235.2±7.75µs | 147.2±1.07µs | 1.60 | | intoiter | 100000 | 4.4±0.07ms | 2.3±0.02ms | 1.91 | | iter | 1000 | 848.8±6.17ns | 405.1±5.59ns | 2.10 | | iter | 10000 | 18.3±0.42µs | 5.1±0.03µs | 3.59 | | iter | 100000 | 235.7±6.59µs | 55.1±0.40µs | 4.28 | | orderedget | 1000 | 221.5±2.02µs | 213.1±1.81µs | 1.04 | | orderedget | 10000 | 2.4±0.00ms | 2.1±0.02ms | 1.14 | | orderedget | 100000 | 25.8±0.07ms | 22.2±0.02ms | 1.16 | | orderedremove | 1000 | 231.5±2.46µs | 311.2±4.56µs | 0.74 | | orderedremove | 10000 | 2.4±0.00ms | 3.2±0.02ms | 0.75 | | orderedremove | 100000 | 26.2±0.67ms | 34.5±0.92ms | 0.76 | | randomget | 1000 | 63.8±0.84µs | 71.2±1.82µs | 0.90 | | randomget | 10000 | 1156.2±5.75µs | 1464.0±11.94µs | 0.79 | | randomget | 100000 | 23.0±0.68ms | 26.0±0.32ms | 0.88 | | randomremove | 1000 | 126.1±0.93µs | 152.2±2.01µs | 0.83 | | randomremove | 10000 | 1764.6±30.54µs | 2.2±0.00ms | 0.80 | | random_remove | 100000 | 30.7±0.11ms | 37.2±0.19ms | 0.83 |

String benchmark data with bulk_load

| name | size | type | time | | - | - | - | - | | orderedinsert | 1000 | bptree | 220.3±2.30µs | | | bptreebulk | 173.1±2.14µs | | | btree | 268.3±11.22µs | orderedinsert | 10000 | bptree | 2.4±0.07ms | | | bptreebulk | 1736.9±16.15µs | | | btree | 3.0±0.00ms | orderedinsert | 100000 | bptree | 25.5±0.11ms | | | bptreebulk | 17.8±0.13ms | | | btree | 33.0±0.11ms | randominsert | 1000 | bptree | 135.6±1.39µs | | | bptreesortload | 75.4±1.55µs | | | btree | 122.8±1.82µs | randominsert | 10000 | bptree | 2.2±0.00ms | | | bptreesortload | 1449.7±14.30µs | | | btree | 1807.4±18.02µs | randominsert | 100000 | bptree | 36.5±0.53ms | | | bptreesort_load | 20.8±0.46ms | | | btree | 30.9±0.82ms

Contribution

Contributions are welcome! Please open an issue or a PR. Currently, documentation, feature parity with std::collections::BTreeMap, and performance improvements are the main areas of interest.