golomb-set

A Golomb Coded Set implementation in Rust

crates.io docs.rs Build Status

Giovanni Bajo's blog post as well as their Python and C++ implementations were a huge help when writing this library, found here and here respectively.

A GCS is a probabilistic data structure which is typically smaller than a bloom filter but slower to query. A sorted list of differences between samples of a random distribution will roughly have a geometric distribution, which makes for an optimal prefix for Golomb coding. Since a good hash will be randomly distributed, this encoding makes for efficient storage of hashed values.

Example

```rust use gcs::GcsBuilder; use md5::Md5;

// Create a GCS where when 3 items are stored in the set, the // probability of a false positive will be 1/2^5 let mut builder = GcsBuilder::new(3, 5);

// Insert the MD5 hashes of "alpha" and "bravo" builder.insert::(b"alpha")?; builder.insert::(b"bravo")?;

let gcs = builder.build();

assert!(gcs.contains(b"alpha")); assert!(gcs.contains(b"bravo")); assert!(!gcs.contains(b"charlie")); ```