Front-coded string dictionary: Fast and compact indexed string set

Documentation Crates.io License: MIT

This is a Rust library to store an indexed set of strings and support fast queires. The data structure is a plain front-coded string dictionary described in Martínez-Prieto et al., Practical compressed string dictionaries, INFOSYS 2016.

Japanese description

Features

Example

```rust use fcsd::Set;

// Input string keys should be sorted and unique. let keys = ["ICDM", "ICML", "SIGIR", "SIGKDD", "SIGMOD"];

// Builds an indexed set. let set = Set::new(keys).unwrap(); assert_eq!(set.len(), keys.len());

// Gets indexes associated with given keys. let mut locator = set.locator(); asserteq!(locator.run(b"ICML"), Some(1)); asserteq!(locator.run(b"SIGMOD"), Some(4)); assert_eq!(locator.run(b"SIGSPATIAL"), None);

// Decodes string keys from given indexes. let mut decoder = set.decoder(); asserteq!(decoder.run(0), b"ICDM".tovec()); asserteq!(decoder.run(3), b"SIGKDD".tovec());

// Enumerates indexes and keys stored in the set. let mut iter = set.iter(); asserteq!(iter.next(), Some((0, b"ICDM".tovec()))); asserteq!(iter.next(), Some((1, b"ICML".tovec()))); asserteq!(iter.next(), Some((2, b"SIGIR".tovec()))); asserteq!(iter.next(), Some((3, b"SIGKDD".tovec()))); asserteq!(iter.next(), Some((4, b"SIGMOD".tovec()))); assert_eq!(iter.next(), None);

// Enumerates indexes and keys starting with a prefix. let mut iter = set.predictiveiter(b"SIG"); asserteq!(iter.next(), Some((2, b"SIGIR".tovec()))); asserteq!(iter.next(), Some((3, b"SIGKDD".tovec()))); asserteq!(iter.next(), Some((4, b"SIGMOD".tovec()))); asserteq!(iter.next(), None);

// Serialization / Deserialization let mut data = Vec::::new(); set.serializeinto(&mut data).unwrap(); asserteq!(data.len(), set.sizeinbytes()); let other = Set::deserializefrom(&data[..]).unwrap(); asserteq!(data.len(), other.sizeinbytes()); ```

Todo

Licensing

This library is free software provided under MIT.