Bitcode

Build Documentation

A bitwise encoder/decoder similar to bincode.

Motivation

We noticed relatively low entropy in data serialized with bincode. This library attempts to shrink the serialized size without sacrificing too much speed (as would be the case with compression).

Bitcode does not attempt to have a stable format, so we are free to optimize it.

Comparison with bincode

Features (serde)

Additional features with #[derive(bitcode::Encode, bitcode::Decode)]

Limitations

Benchmarks vs. bincode and postcard

Speed (nanoseconds)

| Format | Serialize | Deserialize | |------------------|-----------|-------------| | Bitcode (derive) | 6,312 | 25,370 | | Bitcode (serde) | 10,015 | 41,223 | | Bincode | 8,247 | 23,317 | | Bincode (varint) | 9,872 | 30,138 | | Postcard | 13,836 | 31,453 |

Aims to be as fast as bincode and postcard. See rust serialization benchmark for more benchmarks.

Size (bits)

| Type | Bitcode (derive) | Bitcode (serde) | Bincode | Bincode (varint) | Postcard | |---------------------|------------------|-----------------|---------|------------------|----------| | bool | 1 | 1 | 8 | 8 | 8 | | u8 | 8 | 8 | 8 | 8 | 8 | | i8 | 8 | 8 | 8 | 8 | 8 | | u16 | 16 | 16 | 16 | 8-24 | 8-24 | | i16 | 16 | 16 | 16 | 8-24 | 8-24 | | u32 | 32 | 32 | 32 | 8-40 | 8-40 | | i32 | 32 | 32 | 32 | 8-40 | 8-40 | | u64 | 64 | 64 | 64 | 8-72 | 8-80 | | i64 | 64 | 64 | 64 | 8-72 | 8-80 | | f32 | 32 | 32 | 32 | 32 | 32 | | f64 | 64 | 64 | 64 | 64 | 64 | | char | 8-32 | 8-32 | 8-32 | 8-32 | 16-40 | | Option<()> | 1 | 1 | 8 | 8 | 8 | | Result<(), ()> | 1 | 1-3 | 32 | 8 | 8 | | enum { A, B, C, D } | 2 | 1-5 | 32 | 8 | 8 |

| Value | Bitcode (derive) | Bitcode (serde) | Bincode | Bincode (varint) | Postcard | |---------------------|------------------|-----------------|---------|------------------|----------| | [true; 4] | 4 | 4 | 32 | 32 | 32 | | vec![(); 0] | 1 | 1 | 64 | 8 | 8 | | vec![(); 1] | 3 | 3 | 64 | 8 | 8 | | vec![(); 256] | 17 | 17 | 64 | 24 | 16 | | vec![(); 65536] | 33 | 33 | 64 | 40 | 24 | | "" | 1 | 1 | 64 | 8 | 8 | | "abcd" | 37 | 37 | 96 | 40 | 40 | | "abcd1234" | 71 | 71 | 128 | 72 | 72 |

Random Struct Benchmark

The following data structure was used for benchmarking. ```rust struct Data { x: Option, y: Option, z: u16, s: String, e: DataEnum, }

enum DataEnum { Bar, Baz(String), Foo(Option), } `` In the table below, **Size (bytes)** is the average size of a randomly generatedData` struct. Zero Bytes are the percentage of bytes that are 0 in the output. If the result contains a large percentage of zero bytes, that is a sign that it could be compressed more.

| Format | Size (bytes) | Zero Bytes | |------------------------|--------------|------------| | Bitcode (derive) | 6.5 | 0.25% | | Bitcode (serde) | 6.7 | 0.19% | | Bincode | 20.3 | 65.9% | | Bincode (varint) | 10.9 | 27.7% | | Bincode (LZ4) | 9.9 | 13.9% | | Bincode (Deflate Fast) | 8.4 | 0.88% | | Bincode (Deflate Best) | 7.8 | 0.29% | | Postcard | 10.7 | 28.3% | | ideal (max entropy) | | 0.39% |

A note on enums

When using serde to serialize enums. Enum variants are encoded such that variants declared earlier will occupy fewer bits. It is advantageous to sort variant declarations from most common to least common.

This limitation can be avoided by using bitcode's derive macros.

Testing

Fuzzing

cargo install cargo-fuzz cargo fuzz run fuzz

32-bit

sudo apt install gcc-multilib rustup target add i686-unknown-linux-gnu cargo test --target i686-unknown-linux-gnu

Acknowledgement

Some test cases were derived from bincode (see comment in tests.rs).

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.