serde_columnar

serde_columnar is an ergonomic columnar storage encoding crate that offers forward and backward compatibility.

It allows the contents that need to be serialized and deserialized to be encoded into binary using columnar storage, all by just employing simple macro annotations.

For more detailed introduction, please refer to this Notion link: Serde-Columnar.

🚧 This crate is in progress and not stable, should not be used in production environments

Features 🚀

serde_columnar comes with several remarkable features:

How to use

Install

shell cargo add serde_columnar

Or edit your Cargo.toml and add serde_columnar as dependency:

toml [dependencies] serde_columnar = "0.3.2"

Container Attribute

Field Attribute

Examples

```rust use serdecolumnar::{columnar, frombytes, to_vec};

[columnar(vec, ser, de)] // this struct can be a row of vec-like container

struct RowStruct { name: String, #[columnar(strategy = "DeltaRle")] // this field will be encoded by DeltaRle id: u64, #[columnar(strategy = "Rle")] // this field will be encoded by Rle gender: String, #[columnar(strategy = "BoolRle")] // this field will be encoded by BoolRle married: bool #[columnar(optional, index = 0)] // This field is optional, which means that this field can be added in this version or deleted in a future version future: String }

[columnar(ser, de)] // derive Serialize and Deserialize

struct TableStruct<'a> { #[columnar(class = "vec")] // this field is a vec-like table container pub data: Vec, #[columnar(borrow)] // the same as #[serde(borrow)] pub text: Cow<'a, str> #[columnar(skip)] // the same as #[serde(skip)] pub ignore: u8 #[columnar(optional, index = 0)] // table container also supports optional field pub other_data: u64

}

let table = TableStruct::new(...); let bytes = serdecolumnar::tovec(&table).unwrap(); let tablefrombytes = serdecolumnar::frombytes::(&bytes).unwrap();

```

You can find more examples of serde_columnar in examples and tests.

Iterable

When we use columnar for compression encoding, there is a premise that the field is iterable. So we can completely borrow the encoded bytes to obtain all the data in the form of iterator during deserialization without directly allocating the memory of all the data. This implementation can also be achieved completely through macros.

To use iter mode when deserializing, you only need to do 3 things:

  1. mark all row struct with iterable
  2. mark the field of row container with iter="..."
  3. use serde_columnar::iter_from_bytes to deserialize

```rust

[columnar(vec, ser, de, iterable)]

struct Row{ #[columnar(strategy="Rle")] rle: String #[columnar(strategy="DeltaRle")] delta_rle: u64 other: u8 }

[columnar(ser, de)]

struct Table{ #[columnar(class="vec", iter="Row")] vec: Vec, other: u8 }

let table = Table::new(...); let bytes = serdecolumnar::tovec(&table).unwrap(); let tableiter = serdecolumnar::iterfrombytes::

(&bytes).unwrap();

```

Acknowledgements