serde_columnar
serde_columnar
is an ergonomic columnar storage encoding crate that offers forward and backward compatibility.
It allows the contents that need to be serialized and deserialized to be encoded into binary using columnar storage, all by just employing simple macro annotations.
For more detailed introduction, please refer to this Notion
link: Serde-Columnar.
serde_columnar
comes with several remarkable features:
shell
cargo add serde_columnar
Or edit your Cargo.toml
and add serde_columnar
as dependency:
toml
[dependencies]
serde_columnar = "0.3.2"
vec
:
map
:
ser
:
Serialize
trait for this structde
:
Deserialize
trait for this structiterable
:
row
structstrategy
:
Rle
/DeltaRle
/BoolRle
.row
struct.class
:
Vec
or HashMap
and their variants.vec
or map
.table
struct.skip
:
#[serde(skip)]
, do not serialize or deserialize this field.borrow
:
#[serde(borrow)]
, borrow data for this field from the deserializer by using zero-copy deserialization.#[columnar(borrow="'a + 'b")]
to specify explicitly which lifetimes should be borrowed.table
struct for now.iter
:
class
.class="vec"
.compress
.optional
& index
:
optional
.index
.optional
fields must be after other fields.index
is the unique identifier of the optional field, which will be encoded into the result. If the corresponding identifier cannot be found during deserialization, Default
will be used.optional
fields can be added or removed in future versions. The compatibility premise is that the field type of the same index does not change or the encoding format is compatible (such as changing u32
to u64
).compress
:
compress
feature#[columnar(compress(min_size=N))]
: compress the columnar encoded bytes when the size of the bytes is larger than N, default N is 256.#[columnar(compress(level=N))]
: compress the columnar encoded bytes by Deflate algorithm with level N, N is in [0, 9], default N is 6, 0 is no compression, and 9 is the best compression. See flate2 for more details.#[columnar(compress(method="fast"|"best"|"default"))]
: compress the columnar encoded bytes by Deflate algorithm with method "fast", "best" or "default", this attribute is equivalent to #[columnar(compress(level=1|9|6))]
.level
and method
can not be used at the same time.row
struct.```rust use serdecolumnar::{columnar, frombytes, to_vec};
struct RowStruct {
name: String,
#[columnar(strategy = "DeltaRle")] // this field will be encoded by DeltaRle
id: u64,
#[columnar(strategy = "Rle")] // this field will be encoded by Rle
gender: String,
#[columnar(strategy = "BoolRle")] // this field will be encoded by BoolRle
married: bool
#[columnar(optional, index = 0)] // This field is optional, which means that this field can be added in this version or deleted in a future version
future: String
}
Serialize
and Deserialize
struct TableStruct<'a> {
#[columnar(class = "vec")] // this field is a vec-like table container
pub data: Vec#[serde(borrow)]
pub text: Cow<'a, str>
#[columnar(skip)] // the same as #[serde(skip)]
pub ignore: u8
#[columnar(optional, index = 0)] // table container also supports optional field
pub other_data: u64
}
let table = TableStruct::new(...);
let bytes = serdecolumnar::tovec(&table).unwrap();
let tablefrombytes = serdecolumnar::frombytes::
```
You can find more examples of serde_columnar
in examples
and tests
.
When we use columnar for compression encoding, there is a premise that the field is iterable. So we can completely borrow the encoded bytes to obtain all the data in the form of iterator during deserialization without directly allocating the memory of all the data. This implementation can also be achieved completely through macros.
To use iter mode when deserializing, you only need to do 3 things:
iterable
iter="..."
serde_columnar::iter_from_bytes
to deserialize```rust
struct Row{ #[columnar(strategy="Rle")] rle: String #[columnar(strategy="DeltaRle")] delta_rle: u64 other: u8 }
struct Table{
#[columnar(class="vec", iter="Row")]
vec: Vec
let table = Table::new(...); let bytes = serdecolumnar::tovec(&table).unwrap(); let tableiter = serdecolumnar::iterfrombytes::