abin

Crates.io Docs.rs CI Coverage Status Rust GitHub Template

A library for working with binaries and strings. The library tries to avoid heap-allocations / memory-copy whenever possible by automatically choosing a reasonable strategy: stack for small binaries; static-lifetime-binary or reference-counting. It's easy to use (no lifetimes; the binary type is sized), Send + Sync is optional (thus no synchronization overhead), provides optional serde support and has a similar API for strings and binaries. Custom binary/string types can be implemented for fine-tuning.

Libraries that provide similar functionality:

License

Licensed under either of

at your option.

Details

Usage

toml [dependencies] abin = "*"

```rust use std::iter::FromIterator; use std::ops::Deref;

use abin::{AnyBin, AnyStr, Bin, BinFactory, NewBin, NewStr, Str, StrFactory};

[test]

fn usagebasics() { // static binary / static string let staticbin: Bin = NewBin::fromstatic("I'm a static binary, hello!".asbytes()); let staticstr: Str = NewStr::fromstatic("I'm a static binary, hello!"); asserteq!(&staticbin, staticstr.asbin()); asserteq!(staticstr.asstr(), "I'm a static binary, hello!"); // non-static (but small enough to be stored on the stack) let hellobin: Bin = NewBin::fromiter([72u8, 101u8, 108u8, 108u8, 111u8].iter().copied()); let hellostr: Str = NewStr::copyfromstr("Hello"); asserteq!(&hellobin, hellostr.asbin()); asserteq!(hellostr.as_ref() as &str, "Hello");

// operations for binaries / strings

// length (number of bytes / number of utf-8 bytes)
assert_eq!(5, hello_bin.len());
assert_eq!(5, hello_str.len());
// is_empty
assert_eq!(false, hello_bin.is_empty());
assert_eq!(false, hello_str.is_empty());
// as_slice / as_str / deref / as_bin
assert_eq!(&[72u8, 101u8, 108u8, 108u8, 111u8], hello_bin.as_slice());
assert_eq!("Hello", hello_str.as_str());
assert_eq!("Hello", hello_str.deref());
assert_eq!(&hello_bin, hello_str.as_bin());
// slice
assert_eq!(
    NewBin::from_static(&[72u8, 101u8]),
    hello_bin.slice(0..2).unwrap()
);
assert_eq!(NewStr::from_static("He"), hello_str.slice(0..2).unwrap());
// clone
assert_eq!(hello_bin.clone(), hello_bin);
assert_eq!(hello_str.clone(), hello_str);
// compare
assert!(NewBin::from_static(&[255u8]) > hello_bin);
assert!(NewStr::from_static("Z") > hello_str);
// convert string into binary and binary into string
let hello_bin_from_str: Bin = hello_str.clone().into_bin();
assert_eq!(hello_bin_from_str, hello_bin);
let hello_str_from_bin: Str = AnyStr::from_utf8(hello_bin.clone()).expect("invalid utf8!");
assert_eq!(hello_str_from_bin, hello_str);
// convert into Vec<u8> / String
assert_eq!(
    Vec::from_iter([72u8, 101u8, 108u8, 108u8, 111u8].iter().copied()),
    hello_bin.into_vec()
);
assert_eq!("Hello".to_owned(), hello_str.into_string());

} ```

Notable structs, traits and types & naming

Interfaces: * Bin: Binary (it's a struct). * SBin: Synchronized binary (it's a struct). * Str: String (type Str = AnyStr<Bin>) * SStr Synchronized string (type SStr = AnyStr<SBin>).

Factories provided by the default implementation: * NewBin: Creates Bin. * NewSBin: Creates SBin. * NewStr: Creates Str. * NewSStr: Creates SStr.

See also: * AnyBin: Trait implemented by Bin and SBin. * AnyStr: See Str and SStr; string backed by either Bin or SBin. * BinFactory: Factory trait implemented by NewBin and NewSBin. * StrFactory: Factory trait implemented by NewStr and NewSStr.

Learn

See the example tests:

Maturity

It's quite young (development started in October 2020). The main functionality has been implemented. Things I might do:

Questions and Answers

There's already other crates with similar functionality, why another one? / Features

This crate provides some features that cannot be found in other crates (or not all of them):

Why NewBin, NewStr? what's this?

Why let string = NewStr::from_static("Hello") instead of just let string = Str::from_static("Hello") (or implement From<&str> for Str)? This is due to the decision to decouple the interface from the implementation. The Str is the interface, whereas NewStr is the factory of the built-in implementation. This library is designed to be extensible; you can provide your own implementation, tweaked for your use case.

How does the default-implementation NewBin / NewStr work?

The only difference between NewBin and NewSBin is the reference-counted binaries: SBin created by NewSBin have a synchronized reference counter (AtomicUsize).

Note: The same statements also apply to strings (since strings are backed by the binary implementation).

What operations are allocation-free / zero-copy?

It's not documented (in text) - and of course depends on the implementation ... but for the default-implementation (NewBin/NewSBin/NewStr/NewSStr) there's a test, see tests/noallocguarantees.rs.

Also see these two tests for single-allocation guarantee:

I want to write my own implementation, how to?

There's currently no documentation - but you can use the default implementation for reference. It's found in the module implementation.

Why Boo and not Cow?

Cow requires where B: 'a + ToOwned. This does not work with this crate, since the implementation is separated from the interface. Say we have &[u8] (borrowed), to convert that to owned (Bin or SBin), the implementation has to be known. I don't want Cow to contain information about the implementation.

Aren't Bin and Str huge (stack-size)?

Bin and Str have a size of 4 words and are word-aligned. Yes, it's not small - but for reference, a Vec<u8> also takes 3 words (pointer, length and capacity).

What is re-integration?

Say we have this code (pseudocode):

``` let largebinaryfromnetwork : Vec = <...>; let bin = NewBin::fromgivenvec(largebinaryfromnetwork); let sliceofthatbin : &[u8] = &bin.asslice()[45..458];

// it's now possible to re-integrate that slice_of_that_bin into the bin it was sliced from. // re-integration converts the borrowed type &[u8] (slice_of_that_bin) into an owned // type (Bin) without memory-allocation or memory-copy. let binreintegrated : Bin = bin.tryreintegrate(sliceofthat_bin).unwrap(); ```

This is useful if you want to de-serialize to owned (without using Boo) using serde. When deserializing a type, we get slice_of_that_bin from serde; using re-integration it's possible to get an owned binary (Bin) without allocation.

Technical detail: It checks whether slice_of_that_bin lies within the memory range of bin; if so, it increments the reference-count of bin by one, and the returned binary (bin_re_integrated) is then just a sliced reference to bin.

Name abin?

It's named after the trait AnyBin.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

See CONTRIBUTING.md.