An implementation of the BLAKE2b hash with:
b2sum
command line utility, provided as a sub-crate. b2sum
includes command line flags for all the BLAKE2 associated data features.no_std
support. The std
Cargo feature is on by default, for CPU feature detection and
for implementing std::io::Write
.blake2bp
Cargo
feature.```rust use blake2b_simd::{blake2b, Params};
let expected = "ca002330e69d3e6b84a46a56a6533fd79d51d97a3bb7cad6c2ff43b354185d6d\ c1e723fb3db4ae0737e120378424c714bb982d9dc5bbd7a0ab318240ddd18f8d"; let hash = blake2b(b"foo"); asserteq!(expected, &hash.tohex());
let hash = Params::new() .hashlength(16) .key(b"The Magic Words are Squeamish Ossifrage") .personal(b"L. P. Waterhouse") .tostate() .update(b"foo") .update(b"bar") .update(b"baz") .finalize(); asserteq!("ee8ff4e9be887297cf79348dc35dab56", &hash.tohex()); ```
An example using the included b2sum
command line utility:
bash
$ cd b2sum
$ cargo build --release
Finished release [optimized] target(s) in 0.04s
$ echo hi | ./target/release/b2sum --length 256
de9543b2ae1b2b87434a730727db17f5ac8b8c020b84a5cb8c5fbcc1423443ba -
The AVX2 implementation in this crate is ported from the C implementation in libsodium. That implementation was originally written by Samuel Neves and integrated into libsodium by Frank Denis. All credit for performance goes to those authors.
To run small benchmarks yourself, first install OpenSSL and libsodium on your machine, then:
```sh cd benches/cargo_bench
cargo +nightly bench ```
The benches/benchmark_gig
sub-crate allocates a gigabyte (10⁹) array and repeatedly hashes it
to measure throughput. A similar C program, benches/bench_libsodium.c
, does the same thing
using libsodium's implementation of BLAKE2b. Here are the results from my laptop:
gcc -O3 -lsodium benches/bench_libsodium.c
(via the
helper script benches/bench_libsodium.sh
)cargo +nightly run --release
table
╭────────────┬────────────╮
│ portable │ AVX2 │
╭──────────────┼────────────┼────────────┤
│ blake2b_simd │ 0.771 GB/s │ 1.005 GB/s │
│ libsodium │ 0.743 GB/s │ 0.939 GB/s │
╰──────────────┴────────────┴────────────╯
The benches/bench_b2sum.py
script benchmarks b2sum
against several Coreutils hashes, on a
10 MB file of random data. Here are the results from my laptop:
table
╭───────────────────────────┬────────────╮
│ blake2b_simd b2sum --mmap │ 0.676 GB/s │
│ blake2b_simd b2sum │ 0.649 GB/s │
│ coreutils sha1sum │ 0.628 GB/s │
│ coreutils b2sum │ 0.536 GB/s │
│ coreutils md5sum │ 0.476 GB/s │
│ coreutils sha512sum │ 0.464 GB/s │
╰───────────────────────────┴────────────╯
The benches/count_cycles
sub-crate (cargo +nightly run --release
) measures a peak
throughput of 1.8 cycles per byte.