** Vision Easy, powerful, absurdly fast numerical calculations. Chaining, Type punning, static dispatch (w/ inlining) based on your platform and vector types, zero-allocation iteration, and support for uneven collections.

+BEGIN_SRC rust

let somefloats = (&[0u8; 128][..]).simditer() .map(|v| ((v >> splat(4)).asu16s() + splat(2000))) .map(|v| mem::transmute(v).sqrt().rounddown()) .scalar_collect>();

let someu8s = [0u8; 100]; let filledu8s = (&[0u8; 100][..]).simditer() .unevenmap(|vector| vector * splat(2), |scalar| scalar * 2) .fill(&mut some_u8s);

+END_SRC

** Current API It looks something like this:

+BEGIN_SRC rust

let lotsoftwos = (&[0u8; 128][..]).simditer() .map(|v| u8x32::splat(9) * v + u8x32::splat(4) - u8x32::splat(2)) .scalarcollect::>();

+END_SRC

Vector size from ~simd_iter~ and friends is determined based on the machine you're compiling on. I also plan to wrap stdsimd's vector types and intrinsics to provide true cross-platform SIMD, regardless of arch or feature level.

Don't actually use this until I have some more time to work on it (pull requests very welcome; I have a lot of intrinsics to wade through!). ** Performance Here are some extremely unscientific benchmarks which, at least, prove that this isn't any worse than scalar iterators.

+BEGIN_SRC shell

running 4 tests test tests::benchnopscalar ... bench: 59 ns/iter (+/- 0) test tests::benchnopsimd ... bench: 52 ns/iter (+/- 0) test tests::benchscalar ... bench: 57 ns/iter (+/- 0) test tests::benchsimd ... bench: 52 ns/iter (+/- 0)

+END_SRC