A library that abstracts over SIMD instruction sets, including ones with differing widths. SIMDeez is designed to allow you to write a function one time and produce SSE2, SSE41, and AVX2 versions of the function. You can either have the version you want chosen at compile time with cfg attributes, or at runtime with target_feature attributes and using the built in is_x86_feature_detected! macro.

SIMDeez is currently in Beta, if there are intrinsics you need that are not currently implemented, create an issue and I'll add them. PRs to add more intrinsics are welcome. Currently things are well fleshed out for i32, i64, f32, and f64 types.

As Rust stabilizes support for Neon and AVX-512 I plan to add those as well.

Refer to the excellent Intel Intrinsics Guide for documentation on these functions:

Features

Compared to stdsimd

Compared to Faster

All of the above could change! Faster seems to generally have the same performance as long as you don't run into some of the slower fallback functions.

Example

``rust // When using runtime feature detection we need to be sure this inlines into each specific // function using a givenfeature_target` or intrinsics will get downgraded // All intrinsics are unsafe, so functions using them must be unsafe or // you must wrap all calls with unsafe blocks.

[inline(always)]

unsafe fn distance( x1: &[f32], y1: &[f32], x2: &[f32], y2: &[f32]) -> Vec {

let mut result: Vec<f32> = Vec::with_capacity(x1.len());
result.set_len(x1.len()); // for efficiency

// Operations have to be done in terms of the vector width
// so that it will work with any size vector.
// the width of a vector type is provided as a constant
// so the compiler is free to optimize it more.
let mut i = 0;
//S::VF32_WIDTH is a constant, 4 when using SSE, 8 when using AVX2, etc
while i < x1.len() {
    //load data from your vec into a SIMD value
    let xv1 = S::loadu_ps(&x1[i]);
    let yv1 = S::loadu_ps(&y1[i]);
    let xv2 = S::loadu_ps(&x2[i]);
    let yv2 = S::loadu_ps(&y2[i]);

    // Use the usual intrinsic syntax if you prefer
    let mut xdiff = S::sub_ps(xv1, xv2);
    // Or use operater overloading if you like
    let mut ydiff = yv1 - yv2;
    xdiff *= xdiff;
    ydiff *= ydiff;
    let distance = S::sqrt_ps(xdiff + ydiff);
    // Store the SIMD value into the result vec
    S::storeu_ps(&mut result[i], distance);
    // Increment i by the vector width
    i += S::VF32_WIDTH
}
result

}

//Call distance as an SSE2 function

[target_feature(enable = "sse2")]

unsafe fn distance_sse2( x1: &[f32], y1: &[f32], x2: &[f32], y2: &[f32]) -> Vec { distance::(x1, y1, x2, y2) } //Call distance as an SSE41 function

[target_feature(enable = "sse4.1")]

unsafe fn distance_sse41( x1: &[f32], y1: &[f32], x2: &[f32], y2: &[f32]) -> Vec { distance::(x1, y1, x2, y2) } //Call distance as an AVX2 function

[target_feature(enable = "avx2")]

unsafe fn distance_avx2( x1: &[f32], y1: &[f32], x2: &[f32], y2: &[f32]) -> Vec { distance::(x1, y1, x2, y2) } ```