# Easy GPGPU
A high level, easy to use async gpgpu crate based on wgpu
.
It is made for very large computations on powerful gpus
Main goals :
deal with binding buffers automatically
Limitations :
only types available for buffers : bool, i32, u32, f32
takes a bit of time to initiate the device (due to wgpu backends)
recreating wgpu's hello-compute
(205 sloc when writen with wgpu)
``` fn wgpuhellocompute() { let mut device = Device::new(); let v = vec![1u32, 4, 3, 295]; device.createbufferfrom("inputs", &v, BufferUsage::ReadWrite, true); let result = device.executeshadercode(Dispatch::Linear(v.len()), r" fn collatziterations(nbase: u32) -> u32{ var n: u32 = n_base; var i: u32 = 0u; loop { if (n <= 1u) { break; } if (n % 2u == 0u) { n = n / 2u; } else { // Overflow? (i.e. 3*n + 1 > 0xffffffffu?) if (n >= 1431655765u) { // 0x55555555u return 4294967295u; // 0xffffffffu } n = 3u * n + 1u; } i = i + 1u; } return i; }
fn main() { inputs[index] = collatziterations(inputs[index]); }" ).intoiter().next().unwrap().unwrapu32(); asserteq!(result, vec![0, 2, 7, 55]); } ``` => No binding, no annoying global_id, no need to use a low level api. You just declare the name of the buffer and it is immediately available in the wgsl shader.
First create a device :
let device = Device::new();
Then create some buffers, specify if you want to get their content after the execution :
let v1 = vec![1i32, 2, 3, 4, 5, 6];
// from a vector
device.create_buffer_from("v1", &v1, BufferUsage::ReadOnly, false);
// creates an empty buffer
device.create_buffer("output", "i32", v1.len(), BufferUsage::WriteOnly, true);
Finaly, execute a shader :
let result = device.execute_shader_code(Dispatch::Linear(v1.len()), r"
fn main() {
output[index] = v1[index] * 2;
}").into_iter().next().unwrap().unwrap_i32();
println!("{:?}", result);
The buffers are available in the shader with the name provided when created with the device.
The index
variable is provided thanks to the use of Dispatch::Linear
.
We had only specified one buffer with is_output: true
so we get only one vector as an output.
We just need to unwrap the data as a vector of i32s with .unwrap_i32()