accel-derive

Crate docs.rs

Procedural-macro crate for #[kernel]

Requirements

Install nightly-2020-01-02 toolchain

rustup toolchain add nightly-2020-01-02 --profile minimal

and rust-ptx-linker

cargo install ptx-linker -f

#[accel_derive::kernel]

#[kernel] function will be converted to two part:

Generate PTX assember

```rust use accel_derive::kernel;

[kernel]

[dependencies(accel-core = "0.2.0-alpha")] // Strings in () will be parsed as TOML

pub unsafe fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) { let i = accel_core::index(); if (i as usize) < n { *c.offset(i) = *a.offset(i) + *b.offset(i); } } ```

will generate a crate with a device code

```rust

[no_mangle]

pub unsafe extern "ptx-kernel" fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) { let i = accel_core::index(); if (i as usize) < n { *c.offset(i) = *a.offset(i) + *b.offset(i); } } ```

and Cargo.toml

toml [dependencies] accel-core = "0.2.0-alpha"

This crate will be compiled into PTX assembler using nvptx64-nvidia-cuda target. nvptx64-nvidia-cuda target does not support libstd currently. You need to write codes in no_std manner.

Kernel launcher

On the other hand, corresponding host code will also generated:

```rust use ::accel::{Grid, Block, kernel::void_cast, module::Module};

pub fn add(grid: Grid, block: Block, a: *const f64, b: *const f64, c: *mut f64, n: usize) { threadlocal!{ static module: Module = Module::fromstr("{{PTXSTRING}}").expect("Load module failed"); } module.with(|m| { let mut kernel = m.getkernel("add").expect("Failed to get Kernel"); let mut args = [voidcast(&a)), voidcast(&b), voidcast(&c)]; unsafe { kernel.launch(args.asmut_ptr(), grid, block).expect("Failed to launch kernel") }; }) } ```

where {{PTX_STRING}} is a PTX assembler string compiled from the device code. This can be called like following:

rust let grid = Grid::x(1); let block = Block::x(n as u32); add(grid, block, a.as_ptr(), b.as_ptr(), c.as_mut_ptr(), n);