Procedural-macro crate for #[kernel]
Install nightly-2020-01-02 toolchain
rustup toolchain add nightly-2020-01-02 --profile minimal
and rust-ptx-linker
cargo install ptx-linker -f
#[kernel]
function will be converted to two part:
accel::module
API```rust use accel_derive::kernel;
()
will be parsed as TOMLpub unsafe fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) { let i = accel_core::index(); if (i as usize) < n { *c.offset(i) = *a.offset(i) + *b.offset(i); } } ```
will generate a crate with a device code
```rust
pub unsafe extern "ptx-kernel" fn add(a: *const f64, b: *const f64, c: *mut f64, n: usize) { let i = accel_core::index(); if (i as usize) < n { *c.offset(i) = *a.offset(i) + *b.offset(i); } } ```
and Cargo.toml
toml
[dependencies]
accel-core = "0.2.0-alpha"
This crate will be compiled into PTX assembler using nvptx64-nvidia-cuda
target.
nvptx64-nvidia-cuda
target does not support libstd currently.
You need to write codes in no_std
manner.
On the other hand, corresponding host code will also generated:
```rust use ::accel::{Grid, Block, kernel::void_cast, module::Module};
pub fn add(grid: Grid, block: Block, a: *const f64, b: *const f64, c: *mut f64, n: usize) { threadlocal!{ static module: Module = Module::fromstr("{{PTXSTRING}}").expect("Load module failed"); } module.with(|m| { let mut kernel = m.getkernel("add").expect("Failed to get Kernel"); let mut args = [voidcast(&a)), voidcast(&b), voidcast(&c)]; unsafe { kernel.launch(args.asmut_ptr(), grid, block).expect("Failed to launch kernel") }; }) } ```
where {{PTX_STRING}}
is a PTX assembler string compiled from the device code.
This can be called like following:
rust
let grid = Grid::x(1);
let block = Block::x(n as u32);
add(grid, block, a.as_ptr(), b.as_ptr(), c.as_mut_ptr(), n);