For some time, even without the linker, it is possible to create CUDA (PTX) kernels written with Rust.
The one could emit PTX code with --emit asm
flag. But some problems come up when we need to write more or less complex kernels, which uses functions from external crates.
Unfortunately, --emit asm
can't link couple modules into a single PTX. From dicsussion another solution revealed:
llvm-link
.llc
.According to Rust NVPTX metabug it's quite realistic to solve part of bugs within this repo:
The trick it to compile kernels crate as dylib.
So you usually have to add to your Cargo.toml
:
toml
[lib]
crate_type = ["dylib"]
And also, some modifications has to be made for target definition:
json
{
"arch": "nvptx64",
"cpu": "sm_20",
"data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
"linker": "ptx-linker",
"linker-flavor": "ld",
"linker-is-gnu": true,
"dll-prefix": "",
"dll-suffix": ".ptx",
"dynamic-linking": true,
"llvm-target": "nvptx64-nvidia-cuda",
"max-atomic-width": 0,
"os": "cuda",
"obj-is-bitcode": true,
"panic-strategy": "abort",
"target-endian": "little",
"target-pointer-width": "64",
"target-c-int-width": "32"
}
Especially, the most important for the linker:
* "linker": "ptx-linker"
- the linker executable in PATH
.
* "linker-flavor": "ld"
- currently we support only ld
flavor parsing.
* "linker-is-gnu": true
- it needs for Rust to pass optimisation flag.
* "dll-suffix": ".ptx"
- correct file extension for PTX assembly output.
* "dynamic-linking": true
- allows Rust to create dylib.
* "obj-is-bitcode": true
- store bitcode instead of object files.
After that you can: ``` $ echo "Installing PTX linker" $ cargo install ptx-linker
$ cd /path/to/kernels/crate $ echo "Building PTX assembly output" $ xargo rustc --target nvptx64-nvidia-cuda --release ```