Rust PTX Linker

Purpose

For some time, even without the linker, it is possible to create CUDA (PTX) kernels written with Rust.

The one could emit PTX code with --emit asm flag. But some problems come up when we need to write more or less complex kernels, which uses functions from external crates.

Unfortunately, --emit asm can't link couple modules into a single PTX. From dicsussion another solution revealed:

Emit LLVM bitcode for every crate.
Link the bitcodes with llvm-link.
Compile output bitcode into PTX with llc.

Issues

According to Rust NVPTX metabug it's quite realistic to solve part of bugs within this repo:

[x] Non-inlined functions can't be used cross crate - rust#38787
[x] No "undefined reference" error is raised when it should be - rust#38786

Approach

The trick it to compile kernels crate as dylib.

So you usually have to add to your Cargo.toml: toml [lib] crate_type = ["dylib"]

And also, some modifications has to be made for target definition: json { "arch": "nvptx64", "cpu": "sm_20", "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64", "linker": "ptx-linker", "linker-flavor": "ld", "linker-is-gnu": true, "dll-prefix": "", "dll-suffix": ".ptx", "dynamic-linking": true, "llvm-target": "nvptx64-nvidia-cuda", "max-atomic-width": 0, "os": "cuda", "obj-is-bitcode": true, "panic-strategy": "abort", "target-endian": "little", "target-pointer-width": "64", "target-c-int-width": "32" }

Especially, the most important for the linker: * "linker": "ptx-linker" - the linker executable in PATH. * "linker-flavor": "ld" - currently we support only ld flavor parsing. * "linker-is-gnu": true - it needs for Rust to pass optimisation flag. * "dll-suffix": ".ptx" - correct file extension for PTX assembly output. * "dynamic-linking": true - allows Rust to create dylib. * "obj-is-bitcode": true - store bitcode instead of object files.

After that you can: ``` $ echo "Installing PTX linker" $ cargo install ptx-linker

$ cd /path/to/kernels/crate $ echo "Building PTX assembly output" $ xargo rustc --target nvptx64-nvidia-cuda --release ```