A Rust implementation of the Khronos OpenCL API.
A relatively simple, object based model of the OpenCL 3.0
API.
It is built upon the cl3 crate, which
provides a functional interface to the OpenCL C API.
OpenCL (Open Computing Language) is framework for general purpose parallel programming across heterogeneous devices including: CPUs, GPUs, DSPs, FPGAs and other processors or hardware accelerators. It is often considered as an open-source alternative to Nvidia's proprietary Compute Unified Device Architecture CUDA for performing General-purpose computing on GPUs, see GPGPU.
OpenCL 3.0
is a unified specification that adds little new functionality to previous OpenCL versions.
It specifies that all OpenCL 1.2 features are mandatory, while all
OpenCL 2.x and 3.0 features are now optional.
This library has:
The library is object based with most OpenCL objects represented by rust structs.
For example, an OpenCL cl_device_id
is represented by Device with methods to get information about the device instead of calling clGetDeviceInfo
with the relevant cl_device_info
value.
The struct methods are simpler to use than their equivalent standalone functions in cl3 because they convert the InfoType
enum into the correct underlying type returned by the clGetDeviceInfo
call for the cl_device_info
value.
Nearly all the structs implement the Drop
trait to release their corresponding
OpenCL objects. The exceptions are Platform
and Device
which don't need to be released. See the crate documentation.
The API for OpenCL versions and extensions are controlled by Rust features such as "CLVERSION20" and "clkhrglsharing". To enable an OpenCL version, the feature for that version and all previous OpenCL versions must be enabled, e.g. for "CLVERSION20"; "CLVERSION11" and "CLVERSION1_2" must also be enabled.
The default features are "CLVERSION11", "CLVERSION12" and "CLVERSION2_0".
Rust deprecation warnings are given for OpenCL API functions that are deprecated by an enabled OpenCL version e.g., clCreateCommandQueue
is deprecated whenever "CLVERSION2_0" is enabled.
Ensure that an OpenCL Installable Client Driver (ICD) and the appropriate OpenCL hardware driver(s) are installed, see OpenCL Installation.
opencl3
supports OpenCL 1.2 and 2.0 ICD loaders by default. If you have an
OpenCL 2.0 ICD loader then just add the following to your project's Cargo.toml
:
toml
[dependencies]
opencl3 = "0.8"
If your OpenCL ICD loader supports higher versions of OpenCL then add the
appropriate features to opencl3, e.g. for an OpenCL 3.0 ICD loader add the
following to your project's Cargo.toml
instead:
toml
[dependencies.opencl3]
version = "0.8"
features = ["CL_VERSION_2_1", "CL_VERSION_2_2", "CL_VERSION_3_0"]
OpenCL extensions and serde
support can also be enabled by adding their features, e.g.:
toml
[dependencies.opencl3]
version = "0.8"
features = ["cl_khr_gl_sharing", "cl_khr_dx9_media_sharing", "serde"]
See the OpenCL Guide and OpenCL Description for background on using OpenCL.
There are examples in the examples directory. The tests also provide examples of how the crate may be used, e.g. see: platform, device, context, integrationtest and opencl2kernel_test.
The library is designed to support events and OpenCL 2 features such as Shared Virtual Memory (SVM) and kernel built-in work-group functions.
It also has optional support for serde
e.g.:
```rust no-run const PROGRAMSOURCE: &str = r#" kernel void inclusivescanint (global int* output, global int const* values) { int sum = 0; sizet lid = getlocalid(0); sizet lsize = getlocal_size(0);
size_t num_groups = get_num_groups(0);
for (size_t i = 0u; i < num_groups; ++i)
{
size_t lidx = i * lsize + lid;
int value = work_group_scan_inclusive_add(values[lidx]);
output[lidx] = sum + value;
sum += work_group_broadcast(value, lsize - 1);
}
}"#;
const KERNELNAME: &str = "inclusivescan_int";
// Create a Context on an OpenCL device let context = Context::fromdevice(&device).expect("Context::fromdevice failed");
// Build the OpenCL program source and create the kernel. let program = Program::createandbuildfromsource(&context, PROGRAMSOURCE, CLSTD20) .expect("Program::createandbuildfromsource failed");
let kernel = Kernel::create(&program, KERNEL_NAME).expect("Kernel::create failed");
// Create a commandqueue on the Context's device let queue = CommandQueue::createdefaultwithproperties( &context, CLQUEUEPROFILINGENABLE, 0, ) .expect("CommandQueue::createdefaultwithproperties failed");
// The input data const ARRAYSIZE: usize = 8; const VALUEARRAY: &str = "[3,2,5,9,7,1,4,2]";
// Create an OpenCL SVM vector
let mut test_values = SvmVec::
// Handle testvalues if device only supports CLDEVICESVMCOARSEGRAINBUFFER
if !testvalues.isfinegrained() {
// SVMCOARSEGRAINBUFFER needs to know the size of the data to allocate the SVM
testvalues = SvmVec::
ExtendSvmVec(&mut testvalues) .deserialize(&mut deserializer) .expect("Error deserializing the VALUEARRAY JSON string.");
// Make testvalues immutable let testvalues = test_values;
// Unmap testvalues if not a CLMEMSVMFINEGRAINBUFFER if !testvalues.isfinegrained() { let unmaptestvaluesevent = unsafe { queue.enqueuesvmunmap(&testvalues, &[])? }; unmaptestvaluesevent.wait()?; }
// The output data, an OpenCL SVM vector
let mut results =
SvmVec::
// Run the kernel on the input data let sumkernelevent = unsafe { ExecuteKernel::new(&kernel) .setargsvm(results.asmutptr()) .setargsvm(testvalues.asptr()) .setglobalworksize(ARRAYSIZE) .enqueuendrange(&queue)? };
// Wait for the kernel to complete execution on the device kernel_event.wait()?;
// Map results if not a CLMEMSVMFINEGRAINBUFFER if !results.isfinegrained() { unsafe { queue.enqueuesvmmap(CLBLOCKING, CLMAPREAD, &mut results, &[])? }; }
// Convert SVM results to json let jsonresults = serdejson::tostring(&results).unwrap(); println!("json results: {}", jsonresults);
// Unmap results if not a CLMEMSVMFINEGRAINBUFFER if !results.isfinegrained() { let unmapresultsevent = unsafe { queue.enqueuesvmunmap(&results, &[])? }; unmapresults_event.wait()?; } ```
The example above was taken from: opencl2serde.rs.
The crate contains unit, documentation and integration tests.
The tests run the platform and device info functions (among others) so they
can provide useful information about OpenCL capabilities of the system.
It is recommended to run the tests in single-threaded mode, since some of them can interfere with each other when run multi-threaded, e.g.:
shell
cargo test -- --test-threads=1 --show-output
The integration tests are marked ignore
so use the following command to
run them:
shell
cargo test -- --test-threads=1 --show-output --ignored
The API has changed considerably since version 0.1
of the library, with the
aim of making the library more consistent and easier to use.
SvmVec was changed recently to provide support for serde
deserialization.
It also changed in version 0.5.0 to provide better support for
coarse grain buffer Shared Virtual Memory now that Nvidia is supporting it,
see Nvidia OpenCL.
In version 0.6.0 the Info enums were removed from the underlying cl3 crate and this crate so that data can be read from OpenCL devices in the future using new values that are currently undefined.
In version 0.8.0 deprecation warnings are given for OpenCL API functions that are deprecated by an enabled OpenCL version e.g., clCreateCommandQueue
is deprecated whenever "CLVERSION2_0" is enabled.
In version 0.9.0 many OpenCL API functions are declared unsafe
since they may cause undefined behaviour if called incorrectly.
For information on other changes, see Releases.
If you want to contribute through code or documentation, the Contributing guide is the best place to start. If you have any questions, please feel free to ask. Just please abide by our Code of Conduct.
Licensed under the Apache License, Version 2.0, as per Khronos Group OpenCL.
You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0
Any contribution intentionally submitted for inclusion in the work by you shall be licensed as defined in the Apache-2.0 license above, without any additional terms or conditions, unless you explicitly state otherwise.
OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.