created_at2017-06-12 11:29:11.610975
updated_at2021-11-21 11:14:09.424734
descriptionhigh-performance computation on any hardware
crates (github:spearow:crates)



coaster • Join the chat at Build Status dependency status License

coaster is an extensible, pluggable, backend-agnostic framework for parallel, high-performance computations on CUDA, OpenCL and common host CPU. It is fast, easy to build and provides an extensible Rust struct to execute operations on almost any machine, even if it does not have CUDA or OpenCL capable devices.

coaster's abstracts over the different computation languages (Native, OpenCL, Cuda) and let's you run highly-performant code, thanks to easy parallelization, on servers, desktops or mobiles without the need to adapt your code for the machine you deploy to. coaster does not require OpenCL or Cuda on the machine and automatically falls back to the native host CPU, making your application highly flexible and fast to build.

coaster is powering Juice.

  • Parallelizing Performance
    coaster makes it easy to parallelize computations on your machine, putting all the available cores of your CPUs/GPUs to use. coaster provides optimized operations through Plugins, that you can use right away to speed up your application.

  • Easily Extensible
    Writing custom operations for GPU execution becomes easy with coaster, as it already takes care of Framework peculiarities, memory management, safety and other overhead. coaster provides Plugins (see examples below), that you can use to extend the coaster backend with your own, business-specific operations.

  • Butter-smooth Builds
    As coaster does not require the installation of various frameworks and libraries, it will not add significantly to the build time of your application. coaster checks at run-time if these frameworks can be used and gracefully falls back to the standard, native host CPU if they are not. No long and painful build procedures for you or your users.

For more information,

Disclaimer: coaster is currently in a very early and heavy stage of development. If you are experiencing any bugs that are not due to not yet implemented features, feel free to create an issue.

Getting Started

If you're using Cargo, just add coaster to your Cargo.toml:

coaster = "0.2"

If you're using Cargo Edit, you can call:

$ cargo add coaster


You can easily extend coaster's Backend with more backend-agnostic operations, through Plugins. Plugins provide a set of related operations - BLAS would be a good example. To extend coaster's Backend with operations from a Plugin, just add a the desired Plugin crate to your Cargo.toml file. Here is a list of available coaster Plugins, that you can use right away for your own application, or take as a starting point, if you would like to create your own Plugin.

  • BLAS - coaster plugin for backend-agnostic Basic Linear Algebra Subprogram Operations.
  • NN - coaster plugin for Neural Network related algorithms.

You can easily write your own backend-agnostic, parallel operations and provide it for others, via a Plugin - we are happy to feature your Plugin here, just send us a PR.


coaster comes without any operations. The following examples therefore assumes, that you have added both coaster and the coaster Plugin coaster-nn to your Cargo manifest.

use coaster as co;
use coaster_nn as nn;

use co::prelude::*;
use nn::*;

fn write_to_memory<T: Copy>(mem: &mut FlatBox, data: &[T]) {
	let mut mem_buffer = mem.as_mut_slice::<T>();
	for (index, datum) in data.iter().enumerate() {
	    mem_buffer[index] = *datum;

fn main() {
    // Initialize a CUDA Backend.
    let backend = Backend::<Cuda>::default().unwrap();
    // Initialize two SharedTensors.
    let mut x = SharedTensor::<f32>::new(&(1, 1, 3));
    let mut result = SharedTensor::<f32>::new(&(1, 1, 3));
    // Fill `x` with some data.
    let payload: &[f32] = &::std::iter::repeat(1f32).take(x.capacity()).collect::<Vec<f32>>();
    let native = Backend::<Native>::default().unwrap();
    write_to_memory(x.write_only(native.device()).unwrap(), payload); // Write to native host memory.
    // Run the sigmoid operation, provided by the NN Plugin, on your CUDA enabled GPU.
    backend.sigmoid(&mut x, &mut result).unwrap();
    // See the result.


Want to contribute? Awesome! We have instructions to help you get started contributing code or documentation. And high priority issues, that we could need your help with.

We have a mostly real-time collaboration culture and happens here on Github and on the Gitter Channel. You can also reach out to the Maintainer(s) {@drahnr,}.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as below, without any additional terms or conditions.


You can find the release history in the root file

A changelog is a log or record of all the changes made to a project, such as a website or software project, usually including such records as bug fixes, new features, etc. - Wikipedia

We are using Clog, the Rust tool for auto-generating CHANGELOG files.


Licensed under either of

at your option.

Commit count: 178

cargo fmt