This project is a work in progress. Feel free to contribute.
Efficient parallelization of pipelined computations on multiple cores with a simple interface.
The interface should look something like:
rust
let data = vec!(0, 1, 2);
let mut sum = 0;
pipeline!(4, data
=> (|x| (x * 2, x) => |(x2, x)| x2 - x,
|x| -x)
=> |x| sum += x);
In this example, the tasks to be done are:
- Iterate over the elements in data
- Clone each element and pass it to both the |x| (x * 2, x)
and |x| -x
closures.
- Apply the output of |x| (x * 2, x)
to |(x2, x)| x2 - x
- Sum all outputs of |(x2, x)| x2 - x
and |x| -x
into the sum
variable.
This constructs a graph in which each node is a closure. Data flows between the closures and gets processed.
Except for the first and the last nodes in this example (the iteration and the sum nodes),
all nodes are completely parallelizable. The pipeline!
macro will instantiate a scheduler
which will use 4 threads to run all nodes and maximize computation throughput.
Support stateless producers? Rayon-style splittable iterators?
When running short tasks, threads spend significant time synchronizing pushing and popping from the task queue. The usual solution for that would be to implement work stealing.
consume
interface to return a boolean and use try_lock
when executing functions in a mutex.
This might be less problematic once we implement work stealing. Also, we might wanna use a refcell instead of a mutex.Smartly manage the balance between having many ready tasks and memory usage, by knowing which nodes produce more work and which nodes consume more work, and prioritizing them according to current workload.
Support fanout in macro