Easy, Erlang-inspired fault-tolerance framework for Rust Futures.
Features:
Box<dyn Any>
, LOL.Beta quality. Everything appears to work correctly, but we want to write more tests before we feel confident it is correct. I have fixed little bugs as I've noticed them, so clearly we needed better tests.
The API may change slightly before the initial crates release, but nothing major, I hope. Broadly speaking, I'm delighted with it, I'm just polishing it up and trying to make the documentation less awful.
The Backplane (that's a fancy word for 'motherboard') is a dynamic
mesh of Device
s. The Device
object is a Future's connection into
the Backplane. It maintains connections to other Devices, such that
when we disconnect (complete), we notify them. We can connect to
another device with Device.link()
, passing a LinkMode
, of which
there are three:
Monitor
- be notified when the other Device disconnects.Notify
- notify the other Device when this Device disconnects.Peer
- both notify each other when they disconnect.The way we react to these disconnections is what makes our
applications reliable. Erlang's equivalent of a spawned future, a
process
, is categorised according to how they handle errors:
worker
processes notified of a failure will fail themselvessupervisor
processes notified of a completion will apply some sort
of logic to restart processes under their supervision.In async-backplane, worker
corresponds to the Device.manage()
method. Here's an example using the 'smol' futures executor:
```rust use async_backplane::*; use smol::Task;
fn example() { let device = Device::new(); Task::spawn(async move { device.manage(async { ... }); }).detach(); } ```
There are three logical steps here:
* Creating the Device (Device::new()
).
* Spawning a Future on the executor (Task::spawn(...).detach()
).
* In the spawned Future, putting the Device into managed mode
with an async block to execute (device.manage(async { ... }
)`
Managed devices will run until the first of: * The async block returning a result. * The async block unwind panicking. * A Device sending us a message: * On receiving a shutdown request, complete successfully. * On receiving a disconnect notification that is fatal, fault.
The async block you provide should return a Result
of some kind. If
you return Ok
, the Device will be considered to have successfully
completed its work. If you return Err
, the Device will be considered
to have faulted.
When any of these conditions has occurred, the Device will notify all
Devices which are monitoring us of our status and the Device will be
dropped. The manage()
method returns a Result<T, Crash<C>>
where T
is the success type of the Result returned by the async block. C is
the error type for the same Result returned by the async
block. Crash
is just an enum with an arm for each kind of failure.
I'm still trying to work out what to do with crashes. I don't want
this library to be too opinionated or to bloat the dependency tree too
much. Maybe I'll do an opinionated library that uses this one, or
maybe you'll just create your own manage_panic()
function in each
project and use that? Suggestions gratefully received!
Device.watch()
is the tool for building more complex behaviours. It
protects against unwind panics and listens for disconnects, but it
just returns the first of the provided future's result and the next
disconnect to occur.
One of the more useful things you can do with watch is recreate futures that have failed. Indeed, this is how erlang Supervisors work!
There's lot of work still to do here. Much of it will probably be in libraries that build on top of this one.
Devices can be linked together by calling their link()
method. They
take a LinkMode
as described back at the start of the guide. Example:
```rust use async_backplane::*;
fn demo() { let a = Device::new(); let b = Device::new(); let c = Device::new(); a.link(&b, LinkMode::Peer); b.link(&c, LinkMode::Peer); // ... now go spawn them all ... } ```
Most of our Devices will be running in managed mode after they have
been set up. Managed mode takes ownership of our Device
, so how do
we link? Enter the Line
, a reference to a Device
that can be
cloned and passed around freely.
Getting a Line
is simple: device.line()
. Linking to a Line
from
a Device
is much like linking to a Device
, except we call
link_line()
instead of link()
. Unlike link()
:
Device
the line is connected to has
disconnected, so it returns a Result
.You can link between Lines
directly as well: Line
also has a
link_line()
method!
```rust use async_backplane::*;
fn demo() { let a = Device::new(); let b = Device::new(); let c = Device::new(); let c2 = c.line(); let d = Device::new(); let d2 = d.line(); a.link(&b, LinkMode::Peer); b.linkline(c2, LinkMode::Peer).unwrap(); c2.linkline(d2, LinkMode::Peer).unwrap(); // ... now go spawn them all ... } ```
Once you have linked with something through a Line
, you should only
unlink it through the Line
. Device-to-Device linkage is fast because
it avoids the work that would make it handle this case correctly. In
general, you should only link or unlink with Device
s when you know
you have not previously linked with the corresponding Line
s.
async-backplane does not implement actors, only links and monitors. It is a lower level tool that allows for a wider range of usage patterns. You could build actors (and other things!) on top of this.
These work great alongside async-backplane
:
Note: these will likely be new libraries, linked from here when public.
These numbers are random unscientific benchmark measurements from my shitty 2015 macbook pro. Your numbers may be different. Run the benchmarks, or better still, bench your real world code using it.
``` Running target/release/deps/device-8add01b9803770b5
running 11 tests test createdestroy ... bench: 212 ns/iter (+/- 9) test devicemonitordrop ... bench: 585 ns/iter (+/- 64) test devicemonitordropnotify ... bench: 771 ns/iter (+/- 39) test devicemonitorerrornotify ... bench: 798 ns/iter (+/- 39) test devicepeerdropnotify ... bench: 964 ns/iter (+/- 40) test devicepeererrornotify ... bench: 941 ns/iter (+/- 304) test linemonitordrop ... bench: 805 ns/iter (+/- 48) test linemonitordropnotify ... bench: 975 ns/iter (+/- 48) test linemonitorerrornotify ... bench: 993 ns/iter (+/- 55) test linepeerdropnotify ... bench: 1,090 ns/iter (+/- 62) test linepeererror_notify ... bench: 1,181 ns/iter (+/- 65)
test result: ok. 0 passed; 0 failed; 0 ignored; 11 measured; 0 filtered out
Running target/release/deps/line-c87021ef05fddd66
running 6 tests test createdestroy ... bench: 13 ns/iter (+/- 4) test linemonitordrop ... bench: 793 ns/iter (+/- 51) test linemonitordropnotify ... bench: 968 ns/iter (+/- 357) test linemonitorerrornotify ... bench: 1,018 ns/iter (+/- 54) test linepeerdropnotify ... bench: 1,343 ns/iter (+/- 70) test linepeererror_notify ... bench: 1,370 ns/iter (+/- 77) ```
Note that when linking, it is cheaper to use a Device than a Line, that is:
device.link()
is fastest.device.link_line()
is slightly more expensive.line.link_line()
is slightly more expensive still.If performance really matters, do not use dynamic topologies. Also spend some time microoptimising this library, because we didn't yet.
Copyright (c) 2020 James Laver, async-backplane Contributors
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.