someday
Eventually consistent, multi version concurrency.
someday
is a multi-version concurrency control primitive.
All Reader
's receive lock-free Commit
's of data along with a timestamp.
The single Writer
can write lock-free and chooses when to push()
their changes to the readers.
push()
is atomic and all future readers from that point will be able to see the new data.
Readers who are holding onto old copies of data will be able to continue to do so indefinitely. If needed, they can always acquire a fresh copy of the data using head()
, but them holding onto the old Commit
's will not block the writer from continuing.
someday
's API uses git
syntax and semantically does similar actions.
The Writer
:
1. Calls add()
to add a Patch
to their data
2. Actually executes those changes by commit()
'ing
3. Can see local or remote (reader) data whenever
4. Can atomically push()
those changes to the [Reader
]'s
5. Can continue writing without having to wait on Reader
's
The Reader(s)
:
1. Can continually call head()
to cheaply acquire the latest "head" Commit
2. Can hang onto those Commit
objects forever (although at the peril of memory-usage)
3. Will eventually catch up whenever the Writer
calls push()
This example shows the typical use case where the Writer
:
1. Adds some changes
2. Reads their local changes
3. Adds some more changes
4. Locks in those changes by calling commit()
5. Finally reveals those changes to the readers by calling push()
and the Reader
:
1. Continually reads their latest head Commit
of the current data
2. Eventually catches up when the Writer
publishes with push()
The code: ```rust use someday::patch::PatchVec; use someday::{Writer,Reader,CommitRef,Commit,Apply};
// Create a vector. let v = vec!["a"];
// Create Reader/Writer for the vector v
.
let (r, mut w) = someday::new(v);
// The readers see the data.
let commit: CommitRef
// Writer writes some data, but does not commit. w.add(PatchVec::Push("b")); // Nothing commited, data still the same everywhere. let data: &Vec<&str> = w.data(); asserteq!(*data, vec!["a"]); // Patches not yet commit: asserteq!(w.staged().len(), 1);
// Readers still see old data. assert_eq!(r.head(), vec!["a"]);
// Writer writes some more data. w.add(PatchVec::Push("c")); // Readers still see old data. assert_eq!(r.head(), vec!["a"]);
// Writer commits their patches. let patches: usize = w.commit(); // The 2 operation were commited locally // (only the Writer sees them). assert_eq!(patches, 2);
// Readers still see old data. assert_eq!(r.head(), vec!["a"]);
// Writer finally reveals those
// changes by calling push()
.
let commitspushed = w.push();
asserteq!(commits_pushed, 1);
// Now readers see updates.
let commit: CommitRef.commit()
added 1 to the timestamp.
asserteq!(commit.timestamp(), 1);
```
Readers are lock-free and most of the time wait-free.
The writer is lock-free, but may block for a bit in worst case scenarios.
When the writer wants to push()
updates to readers, it must:
1. Atomically update a pointer, at which point all future readers will see the new data
2. Re-apply the patches to the old reclaimed data
The old data can be cheaply reclaimed and re-used by the Writer
if there are no Reader
's hanging onto old Commit
's
This library is very similar to left_right
which uses 2 copies (left and right) of the same data to allow for high concurrency.
The big difference is that someday
theoretically allows infinite copies of new data, as long as the readers continue to hold onto the old references.
A convenience that comes from that is that all data lives as long as there is a reader/writer, so there is no None
returning .get()
like in left_right
. In someday
, if there is a Reader
, they can always access data, even if Writer
is dropped and vice-versa.
The downside is that there are potentially infinite copies of very similar data.
This is actually a positive in some cases, but has obvious tradeoffs, see below.
If there are old Reader
's preventing the Writer
from reclaiming old data, the Writer
will create a new copy so that it can continue.
In regular read/write/mutex locks, this is where the lock()
would hang waiting to acquire the lock.
In left_right
, this is where the publish()
function would hang, waiting for all old readers to evacuate.
In someday
, if the Writer
cannot reclaim old data, instead of waiting, it will completely clone the data to continue.
This means old Reader
's are allowed to hold onto old Commit
's indefinitely and will never block the Writer
.
This is great for small data structures that aren't too expensive to clone and/or when your Reader
's are holding onto the data for a while.
The obvious downside is that the Writer
will fully clone the data over and over again. Depending on how heavy your data is (and if it is de-duplicated via Arc
, Cow
, etc) this may take a while.
As the same with left_right
, someday
retains all the same downsides:
Increased memory use: The Writer
keeps two copies of the backing data structure, and Reader
's can keep an infinite amount (although this is actually wanted in some cases)
Deterministic patches: The patches applied to your data must be deterministic, since the Writer
must apply them twice
Single writer: There is only a single Writer
. To have multiple Writer
's, you need to ensure exclusive access to the through something like a Mutex
Slow writes: Writes are slower than they would be directly against the backing data structure
Patches must be enumerated: You yourself must define the patches that can be applied to your data
Limited to simple patches: Complex patches with lifetimes, return values, etc, are trickier to implement and sometimes impossible. The patches are usually limited to simple patches like setting/adding/removal.
someday
is useful in situations where:
Your data: - Is relatively cheap to clone (or de-duplicated)
and if you have readers who: - Want to acquire the latest copy of data, lock-free - Hold onto data for a little while (or forever)
and a writer that: - Wants to make changes to data, lock-free - Wants to "publish" those changes ASAP to new readers, lock-free - Doesn't need to "publish" data at an extremely fast rate (e.g, 100,000 times a second)