Array Cow
Introduction
In memory array de-duplication, useful for efficiently storing many versions of data.
This is suitable for storing undo history for example, and is effective with both binary and text data.
- Configurable block sizes.
- Supports array-stride to avoids overhead of detecting blocks and un-aliened offsets.
(a stride of 1 for bytes works too)
- Uses block hashing for de-duplication.
- Each state must only reference its previous, making both linear and tree structures possible.
Further Work
It may be worth using mmap
for data storage.