#

Block Array Cow

#

Introduction

In memory array de-duplication, useful for efficiently storing many versions of data.

This is suitable for storing undo history for example - where the size of a struct can be used as the stride, and is effective with both binary and text data.

The code is Apache2.0 licensed and doesn't have any dependencies.

Motivation

For an undo system (or any other history storage) you may want to store many versions of your data.

In some cases it makes sense to write a persistent data structure <https://en.wikipedia.org/wiki/Persistent_data_structure>__ but this depends a lot on the kind of data you're dealing with.

In other cases its nice to have the convenience of being able to serialize your data and store it without worrying about the details of how duplication is managed.

That's the motivation for writing this library.

Algorithm

Where N is currently the stride * 7, see: BCHUNK_HASH_TABLE_ACCUMULATE_STEPS.

.. note::

This has a slight emphasis on performance, since this method is used in an 3D modelers undo system. Where changes accumulate and are freed at run-time.

The use of fixed sized chunks is better suited to in-memory data storage, compared to a command line utility for creating a one-off binary diff for example - where more exhaustive tests may be preferred.

Supported

Unsupported

In general operations that would use excessive calculation are avoided, since there are many possible changes that would improve memory usage at the cost of performance.

Further Work

Some things that may be worth considering.

Links