Meshanina is a rather strange key-value database, with three assumptions:
H
maps every value to every key: H(v) = k
. That is, the same key will never be rebound to a different value.Meshanina is designed for use as a content-addressed datastore, where keys are typically hashes of values and deletion is ill-defined.
A Meshanina database maps 256-bit integers to arbitrary byteslices. At its heart, it is simply an open-addressed hashtable mapping 256-bit keys to "records".
Database files are fixed-size. When they fill up, a new one twice as big is created and all records are moved across.
Each record takes up exactly 1024 bytes, and looks like this:
This, naturally, does not support values that are bigger than 512-4-32-4=984 bytes. Bigger values are supported through a separate mechanism.
We use linear probing: first we modulo the key with the number of records to find a "record number", then we scan until we find the correct one or hit an empty slot. Slots with non-validating checksums are treated as "empty", nonallocated space.
We probe for an "empty" space to insert the key. As a special case, if we find a record with identical key, we simply abort insertion.
Assuming no "blast radius" (edits to one record cannot corrupt other records), we have the following important property: once a key is bound to a value, and its records are safely on disk, the binding can never be corrupted no matter how wrongly subsequent writes go.
Achieving this property is the main reason why Meshanina chooses to use a naive open-addressed hashtable. It's much harder if we use b-trees or other linked data structures, especially without any kind of journaling or crash-recovery mechanism.