Whisper

A turbo-charged whisper database implementation.

Open tasks

What is Whisper?

Whisper is fixed-size file format for storing one run of time series measurements. A measurement comes from a single instance of a thing, said as CPU0 on Computer A or Bytes Transmitted on eth0 for Computer B. To measure a system you will end up with multiple whisper files.

The fixed-size of the file means it has a fixed retention, it can only store so many measurements. Each measurement has a timestamp which corresponds to a predetermined location in the file. And when you get to the end of the file the location just wraps around and overwrites data.

Whisper has clear benefits: easy capacity planning, no dynamic allocations. Whisper-files are the simplest way of storing time series data. This simplicity certainly has its tradeoffs but provides the best raw performance and throughput.

Note: if you want a more modern, clustered, appending time series database it is highly recommended you explore InfluxDB. It's still under heavy development but reflects the future we want.

The Python Implemenation

The original whisper system is written in Python, check out the project on github. There's a severe lack of tests, the code has multiple unused variables, and the most interesting parts are large, undocumented methods.

This Rust Implementation

This is actually version 2 of Xavier's reimplementation. The aim is create a small, fast library which can become the kernel of a full graphite implementation. This maintains full backwards compatibility with your existing whisper files.

How is it faster?

How is the code better?

How do I use it?

git clone https://github.com/tureus/whisper-mmap and cargo test to verify things work ok.

Simply opening a whisper file:

``` let path = Path::new("/tmp/blah.wsp").topathbuf(); let defaultspecs = vec!["1s:60s".tostring(), "1m:1y".tostring()]; let schema = Schema::newfromretentionspecs(default_specs);

let file = WhisperFile::new(&path, schema).unwrap(); // do things with the file ```