Fast zero-configuration single-binary simple queue service.
queued requires persistent storage, and it's preferred to provide a block device directly (e.g. /dev/my_block_device
) to bypass the file system for higher performance. Alternatively, a standard file can be used too (e.g. /var/lib/queued/data
). In either case, the entire device/file will be used.
```
cargo install queued
queued --device /dev/myblockdevice --format ```
queued --device /dev/my_block_device
```jsonc // 🌐 POST localhost:3333/push { "messages": [ { "contents": "Hello, world!", "visibilitytimeoutsecs": 0 } ] } // ✅ 200 OK { "id": 190234 }
// 🌐 POST localhost:3333/poll { "visibilitytimeoutsecs": 30 } // ✅ 200 OK { "message": { "contents": "Hello, world!", "created": "2023-01-03T12:00:00Z", "id": 190234, "pollcount": 1, "polltag": "f914659685fcea9d60" } }
// 🌐 POST localhost:3333/update { "id": 190234, "polltag": "f914659685fcea9d60", "visibilitytimeout_secs": 15 } // ✅ 200 OK {}
// 🌐 POST localhost:3333/delete { "id": 190234, "poll_tag": "f914659685fcea9d60" } // ✅ 200 OK {} ```
With a single Intel Alder Lake CPU core and NVMe SSD, queued manages around 300,000 operations (push, poll, update, or delete) per second with 4,096 concurrent clients and a batch size of 64. There is minimal memory usage; only metadata of each message is stored in memory.
As every operation is durably persisted to the underlying storage, the storage I/O performance can quickly become a bottleneck. Consider using RAID 0 and tuning the write latency for better performance.
At the API layer, only a successful response (i.e. 2xx
) means that the request has been successfully persisted (fdatasync
) to disk. Assume any interrupted or failed requests did not safely get stored, and retry as appropriate. Changes are immediately visible to all other callers.
It's recommended to use error-correcting durable storage when running in production, like any other stateful workload.
Performing backups can be done by stopping the process and taking a copy of the contents of the file/device.
POST /suspend
can suspend specific API endpoints, useful for temporary debugging or emergency intervention without stopping the server. It takes a request body like:
json
{
"delete": true,
"poll": false,
"push": false,
"update": true
}
Set a property to true
to disable that endpoint, and false
to re-enable it. Disabled endpoints will return 503 Service Unavailable
. Use GET /suspend
to get the currently suspended endpoints.
POST /throttle
will configure poll throttling, useful for flow control and rate limiting. It takes a request body like:
json
{
"throttle": {
"max_polls_per_time_window": 100,
"time_window_sec": 60
}
}
This will rate limit poll requests to 100 every 60 seconds. No other endpoint is throttled. Throttled requests will return 429 Too Many Requests
. Use GET /throttle
to get the current throttle setting. To disable throttling:
json
{
"throttle": null
}
GET /healthz
returns the current build version.
GET /metrics
returns metrics in the Prometheus or JSON (Accept: application/json
) format:
```
queuedemptypoll 0 1678525380549
queued_invisible 0 1678525380549
queuediosyncbackgroundloops 19601 1678525380549
queuediosync 0 1678525380549
queuediosync_delayed 0 1678525380549
queuediosynclongestdelay_us 0 1678525380549
queuediosyncshortestdelay_us 0 1678525380549
queuediosync_us 0 1678525380549
queuediowrite_bytes 0 1678525380549
queuediowrite 0 1678525380549
queuediowrite_us 0 1678525380549
queuedmissingdelete 0 1678525380549
queuedmissingupdate 0 1678525380549
queuedsuccessfuldelete 0 1678525380549
queuedsuccessfulpoll 0 1678525380549
queuedsuccessfulpush 0 1678525380549
queuedsuccessfulupdate 0 1678525380549
queuedsuspendeddelete 0 1678525380549
queuedsuspendedpoll 0 1678525380549
queuedsuspendedpush 0 1678525380549
queuedsuspendedupdate 0 1678525380549
queuedthrottledpoll 0 1678525380549
queued_vacant 0 1678525380549
queued_visible 4000000 1678525380549 ```
Clients in example-client can help with running synthetic workloads for stress testing, performance tuning, and profiling.
As I/O becomes the main attention for optimisation, keep in mind:
- We assume powersafe overwrites i.e. a write
won't affect any data outside of the target range.
- write
syscall data is immediately visible to all read
syscalls in all threads and processes.
- write
syscalls can be reordered, unless fdatasync
/fsync
is used, which acts as both a barrier and cache-flusher. This means that a fast sequence of write
(1: create) -> read
(2: inspect) -> write
(3: update) can actually cause 1 to clobber 3. Ideally there would be two different APIs for creating a barrier and flushing the cache.