zkmq.
rust zookeeper backed message queue. designed for low throughput, high durability, message queue using only
zookeeper. messages also support filtering. the producer can set filters, and the client can chose to respect the
filters.
note: this is still experimental and in development. don't assume anything will work or the api will remain the same
tasks are given a randomly generated ID and written into /dir/tasks/<task>. Once tree is written, insert message
into /dir/queue.
a worker will try and read the next message from /dir/queue. once it has a task, it will try and make a lock in
/dir/tasks/<task>/claim. once the lock is in place, starts processing the task.
a background thread will perform garbage collection at the interval configured in /dir/conf/task_gc_interval
task has filter performance_class=standard
node has filter performance_class=high_performance
filter config is OrderedEnum(performance_class)=("low", "standard", "performance", "high_performance")
if filter_method=absolute, a performance_class=standard node will be chosen
if filter_method=at_least, a performance_class=high_performance can be selected
tree layout.
/dir
/queue Pending Messages
/<task>
/<task>
/<task>
/tasks
/<task>
/filters
/<filter_name> => filter value
/state => state enum
/metadata => json encoded metadata
/data => binary data
/consumers Active Consumers
/<id>
/current Current Task
/conf
/task_gc_lifetime
/task_gc_interval
/lock
design.
messages.
messages are stored in two parts. first, the message is assigned a uuid and written to /<dir>/tasks/<id>/data, and
then metadata & filters is written to /<dir>/tasks/<id>. once all the data is written, a message is inserted to
represent the pending task.
this structure allows for two things.
1. limited tracking of the task state, from inserted, to claimed, to finished.
2. tasks can be filtered based on consumer preferences.
the main drawback is that there needs to be some garbage collection to remove completed tasks. this is configured by
the value in /<dir>/conf/task_gc_lifetime. tasks older then this value will be deleted. the polling interval is
governed by /<dir>/conf/task_gc_interval. every producer will try and perform this garbage collection (unless
explicitly configured not to.
( TTL could be used to replace this on clusters that have TTL enabled and configured )
consumers.
the consumers also perform cordinated matience tasks, like cleanup and garbage collection. the reason this is done
on the consumer side, is that it is assumed that the consumers will always be alive longer then the producers. the
consumers will always be around to read messages, even if there isn't any being produced.