tailsrv is a high-performance file-streaming server. It's like tail -f
in
server form. It has high throughput, low latency, and scales to lots of
clients (see Performance). Setup is very simple, and clients
don't require a special library. It is, however, Linux-only (see
Limitations).
Here's how it works in a nutshell:
Compared to a simple TCP connection between your producer and consumer, a tailsrv instance in the middle can be used to provide:
For a quick-start, see the example usage.
Clients open a TCP connection and send a header. This header should be a single signed integer, formatted as a decimal string, UTF8-encoded, and terminated with a newline. The integer represents the byte offset at which to start. If the value is negative, it is interpreted as meaning "counting back from the end of the file". Examples:
"0\n" - start from the beginning of the file "1000\n" - start from byte 1000 "-1000\n" - send the last 1000 bytes
After sending a header, the client should start reading data from the socket. At all times, communication is one-way: first the client sends a header, then tailsrv sends some data. tailsrv will send nothing until a newline is recieved, and once a newline has been recieved it will ignore anything sent by the client. The client may unceremoniously hang up at any time.
tailsrv will not terminate the connection for any reason, unless it is shutting down. If the watched file is deleted or moved, tailsrv will exit.
We use inotify to track modifications to files. This allows us to avoid the latency associated with polling. It also means that watches of quiescent files don't have any performance cost.
We use epoll to track whether clients are writable. This means that a slow client can recieve data at its own pace, but it won't block other clients (even though tailsrv uses only a single thread for sending data).
The use of sendfile means that all data is sent by the kernel directly from the pagecache to the network card. No data is ever copied into userspace. This gives tailsrv really good throughput.
Clients can read data as fast or as slow as they please, without affecting each other. Some fairness properties are guaranteed (TODO: document these).
TODO: Benchmarks
tailsrv is Linux-only, due to its use of sendfile.
tailsrv uses an inotify watch for each file. This puts an upper limit on the
number of watched files: see /proc/sys/fs/inotify/max_user_watches
(the
default is 64k). If two clients watch the same file, only one watch is used.
When all clients for a file disconnect, the watch is removed.
The server operator must ensure that all watched files are append-only. tailsrv won't crash if you modify the middle of a file, but any expectations about log replayability your clients may have will be broken.
tailsrv opts for extremely simple interfaces for both producers and clients; it also makes operations very simple for users who don't have complicated requirements. It therefore lacks some features you might expect, if you're coming from eg. Kafka.
Non-features:
Limitations of the design:
Perhaps you want to write all your log-structured data to one place, and then consume it elsewhere? This Kafka-style approach to data processing has become popular lately, and tailsrv can function as a component in such a setup.
tailsrv will allow consumers to connect to your log server, but it doesn't help you get data onto it in the first place - for this task you'll need to use something else. Here are some ideas:
producerprog |
ssh logserver "cat >> logfile"
?rsync --append
?nc
producerserver 5432 >> logfile
on the logserver?Let's say the machine is called logserver
. Pick a port number and start
tailsrv:
console
$ tailsrv -p 4321 /var/log/nginx/access.log
Now that tailsrv is running, the following commands will do roughly the same thing:
console
$ ssh logserver -- tail -f -n+1000 /var/log/nginx/access.log
$ echo "1000" | nc logserver 4321
Rather than using netcat, however, you probably want to connect to tailsrv directly from your log-consuming application. This is very easy:
rust
let sock = TcpStream::connect("logserver:4321")?;
writeln!(sock, "{}", 1000)?;
for line in BufReader::new(sock).lines() {
/* handle log data */
}
The example above is written in rust, but you can connect to tailsrv from any programming language without the need for a special client library.
This software is in the public domain. See UNLICENSE for details.