tailsrv

Crates.iotailsrv
lib.rstailsrv
version0.9.2
sourcesrc
created_at2022-01-25 08:53:34.91873
updated_at2024-12-13 08:35:47.282993
descriptionA high-performance file-streaming server
homepage
repositoryhttps://github.com/asayers/tailsrv
max_upload_size
id520667
size54,628
CI (github:colearn-dev:ci)

documentation

README

tailsrv

tailsrv watches a single file and streams its contents to multiple clients as it grows. It's like tail -f, but as a server.

  • When a client connects, tailsrv sends it the current contents of the file.
  • When the file grows, tailsrv sends the new data to all clients.
  • Clients can specify a byte-offset when they connect; tailsrv will not send data before that position in the file.
  • If a client's socket is full, tailsrv waits for the client to consume some data before sending more. Other clients are not affected.

tailsrv is low-latency, high-throughput, and consumes minimal system resources. It requires Linux >=5.7.

Some implementation details:

  • All data goes directly from the pagecache to the network card, never copied into userspace. This is done using the splice() syscall (effectively we're doing sendfile(), but I have to emulate it since io_uring doesn't support sendfile yet.)
  • We use inotify to track modifications to the file. This means that, when things are calm, the tailsrv process can go to sleep, and will be woken up by the kernel when the file grows (or a new client connects).
  • The I/O is dispatched using io_uring. This means that the number of threads required doesn't depend on the number of clients. Thousands of clients can connect simulateneously without slowing down the system.

If you're interested in how tailsrv compares to Kafka, see here for a comparison.

Usage example

Let's say you have a machine called webserver. Pick a port number and start tailsrv:

$ tailsrv -p 4321 /var/log/nginx/access.log

tailsrv is now watching access.log. You can connect to tailsrv from your laptop and stream the contents of the file:

$ echo "1000" | nc webserver 4321

You will immediately see the contents of access.log, starting from byte 1000, up to the end of the file. The connection remains open, waiting for new data. As soon as nginx writes a line to access.log, it will appear on your laptop. It's more-or-less the same as if you did this:

$ ssh webserver -- tail -f -c+1000 /var/log/nginx/access.log

Rather than using netcat, however, you probably want to connect to tailsrv directly from your log-consuming application.

let sock = TcpStream::connect("webserver:4321")?;
writeln!(sock, "{}", 1000)?;
for line in BufReader::new(sock).lines() {
    /* handle log data */
}

The example above is written in rust, but as you can see it's very straightforward: you can to do this from any programming language without the need for a special client library.

Protocol

Step 1: the client sends a header to tailsrv

The header is just an integer, in ASCII, terminated with a newline. If the integer is positive, it represents the initial byte offset. If the integer is negative, it is interpreted as meaning "counting back from the end of the file". Examples:

  • 0\n - start from the beginning of the file
  • 1000\n - start from byte 1000
  • -1000\n - send the last 1000 bytes

Step 2: tailsrv sends data to the client

Once it receives a header, tailsrv will start sending you file data.

...and that's it as far as the protocol goes. tailsrv will ignore everything you send to it after the newline. When you're done, just close the connection. tailsrv will not terminate the connection unless it is shutting down.

There's no in-band session control: if you want to seek to a different position in the file, close the connection and open a new one.

The file

tailsrv expects a file which will be appended to. If the watched file is deleted or moved, tailsrv will exit. If you modify the middle of the file - well, nothing disasterous will happen, but your clients might get confused.

Features

tracing-journald

Enables a dependency on tracing-journald crate and adds a new --journald command-line flag. This will redirect all the tracing output to the system journald which gives much richer information than the default output formatter. Especially useful if you're planning to run tailsrv as a systemd service.

sd-notify

Enables a dependency on sd-notify crate. tailsrv is going to send a systemd readiness notification once it starts accepting connections from clients. This is useful combined with a notify systemd service type.

Licence

This software is in the public domain. See UNLICENSE for details.

Commit count: 180

cargo fmt