Crates.io | drainrs |
lib.rs | drainrs |
version | 0.1.0 |
source | src |
created_at | 2023-02-12 19:03:03.299078 |
updated_at | 2023-02-12 19:03:03.299078 |
description | An implementation of the drain logparsing algorithm |
homepage | https://github.com/haydenflinner/drainrs |
repository | https://github.com/haydenflinner/drainrs |
max_upload_size | |
id | 783397 |
size | 49,385 |
drainrs implements the Drain algorithm for automatic log parsing.
cargo run ./apache-short.log | tail
{"template":"[Sat Jun <*> <*> <*> [error] [client <*> script not found or unable to stat: /var/www/cgi-bin/awstats",
"values":["11","03:03:04","2005]","202.133.98.6]"]}
{"template":"[Sat Jun <*> <*> <*> [error] [client <*> script not found or unable to stat: /var/www/cgi-bin/awstats",
"values":["11","03:03:04","2005]","202.133.98.6]"]}
{"template":"[Sat Jun <*> <*> <*> [error] [client <*> script not found or unable to stat: /var/www/cgi-bin/awstats",
"values":["11","03:03:04","2005]","202.133.98.6]"]}
{"template":"[Sat Jun <*> <*> <*> [error] <*> Can't find child <*> in scoreboard",
"values":["11","03:03:04","2005]","jk2_init()","4210"]}
{"template":"[Sat Jun <*> <*> <*> [notice] workerEnv.init() ok <*>",
"values":["11","03:03:04","2005]","/etc/httpd/conf/workers2.properties"]}
{"template":"[Sat Jun <*> <*> <*> [error] mod_jk child init <*> <*>",
"values":["11","03:03:04","2005]","1","-2"]}
{"template":"[Sat Jun <*> <*> <*> [error] [client <*> script not found or unable to stat: /var/www/cgi-bin/awstats",
"values":["11","03:03:04","2005]","202.133.98.6]"]}
A log record is an entry in a text file, typically one-line but doesn't have to be.
E.g. [Thu Jun 09 06:07:05 2005] [notice] Digest logline here: done
A log template is the string template that was used to format log record.
For example in Python format, that would look like this:
"[{date}] [{log_level}] Digest logline here: {status}".format(...)
Or in the syntax output by drain.py and drain3.py:
"[<date>] [<log_level>] Digest logline here: <*>"
None of the parameters that are configurable in the Python version are yet configurable here.
The first drain allowed split_line_provided
, which let you write a simple token-mapper like this:
<timestamp> <loglevel> <content>
And then drain would only apply its logic to <content>
.
Drain3 appears to have dropped this in favor of preprocessing on the user-code side, which is fair enough, although the feature is very helpful from a cli/no-coding perspective.