# audis

Audis is an implementation of a multi-index audit log,
built atop the Redis key-value store solution.

An audit log consists of zero or more Event objects,
indexed against one or more subjects, each.
The correspondence between Redis databases and audit
logs is 1:1 -- each audit log inhabits precisely one
Redis instance, and an instance can only house a single
audit log.

### Example: Logging Events

The simplest way to use audis is to point it at a Redis
instance and start logging to it:

```rust
extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    client.log(&audis::Event{
        id: "foo1".to_string(),
        data: "{\"some\":\"data\"}".to_string(),
        subjects: vec![
            "system".to_string(),
            "user:42".to_string(),
        ],
    }).unwrap();

    // ... etc ...
}
```

### Retrieving The Audit Log

What good is an audit log if you can't ever review it?

```rust
extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    for subject in &client.subjects().unwrap() {
        println!("## {} ######################", subject);
        for event in &client.retrieve(subject).unwrap() {
            println!("  {}", event.data);
        }
        println!("");
    }
}
```

### Distributed Audit Logging via Threads

A common pattern with audis is to delegate a single thread
to the task of shunting event logs into Redis, allowing other
threads to focus on their work, without being slowed down by
momentary hiccups in the auditing layer.

You can do this via the `background()` function, which returns
a buffered channel you can send events to, and the join handle
of the executing background thread:

```rust
extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    // buffer 50 events
    let (tx, thread) = client.background(50).unwrap();

    tx.send(audis::Event{
        id: "foo1".to_string(),
        data: "{\"some\":\"data\"}".to_string(),
        subjects: vec![
            "system".to_string(),
            "user:42".to_string(),
        ],
    }).unwrap();

    // ... etc ...

    thread.join().unwrap();
}
```

### Implementation Details

Audis uses four (4) types of objects A) the events
themselves, B) reference counts for inserted events,
C) per-subject lists of events, sorted in insertion-order,
and D) a master list of all known subjects.

Each audit event is stored as an opaque blob, usually
JSON -- for the purposes of this library, the exact contents
of events is immaterial.  Each event gets its own Redis key,
derived from its globally unique ID, in the form `audit:$id`.
This makes retrieving events an _O(1)_ operation, using the
Redis `GET` command.

Accompanying each event object is a reference count, kept
under a parallel keying structure that appends `:ref` to the
main event object key.  These reference counts are integers
that track how many different subjects are currently still
referencing the given event.  For example, `audit:ae2:ref`
is the reference count key for `audit:ae2`.

Each subject in the audit log maintains its own list of
event IDs that are relevant to it.  These lists are stored
under keys derived from the subject itself.  Callers are
strongly urged to ensure that subject names are as unique
as they need to be for analysis.

Finally, a single Redis Set, called `subjects`, exists to
track the complete set of known subject strings.  This
facilitates discovery of the different subsets of the audit
log.

Here is some pseudocode for the insertion logic of the
`LOG(e)` operation, where `e` is an object:

```redis-pseudo-code
LOG(e):
    var id = $e[id]
    SETEX "audit:$id" $e[data]
    for s in $e[subjects]:
        SADD "subjects" "$s"
        RPUSH "$s" "$id"
        INCR "audit:$id:ref"
```

Technically speaking, `LOG(e)` runs in _O(n)_, linearly
to the number of subjects that the audit log event applies
to.  However, given that this `n` is usually very small
(almost always < 100), `LOG(e)` performs well.

`RETR(s)` is straightforward: iterate over the subject list
in Redis via `LRANGE` and then `GET` the referenced event
objects:

```redis-pseudo-code
RETR(s):
    var log = []
    for id in LRANGE "$s" 0 -1:
        $log.append( GET "audit:$id" )
    return $log
```

Since `LOG(e)` only ever adds to our audit log dataset,
and `RETR(s)` is a read-only operation, our Redis footprint
will forever grow, unless we define operations to clear out
old log entries.  Deleting parts of our audit log seems
wrong and counter-productive -- the whole point is to know
what happened!  Storage (especially memory), however, isn't
unlimited, and for debugging purposes (at least), audit log
events become less relevant over time.

For these reasons, we define two pruning operations:
`TRUNC(s,n)`, for truncating a subject eventset to the most
recent `n` audit events, and `PURGE(s,last)`, for deleting
a events from a subject eventset, until a given ID is found.
(That ID will also be removed, as well).

These two operations allow us to define a cleanup routine,
outside of the audis library, which can do things like
render and persist audit events to a log file in a filesystem
or external blobstore (i.e. S3), before ultimately deleting
them from Redis.

Here is the pseudo-code for `TRUNC(s,n)`:

```redis-pseudo-code
TRUNC(s,n):
    var end = 0 - n - 1
    for id in LRANGE "$s" 0 $end:
        LPOP "$s"
        DECR "audit:$id:ref"
        if GET "audit:$id:ref" <= 0:
            DEL "audit:$id:ref"
            DEL "audit:$id"
```

As events are truncated from the subject's index, the
associated reference counts are checked to determine if any
larger cleanup (via `DEL`) needs to be performed.

`PURGE(s,last)` is similar:

```redis-pseudo-code
PURGE(s,last):
    for id in LRANGE "$s" 0 -1:
        LPOP "$s"
        DECR "audit:$id:ref"
        if GET "audit:$id:ref" <= 0:
            DEL "audit:$id:ref"
            DEL "audit:$id"
        if $id == $last
            break
```

Both of these operations suffer from massive problems
when run concurrently with each other, or with other
calls to themselves.  A future version of this library
will correct this, by the judicious use of `LOCK()`/`UNLOCK()`
primitives implemented inside of the same Redis database.