ratatoskr-cli

Crates.ioratatoskr-cli
lib.rsratatoskr-cli
version0.3.1
created_at2025-12-18 02:16:11.817658+00
updated_at2026-01-09 23:43:17.845012+00
descriptionTrace-first, deterministic execution for language model workflows
homepage
repositoryhttps://github.com/andrewrgarcia/ratatoskr
max_upload_size
id1991590
size66,922
Andrew (andrewrgarcia)

documentation

README

r3

RATATOSKR

Local-first execution layer for language models with enforced memory explicitness and verifiable provenance.


Overview

RATATOSKR executes language models locally as pure, replaceable functions.

Everything else—tasks, documents, conversations, prompts, outputs—exists as explicit, frozen material on disk. Nothing is inferred. Nothing is hidden. Nothing persists outside the trace.

Each execution produces a complete, inspectable record: what information was available, how it was assembled, what the model generated, and which material atoms were cited.

If an output cannot be grounded in declared material, execution fails.


Execution Model

  1. Intent frozen
    Task specification (task.yaml) defines question, scope, engine, and memory references.

  2. Memory explicit
    Conversations and documents resolve to immutable material atoms (e.g. FUR messages, document excerpts).

  3. Context assembled deterministically
    Prompts are constructed in a fully specified, reproducible order.

  4. Model executes
    The language model is treated as an opaque executor: input → output.

  5. Provenance enforced
    Outputs must cite material atoms. Invalid or missing citations fail execution.

  6. Trace finalized
    Inputs, outputs, validation results, and metadata are persisted as durable artifacts.

No hidden state.
If it is not on disk, it did not happen.


Core Principle

Every answer must prove where it came from.
Enforced mechanically, not by convention.


Local-First Execution

RATATOSKR is built for offline and air-gapped environments.

  • Models are downloaded once and run locally
  • No network access is required for execution
  • No telemetry, background calls, or hidden dependencies
  • Engines are invoked explicitly by path

Local-first execution is not an optimization or deployment choice. It is a hard architectural constraint.


Local Inference Setup (llama.cpp + Hugging Face)

RATATOSKR requires a local inference engine and a local model file. It does not download, install, or manage either.

The reference local-first setup uses:

  • llama.cpp — inference engine
  • GGUF model files — downloaded from Hugging Face

1. Install llama.cpp (Linux)

Download a prebuilt llama.cpp release:

https://github.com/ggerganov/llama.cpp/releases

Extract the archive and move it to a stable location:

mkdir -p ~/engines
mv ~/Downloads/llama-b7684 ~/engines/llama.cpp

2. Create a local wrapper (required)

llama.cpp ships shared libraries. To avoid system-wide installation, create a local wrapper.

cd ~/engines/llama.cpp
nano llama

Paste:

#!/usr/bin/env bash
DIR="$(cd "$(dirname "$0")" && pwd)"
export LD_LIBRARY_PATH="$DIR"
exec "$DIR/llama-cli" "$@"

Make it executable:

chmod +x llama

Verify:

~/engines/llama.cpp/llama --help

3. Download a GGUF model (Hugging Face)

Models are not included. Download one GGUF file explicitly.

Recommended (balanced quality, instruction-tuned):

mkdir -p ~/models
cd ~/models

wget https://huggingface.co/TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF/resolve/main/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf

Verify:

ls -lh ~/models

4. Test the engine directly

~/engines/llama.cpp/llama \
  -m ~/models/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf \
  -p "Test output [FUR:test]." \
  --n-predict 128

If text prints, local inference is working.


5. Configure RATATOSKR

Reference the engine and model explicitly in task.yaml:

engine:
  type: llama.cpp
  name: /home/youruser/engines/llama.cpp/llama
  model: /home/youruser/models/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf

RATATOSKR treats the engine as an opaque executor. All execution must be reproducible from disk.

Integration

RATATOSKR operates in concert with:

  • FUR — durable conversation and memory logging
  • Yggdrasil-CLI — project and codebase flattening
  • Inference engines — external executors invoked explicitly

RATATOSKR defines execution truth.
Other systems produce or consume its traces.


Output

Each run creates a trace directory containing:

  • frozen task specification
  • resolved memory and context artifacts
  • assembled prompt
  • engine declaration
  • raw model output
  • material usage ledger
  • execution metadata and status

Traces are append-only, inspectable, and replayable.


Direction

RATATOSKR is evolving toward a complete local-first execution stack where humans curate knowledge explicitly, models remain frozen and replaceable, and reasoning is reproducible by construction.

Future work extends execution patterns and supported engines without weakening the core contract.


The execution record your model cannot escape.

Commit count: 32

cargo fmt