| Crates.io | ratatoskr-cli |
| lib.rs | ratatoskr-cli |
| version | 0.3.1 |
| created_at | 2025-12-18 02:16:11.817658+00 |
| updated_at | 2026-01-09 23:43:17.845012+00 |
| description | Trace-first, deterministic execution for language model workflows |
| homepage | |
| repository | https://github.com/andrewrgarcia/ratatoskr |
| max_upload_size | |
| id | 1991590 |
| size | 66,922 |
Local-first execution layer for language models with enforced memory explicitness and verifiable provenance.
RATATOSKR executes language models locally as pure, replaceable functions.
Everything else—tasks, documents, conversations, prompts, outputs—exists as explicit, frozen material on disk. Nothing is inferred. Nothing is hidden. Nothing persists outside the trace.
Each execution produces a complete, inspectable record: what information was available, how it was assembled, what the model generated, and which material atoms were cited.
If an output cannot be grounded in declared material, execution fails.
Intent frozen
Task specification (task.yaml) defines question, scope, engine, and memory references.
Memory explicit
Conversations and documents resolve to immutable material atoms
(e.g. FUR messages, document excerpts).
Context assembled deterministically
Prompts are constructed in a fully specified, reproducible order.
Model executes
The language model is treated as an opaque executor: input → output.
Provenance enforced
Outputs must cite material atoms. Invalid or missing citations fail execution.
Trace finalized
Inputs, outputs, validation results, and metadata are persisted as durable artifacts.
No hidden state.
If it is not on disk, it did not happen.
Every answer must prove where it came from.
Enforced mechanically, not by convention.
RATATOSKR is built for offline and air-gapped environments.
Local-first execution is not an optimization or deployment choice. It is a hard architectural constraint.
RATATOSKR requires a local inference engine and a local model file. It does not download, install, or manage either.
The reference local-first setup uses:
Download a prebuilt llama.cpp release:
https://github.com/ggerganov/llama.cpp/releases
Extract the archive and move it to a stable location:
mkdir -p ~/engines
mv ~/Downloads/llama-b7684 ~/engines/llama.cpp
llama.cpp ships shared libraries. To avoid system-wide installation, create a local wrapper.
cd ~/engines/llama.cpp
nano llama
Paste:
#!/usr/bin/env bash
DIR="$(cd "$(dirname "$0")" && pwd)"
export LD_LIBRARY_PATH="$DIR"
exec "$DIR/llama-cli" "$@"
Make it executable:
chmod +x llama
Verify:
~/engines/llama.cpp/llama --help
Models are not included. Download one GGUF file explicitly.
Recommended (balanced quality, instruction-tuned):
mkdir -p ~/models
cd ~/models
wget https://huggingface.co/TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF/resolve/main/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf
Verify:
ls -lh ~/models
~/engines/llama.cpp/llama \
-m ~/models/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf \
-p "Test output [FUR:test]." \
--n-predict 128
If text prints, local inference is working.
Reference the engine and model explicitly in task.yaml:
engine:
type: llama.cpp
name: /home/youruser/engines/llama.cpp/llama
model: /home/youruser/models/capybarahermes-2.5-mistral-7b.Q4_K_M.gguf
RATATOSKR treats the engine as an opaque executor. All execution must be reproducible from disk.
RATATOSKR operates in concert with:
RATATOSKR defines execution truth.
Other systems produce or consume its traces.
Each run creates a trace directory containing:
Traces are append-only, inspectable, and replayable.
RATATOSKR is evolving toward a complete local-first execution stack where humans curate knowledge explicitly, models remain frozen and replaceable, and reasoning is reproducible by construction.
Future work extends execution patterns and supported engines without weakening the core contract.
The execution record your model cannot escape.