| Crates.io | swarm-engine-llm |
| lib.rs | swarm-engine-llm |
| version | 0.1.0 |
| created_at | 2026-01-25 17:51:34.325403+00 |
| updated_at | 2026-01-25 17:51:34.325403+00 |
| description | LLM integration backends for SwarmEngine |
| homepage | |
| repository | https://github.com/ytknishimura/swarm-engine |
| max_upload_size | |
| id | 2069175 |
| size | 312,654 |
A high-throughput, low-latency agent swarm execution engine written in Rust.
SwarmEngine is designed for running multiple AI agents in parallel with tick-based synchronization, optimized for batch LLM inference and real-time exploration scenarios.
Measured on troubleshooting scenario (exploration-based, no per-tick LLM calls):
| Metric | Value |
|---|---|
| Throughput | ~80 actions/sec |
| Tick latency (exploration) | 0.1-0.2ms per action |
| Task completion | 5 actions in ~60ms |
Note: LLM-based decision making adds latency per call. The exploration-based mode uses graph traversal instead of per-tick LLM calls.
┌──────────────────────────────────────────────────────────────────────┐
│ SwarmEngine │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ Orchestrator │ │
│ │ │ │
│ │ Tick Loop: │ │
│ │ 1. Collect Async Results │ │
│ │ 2. Manager Phase (LLM Decision / Exploration) │ │
│ │ 3. Worker Execution (Parallel) │ │
│ │ 4. Merge Results │ │
│ │ 5. Tick Advance │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────┐ │
│ │ SwarmState │ │ ExplorationSpace│ │ BatchInvoker │ │
│ │ ├─ SharedState │ │ ├─ GraphMap │ │ ├─ LlamaCppServer │ │
│ │ └─ WorkerStates│ │ └─ Operators │ │ └─ Ollama │ │
│ └─────────────────┘ └─────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
| Crate | Description |
|---|---|
swarm-engine-core |
Core runtime, orchestrator, state management, exploration, and learning |
swarm-engine-llm |
LLM integrations (llama.cpp server, Ollama, prompt building, batch processing) |
swarm-engine-eval |
Scenario-based evaluation framework with assertions and metrics |
swarm-engine-ui |
CLI and Desktop GUI (egui) |
# Clone the repository
git clone https://github.com/ynishi/swarm-engine.git
cd swarm-engine
# Build
cargo build --release
SwarmEngine uses llama.cpp server as the primary LLM backend. LFM2.5-1.2B is the recommended model for development and testing due to its balance of speed and quality.
# Using Hugging Face CLI (recommended)
pip install huggingface_hub
huggingface-cli download LiquidAI/LFM2.5-1.2B-Instruct-GGUF \
LFM2.5-1.2B-Instruct-Q4_K_M.gguf
# Or download directly from Hugging Face:
# https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
# Start with the downloaded model (using glob pattern for snapshot hash)
cargo run --package swarm-engine-ui -- llama start \
-m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf
# With custom options (GPU acceleration, parallel slots)
cargo run --package swarm-engine-ui -- llama start \
-m ~/.cache/huggingface/hub/models--LiquidAI--LFM2.5-1.2B-Instruct-GGUF/snapshots/*/LFM2.5-1.2B-Instruct-Q4_K_M.gguf \
--n-gpu-layers 99 \
--parallel 4 \
--ctx-size 4096
# Check if server is running and healthy
cargo run --package swarm-engine-ui -- llama status
# View server logs
cargo run --package swarm-engine-ui -- llama logs -f
# Stop the server
cargo run --package swarm-engine-ui -- llama stop
| Model | Size | Speed | Quality | Use Case |
|---|---|---|---|---|
| LFM2.5-1.2B | 1.2B | Fast | Good | Development, testing (recommended) |
| Qwen2.5-Coder-3B | 3B | Medium | Better | Complex scenarios |
| Qwen2.5-Coder-7B | 7B | Slow | Best | Production quality testing |
# Run a troubleshooting scenario
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# With learning data collection
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 --learning
# Show help
cargo run --package swarm-engine-ui -- --help
# Initialize configuration
cargo run --package swarm-engine-ui -- init
# Show current configuration
cargo run --package swarm-engine-ui -- config
# Open scenarios directory
cargo run --package swarm-engine-ui -- open scenarios
# Launch Desktop GUI
cargo run --package swarm-engine-ui -- --gui
Scenarios are defined in TOML format and describe the task, environment, actions, and success criteria:
[meta]
name = "Service Troubleshooting"
id = "user:troubleshooting:v2"
description = "Diagnose and fix a service outage"
[task]
goal = "Diagnose the failing service and restart it"
[llm]
provider = "llama-server"
model = "LFM2.5-1.2B"
endpoint = "http://localhost:8080"
[[actions.actions]]
name = "CheckStatus"
description = "Check the status of services"
[[actions.actions]]
name = "ReadLogs"
description = "Read logs for a specific service"
[app_config]
tick_duration_ms = 10
max_ticks = 150
Scenarios can define variants for different configurations:
# List available variants
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --list-variants
# Run with a specific variant
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml --variant complex
SwarmEngine includes a comprehensive learning system with offline parameter optimization and LoRA fine-tuning support.
┌─────────────────────────────────────────────────────────────────┐
│ Learning System │
│ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Data Collection ││
│ │ Eval (--learning) → ActionEvents → Session Snapshots ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Offline Analysis ││
│ │ learn once → Stats Analysis → OptimalParamsModel ││
│ │ → RecommendedPaths ││
│ └─────────────────────────────────────────────────────────────┘│
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Model Application ││
│ │ Next Eval → Load OfflineModel → Apply Parameters ││
│ │ → LoRA Adapter (optional) ││
│ └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘
# 1. Collect data with --learning flag
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 30 --learning
# 2. Run offline learning
cargo run --package swarm-engine-ui -- learn once troubleshooting
# 3. Next eval run will automatically use the learned model
cargo run --package swarm-engine-ui -- eval crates/swarm-engine-eval/scenarios/troubleshooting.toml -n 5 -v
# → "Offline model loaded: ucb1_c=X.XXX, strategy=..."
| Model | Purpose | Lifetime |
|---|---|---|
| ScoreModel | Action selection scores (transitions, N-gram patterns) | 1 session |
| OptimalParamsModel | Parameter optimization (ucb1_c, thresholds) | Cross-session |
| LoRA Adapter | LLM fine-tuning for decision quality | Persistent |
{
"parameters": {
"ucb1_c": 1.414, // UCB1 exploration constant
"learning_weight": 0.3, // Learning weight for selection
"ngram_weight": 1.0 // N-gram pattern weight
},
"strategy_config": {
"initial_strategy": "ucb1",
"maturity_threshold": 5,
"error_rate_threshold": 0.45
},
"recommended_paths": [...] // Optimal action sequences
}
For continuous learning during long-running evaluations:
# Start daemon mode (monitors and learns continuously)
cargo run --package swarm-engine-ui -- learn daemon troubleshooting
# Daemon features:
# - Watches for new session data
# - Triggers learning based on configurable conditions
# - Applies learned models via Blue-Green deployment
Fine-tune LLM for improved decision quality:
# LoRA training requires:
# - Episode data collected from successful runs
# - llama.cpp with LoRA support
# - Training triggers (count, time, or quality-based)
~/.swarm-engine/learning/
├── global_stats.json # Global statistics across scenarios
└── scenarios/
└── troubleshooting/ # Per-scenario (learning_key based)
├── stats.json # Accumulated statistics
├── offline_model.json # Learned parameters
├── lora/ # LoRA adapters (if trained)
│ └── v1/
│ └── adapter.safetensors
└── sessions/ # Session snapshots
└── {timestamp}/
├── meta.json
└── stats.json
The learning system optimizes selection strategy parameters:
| Strategy | Description | When Used |
|---|---|---|
| UCB1 | Upper Confidence Bound | Early exploration |
| Thompson | Bayesian sampling | Probabilistic exploration |
| Greedy | Best known action | Exploitation after learning |
| Adaptive | Dynamic switching | Production (based on error rate) |
llama.cpp server provides true batch processing with continuous batching:
cargo run --package swarm-engine-ui -- llama start \
-m model.gguf \
--parallel 4 \
--ctx-size 4096 \
--n-gpu-layers 99
Ollama can be used but does not support true batch processing:
ollama serve
Note: Ollama processes requests sequentially internally, so throughput measurements may not reflect true parallel performance.
~/.swarm-engine/config.toml)[general]
default_project_type = "eval"
[eval]
default_runs = 30
target_tick_duration_ms = 10
[llm]
default_provider = "llama-server"
cache_enabled = true
[logging]
level = "info"
file_enabled = true
| Path | Purpose |
|---|---|
~/.swarm-engine/ |
System configuration, cache, logs |
~/swarm-engine/ |
User data: scenarios, reports |
./swarm-engine/ |
Project-local configuration |
# Type check
cargo check
# Build
cargo build
# Run tests
cargo test
# Run with verbose logging
RUST_LOG=debug cargo run --package swarm-engine-ui -- eval ...
swarm-engine/
├── crates/
│ ├── swarm-engine-core/ # Core runtime
│ │ ├── src/
│ │ │ ├── orchestrator/ # Main loop
│ │ │ ├── agent/ # Worker/Manager definitions
│ │ │ ├── exploration/ # Graph-based exploration
│ │ │ ├── learn/ # Offline learning
│ │ │ └── ...
│ ├── swarm-engine-llm/ # LLM integrations
│ ├── swarm-engine-eval/ # Evaluation framework
│ │ └── scenarios/ # Built-in scenarios
│ └── swarm-engine-ui/ # CLI and GUI
Detailed design documentation is available in the RustDoc comments of each crate:
# Generate and open documentation
cargo doc --open --no-deps
Key documentation locations:
MIT License