| Crates.io | inferno-ai |
| lib.rs | inferno-ai |
| version | 0.6.0 |
| created_at | 2025-10-01 02:04:49.54842+00 |
| updated_at | 2025-10-01 02:04:49.54842+00 |
| description | Enterprise AI/ML model runner with automatic updates, real-time monitoring, and multi-interface support |
| homepage | https://github.com/ringo380/inferno |
| repository | https://github.com/ringo380/inferno |
| max_upload_size | |
| id | 1861970 |
| size | 10,028,881 |
Run any AI model locally with enterprise-grade performance and privacy
Inferno is a production-ready AI inference server that runs entirely on your hardware. Think of it as your private ChatGPT that works offline, supports any model format, and gives you complete control over your AI infrastructure.
Choose your preferred installation method:
Native macOS application with Metal GPU acceleration, optimized for Apple Silicon (M1/M2/M3/M4)
Inferno.dmg (universal binary for Intel & Apple Silicon)Features:
Build from source:
# Clone and build
git clone https://github.com/ringo380/inferno.git
cd inferno
./scripts/build-desktop.sh --release --universal
# Development mode with hot reload
cd dashboard && npm run tauri dev
Homebrew
# Add tap and install
brew tap ringo380/tap
brew install inferno
# Or directly
brew install ringo380/tap/inferno
# Start as service
brew services start inferno
Quick Install Script
curl -sSL https://github.com/ringo380/inferno/releases/latest/download/install-inferno.sh | bash
# Pull the latest image
docker pull ghcr.io/ringo380/inferno:latest
# Run with GPU support
docker run --gpus all -p 8080:8080 ghcr.io/ringo380/inferno:latest
# With custom models directory
docker run -v /path/to/models:/home/inferno/.inferno/models \
-p 8080:8080 ghcr.io/ringo380/inferno:latest
version: '3.8'
services:
inferno:
image: ghcr.io/ringo380/inferno:latest
ports:
- "8080:8080"
volumes:
- ./models:/home/inferno/.inferno/models
- ./config:/home/inferno/.inferno/config
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
# From crates.io
cargo install inferno
# From GitHub Packages
cargo install --registry github inferno
# From GitHub Packages
npm install @ringo380/inferno-desktop
# From npm registry
npm install inferno-desktop
# Download for your architecture
wget https://github.com/ringo380/inferno/releases/latest/download/inferno-linux-x86_64
# or
wget https://github.com/ringo380/inferno/releases/latest/download/inferno-linux-aarch64
# Make executable and move to PATH
chmod +x inferno-linux-*
sudo mv inferno-linux-* /usr/local/bin/inferno
inferno-windows-x86_64.exe from Releasescargo install inferno
# Clone the repository
git clone https://github.com/ringo380/inferno.git
cd inferno
# Build release binary
cargo build --release
# Install globally (optional)
cargo install --path .
# Build desktop app (optional)
cd desktop-app && npm install && npm run build
inferno upgrade check # Check for updates
inferno upgrade install # Install latest version
# Homebrew
brew upgrade inferno
# Docker
docker pull ghcr.io/ringo380/inferno:latest
# Cargo
cargo install inferno --force
# NPM
npm update @ringo380/inferno-desktop
Note: DMG and installer packages automatically detect existing installations and preserve your settings during upgrade.
# Check version
inferno --version
# Verify GPU support
inferno gpu status
# Run health check
inferno doctor
# List available models
inferno models list
# Run inference
inferno run --model MODEL_NAME --prompt "Your prompt here"
# Start HTTP API server
inferno serve
# Launch terminal UI
inferno tui
# Launch desktop app (if installed from DMG)
open /Applications/Inferno.app
Built with a modular, trait-based architecture supporting pluggable backends:
src/
├── main.rs # CLI entry point
├── lib.rs # Library exports
├── config.rs # Configuration management
├── backends/ # AI model execution backends
├── cli/ # 40+ CLI command modules
├── api/ # HTTP/WebSocket APIs
├── batch/ # Batch processing system
├── models/ # Model discovery and metadata
└── [Enterprise] # Advanced production features
Create inferno.toml:
# Basic settings
models_dir = "/path/to/models"
log_level = "info"
[server]
bind_address = "0.0.0.0"
port = 8080
[backend_config]
gpu_enabled = true
context_size = 4096
batch_size = 64
[cache]
enabled = true
compression = "zstd"
max_size_gb = 10
See CLAUDE.md for comprehensive development documentation.
# Run tests
cargo test
# Format code
cargo fmt
# Run linter
cargo clippy
# Full verification
./verify.sh
Licensed under either of:
🔥 Ready to take control of your AI infrastructure? 🔥
Built with ❤️ by the open source community