| Crates.io | bitmamba |
| lib.rs | bitmamba |
| version | 0.1.0 |
| created_at | 2026-01-09 00:24:52.792616+00 |
| updated_at | 2026-01-09 00:24:52.792616+00 |
| description | BitMamba: 1.58-bit Mamba language model with infinite context window - includes OpenAI-compatible API server |
| homepage | https://github.com/rileyseaburg/bitmamba |
| repository | https://github.com/rileyseaburg/bitmamba |
| max_upload_size | |
| id | 2031337 |
| size | 116,341 |
A 1.58-bit Mamba language model with infinite context window, implemented in Rust.
cargo install bitmamba
Or build from source:
git clone https://github.com/rileyseaburg/bitmamba
cd bitmamba
cargo build --release
# Run inference directly
bitmamba
# Start the server
bitmamba-server
The server runs at http://localhost:8000 with these endpoints:
| Endpoint | Method | Description |
|---|---|---|
/v1/models |
GET | List available models |
/v1/chat/completions |
POST | Chat completions (streaming supported) |
/v1/completions |
POST | Text completions |
/health |
GET | Health check |
{
"apiProvider": "openai-compatible",
"baseUrl": "http://localhost:8000/v1",
"model": "bitmamba-student"
}
use bitmamba::{BitMambaStudent, load_model, load_tokenizer};
fn main() -> anyhow::Result<()> {
let (model, tokenizer) = bitmamba::load()?;
let prompt = "def fibonacci(n):";
let tokens = tokenizer.encode(prompt, true)?;
let output = model.generate(tokens.get_ids(), 50, 0.7)?;
println!("{}", tokenizer.decode(&output, true)?);
Ok(())
}
The default model is rileyseaburg/bitmamba-student on Hugging Face, a 278M parameter BitMamba model distilled from Qwen2.5-Coder-1.5B.
Input -> RMSNorm -> BitLinear (in_proj) -> Conv1d -> SiLU -> SSM Scan -> Gate -> BitLinear (out_proj) -> Residual
The SSM Scan is the key component that enables infinite context:
// Fixed-size state, O(1) memory per token
h = dA * h + dB * x // State update
y = h @ C + D * x // Output
| Metric | Value |
|---|---|
| Parameters | 278M |
| Memory (inference) | ~1.1 GB |
| Context Window | Unlimited |
| Quantization | 1.58-bit weights |
If you use BitMamba in your research, please cite:
@software{bitmamba2024,
author = {Seaburg, Riley},
title = {BitMamba: 1.58-bit Mamba with Infinite Context},
year = {2024},
url = {https://github.com/rileyseaburg/bitmamba}
}
MIT License - see LICENSE for details.