| Crates.io | unsloth-rs |
| lib.rs | unsloth-rs |
| version | 1.0.1 |
| created_at | 2026-01-09 18:00:47.311682+00 |
| updated_at | 2026-01-25 03:02:26.708027+00 |
| description | Rust implementations of transformer building blocks for LLM inference |
| homepage | |
| repository | https://github.com/tzervas/unsloth-rs |
| max_upload_size | |
| id | 2032504 |
| size | 1,081,841 |
Rust implementations of transformer building blocks for LLM inference and fine-tuning.
unsloth-rs provides Rust implementations of common transformer operations built on the Candle ML framework:
Version 1.0.0 - Core functionality stable. Current implementations are CPU reference implementations with GPU dispatch that uses Candle's CUDA backend.
[dependencies]
unsloth-rs = "1.0.0"
For CUDA support (uses Candle's CUDA backend):
[dependencies]
unsloth-rs = { version = "1.0.0", features = ["cuda"] }
use unsloth_rs::kernels::{FusedAttention, FusedAttentionConfig};
use candle_core::{Device, Tensor};
fn main() -> anyhow::Result<()> {
let device = Device::Cpu;
let config = FusedAttentionConfig {
hidden_size: 768,
num_heads: 12,
head_dim: 64,
num_kv_heads: Some(4), // GQA support
..Default::default()
};
let attention = FusedAttention::new(config, &device)?;
// Create random input tensor: randn(mean, std_dev, shape, device)
// 0.0f32 is Rust syntax for a 32-bit float literal with value 0.0
let hidden_states = Tensor::randn(0.0f32, 1.0, (1, 128, 768), &device)?;
let output = attention.forward(&hidden_states, None, None)?;
Ok(())
}
use unsloth_rs::memory::{estimate_forward_memory, CheckpointConfig};
fn main() {
let checkpoint = CheckpointConfig {
enabled: true,
checkpoint_every: 2,
};
let mem_bytes = estimate_forward_memory(
4, // batch_size
2048, // seq_len
4096, // hidden_size
32, // num_layers
&checkpoint,
);
println!("Estimated memory: {} GB", mem_bytes as f64 / 1e9);
}
Run benchmarks with:
cargo bench
Benchmarks test CPU performance across various configurations. GPU benchmarks require the cuda feature.
For detailed development plans and task breakdowns, see:
Contributions are welcome, particularly:
See TASKS.md for specific tasks that need implementation.
Licensed under the MIT License. See LICENSE for details.