| Crates.io | gguf-llms |
| lib.rs | gguf-llms |
| version | 0.0.2 |
| created_at | 2025-06-01 12:52:12.278359+00 |
| updated_at | 2025-06-01 12:54:28.813143+00 |
| description | A Rust library for parsing GGUF (GGML Universal Format) files |
| homepage | |
| repository | https://github.com/yourusername/gguf-rs |
| max_upload_size | |
| id | 1697140 |
| size | 61,116 |
A comprehensive Rust library for parsing, analyzing, and working with GGUF files. The crate provides a type-safe interface to extract model configuration, metadata, and tensor data from GGUF format files used in LLMs.
Current implementation supports:
F32 (FP32)F16 (FP16)I8, I16, I32, I64F64Quantized tensor types are recognized but currently not loaded (contributions welcome)
Add to your Cargo.toml:
[dependencies]
gguf-llms = "0.0.1"
Basic usage example:
use gguf_llms::*;
use std::fs::File;
fn main() -> Result<(), GgufError> {
let mut file = File::open("model.gguf")?;
// Parse file header
let header = GgufHeader::parse(&mut file)?;
// Read metadata
let metadata = GgufReader::read_metadata(&mut file, header.n_kv)?;
// Extract model configuration
let config = extract_model_config(&metadata)?;
// Read tensor information
let tensor_infos = TensorLoader::read_tensor_info(&mut file, header.n_tensors)?;
// Get tensor data start position
let tensor_data_start = TensorLoader::get_tensor_data_start(&mut file)?;
// Load all tensors
let tensors = TensorLoader::load_all_tensors(&mut file, &tensor_infos, tensor_data_start)?;
// Build structured model
let model = ModelBuilder::new(tensors, config).build()?;
println!("Loaded {} layers", model.num_layers());
Ok(())
}
gguf-llms/
├── src/
│ ├── config.rs // Model configuration extraction
│ ├── metadata.rs // GGUF format parsing and types
│ ├── model.rs // Model layer organization
│ ├── tensors.rs // Tensor loading functionality
│ └── lib.rs // Public API
└── README.md
Metadata Parsing
Model Configuration
pub fn extract_model_config(metadata: &HashMap<String, Value>) -> Result<ModelConfig> {
let architecture = /* ... */;
let arch_prefix = &architecture;
let block_count = get_u32_field(metadata, &format!("{}.block_count", arch_prefix))?;
let context_length = get_u32_field(metadata, &format!("{}.context_length", arch_prefix))?;
// ...
Extracts model parameters like layers, dimensions, and attention heads
pub fn load_tensor<R: Read + Seek>(
reader: &mut R,
tensor_info: &TensorInfo,
tensor_data_start: u64,
) -> Result<Tensor> {
// ...
let mut data = vec![0u8; byte_size as usize];
reader.read_exact(&mut data)?;
// ...
}
fn build_transformer_block(&mut self, layer_idx: usize) -> Result<TransformerBlock> {
let prefix = format!("blk.{}", layer_idx);
let attention = AttentionLayer {
query_weights: self.take_tensor(&format!("{}.attn_q.weight", prefix))?,
// ...
};
// ...
}
Organizes tensors into structured layers (TransformerBlock, AttentionLayer, etc.)
The crate has been tested with these architectures:
Note: This crate is in active development. More model architectures will be added.
Contributions are welcome! Please open issues or pull requests for:
MIT License - see LICENSE for details.