kodegen_simd

Crates.io	kodegen_simd
lib.rs	kodegen_simd
version	0.10.9
created_at	2025-10-29 00:14:37.511565+00
updated_at	2026-01-02 15:00:16.305182+00
description	KODEGEN.ᴀɪ: SIMD optimization library
homepage	https://kodegen.ai
repository	https://github.com/cyrup-ai/kodegen-simd
max_upload_size
id	1905809
size	4,385,783

David Maple (kloudsamurai)

documentation

README

kodegen_simd

Ultra-high-performance SIMD-accelerated operations for AI/ML workloads in Rust. Part of the KODEGEN.ᴀɪ ecosystem.

Features

🚀 Automatic SIMD Optimization: Runtime CPU feature detection with zero-overhead dispatch
- x86_64: AVX-512, AVX2, SSE4.1 support
- ARM64: NEON support
- Automatic fallback to optimized scalar implementations
🎯 Vector Similarity Operations: High-performance cosine similarity with intelligent implementation selection
🔥 Logits Processing: Complete pipeline for LLM inference
- Temperature scaling with SIMD acceleration
- Top-k and nucleus (top-p) sampling
- Repetition, frequency, and presence penalties
- Numerically stable softmax and argmax operations
📐 Structured Generation: Type-safe constrained output
- JSON syntax validation
- JSON schema-based constraints
- Generate from Rust types with #[derive(JsonSchema)]
- Predefined constraint presets
⚡ Zero-Allocation Hot Paths: Stack-based buffers for inference-critical operations
🔧 Hardware Acceleration: Optional backends for neural network operations
- CUDA / cuDNN support
- Metal (Apple Silicon) support
- Intel MKL and Apple Accelerate framework support

Installation

Add to your Cargo.toml:

[dependencies]
kodegen_simd = "0.1"

Optional Hardware Acceleration

For CUDA support:

[dependencies]
kodegen_simd = { version = "0.1", features = ["cuda"] }

For Metal (Apple Silicon):

[dependencies]
kodegen_simd = { version = "0.1", features = ["metal"] }

For Intel MKL:

[dependencies]
kodegen_simd = { version = "0.1", features = ["mkl"] }

For Apple Accelerate:

[dependencies]
kodegen_simd = { version = "0.1", features = ["accelerate"] }

Quick Start

Vector Similarity

use kodegen_simd::similarity::cosine_similarity;

let embedding_a = vec![1.0, 2.0, 3.0, 4.0];
let embedding_b = vec![4.0, 3.0, 2.0, 1.0];

// Automatically uses best available SIMD implementation
let similarity = cosine_similarity(&embedding_a, &embedding_b);
println!("Cosine similarity: {}", similarity);

Logits Processing

use kodegen_simd::{scale_temperature, softmax, argmax};

let mut logits = vec![2.0, 1.0, 0.1, 3.0];

// Apply temperature scaling
scale_temperature(&mut logits, 0.7)?;

// Compute softmax probabilities
let probabilities = softmax(&logits)?;

// Find most likely token
let best_token = argmax(&probabilities)?;

Processing Context for Generation

use kodegen_simd::context::ProcessingContext;
use kodegen_simd::logits::process_logits_scalar;
use kodegen_simd::config::ProcessorConfig;

let mut context = ProcessingContext::new()
    .with_temperature(0.8)
    .with_top_k(Some(50))
    .with_top_p(Some(0.95));

let mut logits = vec![2.0, 1.0, 0.1, 3.0, 1.5];
let config = ProcessorConfig::default();

// Apply temperature, top-k, and top-p filtering
process_logits_scalar(&mut logits, &context, &config)?;

// Track token history
context.extend_history(&[best_token]);

Structured Generation with Type Safety

use kodegen_simd::serde_constraints::constraint_for_type;
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Serialize, Deserialize, JsonSchema)]
struct User {
    name: String,
    age: u32,
    email: Option<String>,
    is_active: bool,
}

// Create constraint from Rust type
let constraint = constraint_for_type::<User>(&tokenizer)?;

// Use in ProcessingContext
let mut context = ProcessingContext::new()
    .with_schema_constraint(constraint);

// During generation, check if tokens are valid
for token in candidate_tokens {
    if context.is_token_valid_schema(token)? {
        // Token produces valid JSON matching User schema
        context.update_schema_constraint_state(token)?;
    }
}

JSON Schema Constraints

use kodegen_simd::serde_constraints::constraint_for_schema;

let schema = r#"{
    "type": "object",
    "properties": {
        "name": { "type": "string" },
        "age": { "type": "integer", "minimum": 0 },
        "tags": { 
            "type": "array",
            "items": { "type": "string" }
        }
    },
    "required": ["name", "age"]
}"#;

let constraint = constraint_for_schema(schema, &tokenizer)?;

Predefined Constraint Presets

use kodegen_simd::serde_constraints::presets;

// Array of strings
let constraint = presets::array_of_strings(&tokenizer)?;

// Array of integers
let constraint = presets::array_of_integers(&tokenizer)?;

// Generic object with string keys
let constraint = presets::object_with_string_keys(&tokenizer)?;

Performance

The library uses runtime CPU feature detection to automatically select the fastest implementation:

Cosine Similarity: Up to 8x faster than naive implementations on AVX2
Temperature Scaling: SIMD vectorization across entire logits tensor
Softmax: Numerically stable with minimal allocation overhead
Constraint Validation: Efficient state machine with token lookahead

Run benchmarks:

cargo bench --features bench

Architecture

SIMD Abstraction

Operations are implemented multiple times for different CPU features:

AVX-512 (16-wide vectors)
AVX2 (8-wide vectors)
SSE4.1 (4-wide vectors)
NEON (4-wide vectors, ARM)
Scalar fallback

The best implementation is selected once at startup and cached.

Constraint System

The constrained generation system uses state machines to track valid token transitions:

JSON Constraints: Validates JSON syntax during generation
Schema Constraints: Enforces JSON schema structure (types, required fields, ranges)
Type Constraints: Generated from Rust serde types with JsonSchema derive

Constraints can force deterministic token sequences when only one valid path exists, improving generation efficiency.

Requirements

Rust: Nightly toolchain (specified in rust-toolchain.toml)
Edition: 2024

Documentation

API Documentation
CLAUDE.md - Architecture and development guide
Examples - Additional usage examples

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

Licensed under either of:

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Acknowledgments

Part of the KODEGEN.ᴀɪ ecosystem for AI-powered database tools and MCP servers.

Commit count: 0

kodegen_simd

documentation

README

kodegen_simd

Features

Installation

Optional Hardware Acceleration

Quick Start

Vector Similarity

Logits Processing

Processing Context for Generation

Structured Generation with Type Safety

JSON Schema Constraints

Predefined Constraint Presets

Performance

Architecture

SIMD Abstraction

Constraint System

Requirements

Documentation

Contributing

License

Acknowledgments

cargo fmt