| Crates.io | ohms-adaptq |
| lib.rs | ohms-adaptq |
| version | 2.0.3 |
| created_at | 2025-08-07 19:53:37.305706+00 |
| updated_at | 2025-08-19 20:13:42.752529+00 |
| description | NOVAQ: Normalized Outlier-Vector Additive Quantization - Democratic 93-100x LLM compression with 99%+ accuracy retention. No restrictions, no gatekeeping. |
| homepage | https://ohms-project.org |
| repository | https://github.com/OHMS-DeAI/ohms-adaptq |
| max_upload_size | |
| id | 1785735 |
| size | 496,597 |
Normalized Outlier-Vector Additive Quantization - Revolutionary 93-100x LLM compression with 99%+ accuracy retention. No restrictions, no gatekeeping, pure democratic access.
NOVAQ is completely open and accessible to everyone. No admin controls, no restrictions, no gatekeeping. Anyone can compress any AI model with NOVAQ technology.
NOVAQ (Normalized Outlier-Vector Additive Quantization) is a revolutionary three-stage compression pipeline:
# Clone the repository
git clone https://github.com/OHMS-DeAI/ohms-adaptq.git
cd ohms-adaptq
# Build the democratic NOVAQ CLI
cargo build --release
# Install globally (optional)
cargo install --path .
# Compress any Hugging Face model
novaq hf meta-llama/Llama-3-8B --output llama3-8b-novaq.bin
# Specify custom compression settings
novaq hf microsoft/Phi-3-mini-4k-instruct \
--bits 1.5 \
--subspaces 4 \
--output phi3-mini-novaq.bin
# Compress any Ollama model
novaq ollama llama3:8b --output llama3-8b-novaq.bin
# Compress with custom settings
novaq ollama mistral:7b \
--bits 1.5 \
--subspaces 4 \
--output mistral-7b-novaq.bin
# Compress model from direct URL
novaq url https://example.com/model.safetensors --output model-novaq.bin
# Compress local model file
novaq local /path/to/model.safetensors --output local-model-novaq.bin
# Validate NOVAQ compressed model
novaq validate llama3-8b-novaq.bin
# Show compression statistics
novaq stats llama3-8b-novaq.bin
# Hugging Face token (for private models)
export HF_TOKEN="your_token_here"
export HUGGINGFACE_HUB_TOKEN="your_token_here"
# Enable accelerated downloads
export HF_HUB_ENABLE_HF_TRANSFER=1
--bits: Target bits per weight (default: 1.5)--subspaces: Number of vector subspaces (default: 4)--output: Output file path (default: novaq_compressed.bin).safetensors) - Most common for modern models.bin, .pt, .pth) - Traditional PyTorch format.gguf) - Ollama and llama.cpp format.onnx) - Open Neural Network Exchange format# Download and compress in one command
novaq hf meta-llama/Llama-3-8B \
--bits 1.5 \
--subspaces 4 \
--output llama3-8b-novaq.bin
Results:
novaq hf microsoft/Phi-3-mini-4k-instruct \
--bits 1.5 \
--subspaces 4 \
--output phi3-mini-novaq.bin
Results:
Input Model (FP32)
โ
Distribution Normalization
โ
Multi-stage Vector Codebooks
โ
Teacher-guided Refinement
โ
NOVAQ Compressed Model
For a weight matrix Wโโ^{mรd}:
Normalization:
ลด_{i,:} = (W_{i,:} - ฮผ_i) / s_i
Two-level PQ:
b^{(1)}_{i,k} = argmin_c ||v_{i,k} - C^{(1)}_{c,k}||ยฒ
r_{i,k} = v_{i,k} - C^{(1)}_{b^{(1)}_{i,k},k}
b^{(2)}_{i,k} = argmin_c ||r_{i,k} - C^{(2)}_{c,k}||ยฒ
Inference reconstruction:
แปธ_{i,:} = s_i(ฮฃ_k C^{(1)}_{b^{(1)}_{i,k},k} + C^{(2)}_{b^{(2)}_{i,k},k}) + ฮผ_i
Open Source: Complete source code available
No Restrictions: Use on any model, any platform
No Approval: No admin review or approval process
No Licensing: MIT license - use freely
NOVAQ is based on cutting-edge research in model compression:
NOVAQ is democratic and open to contributions from everyone:
MIT License - Use NOVAQ freely for any purpose.
# Install NOVAQ
cargo install --git https://github.com/OHMS-DeAI/ohms-adaptq.git
# Compress your first model
novaq hf microsoft/Phi-3-mini-4k-instruct --output my-first-novaq.bin
# Validate the result
novaq validate my-first-novaq.bin
๐ Welcome to democratic AI compression! No restrictions, no gatekeeping - just pure technological advancement for everyone.