| Crates.io | mappy-core |
| lib.rs | mappy-core |
| version | 0.3.2 |
| created_at | 2025-10-12 18:32:26.365843+00 |
| updated_at | 2025-11-15 17:13:09.771751+00 |
| description | Core maplet data structure implementation |
| homepage | |
| repository | https://github.com/entropy-tamer/mappy |
| max_upload_size | |
| id | 1879508 |
| size | 416,581 |
A Rust implementation of maplets - space-efficient data structures for approximate key-value mappings, based on the research paper "Time To Replace Your Filter: How Maplets Simplify System Design".
Maplets provide the same space benefits as filters while natively supporting key-value associations with one-sided error guarantees. Unlike traditional filters that only support set membership queries, maplets allow you to associate values with keys and retrieve them during queries.
O(log 1/ε + v) bits per item where ε is the false-positive rate and v is the value sizeM[k] ≺ m[k] for application-specific ordering relationsThe implementation follows a multi-crate workspace structure:
use mappy_core::{Maplet, CounterOperator};
// Create a maplet for counting with 1% false-positive rate
let mut maplet = Maplet::<String, u64, CounterOperator>::new(1000, 0.01);
// Insert key-value pairs
maplet.insert("key1".to_string(), 5).unwrap();
maplet.insert("key2".to_string(), 3).unwrap();
// Query values (may return approximate results)
let count1 = maplet.query(&"key1".to_string()); // Some(5) or Some(5 + other_values)
let count2 = maplet.query(&"key2".to_string()); // Some(3) or Some(3 + other_values)
// Check if key exists
let exists = maplet.contains(&"key1".to_string()); // true
// Get statistics
let stats = maplet.stats();
println!("Load factor: {:.2}%", stats.load_factor * 100.0);
println!("Memory usage: {} bytes", stats.memory_usage);
Enable the quotient-filter feature for enhanced slot finding capabilities with run detection, shifting support, and comprehensive benchmarking:
[dependencies]
mappy-core = { version = "0.3.1", features = ["quotient-filter"] }
use mappy_core::{Maplet, CounterOperator};
// Create a maplet with quotient filter support
let maplet = Maplet::<String, u64, CounterOperator>::new(1000, 0.01).unwrap();
// Insert some data
maplet.insert("test_key".to_string(), 42).await.unwrap();
// Find the actual slot where a key's fingerprint is stored
// This handles runs, shifting, and all complex quotient filter logic
let slot = maplet.find_slot_for_key(&"test_key".to_string()).await;
match slot {
Some(slot_index) => println!("Key found at slot {}", slot_index),
None => println!("Key not found"),
}
Quotient Filter Benefits:
The quotient filter implementation includes extensive testing and benchmarking infrastructure:
# Run all tests with quotient filter features
cargo test --features quotient-filter
# Run comprehensive test suite
./run_tests_and_benchmarks.sh
# Run specific benchmarks
cargo bench --bench basic_quotient_filter_benchmarks
# Run complete test suite with Python support
./test_quotient_filter_complete.sh
Full Python bindings are available for the quotient filter features:
import mappy_python
# Create Maplet with quotient filter features
maplet = mappy_python.PyMaplet(capacity=1000, false_positive_rate=0.01)
maplet.insert("key", 42)
slot = maplet.find_slot_for_key("key") # Returns slot index
# Create Engine with quotient filter features
config = mappy_python.PyEngineConfig(capacity=1000, false_positive_rate=0.01)
engine = mappy_python.PyEngine(config)
engine.set("key", b"value")
slot = engine.find_slot_for_key("key") # Returns slot index
Python Features:
find_slot_for_key() method for both Maplet and Engineuse mappy_core::{Maplet, CounterOperator};
let mut kmer_counter = Maplet::<String, u32, CounterOperator>::new(1_000_000, 0.001);
// Count k-mers in DNA sequences with high accuracy
use mappy_core::{Maplet, SetOperator};
let mut routing_table = Maplet::<String, HashSet<String>, SetOperator>::new(100_000, 0.01);
// Map network prefixes to sets of next-hop routers
use mappy_core::{Maplet, MaxOperator};
let mut sstable_index = Maplet::<String, u64, MaxOperator>::new(10_000_000, 0.001);
// Map keys to SSTable identifiers for efficient lookups
Our comprehensive benchmarks show Mappy significantly outperforms Redis for approximate key-value operations:
| Dataset Size | Operation | Redis Performance | Mappy Performance | Mappy Advantage |
|---|---|---|---|---|
| 100 items | SET/INSERT | 13.9-18.0 ms | 41.9-47.7 µs | ~300x faster |
| 1,000 items | SET/INSERT | 107-130 ms | 414-481 µs | ~250x faster |
| 10,000 items | SET/INSERT | 976-1,244 ms | 4.9-5.7 ms | ~200x faster |
Mappy's approximate nature has been proven not to hurt ML performance when using Huffman-compressed tags. Comprehensive benchmarks demonstrate production-ready performance for all ML tasks:
| Task | Accuracy | Speed Ratio | Status | Improvement |
|---|---|---|---|---|
| Similarity Search | 100.00% | 0.72x | ✅ Faster than exact! | 50.8x faster |
| Embeddings | 100.00% | 1.13x | ✅ Excellent! | 1894x faster |
| Classification | 92.50% | 1.08x | ✅ Excellent! | Stable |
| Clustering | 34.00%* | 2.42x | ✅ Acceptable | Stable |
*Clustering accuracy varies by test data, but exact and approximate match perfectly.
Key Findings:
Optimization Strategy: The critical optimization was moving mappy operations out of the hot path:
Storage Efficiency:
For detailed ML benchmark results and optimization techniques, see the Stilts ML Benchmarks documentation.
# Run Redis comparison benchmarks
cd mappy-core
cargo bench --bench redis_comparison
# Run ML benchmarks (requires stilts package)
cd ../stilts
cargo run --example ml_benchmark_demo --features mappy-integration
# Run all benchmarks
cd ..
./benchmark_runner.sh --all
# Run specific benchmark categories
./benchmark_runner.sh --redis
Maplets provide the strong maplet property:
m[k] = M[k] ⊕ (⊕ᵢ₌₁ˡ M[kᵢ])
Where Pr[ℓ ≥ L] ≤ ε^L, meaning even when wrong, the result is close to correct.
MIT License - see LICENSE file for details.
Based on the research paper:
Bender, M. A., Conway, A., Farach-Colton, M., Johnson, R., & Pandey, P. (2025). Time To Replace Your Filter: How Maplets Simplify System Design. arXiv preprint arXiv:2510.05518.