| Crates.io | zeropool |
| lib.rs | zeropool |
| version | 0.3.1 |
| created_at | 2025-10-18 19:09:14.293463+00 |
| updated_at | 2025-11-23 16:24:10.971963+00 |
| description | High-performance buffer pool with constant-time allocation, thread-safe operations, and 5x speedup over bytes crate |
| homepage | |
| repository | https://github.com/botirk38/zeropool |
| max_upload_size | |
| id | 1889544 |
| size | 180,474 |
A high-performance buffer pool for Rust - Performance First
ZeroPool is a high-performance buffer pool that prioritizes speed above all else. Unlike traditional buffer pools that trade performance for security features, ZeroPool delivers maximum performance:
Perfect for high-throughput applications where raw speed is the primary requirement.
use zeropool::BufferPool;
let pool = BufferPool::new();
// Get a buffer (high-performance, not zeroed by default)
let mut buffer = pool.get(1024 * 1024); // 1MB
// Use it for I/O or data processing
file.read(&mut buffer)?;
// Zero manually if needed for security
buffer.fill(0);
// Buffer automatically returned to pool when dropped
get() and drop(): Buffers automatically return to the poolThread 1 Thread 2 Thread N
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ TLS (4) โ โ TLS (4) โ โ TLS (4) โ โ Lock-free (60-110ns)
โโโโโโฌโโโโโ โโโโโโฌโโโโโ โโโโโโฌโโโโโ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโ
โ Sharded Pool โ โ Thread affinity
โ [0][1]...[N] โ โ Minimal contention
โโโโโโโโโโโโโโโโโโ
Fast path: Thread-local cache (lock-free, ~60-110ns) Slow path: Thread-affinity shard selection (better cache locality) Optimization: Power-of-2 shards enable bitwise AND instead of modulo
| Pattern | Metric | Result |
|---|---|---|
| Ping-pong (LIFO) | Time per operation | 3.56 ยตs |
| Ping-pong (ClockPro) | Time per operation | 3.68 ยตs |
| Hot/cold buffers | Time per operation | 1.05 ยตs |
| Multi-size workload | Time per operation | 6.2 ยตs |
| TLS cache (2 bufs) | Allocation latency | 60.5 ns |
| TLS cache (4 bufs) | Allocation latency | 108 ns |
| TLS cache (8 bufs) | Allocation latency | 288 ns |
| Eviction pressure | Time per operation | 400 ns |
| Threads | Time per 1000 ops | Notes |
|---|---|---|
| 1 | 44.7 ยตs | Single-threaded baseline |
| 4 | 141 ยตs | Good scaling with TLS cache |
| 8 | 282 ยตs | Near-linear scaling |
| 16 | 605 ยตs | Still scales well at high concurrency |
Run yourself:
cargo bench
use zeropool::BufferPool;
let pool = BufferPool::builder()
.tls_cache_size(8) // Buffers per thread
.min_buffer_size(512 * 1024) // Keep buffers โฅ 512KB
.max_buffers_per_shard(32) // Max pooled buffers
.num_shards(16) // Override auto-detection
.build();
Defaults (auto-configured based on CPU count):
Lock buffer memory in RAM to prevent swapping (performance optimization):
use zeropool::BufferPool;
let pool = BufferPool::builder()
.pinned_memory(true)
.build();
Useful for high-performance computing or real-time systems. May require elevated privileges on some systems. Falls back gracefully if pinning fails.
Choose between simple LIFO or intelligent CLOCK-Pro buffer eviction:
use zeropool::{BufferPool, EvictionPolicy};
let pool = BufferPool::builder()
.eviction_policy(EvictionPolicy::ClockPro) // Better cache locality (default)
.build();
let pool_lifo = BufferPool::builder()
.eviction_policy(EvictionPolicy::Lifo) // Simple, lowest overhead
.build();
CLOCK-Pro (default): Uses access counters to favor recently-used buffers, preventing cache thrashing in mixed-size workloads. ~8 bytes overhead per buffer.
LIFO: Simple last-in-first-out eviction. Minimal memory overhead, best for uniform buffer sizes.
Thread-local caching (lock-free)
Thread-local shard affinity
shard = hash(thread_id) & (num_shards - 1) (no modulo)First-fit allocation
Performance-first memory management
BufferPool is Clone and thread-safe:
let pool = BufferPool::new();
for _ in 0..4 {
let pool = pool.clone();
std::thread::spawn(move || {
let buf = pool.get(1024);
// Each thread gets its own TLS cache
// Buffer automatically returned when dropped
});
}
Before ZeroPool, loading GPT-2 checkpoints took 200ms with 70% spent on buffer allocation. With ZeroPool: 53ms (3.8x faster) while delivering maximum performance without security overhead.
ZeroPool automatically adapts to your system:
| System | Cores | TLS Cache | Shards | Buffers/Shard | Total Capacity |
|---|---|---|---|---|---|
| Embedded | 4 | 4 | 4 | 16 | 64 (~64MB) |
| Laptop | 8 | 6 | 8 | 16 | 128 (~128MB) |
| Workstation | 16 | 6 | 8 | 32 | 256 (~256MB) |
| Small Server | 32 | 8 | 16 | 64 | 1024 (~1GB) |
| Large Server | 64 | 8 | 32 | 64 | 2048 (~2GB) |
| Supercompute | 128 | 8 | 64 | 64 | 4096 (~4GB) |
| Feature | ZeroPool | bytes::BytesMut | Lifeguard | Sharded-Slab |
|---|---|---|---|---|
| Memory zeroing | โ No (performance-first) | โ No | โ No | โ No |
| Safe Rust | โ 100% | โ ๏ธ Some unsafe | โ ๏ธ Some unsafe | โ ๏ธ Heavy unsafe |
| Thread-safe | โ Yes | โ No | โ ๏ธ Limited | โ Yes |
| Lock-free path | โ TLS cache | โ No | โ No | โ ๏ธ Partial |
| Auto-configured | โ CPU-aware | โ Manual | โ Manual | โ Manual |
| Performance focus | โ Primary | โ No | โ No | โ No |
ZeroPool is the fastest buffer pool available, designed purely for maximum performance while maintaining safety.
Dual licensed under Apache-2.0 or MIT.
PRs welcome! Please include benchmarks for performance changes and ensure all tests pass:
cargo test
cargo bench
cargo fmt
cargo clippy
See CHANGELOG.md for version history.
Built with โค๏ธ for the Rust community. Inspired by the need for high-performance buffer management in production systems.