| Crates.io | precise_rate_limiter |
| lib.rs | precise_rate_limiter |
| version | 0.3.0 |
| created_at | 2025-11-08 11:33:12.56375+00 |
| updated_at | 2025-11-28 04:34:30.334247+00 |
| description | A high-performance, precise rate limiter using tokio channels and atomic operations |
| homepage | |
| repository | https://github.com/wushilin/precise_rate_limiter.rs |
| max_upload_size | |
| id | 1922785 |
| size | 105,638 |
A high-performance, precise rate limiter for Rust using tokio channels and atomic operations.
This crate provides four implementations:
Async implementations (for tokio async contexts):
FastQuota: A high-performance variant with 10x better throughput in low-contention scenariosQuota: A fair, FIFO rate limiter with guaranteed orderingSynchronous implementations (for non-async contexts):
FastQuotaSync: Synchronous version with best-effort FIFO ordering and fast-path optimizationQuotaSync: Synchronous version with perfect FIFO orderingBoth Quota::new() and FastQuota::new() (as well as their sync variants) return Arc<Self>, so you can easily share them among many threads. You can clone it, send it, no issues.
Shared global quota (e.g. built a VPN server but want to limit global bandwidth). You can consume millions of bytes or bits at one go and refresh millions of bits or bytes at one go.
API quota across a single server between many threads.
Any other high quota refill rate and high acquisitation situation, sometimes acquisition is weighted (depend on actual consumption of tokens, e.g. bytes sent)
Performance: Since quota is always fair and FIFO, it uses mpsc channel internally. The channel has 500 buffer. When you call quota.acquire, you send a wait request to the queue and wait for wake up. When queue is full, you are blocked, of course. When queue is not full, you queue request is waiting to be processed and you entered Notify.await state.
The thread determines the head of queue can be waken up will wake up each caller in order.
Based on benchmark, the await operation (assuming not blocked on quota refreshment), can handle roughly 400,000 operations per second. Therefore, your quota granularity, at most can go 400,000 per second at 1 acquisition.
If this is not fast enough, you can acquire multiple of X tokens and buffer locally for immediate reuse. For example, if you acquired 5000 tokens at once, then consume one by one locally until counter goes to zero, then you can call Quota.acquire(5000) again. In this case, your ops per second can be 5000*400,000 = 20 billion ops. The global quota of ops per second is still honored.
FastQuota is an optimized variant of Quota that provides significantly better performance in low-contention scenarios while maintaining reasonable fairness guarantees.
FastQuota uses a dual-path architecture:
Fast Path: Direct token acquisition using atomic operations when there's no contention
Quota)Slow Path: FIFO-queued processing when there is contention
QuotaUse FastQuota when:
Use Quota when:
| Metric | Quota | FastQuota |
|---|---|---|
| Low contention throughput, mostly no waits | ~400K ops/sec | ~4M ops/sec |
| High contention throughput, everyone waits | ~400K ops/sec | ~400K ops/sec |
| Fairness | Perfect FIFO | Mostly FIFO (with edge cases) |
| Latency (low contention) | Higher (channel overhead) | Lower (direct atomic ops) |
FastQuota prioritizes performance over perfect fairness:
For most practical use cases, FastQuota provides excellent performance with reasonable fairness guarantees.
use precise_rate_limiter::FastQuota;
use std::time::Duration;
#[tokio::main]
async fn main() {
// Create a fast quota with the same parameters as Quota
let quota = FastQuota::new(
1000, // max_buffer: 1000 tokens
100, // refill_amount: 100 tokens
Duration::from_millis(100), // refill_interval: every 100ms
);
// Fast path: acquires immediately if tokens available
quota.acquire(50).await;
// Automatically falls back to slow path if needed
quota.acquire(200).await;
}
The fast path works by:
The slow path:
Quota)This design ensures that when there's no contention, operations are extremely fast, while still maintaining fairness when contention occurs.
The token refills and deduction are using atomic usize and should be accurate, atomic. the refill is using time ticker and missed ticker events (unlikely) are bursted up.
the refill is very reliable.
The bucket is initially full. The initial number of token is always the max token. If you consume quota too slowly, you will always operate at full speed.
If you consume quota too fast, you will burn out the token bucket soon, and you will be subjected to the quota refill rate.
The quota refill will stop once the buffered token reaches the max_token.
If you try to acquire max_token + N where N is positive, your code will panic.
Add this to your Cargo.toml:
[dependencies]
precise_rate_limiter = "0.1.0"
tokio = { version = "1", features = ["full"] }
For non-async contexts (e.g., standard threads, libraries without async runtime), this crate provides synchronous versions that use blocking operations instead of async/await.
QuotaSync provides the same functionality as Quota but uses std::sync primitives and blocks the calling thread instead of using async/await.
Features:
std::sync instead of tokio asyncQuota in non-async contextsExample:
use precise_rate_limiter::QuotaSync;
use std::time::Duration;
use std::thread;
fn main() {
let quota = QuotaSync::new(100, 10, Duration::from_millis(100));
// Blocks until tokens are available
quota.acquire(50);
println!("Acquired 50 tokens!");
// Can be shared across threads
let quota_clone = quota.clone();
thread::spawn(move || {
quota_clone.acquire(30);
println!("Thread acquired 30 tokens!");
}).join().unwrap();
}
FastQuotaSync provides the same functionality as FastQuota but uses blocking operations.
Features:
FastQuotaExample:
use precise_rate_limiter::FastQuotaSync;
use std::time::Duration;
fn main() {
let quota = FastQuotaSync::new(1000, 100, Duration::from_millis(100));
// Fast path: acquires immediately if tokens available
quota.acquire(50);
// Automatically falls back to slow path if needed
quota.acquire(200);
}
| Use Case | Recommended Implementation |
|---|---|
| Tokio async application | Quota or FastQuota |
| Standard threads | QuotaSync or FastQuotaSync |
| Library without async | QuotaSync or FastQuotaSync |
| Mixed async/sync code | Use both (they're independent) |
The synchronous implementations have similar performance characteristics to their async counterparts:
QuotaSync processes ~400K requests/sec (similar to Quota)FastQuotaSync can achieve ~4M requests/sec on the fast path (similar to FastQuota)use precise_rate_limiter::Quota;
use std::time::Duration;
#[tokio::main]
async fn main() {
// Create a rate limiter with:
// - max_buffer: 10 tokens (capacity)
// - refill_amount: 1 token per refill
// - refill_interval: 1 second
let quota = Quota::new(10, 1, Duration::from_secs(1));
// Acquire tokens (blocks until available)
quota.acquire(5).await;
println!("Acquired 5 tokens!");
// Acquire more tokens
quota.acquire(3).await;
println!("Acquired 3 more tokens!");
}
use precise_rate_limiter::FastQuota;
use std::time::Duration;
#[tokio::main]
async fn main() {
// FastQuota provides 10x better performance in low-contention scenarios
let quota = FastQuota::new(10, 1, Duration::from_secs(1));
// Fast path: acquires immediately if tokens available
quota.acquire(5).await;
// Automatically falls back to slow path if needed
quota.acquire(3).await;
}
Unlike most rate limiter implementations, precise_rate_limiter supports flexible acquire and refill amounts up to the maximum buffer size:
max_buffer tokens in a single call. The rate limiter will wait until enough tokens accumulate to fulfill the request.// Request up to max_buffer tokens (10 in this case)
let quota = Quota::new(10, 1, Duration::from_secs(1));
quota.acquire(10).await; // Acquires all available tokens
max_buffer), allowing for burst capacity or faster recovery. Refill amounts exceeding the buffer are automatically clamped.// Refill up to 100 tokens per interval (clamped to max_buffer)
let quota = Quota::new(100, 100, Duration::from_secs(1));
This flexibility is particularly useful for:
Most existing implementations restrict acquire amounts to small fixed values, which can be limiting for real-world use cases.
The rate limiter uses a completely fair FIFO (First-In-First-Out) queueing mechanism. All acquire requests are processed through a single channel and handled in strict order:
This guarantees that if multiple tasks request tokens simultaneously, they will be served in the order they called acquire(), preventing any task from being starved or receiving preferential treatment.
Licensed under the Apache License, Version 2.0. See LICENSE for details.