genswap

Crates.iogenswap
lib.rsgenswap
version0.1.0
created_at2026-01-18 19:53:42.944855+00
updated_at2026-01-18 19:53:42.944855+00
descriptionGeneration-tracked ArcSwap wrapper for 5-1600x faster cached reads in read-heavy workloads
homepagehttps://github.com/temporalxyz/genswap
repositoryhttps://github.com/temporalxyz/genswap
max_upload_size
id2052969
size90,284
J (jkbpvsc)

documentation

https://docs.rs/genswap

README

genswap

Crates.io Documentation License

A high-performance Rust library providing generation-tracked caching around ArcSwap for read-heavy workloads. Achieve 5-1500x faster reads compared to ArcSwap::load_full() by eliminating atomic refcount operations on cache hits.

Overview

In read-heavy scenarios where shared data is updated infrequently, repeatedly calling ArcSwap::load_full() incurs unnecessary overhead from atomic refcount operations. This library solves that by:

  1. Tracking updates with a generation counter (cheap atomic u64)
  2. Caching Arc references locally in each reader
  3. Only reloading when the generation changes

The result: cache hits only perform a single atomic load and comparison, avoiding all refcount operations.

Performance

See BENCHMARK_RESULTS.md for detailed analysis.

Single-threaded performance:

  • 5.8x faster than ArcSwap::load_full() on cache hits
  • 0.72ns per read vs 4.2ns for load_full()

Multi-threaded performance (8 threads):

  • 1548x faster than ArcSwap::load_full()
  • 0.40ns per read vs 619ns for load_full()
  • Perfect scaling with thread count (no cache line contention)

Cache hit rate impact:

  • 90% hit rate → 1.8x faster
  • 99% hit rate → 3.6x faster
  • 99.9% hit rate → 5.2x faster

Installation

Add this to your Cargo.toml:

[dependencies]
genswap = "0.1.0"

Usage

Basic Example

use arc_generation_cache::GenSwap;
use std::sync::Arc;
use std::thread;

// Create shared config
let config = Arc::new(GenSwap::new(42u64));

// Spawn reader threads
let handles: Vec<_> = (0..4).map(|_| {
    let config = Arc::clone(&config);
    thread::spawn(move || {
        // Each thread gets its own cached reader
        let mut reader = config.reader();

        for _ in 0..1_000_000 {
            let value = reader.get();  // Fast! Only checks generation
            // Use value...
        }
    })
}).collect();

// Update from another thread (infrequent)
config.update(100);

for h in handles {
    h.join().unwrap();
}

Real-World Example: Configuration Management

use arc_generation_cache::GenSwap;
use std::sync::Arc;

#[derive(Clone)]
struct AppConfig {
    feature_flags: Vec<String>,
    rate_limit: u32,
    timeout_ms: u64,
}

fn main() {
    let config = Arc::new(GenSwap::new(AppConfig {
        feature_flags: vec!["feature_a".into()],
        rate_limit: 100,
        timeout_ms: 1000,
    }));

    // In your request handler (called millions of times):
    let mut reader = config.reader();

    loop {
        // This is incredibly fast - just an atomic load + comparison
        let cfg = reader.get();

        if cfg.rate_limit > 0 {
            // Process request with current config
        }

        // Config updates are automatically detected and loaded
    }
}

API Overview

GenSwap<T>

The producer-side type that holds the data and generation counter.

// Create
let swap = GenSwap::new(data);
let swap = GenSwap::new_from_arc(arc);

// Update
swap.update(new_data);              // Increments generation
swap.update_arc(Arc::new(new_data));

// Read-copy-update
swap.rcu(|current| modify(current)); // Atomic update based on current value

// Direct access (for one-off reads)
let arc = swap.load_full();
let guard = swap.load();

// Create cached readers
let mut reader = swap.reader();

// Query
let gen = swap.generation();

CachedReader<T>

The consumer-side handle that caches an Arc<T> locally.

let mut reader = swap.reader();

// Primary method - fast path on cache hit
let data: &Arc<T> = reader.get();

// Check if stale without reloading
if reader.is_stale() {
    println!("Data has been updated");
}

// Get owned Arc
let arc = reader.get_clone();

// Force reload even if generation matches
reader.force_refresh();

// Query
let gen = reader.cached_generation();
let cached_ref = reader.cached(); // Without staleness check

When to Use

Ideal for:

  • ✅ Shared configuration data (feature flags, rate limits, routing tables)
  • ✅ Read-heavy workloads (>95% reads vs writes)
  • ✅ Multi-threaded scenarios with frequent reads
  • ✅ When Arc refcount contention is a bottleneck
  • ✅ Data that changes infrequently but is read constantly

Not ideal for:

  • ❌ Write-heavy workloads (use ArcSwap directly)
  • ❌ Single-threaded applications with few reads
  • ❌ Data that changes on every access
  • ❌ When you need strong consistency guarantees (readers may lag by one update)

Memory Ordering

The implementation uses carefully chosen memory orderings for correctness:

// Producer: Release ordering ensures data is visible before generation
self.data.store(new_arc);
self.generation.fetch_add(1, Ordering::Release);

// Consumer: Acquire ordering pairs with Release
let current_gen = source.generation.load(Ordering::Acquire);
if current_gen != cached_generation {
    // Generation changed, reload data
}

This guarantees that when a reader observes a new generation, the corresponding data update is visible.

Thread Safety

  • GenSwap<T> is Send + Sync where T: Send + Sync
  • CachedReader<T> is Send but NOT Sync
  • Each thread should have its own CachedReader instance
  • Multiple readers can exist for the same GenSwap
  • Readers own an Arc<GenSwap<T>>, so no lifetime constraints

How It Works

Without GenSwap (ArcSwap::load_full)

Thread 1: load_full() → atomic_fetch_add(refcount) → use data → atomic_fetch_sub(refcount)
Thread 2: load_full() → atomic_fetch_add(refcount) → use data → atomic_fetch_sub(refcount)
...

Problem: Every read requires two atomic operations (increment + decrement) on a shared refcount, causing cache line contention.

With GenSwap

Setup (once per thread):
  reader = swap.reader() → stores Arc<T> + generation

Hot path (millions of times):
  reader.get() → load(generation) → compare → return cached Arc
                 ↑
                 Only this atomic load, no refcount operations!

Update (rare):
  swap.update(new_data) → store(data) → fetch_add(generation)

Solution: Cached readers only check a generation counter (single atomic load). The cached Arc's refcount doesn't change, avoiding contention.

Benchmarks

Run benchmarks with:

cargo bench

See BENCHMARK_RESULTS.md for detailed analysis.

Comparison with Alternatives

Approach Single-thread Multi-thread (8 cores) Notes
Arc::clone() 3.88 ns High contention Direct clone, high overhead
ArcSwap::load() 3.03 ns Moderate contention Returns Guard, no refcount
ArcSwap::load_full() 4.20 ns 619 ns Returns Arc, refcount ops
CachedReader::get() 0.72 ns 0.40 ns Best performance

Examples

See the examples/ directory:

# Basic usage example
cargo run --example basic_usage

Testing

# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test
cargo test test_cached_reader_cache_hit

Limitations

  1. Generation counter wrapping: The u64 generation counter will wrap after 2^64 updates. In practice, this is not a concern (would take millions of years at 1M updates/sec).

  2. Memory overhead: Each CachedReader stores an Arc<T> and a u64, adding ~16-24 bytes per reader.

  3. Eventual consistency: Readers may observe stale data for a brief period between updates. They're guaranteed to see the update on their next get() call.

  4. Not Clone: CachedReader is intentionally not Clone to prevent cache aliasing. Each reader should be independent.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Licensed under either of:

at your option.

Acknowledgments

Built on top of the excellent arc-swap crate by Michal 'vorner' Vaner.

Inspired by the need for high-performance shared configuration in read-heavy systems.

See Also

  • arc-swap - Atomic Arc swap operations
  • left-right - Alternative read-optimized concurrency primitive
  • evmap - Eventually consistent, lock-free map
Commit count: 2

cargo fmt