sliding-ring

Crates.iosliding-ring
lib.rssliding-ring
version0.1.1
created_at2025-10-26 17:43:12.035886+00
updated_at2025-10-27 15:49:59.47743+00
descriptionCache-friendly sliding ring buffer keyed to an anchor coordinate for ultra-low-latency workloads.
homepage
repositoryhttps://github.com/ChetanBhasin/sliding-ring
max_upload_size
id1901659
size68,368
Chetan Bhasin (ChetanBhasin)

documentation

https://docs.rs/sliding-ring

README

Sliding Ring

Crates.io lib.rs Documentation License Downloads

sliding-ring provides a cache-friendly, allocation-free ring buffer that keeps its slots indexed relative to a moving anchor index. It was extracted from our high-performance orderbook engine and generalized so any Copy slot type can participate in consistent sliding-window semantics. A const-generic DEPTH defines how many offsets past the anchor stay live at any time; newly entered slots are proactively zeroed on each shift so the hot window never carries stale data. Store any domain-specific coordinate (price, timestamp, etc.) outside the ring and combine it with the offsets it exposes.

Installation

cargo add sliding-ring

The crate ships with a README.md-driven cargo doc experience, so cargo doc --open -p sliding-ring gives you rendered API docs locally.

Why this exists

Traditional ring buffers optimize for FIFO semantics, but exchange and telemetry workloads often care about the best coordinate (highest bid, current time bucket, etc.) and need random access around that best. SlidingRing keeps a 256-slot ring that re-anchors in O(1) time, clearing the newly entered slots while preserving the rest. This enables predictable performance, avoids per-slot allocations, and keeps the CPU cache warm even under heavy churn.

When to reach for it

  • Aggregated orderbooks keyed by discrete price ticks.
  • Sliding time buckets for counters, rate-limiters, or observability windows.
  • Any moving reference window where only a bounded number of offsets from the anchor matter.

When it is the wrong tool

  • You need more than 256 simultaneous offsets (use a wider structure or a map).
  • The anchor jumps sparsely across a huge domain, making the clearing work wasteful.
  • Slots own large heap allocations that would be expensive to recreate on every clear (store handles or IDs instead).

Model

  • The ring stores an anchor index (0-255). Track any absolute coordinate outside the structure.
  • A const DEPTH describes the contiguous window of offsets [0, DEPTH) that stays live; everything else is reset to the caller-supplied “zero” value.
  • Shifting by k steps rewires indices modulo 256, then clears the |k| offsets that newly enter the window (or resets the whole ring when |k| >= DEPTH).
  • shift_by_keep_anchor skips clearing the newly entered anchor slot on backward moves—useful when promoting the next best level without losing its quantity.

Performance guardrails

  • No dynamic dispatch: SlidingRing<T, DEPTH> stores [T; 256], so every slot type is monomorphized at compile time and the compiler inlines the small helper methods. T must be Copy and DEPTH controls the number of contiguous offsets that stay hot beyond the anchor.
  • Anchor slices for SIMD: slices_from_anchor* return the two contiguous spans covering exactly DEPTH slots (tail/head), so you can run vectorized scans without modulo math. This is the primary primitive for ultra-low-latency users.
  • Stable SIMD workflow: Pair slices_from_anchor* with the wide crate (see the simd_scan example below) to stay on stable Rust while still using vector instructions.
  • Raw-pointer iterators: iter() / iter_mut() walk storage order, while iter_from_anchor* emit (offset, slot) relative to the current anchor. Both rely on pointer arithmetic, so they stay zero-cost abstractions.
  • Inline hot paths: ring methods stay tiny and are annotated with #[inline(always)] so LLVM can flatten the helpers.
  • Memory over CPU: we willingly dedicate 256 slots per ring to keep operations branch-predictable. If memory becomes tighter than latency, this crate is not the right fit.

Ultra-low-latency workflow

  1. Call slices_from_anchor*() to obtain the tail/head spans.
  2. Run SIMD loads/stores or hand-unrolled loops directly over those spans (tail first, then head) to scan for non-zero slots, update counters, etc.
  3. If you also need absolute coordinates, pair the SIMD scan with iter_from_anchor*()—they share the same traversal order and hand you offsets you can add to your externally tracked anchor.

Slot requirements

The ring stores [T; 256], so T must be Copy. Construction takes the value that represents a cleared slot:

use sliding_ring::SlidingRing;

const DEPTH: u8 = 48;
let mut ring = SlidingRing::<u64, DEPTH>::new(0);

That “zero” gets copied into every slot initially and whenever a slot needs to be cleared because the anchor moved. It can be anything cheap to copy: numeric zeroes, None, or lightweight structs. Only the DEPTH slots starting at the anchor remain significant; everything else is cleared automatically.

Example

use sliding_ring::SlidingRing;

#[derive(Default, Clone, Copy)]
struct Bucket {
    hits: u64,
}

const DEPTH: u8 = 64;
let mut ring = SlidingRing::<Bucket, DEPTH>::new(Bucket::default());
let mut anchor = 10_000i128;
let offset = 5u8;
ring.slot_mut(offset).hits += 1;
ring.shift_by(offset as i128);
anchor += offset as i128;
assert_eq!(ring.borrow_anchor().hits, 1);

Telemetry counter

use sliding_ring::{Direction, SlidingRing};

#[derive(Default, Clone, Copy)]
struct Bucket { hits: u64 }

const DEPTH: u8 = 32;
let mut ring = SlidingRing::<Bucket, DEPTH>::new(Bucket::default());
ring.slot_mut(5).hits += 1;
ring.slot_mut(6).hits += 3;
let sum: u64 = ring
    .iter_from_anchor(Direction::Forward)
    .map(|(_, bucket)| bucket.hits)
    .sum();
assert_eq!(sum, 4);

SIMD scanning

use sliding_ring::SlidingRing;
use wide::u64x4;

const DEPTH: u8 = 64;
let mut ring = SlidingRing::<u64, DEPTH>::new(0);
for i in 0..DEPTH {
    *ring.slot_mut(i) = i as u64;
}
let (tail, head) = ring.slices_from_anchor();
let needle = u64x4::splat(42);

for chunk in tail.chunks_exact(4).chain(head.chunks_exact(4)) {
    let vec = u64x4::new([chunk[0], chunk[1], chunk[2], chunk[3]]);
    if vec
        .simd_eq(needle)
        .to_array()
        .iter()
        .any(|&lane| lane == u64::MAX)
    {
        println!("found 42");
        break;
    }
}

Run cargo run --example telemetry or cargo run --example simd_scan for the full samples. The SIMD example relies on the wide crate, so it stays on stable Rust while still demonstrating contiguous-slice scanning.

Documentation

  • API reference on docs.rs.
  • Run cargo doc --open -p sliding-ring for local browsing.

Status

This crate is extracted from production systems but is still evolving. Expect additions such as configurable ring widths, iterators, and more slot utilities as we work toward a polished open-source release.

Examples

  • cargo run --example telemetry – basic per-slot counter with anchor-ordered reduction.
  • cargo run --example simd_scan – demonstrates splitting the ring into contiguous spans and scanning with SIMD on stable via the wide crate.

Benchmarks

  • cargo bench -p sliding-ring runs the Criterion suite measuring anchor shifts and slot updates.
  • Bazel users can run bazel run //lib/collections/sliding-ring:sliding_ring_bench (tagged benchmark) to launch the same binary.

License

Licensed under either of

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in sliding-ring by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Commit count: 0

cargo fmt