ranged-mmap

Crates.ioranged-mmap
lib.rsranged-mmap
version0.4.0
created_at2025-11-06 04:18:20.331504+00
updated_at2025-12-02 05:25:11.421516+00
descriptionType-safe memory-mapped file library with lock-free concurrent writes to non-overlapping ranges
homepagehttps://github.com/ShaoG-R/ranged-mmap
repositoryhttps://github.com/ShaoG-R/ranged-mmap
max_upload_size
id1919049
size185,140
Shao G. (shaogme)

documentation

https://docs.rs/ranged-mmap

README

ranged-mmap

Crates.io Documentation License

English | δΈ­ζ–‡

A type-safe, high-performance memory-mapped file library optimized for lock-free concurrent writes to non-overlapping ranges.

Features

  • πŸš€ Zero-Copy Writes: Data is written directly to mapped memory without system calls
  • πŸ”’ Lock-Free Concurrency: Multiple threads can write to different file regions simultaneously without locks
  • βœ… Type-Safe API: Prevents overlapping writes at compile-time through the type system
  • πŸ“¦ Reference Counting: Can be cloned and shared among multiple workers
  • ⚑ High Performance: Optimized for concurrent random writes (see benchmarks)
  • πŸ”§ Manual Flushing: Fine-grained control over when data is synchronized to disk
  • 🌐 Runtime Agnostic: Works with any async runtime (tokio, async-std) or without one
  • πŸ“ 4K Alignment: All allocations are automatically aligned to 4K boundaries for optimal I/O performance
  • πŸ”„ Dual Allocators: Sequential allocator for single-thread use, wait-free concurrent allocator for multi-thread scenarios

When to Use

Perfect for:

  • 🌐 Multi-threaded downloaders: Concurrent writes to different file chunks
  • πŸ“ Logging systems: Multiple threads writing to different log regions
  • πŸ’Ύ Database systems: Concurrent updates to different data blocks
  • πŸ“Š Large file processing: Parallel processing of different file sections

Not suitable for:

  • Files that need dynamic resizing (size must be known at creation)
  • Sequential or small file operations (overhead not justified)
  • Systems with limited virtual memory

Quick Start

Add to your Cargo.toml:

[dependencies]
ranged-mmap = "0.4"

Type-Safe Version (Recommended)

The MmapFile API provides compile-time safety guarantees through range allocation:

use ranged_mmap::{MmapFile, allocator::ALIGNMENT};
use std::num::NonZeroU64;

fn main() -> ranged_mmap::Result<()> {
    // Create a file (size in 4K units) and range allocator
    // All allocations are 4K aligned automatically
    let (file, mut allocator) = MmapFile::create_default(
        "output.bin",
        NonZeroU64::new(ALIGNMENT * 256).unwrap()  // 1MB (256 * 4K)
    )?;

    // Allocate non-overlapping ranges in the main thread (4K aligned)
    let range1 = allocator.allocate(NonZeroU64::new(ALIGNMENT * 128).unwrap()).unwrap(); // [0, 512KB)
    let range2 = allocator.allocate(NonZeroU64::new(ALIGNMENT * 128).unwrap()).unwrap(); // [512KB, 1MB)

    // Concurrent writes to different ranges (compile-time safe!)
    std::thread::scope(|s| {
        let f1 = file.clone();
        let f2 = file.clone();
        
        s.spawn(move || {
            let receipt = f1.write_range(range1, &vec![1u8; (ALIGNMENT * 128) as usize]);
            f1.flush_range(receipt);
        });
        
        s.spawn(move || {
            let receipt = f2.write_range(range2, &vec![2u8; (ALIGNMENT * 128) as usize]);
            f2.flush_range(receipt);
        });
    });

    // Final synchronous flush to ensure all data is written
    unsafe { file.sync_all()?; }
    Ok(())
}

Unsafe Version (Maximum Performance)

For scenarios where you can manually guarantee non-overlapping writes:

use ranged_mmap::MmapFileInner;
use std::num::NonZeroU64;

fn main() -> ranged_mmap::Result<()> {
    let file = MmapFileInner::create("output.bin", NonZeroU64::new(1024).unwrap())?;

    let file1 = file.clone();
    let file2 = file.clone();

    std::thread::scope(|s| {
        // ⚠️ Safety: You must ensure non-overlapping regions
        s.spawn(|| unsafe { 
            file1.write_at(0, &[1; 512]);
        });
        s.spawn(|| unsafe { 
            file2.write_at(512, &[2; 512]);
        });
    });

    unsafe { file.flush()?; }
    Ok(())
}

API Overview

Main Types

  • MmapFile: Type-safe memory-mapped file with compile-time safety
  • MmapFileInner: Unsafe high-performance version for manual safety management
  • RangeAllocator: Trait for range allocators
  • allocator::sequential::Allocator: Sequential allocator for single-thread use
  • allocator::concurrent::Allocator: Wait-free concurrent allocator for multi-thread scenarios
  • AllocatedRange: Represents a valid, non-overlapping file range
  • WriteReceipt: Proof that a range has been written (enables type-safe flushing)
  • SplitUpResult: Result of splitting with 4K upper alignment
  • SplitDownResult: Result of splitting with 4K lower alignment
  • ALIGNMENT: 4K alignment constant (4096 bytes)
  • align_up: Function to align values up to 4K boundary
  • align_down: Function to align values down to 4K boundary

Core Methods

MmapFile (Type-Safe)

use std::num::NonZeroU64;
use ranged_mmap::allocator::{sequential, concurrent, ALIGNMENT};

// Create file with default sequential allocator
let (file, mut allocator) = MmapFile::create_default(path, NonZeroU64::new(size).unwrap())?;

// Or specify allocator type explicitly
let (file, mut allocator) = MmapFile::create::<sequential::Allocator>(path, NonZeroU64::new(size).unwrap())?;

// Use concurrent allocator for multi-thread allocation
let (file, allocator) = MmapFile::create::<concurrent::Allocator>(path, NonZeroU64::new(size).unwrap())?;

// Allocate ranges (4K aligned, returns Option)
let range = allocator.allocate(NonZeroU64::new(ALIGNMENT).unwrap()).unwrap();

// Write to range (returns receipt directly)
let receipt = file.write_range(range, data);

// Flush using receipt
file.flush_range(receipt);

// Sync all data to disk
unsafe { file.sync_all()?; }

MmapFileInner (Unsafe)

use std::num::NonZeroU64;

// Create file
let file = MmapFileInner::create(path, NonZeroU64::new(size).unwrap())?;

// Write at offset (must ensure non-overlapping)
unsafe { file.write_at(offset, data); }

// Flush to disk
unsafe { file.flush()?; }

Safety Guarantees

Compile-Time Safety (MmapFile)

The type system ensures:

  • βœ… All ranges are allocated through RangeAllocator
  • βœ… Ranges are allocated sequentially/atomically, preventing overlaps
  • βœ… All allocations are 4K aligned for optimal I/O performance
  • βœ… Data length must match range length
  • βœ… Only written ranges can be flushed (via WriteReceipt)

Runtime Safety (MmapFileInner)

You must ensure:

  • ⚠️ Different threads write to non-overlapping memory regions
  • ⚠️ No reads occur to a region during writes
  • ⚠️ Proper synchronization if violating the above rules

Performance

This library is optimized for concurrent random write scenarios. Compared to standard tokio::fs::File, it offers:

  • Zero system calls for writes: Direct memory modification
  • No locks required: True parallel writes to different regions
  • Batch flushing: Control when data is synchronized to disk

See benches/concurrent_write.rs for detailed benchmarks.

Advanced Usage

With Concurrent Allocator (Multi-Thread Allocation)

use ranged_mmap::{MmapFile, allocator::{concurrent, ALIGNMENT}};
use std::num::NonZeroU64;
use std::sync::Arc;

fn main() -> ranged_mmap::Result<()> {
    // Use concurrent allocator for wait-free allocation from multiple threads
    let (file, allocator) = MmapFile::create::<concurrent::Allocator>(
        "output.bin",
        NonZeroU64::new(ALIGNMENT * 100).unwrap()
    )?;
    let allocator = Arc::new(allocator);
    
    std::thread::scope(|s| {
        for _ in 0..4 {
            let f = file.clone();
            let alloc = Arc::clone(&allocator);
            s.spawn(move || {
                // Each thread can allocate independently (wait-free)
                while let Some(range) = alloc.allocate(NonZeroU64::new(ALIGNMENT).unwrap()) {
                    let receipt = f.write_range(range, &vec![42u8; ALIGNMENT as usize]);
                    f.flush_range(receipt);
                }
            });
        }
    });
    
    unsafe { file.sync_all()?; }
    Ok(())
}

With Tokio Runtime

use ranged_mmap::{MmapFile, allocator::ALIGNMENT};
use std::num::NonZeroU64;
use tokio::task;

#[tokio::main]
async fn main() -> ranged_mmap::Result<()> {
    let (file, mut allocator) = MmapFile::create_default(
        "output.bin",
        NonZeroU64::new(ALIGNMENT * 256).unwrap()  // 1MB
    )?;
    
    // Allocate ranges (4K aligned)
    let range1 = allocator.allocate(NonZeroU64::new(ALIGNMENT * 128).unwrap()).unwrap();
    let range2 = allocator.allocate(NonZeroU64::new(ALIGNMENT * 128).unwrap()).unwrap();
    
    // Spawn async tasks
    let f1 = file.clone();
    let f2 = file.clone();
    
    let task1 = task::spawn_blocking(move || {
        f1.write_range(range1, &vec![1u8; (ALIGNMENT * 128) as usize])
    });
    
    let task2 = task::spawn_blocking(move || {
        f2.write_range(range2, &vec![2u8; (ALIGNMENT * 128) as usize])
    });
    
    let receipt1 = task1.await.unwrap();
    let receipt2 = task2.await.unwrap();
    
    // Flush specific ranges
    file.flush_range(receipt1);
    file.flush_range(receipt2);
    
    unsafe { file.sync_all()?; }
    Ok(())
}

Reading Data

use ranged_mmap::{MmapFile, allocator::ALIGNMENT};
use std::num::NonZeroU64;

fn main() -> ranged_mmap::Result<()> {
    let (file, mut allocator) = MmapFile::create_default(
        "output.bin",
        NonZeroU64::new(ALIGNMENT).unwrap()
    )?;
    // Allocations are 4K aligned
    let range = allocator.allocate(NonZeroU64::new(ALIGNMENT).unwrap()).unwrap();
    
    // Write data (data length must match range length)
    file.write_range(range, &vec![42u8; ALIGNMENT as usize]);
    
    // Read back
    let mut buf = vec![0u8; ALIGNMENT as usize];
    file.read_range(range, &mut buf)?;
    
    assert_eq!(buf[0], 42u8);
    Ok(())
}

Opening Existing Files

use ranged_mmap::{MmapFile, allocator::ALIGNMENT};
use std::num::NonZeroU64;

fn main() -> ranged_mmap::Result<()> {
    // Open existing file with default sequential allocator
    let (file, mut allocator) = MmapFile::open_default("existing.bin")?;
    
    println!("File size: {} bytes", file.size());
    println!("Remaining allocatable: {} bytes", allocator.remaining());
    
    // Continue allocating and writing (4K aligned)
    if let Some(range) = allocator.allocate(NonZeroU64::new(ALIGNMENT).unwrap()) {
        file.write_range(range, &vec![0u8; ALIGNMENT as usize]);
    }
    
    Ok(())
}

Limitations

  • Fixed Size: File size must be specified at creation and cannot be changed
  • Virtual Memory: Maximum file size is limited by system virtual memory
  • Platform Support: Currently optimized for Unix-like systems and Windows
  • No Built-in Locking: Users must manage concurrent access patterns

How It Works

  1. Memory Mapping: The file is memory-mapped using memmap2, making it accessible as a continuous memory region
  2. Range Allocation: Allocators provide non-overlapping ranges:
    • sequential::Allocator: Simple sequential allocation for single-thread use
    • concurrent::Allocator: Wait-free atomic allocation for multi-thread scenarios
  3. 4K Alignment: All allocations are aligned to 4K boundaries for optimal I/O performance
  4. Type Safety: AllocatedRange can only be created through the allocator, guaranteeing validity
  5. Lock-Free Writes: Each thread writes to its own AllocatedRange, avoiding locks
  6. Manual Flushing: Users control when data is synchronized to disk for optimal performance

Comparison

Feature ranged-mmap tokio::fs::File std::fs::File
Concurrent writes βœ… Lock-free ❌ Requires locks ❌ Requires locks
Zero-copy βœ… Yes ❌ No ❌ No
Type safety βœ… Compile-time ⚠️ Runtime ⚠️ Runtime
System calls (write) βœ… Zero ❌ Per write ❌ Per write
Dynamic size ❌ Fixed βœ… Yes βœ… Yes
Async support βœ… Runtime agnostic βœ… Tokio only ❌ No

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is licensed under either of:

at your option.

Acknowledgments

Built on top of the excellent memmap2 crate.

Commit count: 0

cargo fmt