getattrlistbulk

Crates.iogetattrlistbulk
lib.rsgetattrlistbulk
version0.1.0
created_at2026-01-18 21:44:58.321933+00
updated_at2026-01-18 21:44:58.321933+00
descriptionSafe Rust bindings for macOS getattrlistbulk() system call for high-performance directory enumeration
homepage
repositoryhttps://github.com/quivent/getattrlistbulk-rs
max_upload_size
id2053168
size133,922
quivent (quivent)

documentation

https://docs.rs/getattrlistbulk

README

getattrlistbulk

Crates.io Documentation License

Safe Rust bindings for the macOS getattrlistbulk() system call. Enumerate directories and retrieve file metadata in bulk with minimal syscalls.

Why?

Traditional directory reading requires N+1 syscalls for N files:

opendir() → readdir() × N → stat() × N → closedir()

getattrlistbulk() retrieves entries AND metadata together:

open() → getattrlistbulk() × ceil(N/batch) → close()

For a directory with 10,000 files, this means ~10 syscalls instead of ~20,000.

Requirements

  • macOS 10.10+ (Yosemite or later)
  • Rust 1.70+

This crate only compiles on macOS. On other platforms, it will fail to compile with a clear error message.

Installation

[dependencies]
getattrlistbulk = "0.1"

Usage

Basic Example

use getattrlistbulk::{read_dir, RequestedAttributes};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let attrs = RequestedAttributes {
        name: true,
        size: true,
        object_type: true,
        ..Default::default()
    };

    for entry in read_dir("/Users/me/Documents", attrs)? {
        let entry = entry?;
        println!("{}: {} bytes", entry.name, entry.size.unwrap_or(0));
    }

    Ok(())
}

Get All Available Metadata

use getattrlistbulk::{read_dir, RequestedAttributes};

let attrs = RequestedAttributes {
    name: true,
    object_type: true,
    size: true,
    alloc_size: true,
    modified_time: true,
    permissions: true,
    inode: true,
    entry_count: true,  // for directories
};

for entry in read_dir("/path/to/dir", attrs)? {
    let entry = entry?;

    if let Some(modified) = entry.modified_time {
        println!("{} last modified: {:?}", entry.name, modified);
    }
}

Custom Buffer Size

Larger buffers mean fewer syscalls for large directories:

use getattrlistbulk::{read_dir_with_buffer, RequestedAttributes};

let attrs = RequestedAttributes::default().with_name().with_size();

// 256KB buffer for very large directories
let entries = read_dir_with_buffer("/big/directory", attrs, 256 * 1024)?;

Using the Builder

use getattrlistbulk::DirReader;

let entries = DirReader::new("/path/to/dir")
    .name()
    .size()
    .object_type()
    .buffer_size(128 * 1024)
    .follow_symlinks(false)
    .read()?;

Performance

Syscall Comparison

The performance advantage comes entirely from reducing syscalls:

Approach Syscall Pattern Complexity 10K Files
Traditional POSIX readdir() + stat() per file O(n) ~20,000 syscalls
Swift FileManager Uses POSIX internally O(n) ~10,000-20,000 syscalls
getattrlistbulk Bulk metadata per call O(n/batch) ~12 syscalls

Why This Matters

Traditional:  opendir() → [readdir() + stat()] × N → closedir()
This crate:   open() → getattrlistbulk() × ceil(N/800) → close()

Each syscall requires a user→kernel context switch. Reducing syscalls by ~1,600x eliminates this overhead.

Benchmarks (10,000 files)

Run the included benchmark yourself:

cargo bench --bench compare

Example output (Apple Silicon SSD):

Method Avg Time Speedup
std::fs::read_dir + metadata() ~19ms 1.0x
getattrlistbulk ~5ms ~4x faster

Syscall reduction: ~1,600x fewer (from ~20,000 to ~12)

Why "only" 4x faster with 1,600x fewer syscalls?

On fast NVMe SSDs, the kernel's VFS cache handles most metadata requests in-memory. The syscall overhead (~1μs each) becomes the bottleneck only partially.

Expected speedups by storage type:

Storage Speedup Why
NVMe SSD (cached) ~4x VFS cache masks I/O, syscall overhead partial
SATA SSD ~5-8x More I/O latency exposed
HDD ~10-20x Seek time dominates, batching helps significantly
Network (NFS/SMB) ~20-50x Round-trip latency makes batching critical

Swift Comparison

Swift's FileManager does NOT use getattrlistbulk internally—it wraps POSIX calls:

// Swift - still O(n) syscalls under the hood
let contents = try FileManager.default.contentsOfDirectory(
    at: url,
    includingPropertiesForKeys: [.fileSizeKey, .isDirectoryKey]
)

Swift can call getattrlistbulk via C interop, but Apple's high-level frameworks don't. This crate provides the optimized path that Apple's own tools use internally (Finder, ls, etc.).

Comparison with Alternatives

Crate Bulk Metadata macOS Optimized Cross-Platform
std::fs No No Yes
walkdir No No Yes
jwalk No No Yes
getattrlistbulk Yes Yes No

Use this crate when:

  • You're targeting macOS only
  • You need to read large directories quickly
  • You need metadata along with filenames

Use std::fs or walkdir when:

  • You need cross-platform support
  • You're reading small directories
  • You don't need metadata

Error Handling

use getattrlistbulk::{read_dir, RequestedAttributes, Error};

match read_dir("/some/path", RequestedAttributes::default()) {
    Ok(entries) => { /* ... */ }
    Err(Error::Open(e)) => eprintln!("Failed to open directory: {}", e),
    Err(Error::Syscall(e)) => eprintln!("System call failed: {}", e),
    Err(Error::Parse(msg)) => eprintln!("Buffer parsing error: {}", msg),
    Err(Error::NotSupported) => eprintln!("Not running on macOS"),
}

Safety

This crate uses unsafe internally to call the C system call, but exposes a fully safe public API. All buffer parsing is bounds-checked, and file descriptors are properly managed.

License

Licensed under either of:

at your option.

Contributing

Contributions welcome! Please read the SPECIFICATION.md for implementation details and requirements.

Commit count: 2

cargo fmt