| Crates.io | bitcoinleveldb-posixwfile |
| lib.rs | bitcoinleveldb-posixwfile |
| version | 0.1.1 |
| created_at | 2025-12-01 17:54:46.803803+00 |
| updated_at | 2025-12-01 17:54:46.803803+00 |
| description | POSIX-backed implementation of a buffered, durable LevelDB-style WritableFile for the Bitcoin-leveldb stack, using libc syscalls for append, flush, sync, and close with manifest-aware directory syncing. |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1960196 |
| size | 143,308 |
A small, tightly scoped crate providing a POSIX-backed implementation of a LevelDB-style WritableFile for the Bitcoin-derived LevelDB port. It encapsulates buffered appends, durable syncing, and careful error handling on UNIX-like systems.
bitcoinleveldb-posixwfile implements a PosixWritableFile type that:
fd: i32) with a fixed-size write buffer.WritableFile, WritableFileAppend, WritableFileFlush, WritableFileSync, and WritableFileClose used by the Bitcoin-flavored LevelDB core.open, write, fsync / fdatasync, close, fcntl(F_FULLFSYNC) on Apple targets) via libc for predictable behavior.MANIFEST*) from regular files and applies Bitcoin/LevelDB’s durable-fsync discipline: syncing the directory containing the manifest before syncing the manifest itself.This crate is intentionally minimal and low-level. It is not a general-purpose file API but a precise building block in the Bitcoin-leveldb storage stack, with durability and crash semantics tuned to LevelDB’s expectations.
PosixWritableFilepub struct PosixWritableFile {
buf: [u8; WRITABLE_FILE_BUFFER_SIZE],
pos: usize,
fd: i32,
is_manifest: bool,
filename: String,
dirname: String,
ever_valid_fd: bool,
}
Key invariants and behavior:
buf + pos: in-memory staging buffer for appends. Writes are batched to reduce syscall overhead and improve I/O throughput.fd:
fd >= 0, the OS file descriptor is considered open and owned by this instance.fd < 0, the descriptor is treated as closed. A negative descriptor that was once valid is handled idempotently on close().ever_valid_fd distinguishes "never opened" from "already closed"; this allows close() to return a realistic EBADF-style error when the file was never valid.is_manifest and dirname are precomputed:
is_manifest is true when the basename starts with "MANIFEST".dirname holds the directory portion of filename (or "." if none).Construction is explicit:
impl PosixWritableFile {
pub fn new(filename: String, fd: i32) -> Self { /* ... */ }
}
new does not open the file; the caller is responsible for obtaining an fd with the desired flags and mode. This separation keeps POSIX concerns localized and predictable.
WritableFileAppend implementationimpl WritableFileAppend for PosixWritableFile {
fn append(&mut self, data: &Slice) -> Status { /* ... */ }
}
Algorithm outline:
Status::ok().buf until either data is exhausted or the buffer becomes full.Ok without a syscall (purely buffered).flush_buffer()). On error, propagate the Status.write_size < WRITABLE_FILE_BUFFER_SIZE, copy into an empty buffer and return Ok.write_size >= WRITABLE_FILE_BUFFER_SIZE, bypass the buffer and call write_unbuffered directly.This strategy approximates a batching policy: many small appends are batched in memory; large writes do not incur extra buffering overhead.
flush_bufferimpl PosixWritableFile {
pub fn flush_buffer(&mut self) -> crate::Status { /* ... */ }
}
impl WritableFileFlush for PosixWritableFile {
fn flush(&mut self) -> crate::Status { self.flush_buffer() }
}
pos == 0, it is a no-op and returns Status::ok().write_unbuffered with the buffer content.pos to 0, even if the underlying write fails, to match the C++ LevelDB semantics.write_unbufferedimpl PosixWritableFile {
pub fn write_unbuffered(&mut self, data: *const u8, size: usize) -> crate::Status { /* ... */ }
}
Low-level loop:
libc::write(fd, data, size) repeatedly until all bytes are written.data and decreasing size.EINTR, logs and retries.Status via posix_error and returns immediately.This design ensures that higher layers see a simple all-or-error contract, hiding the complexities of partial POSIX writes.
Durability semantics are vital for LevelDB’s correctness under power loss. This crate implements a conservative, Bitcoin-inspired policy.
sync_fdimpl PosixWritableFile {
pub fn sync_fd(fd: i32, fd_path: &String, syncing_dir: bool) -> crate::Status { /* ... */ }
}
Behavior:
fcntl(fd, F_FULLFSYNC) first, which provides stronger guarantees than fsync() with respect to power failures.Status::ok().fdatasync where available (Linux, *BSD, Android) for metadata-light syncing.fsync when fdatasync is not available.sync_result == 0), return Status::ok().errno. If syncing_dir == true and errno == EINVAL, treat it as a non-fatal success. Some filesystems do not support directory fsync; Bitcoin explicitly tolerates this.Status via posix_error(fd_path, err).This function centralizes platform-conditional behavior and error normalization.
sync_dir_if_manifestimpl PosixWritableFile {
pub fn sync_dir_if_manifest(&mut self) -> crate::Status { /* ... */ }
}
!is_manifest, it returns Status::ok() without making syscalls.dirname to CString. If the directory name contains an embedded NUL, return io_error with a descriptive message.open(dir, O_RDONLY).sync_fd(dir_fd, dirname, true) and then close(dir_fd).errno into a Status via posix_error.This mirrors Bitcoin’s strategy: ensure that directory entries referencing newly created files are durable before the manifest referencing them is considered persistent.
syncimpl WritableFileSync for PosixWritableFile {
fn sync(&mut self) -> crate::Status { /* ... */ }
}
// Rough order of operations inside sync():
// 1. sync_dir_if_manifest();
// 2. flush_buffer();
// 3. sync_fd(fd, filename, false);
flush_buffer().sync_fd on the file descriptor itself.The sequencing is deliberate: LevelDB’s metadata consistency and recovery logic assume that files referenced by the manifest are already on disk by the time the manifest is reported as synced.
closeimpl WritableFileClose for PosixWritableFile {
fn close(&mut self) -> Status { /* ... */ }
}
Behavior:
fd < 0:
ever_valid_fd is true, treat this as a no-op, returning Status::ok() (idempotent close).close(-1) would and return an EBADF-style error using posix_error.fd >= 0:
flush_buffer() first.libc::close(fd).close() fails and the flush was successful, propagate the close error as Status.fd to -1 regardless of error, reflecting ownership transfer back to the OS.This yields a pragmatic mix of realism (EBADF when the file was never valid) and ergonomic idempotence (multiple close() calls on a previously valid file are safe).
Drop implementationimpl Drop for PosixWritableFile {
fn drop(&mut self) {
if *self.fd() >= 0 {
let _ = self.close();
}
}
}
fd >= 0, drop calls close() and discards any resulting Status.fd < 0, it simply logs that the fd is already closed.This mirrors the C++ LevelDB destructor: best-effort closure on drop, with errors only observable if you call close() explicitly.
PosixWritableFile includes static helpers for file path analysis:
dirname_staticimpl PosixWritableFile {
pub fn dirname_static(filename: &String) -> String { /* ... */ }
}
filename using the last '/'.'/' is found, returns ".".debug_assert! that the basename does not itself contain /.basenameimpl PosixWritableFile {
pub fn basename(filename: &String) -> Slice { /* ... */ }
}
Slice pointing into the original string’s buffer./ exists, it returns the substring after the final slash as a borrowed slice./ is found, returns a slice over the entire filename.filename value; callers must ensure the string outlives the Slice use.is_manifest_staticimpl PosixWritableFile {
pub fn is_manifest_static(filename: &String) -> bool { /* ... */ }
}
basename to obtain the last path component."MANIFEST" as a manifest file.sync_dir_if_manifest.Throughout the implementation, logging macros such as trace! and debug! are invoked to record:
append, flush_buffer, write_unbuffered, sync, close, drop).This crate assumes a logging facade compatible with those macros (e.g., tracing). When integrated with the rest of the bitcoinleveldb ecosystem, these logs provide high-resolution introspection of I/O behavior under load and during error conditions.
This crate is designed to be used as part of a larger bitcoinleveldb workspace or dependency graph. Typical integration points include:
PosixWritableFile into a higher-level environment or factory that returns concrete WritableFile trait objects to LevelDB’s table and log writers.libc::open or Rust’s std::os::fd::AsRawFd interfaces from higher layers, then wrapping that descriptor in PosixWritableFile::new.append, flush, sync, and close following LevelDB’s state machine.The provided behavior is intentionally conservative: it may favor correctness and persistence guarantees over absolute peak throughput.
unsafe UsageThe implementation uses unsafe blocks for:
Slice::from_ptr_len).libc (open, write, fsync, fdatasync, fcntl, close).These usages are localized and guarded by consistent invariants:
write errors treat EINTR as retryable.Consumers of the crate interact through safe Rust APIs (append, flush, sync, close) and do not need to touch the underlying unsafe elements.
Below is a schematic example. Exact types like Slice, Status, and various WritableFile* traits come from the surrounding bitcoinleveldb ecosystem and are not redefined here.
use bitcoinleveldb_posixwfile::PosixWritableFile;
use bitcoinleveldb_core::{Slice, Status};
use std::ffi::CString;
fn open_posix_writable(path: &str) -> Status { // pseudo-signature
let c_path = CString::new(path).unwrap();
let fd = unsafe { libc::open(c_path.as_ptr(), libc::O_CREAT | libc::O_WRONLY, 0o644) };
if fd < 0 {
// convert to Status via posix_error in the real integration
unimplemented!();
}
let mut wf = PosixWritableFile::new(path.to_string(), fd);
let data = Slice::from("hello world");
wf.append(&data)?;
wf.sync()?;
wf.close()?;
Ok(())
}
In a real system the Status type provides rich error reporting and the traits allow polymorphic substitution of other WritableFile backends (e.g., in-memory or testing variants).
bitcoinleveldb-posixwfile0.1.12024libc-compatible C library.This crate exists to deliver robust on-disk behavior under adverse conditions:
fsync/fdatasync and F_FULLFSYNC (where available).fsync returning EINVAL, are tolerated in a manner consistent with Bitcoin Core.Status type rather than panicking.Systems relying on Bitcoin-leveldb can use this component as the final stage of their persistence pipeline, trusting that it adheres to the same invariants as the C++ implementation it mirrors.