Crates.io | nebari |
lib.rs | nebari |
version | 0.5.5 |
source | src |
created_at | 2021-09-25 15:05:14.632965 |
updated_at | 2023-02-27 15:05:24.12291 |
description | ACID-compliant database storage implementation using an append-only file format. |
homepage | |
repository | https://github.com/khonsulabs/nebari |
max_upload_size | |
id | 456207 |
size | 428,441 |
nebari - noun - the surface roots that flare out from the base of a bonsai tree
This crate provides the Roots
type, which is the transactional storage layer
for BonsaiDb
. It is loosely inspired by
Couchstore
.
This crate blocks the current thread when accessing the filesystem. If you are looking for an async-ready database, BonsaiDb is our vision of an async-aware database built atop Nebari.
This crate is alpha. While its format is considered stable, there may be bugs that could lead to data loss. Please have a good backup strategy while using this crate.
Inserting a key-value pair in an on-disk tree with full revision history:
use nebari::{
tree::{Root, Versioned},
Config,
};
let database_folder = tempfile::tempdir().unwrap();
let roots = Config::default_for(database_folder.path())
.open()
.unwrap();
let tree = roots.tree(Versioned::tree("a-tree")).unwrap();
tree.set("hello", "world").unwrap();
For more examples, check out nebari/examples/
.
Nebari exposes multiple levels of functionality. The lowest level functionality
is the
TreeFile
.
A TreeFile
is a key-value store that uses an append-only file format for its
implementation.
Using TreeFile
s and a transaction log,
Roots
enables
ACID-compliant, multi-tree transactions.
Each tree supports:
TreeFile::modify()
which allows operating on one or more keys and performing various
operations.Vault
trait allows you to
bring your own encryption, compression, or other functionality to this format.
Each independently-addressible chunk of data that is written to the file
passes through the vault.VersionedTreeRoot
to store information that allows scanning old revision information. Or, if you
want to avoid the extra IO, use the
UnversionedTreeRoot
which only stores the information needed to retrieve the latest data in the
file.Atomicity: Every operation on a TreeFile
is done atomically.
Operation::CompareSwap
can be used to perform atomic operations that require evaluating the
currently stored value.
Consistency: Atomic locking operations are used when publishing a new transaction state. This ensures that readers can never operate on a partially updated state.
Isolation: Currently, each tree can only be accessed exclusively within a transaction. This means that if two transactions request the same tree, one will execute and complete before the second is allowed access to the tree. This strategy could be modified in the future to allow for more flexibility.
Durability: The append-only file format is designed to only allow reading data that has been fully flushed to disk. Any writes that were interrupted will be truncated from the end of the file.
Transaction IDs are recorded in the tree headers. When restoring from disk, the transaction IDs are verified with the transaction log. Because of the append-only format, if we encounter a transaction that wasn't recorded, we can continue scanning the file to recover the previous state. We do this until we find a successfluly commited transaction.
This process is much simpler than most database implementations due to the simple guarantees that append-only formats provide.
@ecton wasn't a database engineer before starting this project, and depending on your viewpoint may still not be considered a database engineer. Implementing ACID-compliance is not something that should be attempted lightly.
Creating ACID-compliance with append-only formats is much easier to achieve, however, as long as you can guarantee two things:
The B-Tree implementation in Nebari is designed to offer those exact guarantees.
The major downside of append-only formats is that deleted data isn't cleaned up until a maintenance process occurs: compaction. This process rewrites the file's contents, skipping over entries that are no longer alive. This process can happen without blocking the file from being operated on, but it does introduce IO overhead during the operation.
Nebari provides APIs that perform compaction, but currently delegates scheduling and automation to consumers of this library.
This project, like all projects from Khonsu Labs, are open-source. This repository is available under the MIT License or the Apache License 2.0.
To learn more about contributing, please see CONTRIBUTING.md.