fjall

Crates.io	fjall
lib.rs	fjall
version	3.0.1
created_at	2023-12-21 17:17:56.379608+00
updated_at	2026-01-06 00:52:02.00723+00
description	Log-structured, embeddable key-value storage engine
homepage	https://github.com/fjall-rs/fjall
repository	https://github.com/fjall-rs/fjall
max_upload_size
id	1077261
size	411,535

Marvin (marvin-j97)

documentation

README

Fjall (Nordic: "Mountain") is a log-structured, embeddable key-value storage engine written in Rust. It features:

A thread-safe BTreeMap-like API
100% safe & stable Rust
LSM-tree-based storage similar to RocksDB
Range & prefix searching with forward and reverse iteration
Multiple keyspaces (a.k.a. column families) with cross-keyspace atomic semantics
Built-in compression (default = LZ4)
Serializable transactions (optional)
Key-value separation for large blob use cases (optional)
Automatic background maintenance

It is not:

A standalone database server
A relational or wide-column database: it has no built-in notion of columns or query language

Basic usage

cargo add fjall

use fjall::{PersistMode, Database, KeyspaceCreateOptions};

// A database may contain multiple keyspaces
// You should probably only use a single database for your application
let db = Database::builder(folder).open()?;
// TxDatabase::builder for transactional semantics

// Each keyspace is its own physical LSM-tree, and thus isolated from other keyspaces
let items = db.keyspace("my_items", KeyspaceCreateOptions::default)?;

// Write some data
items.insert("a", "hello")?;

// And retrieve it
let bytes = items.get("a")?;

// Or remove it again
items.remove("a")?;

// Search by prefix
for kv in items.prefix("prefix") {
  // ...
}

// Search by range
for kv in items.range("a"..="z") {
  // ...
}

// Iterators implement DoubleEndedIterator, so you can search backwards, too!
for kv in items.prefix("prefix").rev() {
  // ...
}

// Sync the journal to disk to make sure data is definitely durable
// When the database is dropped, it will try to persist with `PersistMode::SyncAll` automatically
keyspace.persist(PersistMode::SyncAll)?;

[!TIP] Like any typical key-value store, keys are stored in lexicographic order. If you are storing integer keys (e.g. timeseries data), you should use the big endian form to have predictable ordering.

Durability

To support different kinds of workloads, Fjall is agnostic about the type of durability your application needs. After writing data (insert, remove or committing a write batch/transaction), you can choose to call Database::persist which takes a PersistMode parameter. By default, any operation will flush to OS buffers, but not to disk. This matches RocksDB's default durability. Also, when dropped, the database will try to persist the journal to disk synchronously.

Multithreading, Async and Multiprocess

[!WARNING] A single database may not be loaded in parallel from separate processes.

Fjall is internally synchronized for multi-threaded access, so you can clone around the Database and Keyspaces as needed, without needing to lock yourself.

For an async example, see the tokio example.

Memory usage

Generally, memory for loaded data, indexes etc. is managed on a per-block basis, and capped by the block cache capacity. Note that this also applies to returned values: When you hold a Slice, it keeps the backing buffer alive (which may be a block). If you know that you are going to keep a value around for a long time, you may want to copy it out into a new Vec<u8>, Box<[u8]>, Arc<[u8]> or new Slice (using Slice::new).

[!NOTE] It is recommended to configure the block cache capacity to be ~20-25% of the available memory - or more if the data set fits fully into memory.

Additionally, orthogonally to the block cache, each Keyspace has its own write buffer ("Memtable") which is the unit of data flushed back into the "proper" index structure.

Error handling

Fjall returns an error enum, however these variants are mostly used for debugging and tracing purposes, so your application is not expected to handle specific errors.

It's best to let the application crash and restart, which is the safest way to recover from transient I/O errors.

Transactional modes

The backing store (lsm-tree) is a MVCC key-value store, allowing repeatable snapshot reads. However this isolation level can not do read-modify-write operations without the chance of lost updates. Also, WriteBatch does not allow reading the intermediary state back as you would expect from a proper transaction. For that reason, if you need transactional semantics, you need to use one of the transactional database implementation (OptimisticTransactionDatabase or SingleWriterTransactionDatabase).

TL;DR: Fjall supports both transactional and non-transactional workloads. Chances are you want to use a transactional database, unless you know your workload does not need serializable transaction semantics.

Single writer

Opens a transactional database for single-writer (serialized) transactions. Single writer means only a single write transaction can run at a time. This is trivially serializable because it literally serializes write transactions.

Optimistic

Opens a transactional database for multi-writer, serializable transactions. Conflict checking is done using optimistic concurrency control, meaning transactions can conflict and may have to be rerun.

Feature flags

lz4

Allows using LZ4 compression, powered by lz4_flex.

Enabled by default.

bytes_1

Uses bytes 1.x as the underlying Slice type. Otherwise, byteview is used instead.

Disabled by default.

Stable disk format

Future breaking changes will result in a major version bump and a migration path.

For the underlying LSM-tree implementation, see: https://crates.io/crates/lsm-tree.

Examples

See here for practical examples.

Contributing

How can you help?

Ask a question
- or join the Discord server: https://discord.com/invite/HvYGp4NFFk
Post benchmarks and things you created
Open a PR,
- See open issues to pick up here
Open an issue (bug report, weirdness)

License

All source code is licensed under MIT OR Apache-2.0.

All contributions are to be licensed as MIT OR Apache-2.0.

Commit count: 1902