cassadilia

Crates.io	cassadilia
lib.rs	cassadilia
version	0.4.1
created_at	2025-06-23 11:40:06.653472+00
updated_at	2025-09-25 11:16:24.78777+00
description	A content-addressable storage (CAS) system optimized for large blobs with read-mostly access patterns
homepage	https://github.com/broxus/cassadilia
repository	https://github.com/broxus/cassadilia
max_upload_size
id	1722830
size	227,009

Vladimir (0xdeafbeef)

documentation

https://docs.rs/cassadilia

You want to store blobs. Blobs are huge files, like 10mb+.
You will have read-mostly access pattern.
You want get_range(from, to) functionality.
You want 0 write amplification. You will have exactly one 1 write per blob without compactions and all of this stuff.
You tried to use rocksdb for this, and it killed your disk and brain :)

Firstly store data in cas, then index it. With this approach you don't need to handle consistency errors in the user code. You can get list of orphaned blobs on startup and do something with them. As user, you can do nothing if you have a record in the index for blob, but no actual blob in cas. Hi mr. rocksdb :)
Blobs are stored in a classic cas manner /h/a/s/h.
Hash is blake3
Index is stored in a single file. Which is periodically rewritten on wal roll-over.
Internally index is a BtreeMap.

Cassadilia uses a two-level locking strategy to ensure consistency:

A single shared mutex protects CAS fs operations:

Directory creation: Prevents races when multiple threads create the same hash prefix directory
Blob commits: Ensures atomic renames when moving staged files to CAS
Blob deletions: Ensures multiple blob deletions happen atomically as a group

The lock is held only during filesystem metadata operations, not during index updates or blob content writes.

We use scoped intents to prevent races between concurrent puts and deletes.

Register intent: record key -> blob_hash in pending_intents map; no refcount changes.
Commit: append WAL Put, apply it to the index updating refcounts, remove the intent, compute unreferenced blobs, exclude hashes still referenced by active intents, delete the rest.
Drop without commit: remove the current intent and restore any previously replaced intent; no refcount changes.
Remove ops: append WAL Remove, apply to state, filter unreferenced blobs against active intents, delete before doing checkpoint.

Intents are in-memory only; if a process exits before commit, any staged files become orphans, handled by startup orphan scanning.

Commit count: 29