Crates.io | lsmtk |
lib.rs | lsmtk |
version | 0.11.0 |
source | src |
created_at | 2023-09-21 19:49:06.373868 |
updated_at | 2024-12-04 14:38:45.614294 |
description | lsmtk provides a log-structured-merge-graph |
homepage | |
repository | https://github.com/rescrv/blue |
max_upload_size | |
id | 979925 |
size | 172,064 |
This library provides an implementation of an log-structured merge tree; this merge tree uses a new compaction algorithm called triangular compaction that achieves a factor of 6.5x write amplification both in theory and in practice.
Triangular compaction takes note of a single observation: moving from one level to the next is inefficent; instead, the compaction should spread across levels to move data across as many levels as is possible.
The core of the compaction algorithm comes from the intuition embedded in this Python snippet:
sums = 0
count = 0
ingest = 0
for i in range(2**ceil(LSM_NUM_LEVELS)):
bits = 0
while i and (1 << bits) & i:
bits += 1
sums += (1 << bits) * TARGET_FILE_SIZE * LSM_LEVEL_0_NUM_FILES
ingest += TARGET_FILE_SIZE * LSM_LEVEL_0_NUM_FILES
count += 1
COMPACTION_AVERAGE_SIZE = sums / count + LSM_LEVEL_0_NUM_FILES * TARGET_FILE_SIZE # bytes
WRITE_AMPLIFICATION = sums / ingest
In this compaction we select the lowest N levels that are full, stopping at the
first level that's empty. In this way we amortize the cost of a compaction
similar to how a Vec
works in Rust or vector
in C++.
The triangular compaction algorithm generalizes this intuition to select triangles from the LSM tree such that the transitive closure of files under that level will be included in the compaction.
Check src/tree/mod.rs
for the compaction algorithm.
Write amplification is bounded above by 6.5x.
In an empirical benchmark where we ingested 44GiB of data we saw the write amplification stayed under a factor of 3x (trash is the ssts that were compacted away; it will be emptied periodically in a real system):
4.0K db/compaction
71M db/ingest
8.8M db/mani
44G db/sst
127G db/trash
In process counters confirms a 2.9x write amplification.
lsmtk.bytes_ingested = 46552563486
lsmtk.compaction.bytes_written = 135401535775
Active development.
This library provides pieces of an lsm graph. It will eventually grow to support everything necessary to create an embedded lsm graph analogous to LevelDB and RocksDB.
The latest documentation is always available at docs.rs.