| Crates.io | surrealkv |
| lib.rs | surrealkv |
| version | 0.17.1 |
| created_at | 2024-02-08 13:00:15.865719+00 |
| updated_at | 2026-01-16 00:02:47.859217+00 |
| description | A low-level, versioned, embedded, ACID-compliant, key-value database for Rust |
| homepage | https://github.com/surrealdb/surrealkv |
| repository | https://github.com/surrealdb/surrealkv |
| max_upload_size | |
| id | 1132066 |
| size | 1,926,249 |
⚠️ Development Status: SurrealKV is currently under active development and is not feature complete. The API and implementation may change significantly between versions. Use with caution in production environments.
SurrealKV is a versioned, low-level, persistent, embedded key-value database implemented in Rust using an LSM (Log-Structured Merge) tree architecture with built-in support for time-travel queries.
use surrealkv::{Tree, TreeBuilder};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a new LSM tree using TreeBuilder
let tree = TreeBuilder::new()
.with_path("path/to/db".into())
.build()?;
// Start a read-write transaction
let mut txn = tree.begin()?;
// Set some key-value pairs
txn.set(b"hello", b"world")?;
// Commit the transaction (async)
txn.commit().await?;
Ok(())
}
SurrealKV can be configured through various options when creating a new LSM tree:
use surrealkv::TreeBuilder;
let tree = TreeBuilder::new()
.with_path("path/to/db".into()) // Database directory path
.with_max_memtable_size(100 * 1024 * 1024) // 100MB memtable size
.with_block_size(4096) // 4KB block size
.with_level_count(7) // Number of levels in LSM tree
.build()?;
Options:
with_path() - Database directory where SSTables and WAL files are storedwith_max_memtable_size() - Size threshold for memtable before flushing to SSTablewith_block_size() - Size of data blocks in SSTables (affects read performance)with_level_count() - Number of levels in the LSM tree structureSurrealKV supports per-level compression for SSTable data blocks, allowing different compression algorithms for different LSM levels. By default, no compression is used.
use surrealkv::{CompressionType, Options, TreeBuilder};
// Default: No compression (for maximum write performance)
let tree = TreeBuilder::new()
.with_path("path/to/db".into())
.build()?;
// Explicitly disable compression (same as default)
let opts = Options::new()
.with_path("path/to/db".into())
.without_compression();
let tree = TreeBuilder::with_options(opts).build()?;
// Per-level compression configuration
let opts = Options::new()
.with_path("path/to/db".into())
.with_compression_per_level(vec![
CompressionType::None, // L0: No compression for speed
CompressionType::SnappyCompression, // L1+: Snappy compression
]);
let tree = TreeBuilder::with_options(opts).build()?;
// Convenience: No compression on L0, Snappy on other levels
let opts = Options::new()
.with_path("path/to/db".into())
.with_l0_no_compression();
let tree = TreeBuilder::with_options(opts).build()?;
Options:
without_compression() - Disable compression for all levels (default behavior)with_compression_per_level() - Set compression type per level (vector index = level number)with_l0_no_compression() - Convenience method for no compression on L0, Snappy compression on other levelsCompression Types:
CompressionType::None - No compression (fastest writes, largest files)CompressionType::SnappyCompression - Snappy compression (good balance of speed and compression ratio)The Value Log (VLog) separates large values from the LSM tree for more efficient storage and compaction.
let tree = TreeBuilder::new()
.with_path("path/to/db".into())
.with_enable_vlog(true) // Enable VLog
.with_vlog_max_file_size(256 * 1024 * 1024) // 256MB VLog file size
.with_vlog_gc_discard_ratio(0.5) // Trigger GC at 50% garbage
.with_vlog_checksum_verification(VLogChecksumLevel::Full)
.build()?;
Options:
with_enable_vlog() - Enable/disable Value Log for large value storage
with_vlog_max_file_size() - Maximum size of VLog files before rotation
with_vlog_gc_discard_ratio() - Threshold (0.0-1.0) for triggering VLog garbage collection
with_vlog_checksum_verification() - Checksum verification level (Disabled or Full)
Enable time-travel queries to read historical versions of your data:
use surrealkv::{Options, TreeBuilder};
let opts = Options::new()
.with_path("path/to/db".into())
.with_versioning(true, 0); // Enable versioning, retention_ns = 0 means no limit
let tree = TreeBuilder::with_options(opts).build()?;
Note: Versioning requires VLog to be enabled. When you call with_versioning(true, retention_ns), VLog is automatically enabled and configured appropriately.
use surrealkv::TreeBuilder;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let tree = TreeBuilder::new()
.with_path("path/to/db".into())
.build()?;
// Write Transaction
{
let mut txn = tree.begin()?;
// Set multiple key-value pairs
txn.set(b"foo1", b"bar1")?;
txn.set(b"foo2", b"bar2")?;
// Commit changes (async)
txn.commit().await?;
}
// Read Transaction
{
let txn = tree.begin()?;
if let Some(value) = txn.get(b"foo1")? {
println!("Value: {:?}", value);
}
}
Ok(())
}
Note: The transaction API accepts flexible key and value types through the IntoBytes trait. You can use &[u8], &str, String, Vec<u8>, or Bytes for both keys and values.
SurrealKV supports three transaction modes for different use cases:
use surrealkv::Mode;
// Read-write transaction (default)
let mut txn = tree.begin()?;
// Read-only transaction - prevents any writes
let txn = tree.begin_with_mode(Mode::ReadOnly)?;
// Write-only transaction - optimized for writes, no reads allowed
let mut txn = tree.begin_with_mode(Mode::WriteOnly)?;
Range operations support efficient iteration over key ranges:
// Range scan between keys (inclusive start, exclusive end)
let mut txn = tree.begin()?;
let range: Vec<_> = txn.range(b"key1", b"key5")?
.map(|r| r.unwrap())
.collect();
// Keys-only scan (faster, doesn't fetch values when vlog is enabled)
let keys: Vec<_> = txn.keys(b"key1", b"key5")?
.map(|r| r.unwrap())
.collect();
// Range with limit using .take()
let limited: Vec<_> = txn.range(b"key1", b"key9")?
.take(10)
.map(|r| r.unwrap())
.collect();
// Delete a key
txn.delete(b"key1")?;
txn.commit().await?;
Note: Range iterators are double-ended, supporting both forward and backward iteration.
Efficiently count keys in a range without iterating through all values:
let mut txn = tree.begin()?;
// Count all keys between "key1" and "key9"
let count = txn.count(b"key1", b"key9")?;
println!("Found {} keys", count);
// Count with custom options (bounds, timestamp, etc.)
let options = ReadOptions::new()
.with_iterate_lower_bound(Some(b"a".to_vec()))
.with_iterate_upper_bound(Some(b"z".to_vec()));
let count = txn.count_with_options(&options)?;
Note: The count() operation is optimized and more efficient than manually counting iterator results, as it doesn't need to fetch or resolve values from the value log.
Control the durability guarantees for your transactions:
use surrealkv::Durability;
let mut txn = tree.begin()?;
// Eventual durability (default) - faster, data written to OS buffer
txn.set_durability(Durability::Eventual);
// Immediate durability - slower, fsync before commit returns
txn.set_durability(Durability::Immediate);
txn.set(b"key", b"value")?;
txn.commit().await?;
Durability Levels:
Eventual: Commits are guaranteed to be persistent eventually. Data is written to the kernel buffer but not fsynced before returning from commit(). This is the default and provides the best performance.
Immediate: Commits are guaranteed to be persistent as soon as commit() returns. Data is fsynced to disk before returning. This is slower but provides the strongest durability guarantees.
Time-travel queries allow you to read historical versions of your data at specific points in time.
use surrealkv::{Options, TreeBuilder};
let opts = Options::new()
.with_path("path/to/db".into())
.with_versioning(true, 0); // retention_ns = 0 means no retention limit
let tree = TreeBuilder::with_options(opts).build()?;
// Write data with explicit timestamps
let mut tx = tree.begin()?;
tx.set_at_version(b"key1", b"value_v1", 100)?;
tx.commit().await?;
// Update with a new version at a later timestamp
let mut tx = tree.begin()?;
tx.set_at_version(b"key1", b"value_v2", 200)?;
tx.commit().await?;
Query data as it existed at a specific timestamp:
let tx = tree.begin()?;
// Get value at specific timestamp
let value = tx.get_at_version(b"key1", 100)?;
assert_eq!(value.unwrap().as_ref(), b"value_v1");
// Get value at later timestamp
let value = tx.get_at_version(b"key1", 200)?;
assert_eq!(value.unwrap().as_ref(), b"value_v2");
// Range query at specific timestamp
let range: Vec<_> = tx.range_at_version(b"key1", b"key9", 150, None)?
.map(|r| r.unwrap())
.collect();
// Keys-only query at timestamp (faster, when vlog enabled)
let keys: Vec<_> = tx.keys_at_version(b"key1", b"key9", 150, None)?
.map(|r| r.unwrap())
.collect();
// Count keys at a specific timestamp
let count = tx.count_at_version(b"key1", b"key9", 150)?;
println!("Found {} keys at timestamp 150", count);
Get all historical versions of keys in a range:
let tx = tree.begin()?;
let versions = tx.scan_all_versions(b"key1", b"key2", None)?;
for (key, value, timestamp, is_tombstone) in versions {
if is_tombstone {
println!("Key {:?} deleted at timestamp {}", key, timestamp);
} else {
println!("Key {:?} = {:?} at timestamp {}", key, value, timestamp);
}
}
Use ReadOptions for fine-grained control over read operations:
use surrealkv::ReadOptions;
let tx = tree.begin()?;
// Range query with bounds
let options = ReadOptions::new()
.with_iterate_lower_bound(Some(b"a".to_vec()))
.with_iterate_upper_bound(Some(b"z".to_vec()));
let results: Vec<_> = tx.range_with_options(&options)?
.take(10) // Use .take() to limit results
.map(|r| r.unwrap())
.collect();
// Keys-only iteration (faster, doesn't fetch values from disk when vlog is enabled)
let options = ReadOptions::new()
.with_keys_only(true)
.with_iterate_lower_bound(Some(b"a".to_vec()))
.with_iterate_upper_bound(Some(b"z".to_vec()));
let keys: Vec<_> = tx.keys_with_options(&options)?
.take(100) // Use .take() to limit results
.map(|r| r.unwrap())
.collect();
// Point-in-time read with options (requires versioning enabled)
let options = ReadOptions::new()
.with_timestamp(Some(12345))
.with_iterate_lower_bound(Some(b"a".to_vec()))
.with_iterate_upper_bound(Some(b"z".to_vec()));
let historical_data: Vec<_> = tx.range_with_options(&options)?
.take(50) // Use .take() to limit results
.map(|r| r.unwrap())
.collect();
Create consistent point-in-time snapshots of your database for backup and recovery.
let tree = TreeBuilder::new()
.with_path("path/to/db".into())
.build()?;
// Insert some data
let mut txn = tree.begin()?;
txn.set(b"key1", b"value1")?;
txn.set(b"key2", b"value2")?;
txn.commit().await?;
// Create checkpoint
let checkpoint_dir = "path/to/checkpoint";
let metadata = tree.create_checkpoint(&checkpoint_dir)?;
println!("Checkpoint created at timestamp: {}", metadata.timestamp);
println!("Sequence number: {}", metadata.sequence_number);
println!("SSTable count: {}", metadata.sstable_count);
println!("Total size: {} bytes", metadata.total_size);
// Restore database to checkpoint state
tree.restore_from_checkpoint(&checkpoint_dir)?;
// Data is now restored to the checkpoint state
// Any data written after checkpoint creation is discarded
What's included in a checkpoint:
Note: Restoring from a checkpoint discards any pending writes in the active memtable and returns the database to the exact state when the checkpoint was created.
WebAssembly (WASM): Not supported due to fundamental incompatibilities:
Windows (x86_64): Basic functionality supported, but some features are limited:
SurrealKV has undergone a significant architectural evolution to address scalability challenges:
The original implementation used a versioned adaptive radix trie (VART) architecture with the following components:
Why the Change? The VART-based design had fundamental scalability limitations:
The new LSM (Log-Structured Merge) tree architecture provides:
This architectural change enables SurrealKV to handle larger then memory datasets.
Licensed under the Apache License, Version 2.0 - see the LICENSE file for details.