| Crates.io | bitcoinleveldb-key |
| lib.rs | bitcoinleveldb-key |
| version | 0.1.19 |
| created_at | 2023-01-18 05:37:22.539968+00 |
| updated_at | 2025-12-01 16:59:21.625439+00 |
| description | Low-level LevelDB-compatible internal key encoding, comparison, and filter-policy utilities used by bitcoin-rs for on-disk key/value layout and MemTable/SSTable lookups. |
| homepage | |
| repository | https://github.com/klebs6/bitcoin-rs |
| max_upload_size | |
| id | 761524 |
| size | 262,605 |
Low-level internal key encoding utilities for the bitcoin-rs LevelDB port. This crate provides a faithful, byte-for-byte reproduction of LevelDB's internal key machinery, specialized for Bitcoin's storage engine.
It focuses on:
(user_key, sequence_number, value_type) into an opaque byte sequence.InternalKeyComparator that respects both user key order and sequence numbers.LevelDB (and this port) do not store raw user keys directly. Instead, they use an internal key:
internal_key := user_key || tag
where
tag := pack(sequence_number: 56 bits, value_type: 8 bits)
This design allows multiple versions of the same logical user key to coexist, with different sequence_number values and ValueType variants:
ValueType::TypeValue — key is present with a valueValueType::TypeDeletion — key is logically deleted at that sequenceThe sort order for internal keys is:
SliceComparator (or bytewise lexicographic fallback).This crate encapsulates these invariants so that higher-level components (MemTable, SSTable, DB implementation) can operate safely without re-implementing the encoding logic.
The following invariants are critical and preserved here:
ValueType is #[repr(u8)] and the discriminants are stable on disk — they must not change.SequenceNumber is a 64‑bit integer, but only the high 56 bits are used when packed, leaving the lower 8 bits for ValueType.#[repr(u8)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
pub enum ValueType {
TypeDeletion,
TypeValue,
}
Represents the logical type of a versioned key. The enum values are encoded directly into the low 8 bits of the tag and therefore must remain stable.
ValueType::from_tag(tag: u8) -> Option<Self> maps raw tags back to the enum, enforcing the valid subset {0x00, 0x01}.
pub trait Key {
fn key(&self) -> Slice;
}
pub trait Value {
fn value(&self) -> Slice;
}
Minimal traits abstracting over key/value providers that expose their representation as a Slice (defined elsewhere in the bitcoinleveldb ecosystem).
These traits are intentionally narrow: they push policy into higher layers and keep the encoding layer purely about bytes.
pub type SequenceNumber = u64;
#[derive(Setters, Getters)]
#[getset(get = "pub", set = "pub")]
pub struct ParsedInternalKey {
user_key: Slice,
sequence: SequenceNumber,
ty: ValueType,
}
ParsedInternalKey is a structured view of an internal key:
user_key: the raw user key bytessequence: the sequence numberty: the ValueTypeKey operations:
ParsedInternalKey::new(u: &Slice, seq: &SequenceNumber, t: ValueType) -> SelfParsedInternalKey::default() — uses an empty key and VALUE_TYPE_FOR_SEEK as configured elsewhere.debug_string(&self) -> String — stable debug formatting: '<escaped_user>' @ seq : type_tag.#[derive(Clone)]
pub struct InternalKey {
rep: String,
}
Represents a fully encoded internal key as a sequence of bytes. The encoding is not UTF‑8 by design; the use of String is purely a compatibility artifact with the original LevelDB.
Key methods:
InternalKey::new(user_key: &Slice, s: SequenceNumber, t: ValueType) -> Self
Encodes (user_key, s, t) into the rep buffer using append_internal_key.
fn decode_from(&mut self, s: &Slice) -> bool
Decodes an internal key from raw bytes into self.rep. Returns false for empty input.
fn encode(&self) -> Slice
Returns a Slice view over the internal key bytes. Panics if rep is empty.
fn user_key(&self) -> Slice
Extracts the user key portion by stripping the trailing 8‑byte tag.
fn set_from(&mut self, p: &ParsedInternalKey) / fn clear(&mut self)
Reset or rebuild rep from a parsed representation.
fn debug_string(&self) -> String
Parses the internal representation and renders a ParsedInternalKey debug string, or (bad)<escaped_bytes> if parsing fails.
The Debug implementation for InternalKey uses debug_string() instead of raw bytes, yielding stable, human-meaningful logging.
pub fn pack_sequence_and_type(seq: u64, t: ValueType) -> u64
Packs seq and t into a single u64:
seq (constrained by MAX_SEQUENCE_NUMBER elsewhere)t as u8 (constrained by VALUE_TYPE_FOR_SEEK elsewhere)Ensures:
seq <= MAX_SEQUENCE_NUMBERt as u64 <= VALUE_TYPE_FOR_SEEK as u64pub fn internal_key_encoding_length(k: &ParsedInternalKey) -> usize
Returns user_key_len + 8. This is deterministic and matches the C++ implementation.
pub fn parse_internal_key(internal_key: &Slice, result: *mut ParsedInternalKey) -> bool
Given an encoded internal key, fills *result with:
user_key = all but last 8 bytessequence = high 56 bits of the decoded tagty = ValueType::from_tag(low_8_bits)Returns false if:
< 8.VALUE_TYPE_FOR_SEEK.Caller provides a non-null pointer to a ParsedInternalKey instance.
pub fn append_internal_key(result: *mut String, k: &ParsedInternalKey)
Appends the encoded internal key for k onto the provided String buffer:
pack_sequence_and_type and encode_fixed64_le.These functions provide self-contained, allocation-free primitives for manipulating key bytes:
encode_fixed64_le(value: u64) -> [u8; 8] and decode_fixed64_le(ptr: *const u8) -> u64 — fixed 64‑bit little-endian encode/decode.put_varint32_vec(dst: &mut Vec<u8>, v: u32) — varint32 encoding into an existing Vec<u8>.decode_varint32(src: &[u8]) -> (u32, usize) — decodes a varint32, asserting on malformed input.slice_as_bytes(s: &Slice) -> &[u8] — safe view over a Slice.bytewise_compare(a: &[u8], b: &[u8]) -> i32 — lexicographic compare with LevelDB-style {-1,0,1} result.Key-range shortening helpers, used by compaction and iterators:
shorten_separator_user_keys(start: &[u8], limit: &[u8]) -> Option<Vec<u8>>
Attempts to construct a minimal user key between start and limit by bumping the first differing byte when possible.
find_short_successor_user_key(key: &mut Vec<u8>) -> bool
Makes key a short successor by incrementing the first non-0xff byte and truncating after it.
Debug utility:
escape_for_debug(input: &[u8]) -> Stringpub fn extract_user_key(internal_key: &Slice) -> Slice
Returns the user key portion from a full internal key. Panics if the key is shorter than 8 bytes.
This is the canonical way to slice away the tag when you know you have a valid internal key.
pub struct InternalFilterPolicy {
user_policy: *const dyn FilterPolicy,
}
Wraps a user-provided FilterPolicy (e.g., Bloom filter) and transparently converts internal keys into user keys before delegating.
Implements:
FilterPolicyNamedCreateFilterKeyMayMatchKey behavior:
create_filter(&self, keys: *const Slice, n: i32, dst: &mut Vec<u8>)
keys array in place: each entry becomes its user key by applying extract_user_key.user_policy.create_filter if provided.key_may_match(&self, k: &Slice, f: &Slice) -> bool
user_key from internal key k.user_policy.key_may_match.name(&self) -> Cow<'_, str>
InternalFilterPolicy::new(p: *const dyn FilterPolicy) -> Self constructs a wrapper around p. Passing a null pointer disables filtering.
pub struct InternalKeyComparator {
user_comparator: *const dyn SliceComparator,
}
Implements the precise ordering semantics for internal keys:
user_comparator if provided, else bytewise.sequence+type tag such that higher sequence numbers sort first.Implements:
SliceComparatorCompareNamedFindShortSuccessorFindShortestSeparatorCore methods:
pub fn compare_slices(&self, akey: &Slice, bkey: &Slice) -> i32
extract_user_key.user_comparator or bytewise_compare.decode_fixed64_le and orders by descending tag (a_num > b_num yields -1).compare_internal_key(&self, a: &InternalKey, b: &InternalKey) -> i32 is a convenience wrapper taking InternalKey objects.
Used during compaction to minimize index key sizes while preserving ordering.
find_short_successor(&self, k: &mut Vec<u8>)
k as an internal-key byte vector.find_short_successor_user_key).MAX_SEQUENCE_NUMBER and VALUE_TYPE_FOR_SEEK.find_shortest_separator(&self, start: &mut Vec<u8>, limit: &[u8])
start and limit.MAX_SEQUENCE_NUMBER and VALUE_TYPE_FOR_SEEK for the tag.start and limit in internal-key order.InternalKeyComparator::new(c: *const dyn SliceComparator) -> Self creates a comparator; null_slice_comparator() can be used to trigger fallback bytewise logic without providing a concrete comparator.
The name() implementation returns a stable identifier:
"leveldb.InternalKeyComparator"
pub struct LookupKey {
start: *const u8,
kstart: *const u8,
end: *const u8,
space: [u8; 200],
buf: Vec<u8>,
}
LookupKey assists with constructing the exact key bytes used when issuing point lookups against MemTables and internal iterators.
The layout encoded in buf is:
varint32(internal_key_len) || user_key || tag
Where internal_key_len = user_key_len + 8.
Construction:
impl LookupKey {
pub fn new(user_key: &Slice, sequence: SequenceNumber) -> Self;
}
The constructor:
buf.(sequence, VALUE_TYPE_FOR_SEEK) into a tag and appends it.start, kstart, and end inside buf for fast slicing.Accessors:
memtable_key(&self) -> Slice
Returns varint_len || internal_key — used directly as the key in the MemTable.
internal_key(&self) -> Slice
Returns user_key || tag, i.e., the internal key view.
user_key(&self) -> Slice
Returns the user portion, ensuring there is room for the tag.
The Drop implementation is trivial: the backing Vec and stack storage clean themselves up; raw pointers are only references into that storage and do not own anything.
pub fn null_slice_comparator() -> *const dyn SliceComparator
Constructs a syntactically valid but semantically null trait-object pointer to SliceComparator.
This is never dereferenced intentionally; it exists solely to exercise and validate the "no user comparator provided" code paths. It uses transmute over (0usize, 0usize) for the vtable/data pair and must be handled with care.
use bitcoinleveldb_key::{
ParsedInternalKey, InternalKey, ValueType,
SequenceNumber, internal_key_encoding_length,
parse_internal_key, extract_user_key,
};
use bitcoinleveldb_slice::Slice; // assuming this is the Slice type used in the project
fn roundtrip_example() {
let user_bytes = b"example-key";
let user_slice = unsafe { Slice::from_ptr_len(user_bytes.as_ptr(), user_bytes.len()) };
let seq: SequenceNumber = 42;
let ty = ValueType::TypeValue;
// Structured representation
let parsed = ParsedInternalKey::new(&user_slice, &seq, ty);
let encoded_len = internal_key_encoding_length(&parsed);
assert_eq!(encoded_len, user_bytes.len() + 8);
// Opaque InternalKey wrapper
let ikey = InternalKey::new(&user_slice, seq, ty);
let encoded_slice = ikey.encode();
// Parse back into ParsedInternalKey
let mut parsed_back = ParsedInternalKey::default();
let ok = parse_internal_key(&encoded_slice, &mut parsed_back as *mut ParsedInternalKey);
assert!(ok);
let back_user = extract_user_key(&encoded_slice);
let back_user_bytes = unsafe {
std::slice::from_raw_parts(*back_user.data(), *back_user.size())
};
assert_eq!(back_user_bytes, user_bytes);
assert_eq!(*parsed_back.sequence(), seq);
assert_eq!(*parsed_back.ty(), ty);
}
use bitcoinleveldb_key::{
InternalKey, InternalKeyComparator,
ValueType, SequenceNumber,
};
use bitcoinleveldb_slice::SliceComparator; // your implementation
fn comparator_example() {
// For pure bytewise ordering, you may pass a null comparator and rely on fallback.
let cmp = InternalKeyComparator::new(bitcoinleveldb_key::null_slice_comparator());
let uk1 = b"a";
let uk2 = b"a";
let s1: SequenceNumber = 1;
let s2: SequenceNumber = 2;
let k1 = InternalKey::new(
&unsafe { bitcoinleveldb_slice::Slice::from_ptr_len(uk1.as_ptr(), uk1.len()) },
s1,
ValueType::TypeValue,
);
let k2 = InternalKey::new(
&unsafe { bitcoinleveldb_slice::Slice::from_ptr_len(uk2.as_ptr(), uk2.len()) },
s2,
ValueType::TypeValue,
);
// k2 is newer (higher sequence) and must sort before k1.
assert!(cmp.compare_internal_key(&k2, &k1) < 0);
}
use bitcoinleveldb_key::{LookupKey, SequenceNumber};
use bitcoinleveldb_slice::Slice;
fn memtable_lookup_example() {
let user_key_bytes = b"height:00000010";
let user_key = unsafe { Slice::from_ptr_len(user_key_bytes.as_ptr(), user_key_bytes.len()) };
let snapshot_seq: SequenceNumber = 123_456;
let lookup = LookupKey::new(&user_key, snapshot_seq);
let memtable_key = lookup.memtable_key(); // varint32(len) || internal_key
let internal_key = lookup.internal_key(); // user_key || tag
let just_user = lookup.user_key(); // user_key only
// The MemTable layer treats `memtable_key` as the primary key.
drop((memtable_key, internal_key, just_user));
}
bitcoin-rs repository and is intended to be used in concert with sibling crates providing Slice, filter policies, comparators, MemTable implementations, and SSTable abstractions.ValueType discriminants or encoding routines if you care about read-compatibility.unsafe with raw pointers (Slice, trait-object pointers, etc.). The invariants are mirroring the C++ library; ensure any external uses respect lifetimes and aliasing constraints.bitcoinleveldb-keyklebs <none>If you extend or wrap this crate, preserve the encoding, comparison, and filter semantics to maintain compatibility with existing data and with canonical LevelDB behavior.