| Crates.io | jdb_pgm |
| lib.rs | jdb_pgm |
| version | 0.3.16 |
| created_at | 2026-01-05 07:32:15.964868+00 |
| updated_at | 2026-01-16 21:43:51.279181+00 |
| description | Ultra-fast learned index for sorted keys / 面向排序键的超快学习型索引 |
| homepage | https://github.com/js0-site/rust |
| repository | https://github.com/js0-site/rust.git |
| max_upload_size | |
| id | 2023306 |
| size | 236,640 |
A highly optimized, single-threaded Rust implementation of the Pgm-index (Piecewise Geometric Model index), designed for ultra-low latency lookups and minimal memory overhead.
jdb_pgm is a specialized reimplementation of the Pgm-index data structure. It approximates the distribution of sorted keys using piecewise linear models, enabling search operations with O(log ε) complexity.
This crate focuses on single-threaded performance, preparing for a "one thread per CPU" architecture. By removing concurrency overhead and optimizing memory layout (e.g., SIMD-friendly loops), it achieves statistically significant speedups over standard binary search and traditional tree-based indexes.
Add this to your Cargo.toml:
[dependencies]
jdb_pgm = "0.3"
Pgm<K> - Core index without data ownership (ideal for SSTable, mmap scenarios)
use jdb_pgm::Pgm;
fn main() {
let data: Vec<u64> = (0..1_000_000).collect();
// Build index from data reference
let pgm = Pgm::new(&data, 32, true).unwrap();
// Get predicted search range
let range = pgm.predict_range(123_456);
// Search in your own data store
// Note: Range is not Copy. We use range.start/end to create a slice index.
if let Ok(pos) = data[range.start..range.end].binary_search(&123_456) {
println!("Found at index: {}", range.start + pos);
}
}
PgmData<K> - Index with data ownership (convenient for in-memory use)
use jdb_pgm::PgmData;
fn main() {
let data: Vec<u64> = (0..1_000_000).collect();
// Build index and take ownership of data
let index = PgmData::load(data, 32, true).unwrap();
// Direct lookup
if let Some(pos) = index.get(123_456) {
println!("Found at index: {}", pos);
}
}
data (default): Enables PgmData struct with data ownershipbitcode: Enables serialization via bitcodekey_to_u64: Enables key_to_u64() helper for byte keysBased on internal benchmarks with 1,000,000 u64 keys (jdb_pgm's Pgm does not own data, memory is index-only):
ε=32 (pgm_index uses 8.35 MB).pgm_indexThis crate (jdb_pgm) is a specialized fork/rewrite of the original concept found in pgm_index. While the original library aims for general-purpose usage with multi-threading support (Rayon), jdb_pgm takes a different approach:
| Feature | jdb_pgm | pgm_index |
|---|---|---|
| Threading | Single-threaded | Multi-threaded (Rayon) |
| Segment Building | Shrinking Cone O(N) | Parallel Least Squares |
| Prediction Model | slope * key + intercept |
(key - intercept) / slope |
| Prediction Accuracy | ε-bounded (guaranteed) | Heuristic (not guaranteed) |
| Memory | Arc-free, zero-copy | Arc<Vec |
| Data Ownership | Optional (Pgm vs PgmData) |
Always owns data |
| Dependencies | Minimal | rayon, num_cpus, num-traits |
The original pgm_index introduces Rayon for parallel processing. However, in modern high-performance databases (like ScyllaDB or specialized engines), the thread-per-core architecture is often superior.
jdb_pgm: Shrinking Cone (Optimal PLA)
The streaming algorithm guarantees that prediction error never exceeds ε, while least squares fitting provides no such guarantee.
// O(N) streaming algorithm with guaranteed ε-bound
while end < n {
slope_lo = (idx - first_idx - ε) / dx
slope_hi = (idx - first_idx + ε) / dx
if min_slope > max_slope: break // cone collapsed
// Update shrinking cone bounds
}
slope = (min_slope + max_slope) / 2
pgm_index: Parallel Least Squares
// Divides data into fixed chunks, fits each with least squares
target_segments = optimal_segment_count_adaptive(data, epsilon)
segments = (0..target_segments).par_iter().map(|i| {
fit_segment(&data[start..end], start) // least squares fit
}).collect()
jdb_pgm: pos = slope * key + intercept
pgm_index: pos = (key - intercept) / slope
While based on the same Pgm theory, implementation details are significantly more aggressive:
round/floor) with bitwise-based integer casting (as isize + 0.5).predict and search phases with manual clamping and branchless logic, drastically reducing pipeline stalls.(N / 2ε) ahead of time, effectively eliminating vector reallocations during construction.u8, u16, u32, u64, i8, i16, i32, i64.epsilon parameter strictly controls the search range.Pgm for external data, PgmData for owned data.The index construction and lookup process allows for extremely fast predictions of key positions.
graph TD
subgraph Construction [Construction Phase]
A[Sorted Data] -->|build_segments| B[Linear Segments]
B -->|build_lut| C[Look-up Table]
end
subgraph Query [Query Phase]
Q[Search Key] -->|find_seg| S[Select Segment]
S -->|predict| P[Approximate Pos]
P -->|binary_search| F[Final Position]
end
C -.-> S
B -.-> S
The dataset is scanned to create Piecewise Linear Models (segments) that approximate the key distribution within an error ε. Each segment stores:
min_key, max_key: Key range boundariesslope, intercept: Linear model parametersstart_idx, end_idx: Data position rangeA secondary lookup table (LUT) enables O(1) segment selection by mapping key ranges to segment indices.
pos = slope * key + intercept to get an approximate position.[pos - ε, pos + ε] for exact match.This design ensures that the binary search operates on a tiny window (typically < 64 elements) regardless of dataset size, achieving near-constant lookup time.
aok, static_init, criterion (for benchmarks)jdb_pgm/
├── src/
│ ├── lib.rs # Exports and entry point
│ ├── pgm.rs # Core Pgm struct (no data ownership)
│ ├── data.rs # PgmData struct (with data ownership)
│ ├── build.rs # Segment building algorithm
│ ├── types.rs # Key trait, Segment, PgmStats
│ ├── consts.rs # Constants
│ └── error.rs # Error types
├── tests/
│ ├── pgm.rs # Integration tests for Pgm
│ └── data.rs # Integration tests for PgmData
├── benches/
│ ├── main.rs # Criterion benchmark suite
│ └── bench_*.rs # Individual benchmark files
├── examples/
│ ├── simple_benchmark.rs
│ └── test_bitcode.rs
└── readme/
├── en.md
└── zh.md
Pgm<K> (Core, no data ownership)The primary index structure that holds only the index metadata, not the data itself. Ideal for scenarios where data is stored externally (SSTable, memory-mapped files).
Construction
pub fn new(data: &[K], epsilon: usize, check_sorted: bool) -> Result<Self>
Builds the index from a sorted data slice.
data: Reference to sorted key arrayepsilon: Maximum prediction error (controls segment granularity)check_sorted: If true, validates data is sorted before buildingResult<Pgm<K>> with potential PgmErrorPrediction Methods
pub fn predict(key: K) -> usize
Returns the predicted position for a key using the linear model.
pub fn predict_range(key: K) -> std::ops::Range<usize>
Returns the search range start..end for a key, bounded by epsilon.
Search Methods
pub fn find<'a, Q, F>(&self, key: &Q, get_key: F) -> usize
where
Q: ToKey<K> + ?Sized,
F: Fn(usize) -> Option<&'a [u8]>,
Finds the insertion point for a key using byte comparison. Returns the index where the key would be inserted.
pub fn find_key<F>(&self, key: K, get_key: F) -> usize
where
F: Fn(usize) -> Option<K>,
Finds the insertion point using Key type comparison.
Metadata Methods
pub fn segment_count() -> usize
Returns the number of segments in the index.
pub fn avg_segment_size() -> f64
Returns the average number of keys per segment.
pub fn mem_usage() -> usize
Returns memory usage of the index (excluding data).
pub fn len() -> usize
pub fn is_empty() -> bool
Standard collection methods.
PgmData<K> (With data ownership, requires data feature)Convenient wrapper that owns both the index and the data, providing direct lookup methods.
Construction
pub fn load(data: Vec<K>, epsilon: usize, check_sorted: bool) -> Result<Self>
Builds the index and takes ownership of data.
Lookup Methods
pub fn get(key: K) -> Option<usize>
Returns the index of the key if found, or None.
pub fn get_many<'a, I>(&'a self, keys: I) -> impl Iterator<Item = Option<usize>> + 'a
where
I: IntoIterator<Item = K> + 'a,
Batch lookup returning an iterator of results.
pub fn count_hits<I>(&self, keys: I) -> usize
where
I: IntoIterator<Item = K>,
Counts how many keys from the iterator exist in the index.
Metadata Methods
pub fn data() -> &[K]
Returns reference to underlying data.
pub fn memory_usage() -> usize
Returns total memory usage (data + index).
pub fn stats() -> PgmStats
Returns comprehensive statistics including segment count, average segment size, and memory usage.
Segment<K>Represents a single linear segment in the index.
pub struct Segment<K: Key> {
pub min_key: K, // Minimum key in this segment
pub max_key: K, // Maximum key in this segment
pub slope: f64, // Linear model slope
pub intercept: f64, // Linear model intercept
pub start_idx: u32, // Starting data index
pub end_idx: u32, // Ending data index (exclusive)
}
PgmStatsIndex statistics structure.
pub struct PgmStats {
pub segments: usize, // Number of segments
pub avg_segment_size: f64, // Average keys per segment
pub memory_bytes: usize, // Total memory usage
}
Key TraitTrait defining requirements for indexable key types.
pub trait Key: Copy + Send + Sync + Ord + Debug + 'static {
fn as_f64(self) -> f64;
}
Implemented for: u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize, isize.
ToKey<K> TraitTrait for types that can be converted to Key and provide byte reference.
pub trait ToKey<K: Key> {
fn to_key(&self) -> K;
fn as_bytes(&self) -> &[u8];
}
Implemented for: [u8], &[u8], Vec<u8>, Box<[u8]>, [u8; N].
PgmErrorError types for index operations.
pub enum PgmError {
EmptyData, // Data cannot be empty
NotSorted, // Data must be sorted
InvalidEpsilon { // Epsilon must be >= MIN_EPSILON
provided: usize,
min: usize,
},
}
pub fn key_to_u64(key: &[u8]) -> u64 // Requires `key_to_u64` feature
Converts key bytes to u64 prefix (big-endian, pad with 0).
pub fn build_segments<K: Key>(data: &[K], epsilon: usize) -> Vec<Segment<K>>
Low-level function to build segments using the shrinking cone algorithm.
pub fn build_lut<K: Key>(data: &[K], segments: &[Segment<K>]) -> (Vec<u32>, f64, f64)
Low-level function to build the lookup table.
In the era of "Big Data," traditional B-Trees became a bottleneck due to their memory consumption and cache inefficiency. Each node in a B-Tree stores multiple keys and pointers, leading to poor cache locality and high memory overhead.
The breakthrough came in 2020 when Paolo Ferragina and Giorgio Vinciguerra introduced the Piecewise Geometric Model (Pgm) index in their paper "The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds." Their key insight was simple yet revolutionary: why store every key when the data's distribution often follows a predictable pattern?
By treating the index as a machine learning problem—learning the Cumulative Distribution Function (CDF) of the data—they reduced the index size by orders of magnitude while maintaining O(log N) worst-case performance. The Pgm-index approximates the key distribution using piecewise linear functions, where each segment guarantees that the prediction error never exceeds a specified epsilon.
Before learned indexes, the field was dominated by heuristic approaches like B-Trees (1970s), Skip Lists (1989), and various hash-based structures. These all relied on predetermined structural properties rather than learning from the data itself. The Pgm-index pioneered the concept of "learned indexes" that adapt to data characteristics, opening a new research direction at the intersection of databases and machine learning.
This project, jdb_pgm, takes that concept and strips it down to its bare metal essentials for Rust. By focusing on single-threaded performance and eliminating overhead, it prioritizes raw speed on modern CPUs where every nanosecond counts—exactly what high-performance databases need in the era of thread-per-core architectures.
Performance comparison of Pgm-Index vs Binary Search with different epsilon values.
| Algorithm | Epsilon | Mean Time | Std Dev | Throughput | Memory |
|---|---|---|---|---|---|
| jdb_pgm | 32 | 17.85ns | 58.01ns | 56.01M/s | 1.01 MB |
| jdb_pgm | 64 | 17.91ns | 56.67ns | 55.83M/s | 512.00 KB |
| pgm_index | 32 | 20.13ns | 54.58ns | 49.67M/s | 8.35 MB |
| pgm_index | 64 | 23.16ns | 66.31ns | 43.18M/s | 8.38 MB |
| pgm_index | 128 | 25.91ns | 62.66ns | 38.60M/s | 8.02 MB |
| jdb_pgm | 128 | 26.15ns | 96.65ns | 38.25M/s | 256.00 KB |
| HashMap | null | 39.99ns | 130.55ns | 25.00M/s | 40.00 MB |
| Binary Search | null | 40.89ns | 79.06ns | 24.46M/s | - |
| BTreeMap | null | 84.21ns | 99.32ns | 11.87M/s | 16.83 MB |
| Data Size | Epsilon | jdb_pgm (Max) | jdb_pgm (Avg) | pgm_index (Max) | pgm_index (Avg) |
|---|---|---|---|---|---|
| 1,000,000 | 128 | 128 | 46.80 | 1024 | 511.28 |
| 1,000,000 | 32 | 32 | 11.35 | 256 | 127.48 |
| 1,000,000 | 64 | 64 | 22.59 | 512 | 255.39 |
| Data Size | Epsilon | jdb_pgm (Time) | pgm_index (Time) | Speedup |
|---|---|---|---|---|
| 1,000,000 | 128 | 1.28ms | 1.26ms | 0.98x |
| 1,000,000 | 32 | 1.28ms | 1.27ms | 0.99x |
| 1,000,000 | 64 | 1.28ms | 1.20ms | 0.94x |
Query Count: 1500000 Data Sizes: 10,000, 100,000, 1,000,000 Epsilon Values: 32, 64, 128
Epsilon (ε) controls the accuracy-speed trade-off:
Mathematical definition: ε defines the maximum absolute error between the predicted position and the actual position in the data array. When calling load(data, epsilon, ...), ε guarantees |pred - actual| ≤ ε, where positions are indices within the data array of length data.len().
Example: For 1M elements with ε=32, if the actual key is at position 1000:
ε=32 predicts position between 968-1032, then checks up to 64 elements
ε=128 predicts position between 872-1128, then checks up to 256 elements
Pgm-Index (Piecewise Geometric Model Index) is a learned index structure that approximates the distribution of keys with piecewise linear models. It provides O(log ε) search time with guaranteed error bounds, where ε controls the trade-off between memory and speed.
Binary search is the baseline for sorted array lookup. Pgm-Index aims to:
This project is an open-source component of js0.site ⋅ Refactoring the Internet Plan.
We are redefining the development paradigm of the Internet in a componentized way. Welcome to follow us:
经过高度优化的 Rust 版 Pgm 索引(分段几何模型索引)单线程实现,专为超低延迟查找和极小内存开销而设计。
jdb_pgm 是 Pgm-index 数据结构的专用重构版本。它使用分段线性模型近似排序键的分布,从而实现 O(log ε) 复杂度的搜索操作。
本 crate 专注于 单线程性能,为"一线程一核 (One Thread Per CPU)"的架构做准备。通过移除并发开销并优化内存布局(如 SIMD 友好的循环),与标准二分查找和传统树状索引相比,它实现了具有统计意义的显著速度提升。
在 Cargo.toml 中添加依赖:
[dependencies]
jdb_pgm = "0.3"
Pgm<K> - 核心索引,不持有数据(适用于 SSTable、mmap 场景)
use jdb_pgm::Pgm;
fn main() {
let data: Vec<u64> = (0..1_000_000).collect();
// 从数据引用构建索引
let pgm = Pgm::new(&data, 32, true).unwrap();
// 获取预测的搜索范围
let range = pgm.predict_range(123_456);
// 在自己的数据存储中搜索
// 注意:Range 不实现 Copy,如果后续需要 range.start,需先保存或重新构建 Range
// 这里演示仅使用 range 的用法:
if let Ok(pos) = data[range.start..range.end].binary_search(&123_456) {
println!("Found at index: {}", range.start + pos);
}
}
PgmData<K> - 持有数据的索引(适用于内存使用场景)
use jdb_pgm::PgmData;
fn main() {
let data: Vec<u64> = (0..1_000_000).collect();
// 构建索引并获取数据所有权
let index = PgmData::load(data, 32, true).unwrap();
// 直接查找
if let Some(pos) = index.get(123_456) {
println!("Found at index: {}", pos);
}
}
data(默认):启用持有数据的 PgmData 结构体bitcode:启用 bitcode 序列化key_to_u64:启用 key_to_u64() 辅助函数用于字节键基于 1,000,000 个 u64 键的内部基准测试(jdb_pgm 的 Pgm 不持有数据,仅统计索引内存):
ε=32 时,索引内存仅 1.01 MB(pgm_index 为 8.35 MB)。pgm_index 的对比本 crate (jdb_pgm) 是原版 pgm_index 概念的专用分叉/重写版本。原版库旨在通用并支持多线程(Rayon),而 jdb_pgm 采取了截然不同的优化路径:
| 特性 | jdb_pgm | pgm_index |
|---|---|---|
| 线程模型 | 单线程 | 多线程 (Rayon) |
| 段构建算法 | 收缩锥 O(N) | 并行最小二乘法 |
| 预测公式 | slope * key + intercept |
(key - intercept) / slope |
| 预测精度 | ε 有界(保证) | 启发式(无保证) |
| 内存 | 无 Arc,零拷贝 | Arc<Vec |
| 数据所有权 | 可选(Pgm vs PgmData) |
始终持有数据 |
| 依赖 | 最小化 | rayon, num_cpus, num-traits |
原版 pgm_index 引入了 Rayon 进行并行处理。然而,在现代高性能数据库(如 ScyllaDB 或专用引擎)中,线程绑定核心 (Thread-per-Core) 架构往往更具优势。
jdb_pgm: 收缩锥算法 (Optimal PLA)
流式算法保证预测误差永远不超过 ε,而最小二乘拟合无法提供这种保证。
// O(N) 流式算法,保证 ε 有界
while end < n {
slope_lo = (idx - first_idx - ε) / dx
slope_hi = (idx - first_idx + ε) / dx
if min_slope > max_slope: break // 锥体收缩至崩塌
// 更新收缩锥边界
}
slope = (min_slope + max_slope) / 2
pgm_index: 并行最小二乘法
// 将数据分成固定块,对每块进行最小二乘拟合
target_segments = optimal_segment_count_adaptive(data, epsilon)
segments = (0..target_segments).par_iter().map(|i| {
fit_segment(&data[start..end], start) // 最小二乘拟合
}).collect()
jdb_pgm: pos = slope * key + intercept
pgm_index: pos = (key - intercept) / slope
虽然基于相同的 Pgm 理论,但在具体代码实现层面上,算法更加激进:
round/floor) 替换为基于位操作的整数转换 (as isize + 0.5),在指令周期层面带来质的飞跃。intrinsic 代码即可生成 AVX2/AVX-512 指令。predict 和 search 阶段,大幅降低流水线停顿。(N / 2ε),有效消除构建过程中的向量重分配 (Reallocation)。u8, u16, u32, u64, i8, i16, i32, i64。epsilon 参数严格控制搜索范围。Pgm 用于外部数据,PgmData 用于持有数据。索引构建和查找过程允许极快地预测键的位置。
graph TD
subgraph Construction [构建阶段]
A[已排序数据] -->|build_segments| B[线性段模型]
B -->|build_lut| C[查找表 LUT]
end
subgraph Query [查询阶段]
Q[搜索键] -->|find_seg| S[选择段]
S -->|predict| P[近似位置]
P -->|binary_search| F[最终位置]
end
C -.-> S
B -.-> S
扫描数据集以创建分段线性模型(Segments),在误差 ε 内近似键的分布。每个段存储:
min_key, max_key:键范围边界slope, intercept:线性模型参数start_idx, end_idx:数据位置范围辅助查找表(LUT)通过将键范围映射到段索引,实现 O(1) 的段选择。
pos = slope * key + intercept 获取近似位置。[pos - ε, pos + ε] 内执行二分查找以精确匹配。此设计确保二分查找在极小窗口(通常 < 64 个元素)内操作,无论数据集大小如何,均实现近似常量的查找时间。
aok, static_init, criterion (用于基准测试)jdb_pgm/
├── src/
│ ├── lib.rs # 导出和入口点
│ ├── pgm.rs # 核心 Pgm 结构体(不持有数据)
│ ├── data.rs # PgmData 结构体(持有数据)
│ ├── build.rs # 段构建算法
│ ├── types.rs # Key trait, Segment, PgmStats
│ ├── consts.rs # 常量
│ └── error.rs # 错误类型
├── tests/
│ ├── pgm.rs # Pgm 集成测试
│ └── data.rs # PgmData 集成测试
├── benches/
│ ├── main.rs # Criterion 基准测试套件
│ └── bench_*.rs # 各个基准测试文件
├── examples/
│ ├── simple_benchmark.rs
│ └── test_bitcode.rs
└── readme/
├── en.md
└── zh.md
Pgm<K>(核心,不持有数据)主要索引结构,仅保存索引元数据,不保存数据本身。适用于数据外部存储的场景(SSTable、内存映射文件)。
构建
pub fn new(data: &[K], epsilon: usize, check_sorted: bool) -> Result<Self>
从已排序数据切片构建索引。
data:已排序键数组的引用epsilon:最大预测误差(控制段粒度)check_sorted:若为 true,构建前验证数据已排序Result<Pgm<K>>,可能包含 PgmError预测方法
pub fn predict(key: K) -> usize
使用线性模型返回键的预测位置。
pub fn predict_range(key: K) -> std::ops::Range<usize>
返回键的搜索范围 start..end,由 epsilon 限定。
搜索方法
pub fn find<'a, Q, F>(&self, key: &Q, get_key: F) -> usize
where
Q: ToKey<K> + ?Sized,
F: Fn(usize) -> Option<&'a [u8]>,
使用字节比较查找键的插入点。返回键应插入的索引。
pub fn find_key<F>(&self, key: K, get_key: F) -> usize
where
F: Fn(usize) -> Option<K>,
使用 Key 类型比较查找插入点。
元数据方法
pub fn segment_count() -> usize
返回索引中的段数量。
pub fn avg_segment_size() -> f64
返回每段的平均键数量。
pub fn mem_usage() -> usize
返回索引的内存使用量(不含数据)。
pub fn len() -> usize
pub fn is_empty() -> bool
标准集合方法。
PgmData<K>(持有数据,需要 data feature)便捷包装器,同时拥有索引和数据,提供直接查找方法。
构建
pub fn load(data: Vec<K>, epsilon: usize, check_sorted: bool) -> Result<Self>
构建索引并获取数据所有权。
查找方法
pub fn get(key: K) -> Option<usize>
如果找到,返回键的索引;否则返回 None。
pub fn get_many<'a, I>(&'a self, keys: I) -> impl Iterator<Item = Option<usize>> + 'a
where
I: IntoIterator<Item = K> + 'a,
批量查找,返回结果迭代器。
pub fn count_hits<I>(&self, keys: I) -> usize
where
I: IntoIterator<Item = K>,
统计迭代器中有多少键存在于索引中。
元数据方法
pub fn data() -> &[K]
返回底层数据引用。
pub fn memory_usage() -> usize
返回总内存使用量(数据 + 索引)。
pub fn stats() -> PgmStats
返回综合统计信息,包括段数、平均段大小和内存使用量。
Segment<K>表示索引中的单个线性段。
pub struct Segment<K: Key> {
pub min_key: K, // 段内最小键
pub max_key: K, // 段内最大键
pub slope: f64, // 线性模型斜率
pub intercept: f64, // 线性模型截距
pub start_idx: u32, // 起始数据索引
pub end_idx: u32, // 结束数据索引(不包含)
}
PgmStats索引统计信息结构。
pub struct PgmStats {
pub segments: usize, // 段数量
pub avg_segment_size: f64, // 每段平均键数
pub memory_bytes: usize, // 总内存使用量
}
Key Trait定义可索引键类型需求的 trait。
pub trait Key: Copy + Send + Sync + Ord + Debug + 'static {
fn as_f64(self) -> f64;
}
已实现类型:u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize, isize。
ToKey<K> Trait可转换为 Key 并提供字节引用的类型 trait。
pub trait ToKey<K: Key> {
fn to_key(&self) -> K;
fn as_bytes(&self) -> &[u8];
}
已实现类型:[u8], &[u8], Vec<u8>, Box<[u8]>, [u8; N]。
PgmError索引操作的错误类型。
pub enum PgmError {
EmptyData, // 数据不能为空
NotSorted, // 数据必须已排序
InvalidEpsilon { // Epsilon 必须 >= MIN_EPSILON
provided: usize,
min: usize,
},
}
pub fn key_to_u64(key: &[u8]) -> u64 // 需要 `key_to_u64` feature
将键字节转换为 u64 前缀(大端序,不足补0)。
pub fn build_segments<K: Key>(data: &[K], epsilon: usize) -> Vec<Segment<K>>
底层函数,使用收缩锥算法构建段。
pub fn build_lut<K: Key>(data: &[K], segments: &[Segment<K>]) -> (Vec<u32>, f64, f64)
底层函数,构建查找表。
在"大数据"时代,传统的 B-Tree 由于其内存消耗和缓存效率低逐渐成为瓶颈。B-Tree 的每个节点存储多个键和指针,导致缓存局部性差和内存开销高。
突破性进展出现在 2020 年,Paolo Ferragina 和 Giorgio Vinciguerra 在论文"The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds"中提出了 分段几何模型 (Pgm) 索引。他们的核心见解简单而具有革命性:如果数据分布通常遵循可预测的模式,为什么还要存储每个键呢?
通过将索引视为机器学习问题——学习数据的累积分布函数(CDF)——他们在保持 O(log N) 最坏情况性能的同时,将索引大小减少了几个数量级。Pgm-index 使用分段线性函数近似键分布,其中每个段保证预测误差永远不会超过指定的 epsilon。
在 Pgm-index 出现之前,该领域由启发式方法主导,如 B-Tree(1970年代)、Skip List(1989年)和各种基于哈希的结构。这些都依赖于预定的结构属性,而不是从数据本身学习。Pgm-index 开创了"学习型索引"的概念,根据数据特征自适应调整,开启了数据库和机器学习交叉领域的新研究方向。
本项目 jdb_pgm 借鉴了这一概念,并将其剥离至最本质的 Rust 实现。通过专注于单线程性能和消除开销,它在每一纳秒都至关重要的现代 CPU 上优先考虑原始速度——这正是高性能数据库在线程绑定核心架构时代所需要的。
Pgm-Index 与二分查找在不同 epsilon 值下的性能对比。
| 算法 | Epsilon | 平均时间 | 标准差 | 吞吐量 | 内存 |
|---|---|---|---|---|---|
| jdb_pgm | 32 | 17.85ns | 58.01ns | 56.01M/s | 1.01 MB |
| jdb_pgm | 64 | 17.91ns | 56.67ns | 55.83M/s | 512.00 KB |
| pgm_index | 32 | 20.13ns | 54.58ns | 49.67M/s | 8.35 MB |
| pgm_index | 64 | 23.16ns | 66.31ns | 43.18M/s | 8.38 MB |
| pgm_index | 128 | 25.91ns | 62.66ns | 38.60M/s | 8.02 MB |
| jdb_pgm | 128 | 26.15ns | 96.65ns | 38.25M/s | 256.00 KB |
| HashMap | null | 39.99ns | 130.55ns | 25.00M/s | 40.00 MB |
| 二分查找 | null | 40.89ns | 79.06ns | 24.46M/s | - |
| BTreeMap | null | 84.21ns | 99.32ns | 11.87M/s | 16.83 MB |
| 数据大小 | Epsilon | jdb_pgm (最大) | jdb_pgm (平均) | pgm_index (最大) | pgm_index (平均) |
|---|---|---|---|---|---|
| 1,000,000 | 128 | 128 | 46.80 | 1024 | 511.28 |
| 1,000,000 | 32 | 32 | 11.35 | 256 | 127.48 |
| 1,000,000 | 64 | 64 | 22.59 | 512 | 255.39 |
| 数据大小 | Epsilon | jdb_pgm (时间) | pgm_index (时间) | 加速比 |
|---|---|---|---|---|
| 1,000,000 | 128 | 1.28ms | 1.26ms | 0.98x |
| 1,000,000 | 32 | 1.28ms | 1.27ms | 0.99x |
| 1,000,000 | 64 | 1.28ms | 1.20ms | 0.94x |
查询次数: 1500000 数据大小: 10,000, 100,000, 1,000,000 Epsilon 值: 32, 64, 128
Epsilon (ε) 控制精度与速度的权衡:
数学定义:ε 定义了预测位置与实际位置在数据数组中的最大绝对误差。调用 load(data, epsilon, ...) 时,ε 保证 |pred - actual| ≤ ε,其中位置是长度为 data.len() 的数据数组中的索引。
举例说明:对于 100 万个元素,ε=32 时,如果实际键在位置 1000:
ε=32 预测位置在 968-1032 之间,然后检查最多 64 个元素
ε=128 预测位置在 872-1128 之间,然后检查最多 256 个元素
Pgm-Index(分段几何模型索引)是一种学习型索引结构,使用分段线性模型近似键的分布。 它提供 O(log ε) 的搜索时间,并保证误差边界,其中 ε 控制内存和速度之间的权衡。
二分查找是已排序数组查找的基准。Pgm-Index 旨在:
本项目为 js0.site ⋅ 重构互联网计划 的开源组件。
我们正在以组件化的方式重新定义互联网的开发范式,欢迎关注: