| Crates.io | idpath |
| lib.rs | idpath |
| version | 0.1.1 |
| created_at | 2025-12-26 03:44:23.021338+00 |
| updated_at | 2025-12-26 03:44:23.021338+00 |
| description | Hierarchical path encoding for scalable storage systems / 为可扩展存储系统设计的层次化路径编码 |
| homepage | https://github.com/js0/rust/tree/main/idpath |
| repository | https://github.com/js0/rust.git |
| max_upload_size | |
| id | 2005235 |
| size | 52,289 |
idpath provides utilities to encode 64-bit integers into hierarchical filesystem paths. By placing low-order bits at higher directory levels, it ensures uniform distribution of files across directories, preventing filesystem performance degradation caused by directory saturation.
HipStr and stack buffers to eliminate heap allocations during successful decoding._, -, ~).use idpath::{encode, decode};
fn main() -> idpath::Result<()> {
let id = 1234567u64;
let prefix = "/storage";
// Encode
let path = encode(prefix, id);
println!("Path: {}", path);
// Decode
let decoded_id = decode(&path)?;
assert_eq!(id, decoded_id);
Ok(())
}
Modern filesystems (like ext4 or XFS) experience performance drops when a single directory contains thousands of entries. Hierarchical structures mitigate this. By reversing bit significance during path construction (Low-Bits-First), idpath avoids hotspots even when IDs are generated sequentially (e.g., from databases).
graph TD
ID[u64 ID] --> B32[Crockford Base32]
B32 --> Reorder{Bits Reordering}
Reorder -- "len 0-2" --> P1[Level 1: prefix/xx_]
Reorder -- "len 3" --> P2[Level 2: prefix/xx/x-]
Reorder -- "len 4" --> P3[Level 2: prefix/xx/xx~]
Reorder -- "len >=5" --> P4[Level 3: prefix/xx/xx/xxx]
P1/P2/P3/P4 --> Output[HipStr Path]
.
├── src/
│ ├── lib.rs # Core encoding/decoding logic
│ └── error.rs # Custom error definitions
└── tests/
└── main.rs # Integration tests
encode(prefix: impl AsRef<str>, id: u64) -> HipStr<'static>
Encodes an ID into a path string.decode(path: impl AsRef<str>) -> Result<u64>
Decodes a path string back into an ID.DEPTH1: b'_', suffix for 1-level paths.DEPTH2: b'-', suffix for 2-level paths (3-char encoded).DEPTH3: b'~', suffix for 2-level paths (4-char encoded).Result<T>: Alias for std::result::Result<T, Error>.Error: Enum containing InvalidPath(HipStr) and DecodeFailed(HipStr).Hierarchical directory structures for data distribution date back to early Unix systems. Large-scale caching proxies like Squid and version control systems like Git popularized this "fan-out" approach. Git, for instance, stores objects in a .git/objects/xx/xxxxxxxx... structure using the first two hex digits of a SHA-1 hash. idpath extends this concept by prioritizing low-order bits, specifically optimized for numeric IDs where sequential growth is common.
This project is an open-source component of js0.site ⋅ Refactoring the Internet Plan.
We are redefining the development paradigm of the Internet in a componentized way. Welcome to follow us:
idpath 工具用于将 64 位整数编码为层次化文件系统路径。通过将低位比特放置在目录的高层级,确保文件均匀分布在不同目录中,防止目录由于条目过多导致文件系统性能下降。
HipStr 与栈分配缓冲区,成功解码过程中无需堆分配。_, -, ~)。use idpath::{encode, decode};
fn main() -> idpath::Result<()> {
let id = 1234567u64;
let prefix = "/data";
// 编码
let path = encode(prefix, id);
println!("编码路径: {}", path);
// 解码
let decoded_id = decode(&path)?;
assert_eq!(id, decoded_id);
Ok(())
}
现代文件系统(如 ext4 或 XFS)在单个目录包含数千个条目时性能会大幅下降。层次化结构能有效缓解此问题。idpath 在构建路径时反转比特重要性(低位优先),有效规避了 ID 顺序生成(例如数据库自增主键)产生的存储热点。
graph TD
ID[u64 ID] --> B32[Crockford Base32]
B32 --> Reorder{比特重排}
Reorder -- "长度 0-2" --> P1[1层深度: prefix/xx_]
Reorder -- "长度 3" --> P2[2层深度: prefix/xx/x-]
Reorder -- "长度 4" --> P3[2层深度: prefix/xx/xx~]
Reorder -- "长度 >=5" --> P4[3层深度: prefix/xx/xx/xxx]
P1/P2/P3/P4 --> Output[HipStr 路径]
.
├── src/
│ ├── lib.rs # 核心编解码逻辑
│ └── error.rs # 自定义错误定义
└── tests/
└── main.rs # 集成测试
encode(prefix: impl AsRef<str>, id: u64) -> HipStr<'static>
将 ID 编码为路径字符串。decode(path: impl AsRef<str>) -> Result<u64>
将路径字符串解码回 ID。DEPTH1: b'_',1 层深度的后缀。DEPTH2: b'-',2 层深度的后缀(编码长度为 3)。DEPTH3: b'~',2 层深度的后缀(编码长度为 4)。Result<T>:std::result::Result<T, Error> 的别名。Error:包含 InvalidPath(HipStr) 与 DecodeFailed(HipStr) 的枚举。层次化目录结构用于数据分发的设计最早可追溯到早期 Unix 系统。Squid 缓存代理和 Git 版本控制系统等大型工具使这种“扇出”(fan-out)方法广为人知。以 Git 为例,它利用 SHA-1 哈希值的前两位作为目录名(如 .git/objects/xx/xxxxxxxx...)来存储对象。idpath 扩展了这一理念,通过优先处理低位比特,专门针对数字 ID 顺序增长的场景进行了优化,实现了更优的磁盘分布规律。
本项目为 js0.site ⋅ 重构互联网计划 的开源组件。
我们正在以组件化的方式重新定义互联网的开发范式,欢迎关注: