| Crates.io | u64_2 |
| lib.rs | u64_2 |
| version | 0.1.3 |
| created_at | 2025-12-30 17:29:41.26027+00 |
| updated_at | 2025-12-30 17:37:20.64424+00 |
| description | Ultra-fast dual u64 variable-length encoding / 超高速双 u64 变长编码 |
| homepage | https://github.com/js0-site/rust/tree/main/u64_2 |
| repository | https://github.com/js0-site/rust.git |
| max_upload_size | |
| id | 2013066 |
| size | 104,421 |
u64_2 is a highly customized variable-length encoding scheme specifically designed for simultaneously storing two u64 integers.
Compared to standard VByte (Varint/LEB128), this scheme eliminates branch loops in the decoding process by grouping metadata (length information), thereby maximizing the utilization of modern CPU pipelines and branch prediction capabilities (such as Apple Silicon M series). It maintains high compression ratios while providing decoding speeds close to memory copy (Memcpy) levels.
while loops, decoding requires only simple bitwise operations and memory reads.The encoded data stream consists of a Tag (tag byte) and subsequent Data Body (data body).
Structure diagram:
[Tag (1 Byte)] [Integer A Bytes...] [Integer B Bytes...]
Tag is the first byte of the data stream, which records the byte length information of both integers simultaneously. Tag is split into high 4 bits and low 4 bits:
Length Encoding Rule (-1 offset):
To maximize space utilization, the stored length value = actual byte count - 1.
0 (0000) $\rightarrow$ represents actual length 1 byte.7 (0111) $\rightarrow$ represents actual length 8 bytes.Following the Tag is the pure data part, arranged in order:
Decoding is the core optimization point of this algorithm, using Masking technique to replace traditional byte-by-byte reading.
LenA.LenB.LenA and use mask to clear high-bit garbage data.1 + LenA), similarly load a complete 64-bit word, look up table based on LenB and apply mask.u64.use u64_2::encode;
let mut buffer = [0u8; 32];
let num1: u64 = 500; // Needs 2 bytes
let num2: u64 = 100000; // Needs 3 bytes
let len = encode(num1, num2, &mut buffer);
// Encoded data is in &buffer[..len]
use u64_2::decode;
let encoded_data = [0x12, 0xF4, 0x01, 0xA0, 0x86, 0x01];
let (num1, num2, consumed) = decode(&encoded_data);
// num1 = 500, num2 = 100000, consumed = 6
For detailed performance benchmark results, please refer to benches/RESULTS.md.
Run benchmarks:
cargo bench --bench u64_encode_decode
Key performance indicators:
This project is licensed under MulanPSL-2.0.
Comparing u64_2 (pair encoding) with vb (varint) using 100,000 integers (mixed distribution: 60% small, 30% medium, 10% large).
| Library | Encode (M/s) | Decode (M/s) |
|---|---|---|
| u64_2 | 2727.6 | 2258.2 |
| vb | 199.5 | 288.3 |
macOS 26.1 (arm64) · Apple M2 Max · 12 cores · 64.0GB · rustc 1.94.0-nightly (21ff67df1 2025-12-15)
This project is an open-source component of js0.site ⋅ Refactoring the Internet Plan.
We are redefining the development paradigm of the Internet in a componentized way. Welcome to follow us:
u64_2 是一种高度定制化的变长编码方案,专为同时存储两个 u64 整数而设计。
与标准的 VByte (Varint/LEB128) 相比,本方案通过将元数据(长度信息)分组存储,彻底消除了解码过程中的分支循环(Branch Loops),从而最大限度地利用现代 CPU(如 Apple Silicon M 系列)的流水线和分支预测能力。它在保持高压缩率的同时,提供接近内存拷贝(Memcpy)级别的解码速度。
while 循环,解码仅需简单的位运算和内存读取。编码后的数据流由一个 Tag(标签字节) 和随后的 Data Body(数据体) 组成。
结构示意图:
[Tag (1 Byte)] [Integer A Bytes...] [Integer B Bytes...]
Tag 是数据流的第一个字节,它同时记录了两个整数的字节长度信息。Tag 被拆分为高 4 位和低 4 位:
长度编码规则(-1 偏移):
为了最大化利用空间,存储的长度值 = 实际字节数 - 1。
0 (0000) $\rightarrow$ 代表实际长度 1 字节。7 (0111) $\rightarrow$ 代表实际长度 8 字节。紧随 Tag 之后的是纯数据部分,按顺序排列:
解码是本算法的核心优化点,采用 Masking(掩码) 技术替代传统的字节逐个读取。
LenA。LenB。LenA 查表,使用掩码将高位垃圾数据清零。1 + LenA),同样加载一个完整的 64 位字,根据 LenB 查表并应用掩码。u64 的 Pair 对。use u64_2::encode;
let mut buffer = [0u8; 32];
let num1: u64 = 500; // 需要 2 字节
let num2: u64 = 100000; // 需要 3 字节
let len = encode(num1, num2, &mut buffer);
// 编码后的数据在 &buffer[..len] 中
use u64_2::decode;
let encoded_data = [0x12, 0xF4, 0x01, 0xA0, 0x86, 0x01];
let (num1, num2, consumed) = decode(&encoded_data);
// num1 = 500, num2 = 100000, consumed = 6
详细的性能评测结果请参考 benches/RESULTS.md。
运行评测:
cargo bench --bench u64_encode_decode
关键性能指标:
本项目采用 MulanPSL-2.0 许可证。
对比 u64_2(双值编码)与 vb(变长编码),测试数据:100,000 个整数(混合分布:60% 小值,30% 中值,10% 大值)。
| 库 | 编码 (百万/秒) | 解码 (百万/秒) |
|---|---|---|
| u64_2 | 2727.6 | 2258.2 |
| vb | 199.5 | 288.3 |
macOS 26.1 (arm64) · Apple M2 Max · 12 核 · 64.0GB · rustc 1.94.0-nightly (21ff67df1 2025-12-15)
本项目为 js0.site ⋅ 重构互联网计划 的开源组件。
我们正在以组件化的方式重新定义互联网的开发范式,欢迎关注: