| Crates.io | sosorted |
| lib.rs | sosorted |
| version | 0.2.0 |
| created_at | 2023-01-08 16:36:30.228744+00 |
| updated_at | 2026-01-11 22:28:12.921137+00 |
| description | A set of methods to efficiently manipulated sorted arrays |
| homepage | |
| repository | |
| max_upload_size | |
| id | 753793 |
| size | 170,493 |
This crate provides various methods for efficiently manipulating arrays of sorted data using SIMD optimizations. It supports all primitive integer types (u8, u16, u32, u64, i8, i16, i32, i64).
All mutable operations follow a consistent pattern: the destination buffer is always the first argument, followed by immutable input slices. This design:
// All mutable APIs follow: (dest, inputs...) -> result_length
let len = intersect(&mut dest, &a, &b);
let len = union(&mut dest, &a, &b);
let len = difference(&mut dest, &a, &b);
let len = deduplicate(&mut dest, &input);
Note that different operations handle duplicates differently:
intersect - Computes the multiset intersection. If an element appears $n$ times in a and $m$ times in b, it will appear $\min(n, m)$ times in the result.union - Computes the set union. Merges two sorted arrays and deduplicates the result. If an element appears multiple times in inputs, it appears exactly once in the result.union_size - Calculates the size of the set union without allocation.difference - Computes a modified set difference. Removes all occurrences of elements found in b from a. However, duplicates in a that are not in b are preserved.difference_size - Calculates the size of the difference without allocation.deduplicate - Removes repeated elements from a sorted slice. Writes the result to a destination buffer.find_first_duplicate - Finds the index of the first duplicate entry in a sorted slice. Returns the length if no duplicates exist.symmetric_difference - Elements in either array but not in bothis_subset - Check if the first array is a subset of the secondis_superset - Check if the first array is a superset of the secondmerge - Merge two sorted arrays (preserving duplicates)merge_unique - Merge two sorted arrays without duplicatesfind_range - Find all elements within a specified rangecontains - Check if a specific element existslower_bound - Find the first element not less than the given valueupper_bound - Find the first element greater than the given valuecount_unique - Count the number of unique elementsnth_unique - Find the nth unique elementAll operations are generic over integer types via the SortedSimdElement trait. This trait is implemented for all primitive integer types: u8, u16, u32, u64, i8, i16, i32, and i64.
use sosorted::intersect;
// Works with u64
let a = [1u64, 2, 3, 4, 5];
let b = [2, 4];
let mut dest = [0u64; 5];
assert_eq!(intersect(&mut dest, &a, &b), 2);
// Works with i32
let c = [1i32, 3, 5, 7];
let d = [1, 5, 9];
let mut dest = [0i32; 4];
assert_eq!(intersect(&mut dest, &c, &d), 2);
This crate uses Rust's portable SIMD (std::simd) and automatically selects optimal SIMD lane counts at compile time based on the target CPU features.
| Target Feature | Register Width | Detection |
|---|---|---|
| AVX-512 | 512-bit | target_feature = "avx512f" |
| AVX2 | 256-bit | target_feature = "avx2" |
| SSE2 (fallback) | 128-bit | Default for x86_64 |
Each element type uses the optimal number of SIMD lanes to fully utilize the available register width:
| Element Type | AVX-512 (512-bit) | AVX2 (256-bit) | SSE2 (128-bit) |
|---|---|---|---|
u8 / i8 |
64 lanes | 32 lanes | 16 lanes |
u16 / i16 |
32 lanes | 16 lanes | 8 lanes |
u32 / i32 |
16 lanes | 8 lanes | 4 lanes |
u64 / i64 |
8 lanes | 4 lanes | 2 lanes |
To enable AVX2 or AVX-512 optimizations, set the target CPU at compile time:
# For AVX2 (most modern x86_64 CPUs)
RUSTFLAGS="-C target-cpu=native" cargo build --release
# Or specify a specific CPU
RUSTFLAGS="-C target-cpu=skylake" cargo build --release
# For AVX-512 (Intel Skylake-X, Ice Lake, or newer)
RUSTFLAGS="-C target-cpu=skylake-avx512" cargo build --release
Alternatively, create a .cargo/config.toml in your project:
[build]
rustflags = ["-C", "target-cpu=native"]
The SIMD width is determined at compile time, not runtime. If you need to support multiple CPU generations with a single binary, compile with the lowest common denominator (SSE2) or use separate binaries for different targets.
The SIMD_WIDTH_BITS constant is exported and indicates the detected register width (128, 256, or 512 bits).
This repository includes bench-compare, a CLI tool for A/B benchmark comparison with statistical hypothesis testing. See crates/bench-compare for details.
The intersection algorithm is based on the following paper: