| Crates.io | avila-arrow |
| lib.rs | avila-arrow |
| version | 0.2.0 |
| created_at | 2025-11-22 21:52:57.564123+00 |
| updated_at | 2025-12-02 02:13:07.304776+00 |
| description | Zero-copy columnar format with scientific arrays (Quaternions, Complex, Tensors, Spinors), SIMD acceleration (35x), and native compression (125x RLE, 16x BitPack, 4x Delta) - Zero external dependencies |
| homepage | https://avila.cloud |
| repository | https://github.com/avilaops/arxis |
| max_upload_size | |
| id | 1945783 |
| size | 220,303 |
Zero-copy columnar format with scientific extensions, SIMD acceleration, and native compression.
byteorder requiredAvila-arrow is the only columnar format with:
[dependencies]
avila-arrow = "0.1"
use avila_arrow::{Schema, Field, DataType, RecordBatch};
use avila_arrow::array::Int64Array;
use avila_arrow::compute::*;
// Create schema
let schema = Schema::new(vec![
Field::new("id", DataType::Int64),
Field::new("value", DataType::Float64),
]);
// Create arrays
let ids = Int64Array::from(vec![1, 2, 3, 4, 5]);
let values = Float64Array::from(vec![10.0, 20.0, 30.0, 40.0, 50.0]);
// Compute operations
let sum = sum_f64(&values);
let mean = mean_f64(&values).unwrap();
let filtered = filter_f64(&values, >_f64(&values, 25.0))?;
println!("Sum: {}, Mean: {}, Filtered: {:?}", sum, mean, filtered.values());
use avila_arrow::scientific::*;
// Quaternion arrays for spacecraft orientation
let q1 = Quaternion::from_axis_angle([0.0, 0.0, 1.0], PI / 2.0);
let q2 = Quaternion::from_axis_angle([0.0, 0.0, 1.0], PI);
let array1 = QuaternionArray::new(vec![q1; 1000]);
let array2 = QuaternionArray::new(vec![q2; 1000]);
// SLERP interpolation for smooth rotation
let interpolated = array1.slerp(&array2, 0.5).unwrap();
// Complex arrays for FFT
let signal = ComplexArray::new(vec![
Complex64::new(1.0, 0.0),
Complex64::new(0.0, 1.0),
]);
let magnitudes = signal.magnitude();
let phases = signal.phase();
use avila_arrow::compression::*;
// RLE: 125x compression for repeated values
let data = vec![42u8; 10000];
let encoded = rle::encode(&data).unwrap();
// 10000 bytes -> 80 bytes!
// Delta: 4x compression for timestamps
let timestamps: Vec<i64> = (0..10000).map(|i| 1700000000 + i * 1000).collect();
let encoded = delta::encode_i64(×tamps).unwrap();
// Bit-Packing: 16x compression for small integers
let small_ints: Vec<i64> = (0..10000).map(|i| i % 16).collect();
let bit_width = bitpack::detect_bit_width(&small_ints); // 4 bits
let packed = bitpack::pack(&small_ints, bit_width).unwrap();
// Dictionary: Optimal for low cardinality
let mut encoder = DictionaryEncoderI64::new();
for i in 0..10000 {
encoder.encode(i % 20); // Only 20 unique values
}
let (dict, indices) = encoder.finish();
| Codec | Best For | Compression Ratio | Example |
|---|---|---|---|
| RLE | Repeated values | 125x | [1,1,1,...] |
| Bit-Pack | Small integers (0-15) | 16x | Flags, counters |
| Delta | Sequential data | 4x | Timestamps, IDs |
| Dictionary | Low cardinality | 1-10x | Categories, enums |
All codecs are 100% native Rust - no external dependencies!
Avila-arrow uses AVX2 intrinsics for hardware-accelerated operations with proven speedups:
use avila_arrow::compute::*;
let data = Float64Array::from((0..1_000_000).map(|i| i as f64).collect::<Vec<_>>());
// Automatically uses SIMD when AVX2 is available
let sum = sum_f64(&data); // 4.24x faster than scalar
Basic Operations:
| Operation | Size | Scalar | SIMD | Speedup |
|---|---|---|---|---|
| Sum | 100K | 61.4μs | 14.5μs | 4.24x |
| Add | 10K | 38.8μs | 4.4μs | 8.81x |
| Multiply | 100 | 856ns | 24.4ns | 35x |
| Subtract | 1K | 4.64μs | 611ns | 7.59x |
| Divide | 10K | 76.3μs | 34.7μs | 2.20x |
| Sqrt | 1M | 8.67ms | 4.98ms | 1.74x |
| FMA | 10K | 54.9μs | 9.43μs | 5.82x |
Complex Pipelines (3 operations):
| Size | Scalar | SIMD | Speedup |
|---|---|---|---|
| 10K | 99.3μs | 24.7μs | 4.02x |
| 100K | 1.03ms | 586μs | 1.75x |
| 1M | 12.0ms | 10.8ms | 1.11x |
Memory Throughput:
| Elements | Scalar | SIMD | Speedup |
|---|---|---|---|
| 100K | 61.4μs | 14.5μs | 4.24x |
| 1M | 721μs | 292μs | 2.47x |
Note: Benchmarks run on Intel AVX2 CPU. SIMD excels at small-medium datasets (100-100K). For 10M+ elements, consider parallel processing.
See examples/ directory:
basic.rs - Arrays and RecordBatchscientific.rs - Quaternions, Complex, Tensorscompression.rs - Native compression codecs (125x!)ipc.rs - Serialization (coming soon)Run with:
cargo run --example compression
[dependencies.avila-arrow]
version = "0.1"
features = ["scientific", "compression", "ipc"]
scientific (default): Scientific array typescompression: Compression supportipc: Arrow IPC formataviladb: AvilaDB integrationContributions welcome! Please open an issue or PR.
Dual licensed under MIT OR Apache-2.0.
Built with ❤️ by avilaops for the Brazilian scientific computing community.
Status: v0.2.0 - 80+ tests passing ✅ | 35x SIMD ✅ | 125x Compression ✅