| Crates.io | recoco-utils |
| lib.rs | recoco-utils |
| version | 0.2.1 |
| created_at | 2026-01-25 00:16:37.46383+00 |
| updated_at | 2026-01-25 20:52:14.953599+00 |
| description | Common utilities for ReCoco, an all-Rust fork of CocoIndex with greater flexibility. |
| homepage | |
| repository | https://github.com/knitli/recoco |
| max_upload_size | |
| id | 2067762 |
| size | 241,419 |
Common utilities for the ReCoco ecosystem.
This crate provides shared building blocks used across ReCoco's core and operation modules. While primarily intended for internal use within ReCoco, these utilities can be useful for developing custom ReCoco operations or for standalone use in Rust projects.
[dependencies]
recoco-utils = { version = "0.2", features = ["batching", "fingerprint"] }
recoco-utils is highly modular with no default features to keep dependencies minimal. Enable only what you need.
| Feature | Description | Key Dependencies | Use When |
|---|---|---|---|
batching |
Async batch processing with concurrency control | tokio-util, serde |
Building efficient data pipelines with batch operations |
bytes_decode |
Smart encoding detection and UTF-8 decoding | encoding_rs |
Processing files with unknown or mixed encodings |
concur_control |
Concurrency limiting and rate control primitives | tokio |
Managing concurrent operations and backpressure |
deserialize |
JSON deserialization helpers with better error messages | serde, serde_json, serde_path_to_error |
Parsing JSON with detailed error reporting |
fingerprint |
Content hashing (BLAKE3) and fingerprinting | blake3, base64, hex |
Change detection, deduplication, caching |
immutable |
Immutable data structures (Arc-based collections) | None | Safe concurrent access to shared data |
retryable |
Exponential backoff retry logic | tokio, rand, time |
Network calls, external APIs, unreliable operations |
str_sanitize |
String cleaning and SQL-safe sanitization | serde, sqlx |
Input validation, SQL injection prevention |
yaml |
YAML parsing and serialization | yaml-rust2, base64 |
Configuration files, structured data |
[!NOTE] This list isn't exhaustive. It doesn't include features that are intended for recoco-core. The above features cover all functionality of the crate, providing granular by-module access.
Efficient batch processing with concurrency control:
use recoco_utils::batching::{Batcher, BatchConfig};
let config = BatchConfig {
max_batch_size: 100,
max_wait_ms: 1000,
max_inflight: 10,
};
let batcher = Batcher::new(config, |batch| async move {
// Process batch
Ok(())
}).await?;
batcher.send(item).await?;
Content-addressable hashing with BLAKE3:
use recoco_utils::fingerprint::{fingerprint, Fingerprint};
let hash = fingerprint(b"hello world");
let hex_string = hash.to_hex();
let base64_string = hash.to_base64();
Exponential backoff for unreliable operations:
use recoco_utils::retryable::{retry_with_backoff, RetryConfig};
let result = retry_with_backoff(
|| async {
// Your operation that might fail
api_call().await
},
RetryConfig {
max_attempts: 5,
initial_delay_ms: 100,
max_delay_ms: 10000,
backoff_multiplier: 2.0,
}
).await?;
Limit concurrent operations:
use recoco_utils::concur_control::Semaphore;
let sem = Semaphore::new(10); // Max 10 concurrent operations
let _permit = sem.acquire().await?;
// Do work while holding permit
// Permit is released when dropped
Arc-based collections for safe sharing:
use recoco_utils::immutable::{ImmArcVec, ImmArcMap};
let vec = ImmArcVec::from(vec![1, 2, 3]);
let cloned = vec.clone(); // Cheap Arc clone
let map = ImmArcMap::from([("key", "value")]);
Smart encoding detection:
use recoco_utils::bytes_decode::decode_bytes_to_string;
let text = decode_bytes_to_string(&bytes)?;
// Automatically detects UTF-8, UTF-16, latin1, etc.
Some features depend on others. Most are fully independent, except:
batching requires concur_control, fingerprint, and retryablefingerprint requires deserializeWhen you enable a feature, its dependencies are automatically enabled.
recoco-utils = { version = "0.2", features = ["batching", "fingerprint", "retryable"] }
recoco-utils = { version = "0.2", features = ["server", "deserialize", "uuid"] }
recoco-utils = { version = "0.2", features = ["s3", "azure", "retryable"] }
recoco-utils = { version = "0.2", features = ["sqlx", "uuid", "fingerprint"] }
recoco-utils = { version = "0.2", features = ["openai", "batching", "retryable"] }
This crate is part of the ReCoco workspace. See the main repository for development guidelines.
Apache-2.0. See main repository for details.