Crates.io | range-reader |
lib.rs | range-reader |
version | 0.2.0 |
source | src |
created_at | 2021-11-20 08:00:06.394866 |
updated_at | 2022-05-17 06:39:42.332679 |
description | Converts low-level APIs to read ranges of bytes to `Read + Seek` |
homepage | https://github.com/DataEngineeringLabs/ranged-reader-rs |
repository | https://github.com/DataEngineeringLabs/ranged-reader-rs |
max_upload_size | |
id | 484847 |
size | 20,216 |
Convert low-level APIs to read ranges of files into structs that implement
Read + Seek
and AsyncRead + AsyncSeek
. See
parquet_s3_async.rs for an example of this API to
read parts of a large parquet file from s3 asynchronously.
Blob storage https APIs offer the ability to read ranges of bytes from a single blob, i.e. functions of the form
fn read_range_blocking(path: &str, start: usize, length: usize) -> Vec<u8>;
async fn read_range(path: &str, start: usize, length: usize) -> Vec<u8>;
together with its total size,
async fn length(path: &str) -> usize;
fn length(path: &str) -> usize;
These APIs are usually IO-bounded - they wait for network.
Some file formats (e.g. Apache Parquet, Apache Avro, Apache Arrow IPC) allow reading parts of a file for filter and projection push down.
This crate offers 2 structs, RangedReader
and RangedStreamer
that implement
Read + Seek
and AsyncRead + AsyncSeek
respectively, to bridge the blob storage
APIs mentioned above to the traits used by most Rust APIs to read bytes.
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.