parquet-format-async-temp

Crates.ioparquet-format-async-temp
lib.rsparquet-format-async-temp
version0.3.1
sourcesrc
created_at2021-08-08 15:43:26.959715
updated_at2022-06-14 06:22:46.852487
descriptionTemporary crate containing thrift library + parquet definitions compiled to support read+write async.
homepagehttps://github.com/jorgecarleitao/parquet-format-rs
repositoryhttps://github.com/jorgecarleitao/parquet-format-rs
max_upload_size
id433185
size476,703
Jorge Leitao (jorgecarleitao)

documentation

README

parquet-format-async-temp

This is a temporary crate containing a subset of rust's thirft library and parquet to support native async parquet read and write.

Specifically, it:

  • supports async read API (via futures)
  • supports async write API (via futures)
  • the write API returns the number of written bytes

It must be used with the fork of thrift's compiler available at https://github.com/jorgecarleitao/thrift/tree/write_size .

Why

To read and write files with thrift (e.g. parquet) without commiting to a particular runtime (e.g. tokio, hyper, etc.), the protocol needs to support AsyncRead + AsyncSeek and AsyncWrite respectively.

To not require Seek and AsyncSeek on write, the protocol must return the number of written bytes on its write_* API.

This crate addresses these two concerns for parquet. It is essentially:

Commit count: 12

cargo fmt