orc-format

Crates.ioorc-format
lib.rsorc-format
version0.3.0
sourcesrc
created_at2022-07-19 21:00:28.160586
updated_at2022-07-30 16:06:10.72384
descriptionUnofficial implementation of Apache ORC spec in safe Rust
homepagehttps://github.com/DataEngineeringLabs/orc-format
repositoryhttps://github.com/DataEngineeringLabs/orc-format
max_upload_size
id628536
size92,667
Jorge Leitao (jorgecarleitao)

documentation

README

Read Apache ORC from Rust

test codecov

Read Apache ORC in Rust.

This repository is similar to parquet2 and Avro-schema, providing a toolkit to:

  • Read ORC files (proto structures)
  • Read stripes (the conversion from proto metadata to memory regions)
  • Decode stripes (the math of decode stripes into e.g. booleans, runs of RLE, etc.)

It currently reads the following (logical) types:

  • booleans
  • strings
  • integers
  • floats

What is not yet implemented:

  • Snappy, LZO decompression
  • RLE v2 Patched Base decoding
  • RLE v1 decoding
  • Utility functions to decode non-native logical types:
    • decimal
    • timestamp
    • struct
    • List
    • Union

Run tests

python3 -m venv venv
venv/bin/pip install -U pip
venv/bin/pip install -U pyorc
venv/bin/python write.py
cargo test
Commit count: 29

cargo fmt