xpq2

Crates.ioxpq2
lib.rsxpq2
version0.2.2
sourcesrc
created_at2023-09-10 15:41:22.084304
updated_at2023-09-10 15:41:22.084304
descriptionSimple command line tool for analyzing parquet files
homepagehttps://github.com/d33tah/xpq2
repositoryhttps://github.com/d33tah/xpq2
max_upload_size
id968871
size126,040
Jacek Wielemborek (d33tah)

documentation

https://github.com/d33tah/xpq2

README

xpq2

xpq2 is a fork of xpq: https://github.com/FabioBatSilva/xpq

xpq is a simple command line program for analyzing parquet files.

Build Status

Fork

xpq2 was forked so I could quickly try my patches to xpq.

I have no intention of maintaining this fork. It might become stale, use xpq instead.

Requirements

  • Rust nightly

See Working with nightly Rust to install nightly toolchain and set it as default.

Installation

Binaries for Linux and macOS are available from Github.

To install the binary download the latest release.

curl -s https://api.github.com/repos/FabioBatSilva/xpq/releases/latest \
  | grep "browser_download_url" \
  | grep apple-darwin \
  | cut -d : -f 2,3 \
  | tr -d \" \
  | wget -qi -

Make it executable

chmod +x ./xpq-*-apple-darwin

mv ./xpq-*-apple-darwin /usr/local/bin/xpq

Alternatively, you can compile and install using Cargo :

cargo install xpq

You can also compile from source using cargo

cargo install --git https://github.com/FabioBatSilva/xpq.git --force

Available commands

  • read - Read rows.
  • count - Show num of rows.
  • schema - Show parquet schema.
  • sample - Randomly sample rows from parquet.
  • frequency - Show frequency counts for each value.

Quick tour

Grab some parquet data :

wget -O users.parquet https://github.com/apache/spark/blob/master/examples/src/main/resources/users.parquet?raw=true

Check the schema :

xpq schema users.parquet

message example.avro.User {
  REQUIRED BYTE_ARRAY name (UTF8);
  OPTIONAL BYTE_ARRAY favorite_color (UTF8);
  REQUIRED group favorite_numbers (LIST) {
    REPEATED INT32 array;
  }
}

Check the number of rows :

xpq count users.parquet

 count
 2

Read some data :

xpq read users.parquet

 name      favorite_color  favorite_numbers
 "Alyssa"  null            [3, 9, 15, 20]
 "Ben"     "red"           []
Commit count: 101

cargo fmt