xpq is a simple command line program for analyzing parquet files.
See Working with nightly Rust to install nightly toolchain and set it as default.
Binaries for Linux and macOS are available from Github.
To install the binary download the latest release.
bash
curl -s https://api.github.com/repos/FabioBatSilva/xpq/releases/latest \
| grep "browser_download_url" \
| grep apple-darwin \
| cut -d : -f 2,3 \
| tr -d \" \
| wget -qi -
Make it executable ```bash chmod +x ./xpq-*-apple-darwin
mv ./xpq-*-apple-darwin /usr/local/bin/xpq ```
Alternatively, you can compile and install using Cargo :
bash
cargo install xpq
You can also compile from source using cargo
bash
cargo install --git https://github.com/FabioBatSilva/xpq.git --force
Grab some parquet data :
``` wget -O users.parquet https://github.com/apache/spark/blob/master/examples/src/main/resources/users.parquet?raw=true
```
Check the schema : ``` xpq schema users.parquet
message example.avro.User { REQUIRED BYTEARRAY name (UTF8); OPTIONAL BYTEARRAY favoritecolor (UTF8); REQUIRED group favoritenumbers (LIST) { REPEATED INT32 array; } } ```
Check the number of rows : ``` xpq count users.parquet
count 2 ```
Read some data : ``` xpq read users.parquet
name favoritecolor favoritenumbers "Alyssa" null [3, 9, 15, 20] "Ben" "red" [] ```