Parquimetro is a small (10MB) and simple tool to interact with parquet files. Built around parquet-go.
To check parquet schemas:
parquimetro schema ~/path/to/file.parquet
Options available:
- Count:
-for--formatoutput format,jsonorgo. (defaultjson) - Skip:
--tagsshow go struct tags (Only available if format isgo) - Threads:
-tor--threadsquantity of threads to be used. (default 1)
Schema command can be easily used together with jq:
parquimetro schema ~/path/to/file.parquet | jq .
Easy read parquet files:
parquimetro read ~/path/to/file.parquet
Options available:
- Count:
-cor--countquantity of rows to be shows. (default 25) - Skip:
-sor--skipquantity of rows to skip (from beginning) - Threads:
-tor--threadsquantity of threads to be used. (default 1)
Just as schema, read command can be easily used together with jq:
parquimetro read ~/path/to/file.parquet | jq .
Easy know size related data:
go run main.go size ~/Downloads/userdata1.parquet
Options available:
- Uncompressed:
--uncompressedshow uncompressed size (Defaulttrue) - Compressed:
--compressedshow compressed size (Defaultfalse) - Pretty:
--prettyshow pretty size, it will use the best format to print (Defaulttrue) - Format:
--formator-fgive format to print output. Acceptable formats:KB,MB,GB,TB. (Lower priority thanpretty, need to set--pretty=falseto use)
If you have go installed:
go install github.com/ovaladares/parquimetro@latest
Or if you want, you can download the release on our releases page and install it.