catcsv
: Concatenate directories of possibly-compressed CSV filesThis is a small utility that we use to reassemble many small CSV files into much larger ones. In our case, the small CSV files are generated by highly-parallel by Pachyderm pipelines doing map/reduce-style operations.
Usage:
``` catcsv - Combine many CSV files into one
Usage:
catcsv
Options: --help Show this screen. --version Show version.
Input files must have the extension *.csv or *.csv.sz. The latter are assumed to be in Google's "snappy framed" format: https://github.com/google/snappy
If passed a directory, this will recurse over all files in that directory. ```
If you'd like to add support for other common compression formats, such as *.gz
,
we'll happily accept PRs that depend on either pure Rust crates, or which
include C code in the crate but still cross-compile easily with musl.
If you're interested in this utility, you might also be interested in:
xsv
cat
command, which has many options that catcsv
doesn't (but which
doesn't do directory walking or automatic decompression as far as I
know).