# `catcsv`: Concatenate directories of possibly-compressed CSV files This is a small utility that we use to reassemble many small CSV files into much larger ones. In our case, the small CSV files are generated by highly-parallel by [Pachyderm][] pipelines doing map/reduce-style operations. Usage: ``` catcsv - Combine many CSV files into one Usage: catcsv ... catcsv (--help | --version) Options: --help Show this screen. --version Show version. Input files must have the extension *.csv or *.csv.sz. The latter are assumed to be in Google's "snappy framed" format: https://github.com/google/snappy If passed a directory, this will recurse over all files in that directory. ``` ## Wish list If you'd like to add support for other common compression formats, such as `*.gz`, we'll happily accept PRs that depend on either pure Rust crates, or which include C code in the crate but still cross-compile easily with musl. ## Related utilities If you're interested in this utility, you might also be interested in: - BurntSushi's excellent [xsv][] utility, which features a wide variety of subcommands for working with CSV files. Among these is a powerful `xsv cat` command, which has many options that `catcsv` doesn't (but which doesn't do directory walking or automatic decompression as far as I know). - Faraday's [scrubcsv][] utility, which attempts to normalize non-standard CSV files. [xsv]: https://github.com/BurntSushi/xsv [scrubcsv]: https://github.com/faradayio/scrubcsv [Pachyderm]: https://www.pachyderm.io