catcsv

Crates.iocatcsv
lib.rscatcsv
version1.0.0
sourcesrc
created_at2017-05-15 22:38:28.138414
updated_at2022-05-25 14:26:11.480847
descriptionConcatenate directories of (possibly-compressed CSV) files into one CSV file
homepage
repositoryhttps://github.com/faradayio/catcsv
max_upload_size
id14745
size21,037
Eric Kidd (emk)

documentation

README

catcsv: Concatenate directories of possibly-compressed CSV files

This is a small utility that we use to reassemble many small CSV files into much larger ones. In our case, the small CSV files are generated by highly-parallel by Pachyderm pipelines doing map/reduce-style operations.

Usage:

catcsv - Combine many CSV files into one

Usage:
  catcsv <input-file-or-dir>...
  catcsv (--help | --version)

Options:
  --help        Show this screen.
  --version     Show version.

Input files must have the extension *.csv or *.csv.sz.  The latter are assumed
to be in Google's "snappy framed" format: https://github.com/google/snappy

If passed a directory, this will recurse over all files in that directory.

Wish list

If you'd like to add support for other common compression formats, such as *.gz, we'll happily accept PRs that depend on either pure Rust crates, or which include C code in the crate but still cross-compile easily with musl.

Related utilities

If you're interested in this utility, you might also be interested in:

  • BurntSushi's excellent xsv utility, which features a wide variety of subcommands for working with CSV files. Among these is a powerful xsv cat command, which has many options that catcsv doesn't (but which doesn't do directory walking or automatic decompression as far as I know).

  • Faraday's scrubcsv utility, which attempts to normalize non-standard CSV files.

Commit count: 4

cargo fmt