This repository has been archived by the owner on May 24, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
1 addition
and
48 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,48 +1 @@ | ||
# `catcsv`: Concatenate directories of possibly-compressed CSV files | ||
|
||
This is a small utility that we use to reassemble many small CSV files into | ||
much larger ones. In our case, the small CSV files are generated by | ||
highly-parallel by [Pachyderm][] pipelines doing map/reduce-style | ||
operations. | ||
|
||
Usage: | ||
|
||
``` | ||
catcsv - Combine many CSV files into one | ||
Usage: | ||
catcsv <input-file-or-dir>... | ||
catcsv (--help | --version) | ||
Options: | ||
--help Show this screen. | ||
--version Show version. | ||
Input files must have the extension *.csv or *.csv.sz. The latter are assumed | ||
to be in Google's "snappy framed" format: https://github.com/google/snappy | ||
If passed a directory, this will recurse over all files in that directory. | ||
``` | ||
|
||
## Wish list | ||
|
||
If you'd like to add support for other common compression formats, such as `*.gz`, | ||
we'll happily accept PRs that depend on either pure Rust crates, or which | ||
include C code in the crate but still cross-compile easily with musl. | ||
|
||
## Related utilities | ||
|
||
If you're interested in this utility, you might also be interested in: | ||
|
||
- BurntSushi's excellent [xsv][] utility, which features a wide variety of | ||
subcommands for working with CSV files. Among these is a powerful `xsv | ||
cat` command, which has many options that `catcsv` doesn't (but which | ||
doesn't do directory walking or automatic decompression as far as I | ||
know). | ||
- Faraday's [scrubcsv][] utility, which attempts to normalize non-standard | ||
CSV files. | ||
|
||
|
||
[xsv]: https://github.com/BurntSushi/xsv | ||
[scrubcsv]: https://github.com/faradayio/scrubcsv | ||
[Pachyderm]: https://www.pachyderm.io | ||
Moved to https://github.com/faradayio/csv-tools. |