csv-split

0.1.0-1 • Public • Published

Commandline tool to split csv

NPM

This small tool spawn off from our need during the Nigeria MDGs Info System data mopup process, we needed to process millions of lines of csv file with a constraint of memory, and a good way to go was to split the csv based on one column and have each be processed separately in R.

We used streams to pick up one line at a time and dump the result to the output directory.

To use, do

csv-split -i [file_name] -b [group_by_column] [-o [output_directory]]

if output_directory is not specified, it will default to [file_name]_by_[group]

you can also pipe from stdin:

cat data.csv | csv-split -b [group_by_column]

  Usage: csv-split [options]

  Options:

    -h, --help                output usage information
    -V, --version             output the version number
    -i, --input <file>        select an input csv
    -o, --output <directory>  select an output directory
    -b, --groupby <group>     the column you want to group by

Package Sidebar

Install

npm i csv-split

Weekly Downloads

12

Version

0.1.0-1

License

MIT

Last publish

Collaborators

  • myf