untar-csv
is a node.js library for reading reading data from a tar containing multiple similarly structured CSV files as a single stream of objects.
Installation
npm install untar-csv
Usage
const fs = require ('fs')
const zlib = require ('zlib')
const {TarCsvReader} = require ('untar-csv')
const reader = TarCsvReader ({
// test: entry => entry.name.indexOf ('.csv') > -1,
// delimiter: ',',
// skip: 0, // header lines
// fileNumField: '#', // how to name the file # property
// rowNumField: '##', // how to name the line # property
// empty: null,
columns: ['id', 'name'],
})
fs.createReadStream ('lots-of-data.tar.gz').pipe (zlib.createGunzip ()).pipe (reader)
for await (const {id, name} of reader) {
// do something with `id` and `name`
}
Options
Most options are effectively passed to CSVReader, see there for details.
Name | Default value | Description |
---|---|---|
test |
({name}) => true |
tar entry filter, structure described at tar-stream |
columns |
Array of column definitions | |
delimiter |
',' |
Column delimiter |
skip |
0 |
Number of header lines to ignore |
fileNumField |
null |
The name of the file # property (null for no numbering) |
rowNumField |
null |
The name of the line # property (null for no numbering) |
empty |
null |
The value corresponding to zero length cell content |