Motivation
Provide a simple way to convert very large (and small) XML files to records of JSON, or CSV files or perform a custom action. This is done while still maintaining a small foot-print, and being fast. Solutions that load the entire XML wouldn't scale, perform poorly, and may not work if the files are several GBs.
Example usages
For the usage demonstration below let's assume an input file, sample.xml, with the following contents:
<users> <user> <name>Peter</name> <age>45</age> </user> <user> <name>John</name> <age>25</age> </user> <user> <name>Cindy</name> <age>32</age> </user> <user> <name>Alex</name> <age>15</age> </user> </users>
Given that input file, let's see some simple ways to convert it into records.
Convert XML and write to a CSV file
Would write the contents below to sample.csvvar xml2rec=require('xml2rec');
xml2rec('sample.xml', 'user', 'sample.csv');
name, age Peter, 45 John, 25 Cindy, 32 Alex, 15
Convert XML and write to a JSON file
Would write JSON records of require('xml2rec');
xml2rec('sample.xml', 'user', 'sample.json');
user
to sample.json
Convert to CSV
Would outputvar xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {}, (err, rec) => {
console.log(rec);
});
name, age Peter, 45 John, 25 Cindy, 32 Alex, 15
Convert to CSV with no heading
By default, CSV files are output with a heading. To not output the heading, you
will have to override the default options like so:
var xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {heading:false}, (err, rec) => {
console.log(rec);
});
Would output
Peter, 45 John, 25 Cindy, 32 Alex, 15
Convert to JSON
var xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {format:'json'}, (err, rec) => {
console.log(JSON.stringify(rec));
});
Would output
{name:'Peter', age:45} {name:'John', age:25} {name:'Cindy', age:32} {name:'Alex', age:15}
Features
When the most common use cases described above does not suffice, the general parameter list below would be the way to go.
For example, wanting to convert XML input into some other format, or updating a database, rendering XML via Jade/Pug are some use-cases.
xml2rec()
calls a user supplied callback function cb(err, rec) for every record identified by user supplied list of element name(s).
Wherexml2rec(input_filename, [element_name1, element_name2, ..], {}, function cb(err, rec) {
});
input_filename
: is the name of the (large) file that needs to be converted[element_name1, element_name2, ..]
: are the list of XML element names in the input file. For each occurrence of the element_name* in the input file, the callback will be called with rec representing the child elements. The type of the object rec depends on the thiird options parameter {}{}
: the options parametertrimWhiteSpaceTextNodes=true|false
// default: true. Nodes that have just white-spaces are treated as empty nodestrimTrailingSpaces=true|false
// default: false. Leading white-spaces in text nodes are removedtrimLeadingSpaces=true|false
// default: false. Trailing white-spaces in text nodes are removedescapeNewLinesInText=true|false
// default: false. New lines are replaced with \nformat='csv'|'json'
// default: csvheading=true|false
// default: falseoutputAttributes=true|false
// default: true. Not implemented yet.
cb(err, rec):
the function to call per parsed record. The value ofrec
depends on options.fomat's value.
If preserveWhiteSpaces is true, then the whitespace between XML nodes will be preserved in rec. Otherwise, they will be eaten. If format=='csv', the cb()'s rec parameter will be a string --a line with comma separated values, where each value is the text node of the element_name*'s children. If format=='json', rec would be a JavaScript object representing the elements' children. If heading==true, then before calling cb() for each record, cb() will be called once with rec representing the names of the child elements.