xml2rec

1.0.26 • Public • Published

Motivation

Provide a simple way to convert very large (and small) XML files to records of JSON, or CSV files or perform a custom action. This is done while still maintaining a small foot-print, and being fast. Solutions that load the entire XML wouldn't scale, perform poorly, and may not work if the files are several GBs.

Example usages

For the usage demonstration below let's assume an input file, sample.xml, with the following contents:

  <users>
    <user>
      <name>Peter</name>
      <age>45</age>
    </user>
    <user>
      <name>John</name>
      <age>25</age>
    </user>
    <user>
      <name>Cindy</name>
      <age>32</age>
    </user>
    <user>
      <name>Alex</name>
      <age>15</age>
    </user>
  </users>

Given that input file, let's see some simple ways to convert it into records.

Convert XML and write to a CSV file

var xml2rec=require('xml2rec');
xml2rec('sample.xml', 'user', 'sample.csv');
Would write the contents below to sample.csv

  name, age
  Peter, 45
  John, 25
  Cindy, 32
  Alex, 15

Convert XML and write to a JSON file

require('xml2rec');
xml2rec('sample.xml', 'user', 'sample.json');
Would write JSON records of user to sample.json

Convert to CSV

var xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {}, (err, rec) => {
console.log(rec);
});
Would output

  name, age
  Peter, 45
  John, 25
  Cindy, 32
  Alex, 15

Convert to CSV with no heading

By default, CSV files are output with a heading. To not output the heading, you will have to override the default options like so:

var xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {heading:false}, (err, rec) => {
console.log(rec);
});

Would output

  Peter, 45
  John, 25
  Cindy, 32
  Alex, 15

Convert to JSON

var xml2rec=require('xml2rec');
xml2rec('sample.xml', ['user'], {format:'json'}, (err, rec) => {
console.log(JSON.stringify(rec));
});

Would output

	{name:'Peter', age:45}
	{name:'John', age:25}
	{name:'Cindy', age:32}
	{name:'Alex', age:15}

Features

When the most common use cases described above does not suffice, the general parameter list below would be the way to go. For example, wanting to convert XML input into some other format, or updating a database, rendering XML via Jade/Pug are some use-cases. xml2rec() calls a user supplied callback function cb(err, rec) for every record identified by user supplied list of element name(s).

xml2rec(input_filename, [element_name1, element_name2, ..], {}, function cb(err, rec) {
});
Where

  • input_filename: is the name of the (large) file that needs to be converted
  • [element_name1, element_name2, ..]: are the list of XML element names in the input file. For each occurrence of the element_name* in the input file, the callback will be called with rec representing the child elements. The type of the object rec depends on the thiird options parameter {}
  • {}: the options parameter
    • trimWhiteSpaceTextNodes=true|false // default: true. Nodes that have just white-spaces are treated as empty nodes
    • trimTrailingSpaces=true|false // default: false. Leading white-spaces in text nodes are removed
    • trimLeadingSpaces=true|false // default: false. Trailing white-spaces in text nodes are removed
    • escapeNewLinesInText=true|false // default: false. New lines are replaced with \n
    • format='csv'|'json' // default: csv
    • heading=true|false // default: false
    • outputAttributes=true|false // default: true. Not implemented yet.
  • cb(err, rec): the function to call per parsed record. The value of rec depends on options.fomat's value.

If preserveWhiteSpaces is true, then the whitespace between XML nodes will be preserved in rec. Otherwise, they will be eaten. If format=='csv', the cb()'s rec parameter will be a string --a line with comma separated values, where each value is the text node of the element_name*'s children. If format=='json', rec would be a JavaScript object representing the elements' children. If heading==true, then before calling cb() for each record, cb() will be called once with rec representing the names of the child elements.

Readme

Keywords

Package Sidebar

Install

npm i xml2rec

Weekly Downloads

11

Version

1.0.26

License

ISC

Last publish

Collaborators

  • arunsundaram