xml-splitter

Provide an easy way to split or extract some nodes of very big XML files

XML Splitter for NodeJS

It's native and full Javascript class, that provides an easy way to split huge XML data with one or more paths.

Installation

With npm do:

$ npm install xml-splitter

Examples

    var XMLSplitter = require('xml-splitter')
 
    xs = new XMLSplitter('/root/item')
    xs.on('data', function(data) {
        console.log(data)
    })
    xs.on('end', function(counter) {
        console.log(counter+' slices !')
    })
    xs.parseString('<root><item><id>1</id></item><item><id>2</id></item></root>')

Output:

{ id: { '$t': '1' } }
{ id: { '$t': '2' } }
2 slices !
    var XMLSplitter = require('xml-splitter')
 
    xs = new XMLSplitter(['/root/item', '/root/entry'])
    xs.on('data', function(data) {
        console.log(data)
    })
    xs.on('end', function(counter) {
        console.log(counter+' slices !')
    })
    xs.parseString('<root><item><id>1</id></item><entry><id>2</id></entry></root>')

Output:

{ id: { '$t': '1' } }
{ id: { '$t': '2' } }
2 slices !
    var XMLSplitter = require('xml-splitter')
 
    xs = new XMLSplitter('/root/item')
    xs.on('data', function(data) {
        console.log(data)
    })
    xs.on('end', function(counter) {
        console.log(counter+' slices !')
    })
    xs.parseStream(process.stdin) // or process.stdin.pipe(xs.stream) 

Tests

Use nodeunit to run the tests.

$ npm install nodeunit
$ nodeunit test

API Documentation

Create an new splitter, cutter is a string or an array of strings that contains path. Options are :

  • regular : To indicate if the cutter is applied to not nested XML parts. By default is true (to optimize the memory consumation)
  • ignoreError : To NOT emit error event when an XML Error was met . By default is false.

Split XML within a string

Split XML within a stream

Emits three elements on each slice: the data node (object), the node's tag name (string), and the node's path (string). For example:

var xs = new XMLSplitter('//(item|unit)')
xs.on('data', function (node, tag, path) {
    console.log(node);
    console.log(tag);
    console.log(path);
})
xs.parseString('<record><item><value>X</value></item><unit><value>Y</value></unit></record>')

Output:

{ value: { '$t': 'X' } }
item
/record/item
{ value: { '$t': 'Y' } }
unit
/record/unit

Emit if the stream emit the close event OR if the stream is destroyed

Emit on the end of the XML parsing

Emit when something bad happened

The XPath standard is not supported, only basic paths (included namespaces) and fews operotors is implemented :

  • / : /record, /record/item
  • // : //para, /root//item
  • * : /root/*/item, /root/item/*
  • | : /(record|item), /root/(item|unit)

I do not think I will implement more operators.

Also

  • https://github.com/jahewson/node-big-xml
  • https://github.com/DamonOehlman/xmlslicer

License

MIT/X11