saxophonist

2.0.0 • Public • Published

saxophonist

Extract elements from large XML files with node.js streams

Usage

'use strict'
 
var fs = require('fs')
var p = require('path')
var saxophonist = require('./')
var count = 0
 
console.time('parsing time')
 
fs.createReadStream(p.join(__dirname, 'wikipedia', '1.xml'))
  .pipe(saxophonist('page'))
  .on('data', function () {
    count++
  })
  .on('end', function () {
    console.timeEnd('parsing time')
    console.log('read', count, 'pages')
  })

The data format is:

{
  path: ['a', 'path', 'to', 'page'], // the path in the XML document
  children: null, // or an array with elements like this
  attributes: {}, // object with all element attribute
  text: null // or string, containing the element text
}

Acknowledgements

saxophonist is sponsored by nearForm.

License

MIT

Dependents (2)

Package Sidebar

Install

npm i saxophonist

Weekly Downloads

347

Version

2.0.0

License

MIT

Unpacked Size

7.62 kB

Total Files

11

Last publish

Collaborators

  • matteo.collina