datexparser
An xml stream parser designed to make it easier to deal with large volumes of xml which are to be handled by a pipeline. Implemented in production as a datex2 parser accepting large volumes of traffic data in datex2, hence the name. Built upon xml-flow and is primarily a convenience wrapper around this module.
Installation
$ npm install @warerebel/datexparser`
Constructor
Constructor accepts an options object detailed below, there are no mandatory options:
- tagsToPush - will emit a 'data' event for each xml tag in the array with the corresponding xml data
- metaData - When emitting a data event will add data collected from elsewhere in the xml to the emitted object
- externals - Will emit a custom event when parsing this xml tag and provide the tag data
const options = {
tagsToPush: ["string"],
metaData: [{tag: "string", data: "string"}],
externals: [{tag: "string", event: "string"}]
}
Getting started
The simplest example takes the following data.xml:
<data>
<item>
<name>item 1</name>
</item>
<item>
<name>item 2</name>
</item>
<dataobject>
<name>dataobject 1</name>
</dataobject>
</data>
and emits a "data" event for every "name":
const datexparser = require("@warerebel/datexparser");
const filestream = require("fs").createReadStream("data.xml", "utf8");
const options = {
tagsToPush: ["name"]
}
let myParser = new datexparser(options);
myParser.on("data", (data) => {
console.log(data.$text);
});
myParser.handleRequest(filestream, (error) => {
// callback is called when parsing is finished
});
Output:
item 1
item 2
dataobject 1
Add meta data
We might want to capture info elsewhere in the xml and emit it with specific matching tags.
<data>
<country>US</country>
<items>
<item>
<name>item 1</name>
</item>
<item>
<name>item 2</name>
</item>
<dataobject>
<name>dataobject 1</name>
</dataobject>
</items>
</data>
In the above data.xml we want dataobjects and items to be emitted with a country code. We add metaData
to options which is an array of objects with a target tag name and the meta data tag name to associate with it.
const datexparser = require("@warerebel/datexparser");
const filestream = require("fs").createReadStream("data.xml", "utf8");
const options = {
tagsToPush: ["item","dataobject"],
metaData:[{tag: "item",data: "country"},{tag: "dataobject",data: "country"}]
}
let myParser = new datexparser(options);
myParser.on("data", (data) => {
console.log(`Item named ${data.name} has country code ${data.country}`);
});
myParser.handleRequest(filestream, (error) => {
// callback is called when parsing is finished
});
Output:
Item named item 1 has country code US
Item named item 2 has country code US
Item named dataobject 1 has country code US
Custom events
If we want to treat some data differently we can ask datexparser to emit a custom event for us to listen to rather than just emitting data events:
<data>
<country>US</country>
<items>
<item>
<name>item 1</name>
</item>
<item>
<name>item 2</name>
</item>
<dataobject>
<name>dataobject 1</name>
</dataobject>
</items>
</data>
const datexparser = require("@warerebel/datexparser");
const filestream = require("fs").createReadStream("data.xml", "utf8");
const options = {
tagsToPush: ["item"],
externals: [{tag: "dataobject", event: "dataobject"}]
}
let myParser = new datexparser(options);
myParser.on("data", (data) => {
console.log("Received data");
});
myParser.on("dataobject", (dataobject) => {
console.log("Received dataobject");
});
myParser.handleRequest(filestream, (error) => {
// callback is called when parsing is finished
});
Output
Received data
Received data
Received dataobject