node package manager
Easy collaboration. Discover, share, and reuse code in your team. Create a free org »

bfj-node4

BFJ

Package status Build status License

Big-Friendly JSON. Asynchronous streaming functions for large JSON data sets.

Why would I want those?

If you need to parse huge JSON strings or stringify huge JavaScript data sets, it monopolises the event loop and can lead to out-of-memory exceptions. BFJ implements asynchronous functions and uses pre-allocated fixed-length arrays to try and alleviate those issues.

Is it fast?

No. BFJ yields frequently to avoid monopolising the event loop, interrupting its own execution to let other event handlers run. The frequency of those yields can be controlled with the yieldRate option, but fundamentally it is not designed for speed. If you need quick results, BFJ is not for you.

What functions does it implement?

Eight functions are exported.

Four are concerned with parsing, or turning JSON strings into JavaScript data:

  • read asynchronously parses a JSON file from disk.

  • parse and unpipe are for asynchronously parsing streams of JSON.

  • walk asynchronously walks a stream, emitting events as it encounters JSON tokens. Analagous to a SAX parser.

The other four functions handle the reverse transformations, serialising JavaScript data to JSON:

  • write asynchronously serialises data to a JSON file on disk.

  • stringify asynchronously serialises data to a JSON string.

  • streamify asynchronously serialises data to a stream of JSON.

  • eventify asynchronously traverses a data structure depth-first, emitting events as it encounters items. By default it coerces promises, buffers and iterables to JSON-friendly values.

How do I install it?

If you're using npm:

npm i bfj --save

Or if you just want the git repo:

git clone git@github.com:philbooth/bfj.git

How do I read a JSON file?

const bfj = require('bfj');
 
bfj.read(path, options)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

read returns a bluebird promise and asynchronously parses a JSON file from disk.

It takes two arguments; the path to the JSON file and an options object.

If there are no syntax errors, the returned promise is resolved with the parsed data. If syntax errors occur, the promise is rejected with the first error.

How do I write a JSON file?

const bfj = require('bfj');
 
bfj.write(path, data, options)
  .then(() => {
    // :)
  })
  .catch(error => {
    // :(
  });

write returns a bluebird promise and asynchronously serialises a data structure to a JSON file on disk. The promise is resolved when the file has been written, or rejected with the error if writing failed.

It takes three arguments; the path to the JSON file, the data structure to serialise and an options object.

How do I parse a stream of JSON?

const bfj = require('bfj');
 
// By passing a readable stream to bfj.parse():
bfj.parse(fs.createReadStream(path), options)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });
 
// ...or by passing the result from bfj.unpipe() to stream.pipe():
request({ url }).pipe(bfj.unpipe((error, data) => {
  if (error) {
    // :(
  } else {
    // :)
  }
}))
  • parse returns a bluebird promise and asynchronously parses a stream of JSON data.

    It takes two arguments; a readable stream from which the JSON will be parsed and an options object.

    If there are no syntax errors, the returned promise is resolved with the parsed data. If syntax errors occur, the promise is rejected with the first error.

  • unpipe returns a writable stream that can be passed to stream.pipe, then parses JSON data read from the stream.

    It takes two arguments; a callback function that will be called after parsing is complete and an options object.

    If there are no errors, the callback is invoked with the result as the second argument. If errors occur, the first error is passed the callback as the first argument.

How do I create a JSON string?

const bfj = require('bfj');
 
bfj.stringify(data, options)
  .then(json => {
    // :)
  })
  .catch(error => {
    // :(
  });

stringify returns a bluebird promise and asynchronously serialises a data structure to a JSON string. The promise is resolved to the JSON string when serialisation is complete.

It takes two arguments; the data structure to serialise and an options object.

How do I create a stream of JSON?

const bfj = require('bfj');
 
const stream = bfj.streamify(data, options);
 
// Get data out of the stream with event handlers
stream.on('data', chunk => { /* ... */ });
stream.on('end', () => { /* ... */);
stream.on('dataError', () => { /* ... */);
 
// ...or you can pipe it to another stream
stream.pipe(someOtherStream);

streamify returns a readable stream and asynchronously serialises a data structure to JSON, pushing the result to the returned stream.

It takes two arguments; the data structure to serialise and an options object.

What other methods are there?

bfj.walk (stream, options)

const bfj = require('bfj');
 
const emitter = bfj.walk(fs.createReadStream(path), options);
 
emitter.on(bfj.events.array, () => { /* ... */ });
emitter.on(bfj.events.object, () => { /* ... */ });
emitter.on(bfj.events.property, name => { /* ... */ });
emitter.on(bfj.events.string, value => { /* ... */ });
emitter.on(bfj.events.number, value => { /* ... */ });
emitter.on(bfj.events.literal, value => { /* ... */ });
emitter.on(bfj.events.endArray, () => { /* ... */ });
emitter.on(bfj.events.endObject, () => { /* ... */ });
emitter.on(bfj.events.error, error => { /* ... */ });
emitter.on(bfj.events.end, () => { /* ... */ });

walk returns an event emitter and asynchronously walks a stream of JSON data, emitting events as it encounters tokens.

It takes two arguments; a readable stream from which the JSON will be read and an options object.

The emitted events are defined as public properties of an object, bfj.events:

  • bfj.events.array indicates that an array context has been entered by encountering the [ character.

  • bfj.events.endArray indicates that an array context has been left by encountering the ] character.

  • bfj.events.object indicates that an object context has been entered by encountering the { character.

  • bfj.events.endObject indicates that an object context has been left by encountering the } character.

  • bfj.events.property indicates that a property has been encountered in an object. The listener will be passed the name of the property as its argument and the next event to be emitted will represent the property's value.

  • bfj.events.string indicates that a string has been encountered. The listener will be passed the value as its argument.

  • bfj.events.number indicates that a number has been encountered. The listener will be passed the value as its argument.

  • bfj.events.literal indicates that a JSON literal (either true, false or null) has been encountered. The listener will be passed the value as its argument.

  • bfj.events.error indicates that a syntax error has occurred. The listener will be passed the Error instance as its argument.

  • bfj.events.end indicates that the end of the input has been reached and the stream is closed.

bfj.eventify (data, options)

const bfj = require('bfj');
 
const emitter = bfj.eventify(data, options);
 
emitter.on(bfj.events.array, () => { /* ... */ });
emitter.on(bfj.events.object, () => { /* ... */ });
emitter.on(bfj.events.property, name => { /* ... */ });
emitter.on(bfj.events.string, value => { /* ... */ });
emitter.on(bfj.events.number, value => { /* ... */ });
emitter.on(bfj.events.literal, value => { /* ... */ });
emitter.on(bfj.events.endArray, () => { /* ... */ });
emitter.on(bfj.events.endObject, () => { /* ... */ });
emitter.on(bfj.events.end, () => { /* ... */ });

eventify returns an event emitter and asynchronously traverses a data structure depth-first, emitting events as it encounters items. By default it coerces promises, buffers and iterables to JSON-friendly values.

It takes two arguments; the data structure to traverse and an options object.

The emitted events are defined as public properties of an object, bfj.events:

  • bfj.events.array indicates that an array has been encountered.

  • bfj.events.endArray indicates that the end of an array has been encountered.

  • bfj.events.object indicates that an object has been encountered.

  • bfj.events.endObject indicates that the end of an object has been encountered.

  • bfj.events.property indicates that a property has been encountered in an object. The listener will be passed the name of the property as its argument and the next event to be emitted will represent the property's value.

  • bfj.events.string indicates that a string has been encountered. The listener will be passed the value as its argument.

  • bfj.events.number indicates that a number has been encountered. The listener will be passed the value as its argument.

  • bfj.events.literal indicates that a JSON literal (either true, false or null) has been encountered. The listener will be passed the value as its argument.

  • bfj.events.error indicates that a circular reference was encountered in the data. The listener will be passed the Error instance as its argument.

  • bfj.events.end indicates that the end of the data has been reached and no further events will be emitted.

What options can I specify?

Options for parsing functions

  • options.reviver: Transformation function, invoked depth-first against the parsed data structure. This option is analagous to the reviver parameter for JSON.parse.

  • options.yieldRate: The number of data items to process before yielding to the event loop. Smaller values yield to the event loop more frequently, meaning less time will be consumed by bfj per tick but the overall parsing time will be slower. Larger values yield to the event loop less often, meaning slower tick times but faster overall parsing time. The default value is 16384.

  • options.Promise: Promise constructor that will be used for promises returned by all methods. If you set this option, please be aware that some promise implementations (including native promises) may cause your process to die with out-of-memory exceptions. Defaults to bluebird's implementation, which does not have that problem.

Options for serialisation functions

  • options.space: Indentation string or the number of spaces to indent each nested level by. This option is analagous to the space parameter for JSON.stringify.

  • options.promises: By default, promises are coerced to their resolved value. Set this property to 'ignore' for improved performance if you don't need to coerce promises.

  • options.buffers: By default, buffers are coerced using their toString method. Set this property to 'ignore' for improved performance if you don't need to coerce buffers.

  • options.maps: By default, maps are coerced to plain objects. Set this property to 'ignore' for improved performance if you don't need to coerce maps.

  • options.iterables: By default, other iterables (i.e. not arrays, strings or maps) are coerced to arrays. Set this property to 'ignore' for improved performance if you don't need to coerce iterables.

  • options.circular: By default, circular references will cause the write to fail. Set this property to 'ignore' if you'd prefer to silently skip past circular references in the data.

  • options.bufferLength: The length of the write buffer. Smaller values use less memory but may result in a slower serialisation time. The default value is 1024.

  • options.yieldRate: The number of data items to process before yielding to the event loop. Smaller values yield to the event loop more frequently, meaning less time will be consumed by bfj per tick but the overall serialisation time will be slower. Larger values yield to the event loop less often, meaning slower tick times but faster overall serialisation time. The default value is 16384.

  • options.Promise: Promise constructor that will be used for promises returned by all methods. If you set this option, please be aware that some promise implementations (including native promises) may cause your process to die with out-of-memory exceptions. Defaults to bluebird's implementation, which does not have that problem.

Why does it default to bluebird promises?

Until version 4.2.4, native promises were used. But they were found to cause out-of-memory errors when serialising large amounts of data to JSON, due to well-documented problems with the native promise implementation. So in version 5.0.0, bluebird promises were used instead. In version 5.1.0, an option was added that enables callers to specify the promise constructor to use. Use it at your own risk.

Can I specify a different promise implementation?

Yes. Just pass the Promise option to any method. If you get out-of-memory errors when using that option, consider changing your promise implementation.

Is there a change log?

Yes.

How do I set up the dev environment?

The development environment relies on Node.js, ESLint, Mocha, Chai, Proxyquire and Spooks. Assuming that you already have node and NPM set up, you just need to run npm install to install all of the dependencies as listed in package.json.

You can lint the code with the command npm run lint.

You can run the tests with the command npm test.

What versions of Node.js does it support?

Versions 4 and later.

What license is it released under?

MIT.