streamsearch

Streaming Boyer-Moore-Horspool searching for node.js

Description

streamsearch is a module for node.js that allows searching a stream using the Boyer-Moore-Horspool algorithm.

This module is based heavily on the Streaming Boyer-Moore-Horspool C++ implementation by Hongli Lai here.

Requirements

Installation

npm install streamsearch

Example

  var StreamSearch = require('streamsearch'),
      inspect = require('util').inspect;
 
  var needle = new Buffer([13, 10]), // CRLF 
      s = new StreamSearch(needle),
      chunks = [
        new Buffer('foo'),
        new Buffer(' bar'),
        new Buffer('\r'),
        new Buffer('\n'),
        new Buffer('baz, hello\r'),
        new Buffer('\n world.'),
        new Buffer('\r\n Node.JS rules!!\r\n\r\n')
      ];
  s.on('data', function(datastartend) {
    console.log('data: ' + inspect(data.toString('ascii', start, end)));
  });
  s.on('match', function() {
    console.log('match!');
  });
  for (var i = 0, len = chunks.length; i < len; ++i)
    s.push(chunks[i]);
 
  // output: 
  // 
  // data: 'foo' 
  // data: ' bar' 
  // match! 
  // data: 'baz, hello' 
  // match! 
  // data: ' world.' 
  // match! 
  // data: ' Node.JS rules!!' 
  // match! 
  // data: '' 
  // match! 

API

  • data(< Buffer >chunk, < integer >start, < integer >end) - Emitted when non-needle data is available. This data is in chunk between start (inclusive) and end (exclusive).

  • match() - Emitted when the needle has been found in the stream.

  • maxMatches - < integer > - The maximum number of matches. This is especially useful when multiple matches exist within a single chunk passed to push(), so that this module knows to stop matching within that single chunk. Defaults to Infinity.

  • matches - < integer > - The current match count.

  • (constructor)(< Buffer >needle) - Creates and returns a new instance for searching for needle.

  • push(< Buffer >chunk) - integer - Process chunk. The return value is the last processed index in chunk + 1.

  • reset() - (void) - Resets internal state. Useful for when you wish to start searching a new/different stream for example.