fulltext-engine

levelquery-engine plugin to index and perform full-text search indexing of documents in levelup/leveldb

fulltext-engine

Query your levelup/leveldb engine using full text search phrases with INDEXES.

This is a plugin for level-queryengine.

Install through npm:

$ npm install fulltext-engine
var levelQuery = require('level-queryengine'),
    fulltextEngine = require('fulltext-engine'),
    levelup = require('levelup'),
    db = levelQuery(levelup('my-db'));
 
db.query.use(fulltextEngine());
 
// index the properties you want (the 'doc' property on objects in this case): 
db.ensureIndex('doc', 'fulltext', fulltextEngine.index());
 
db.batch(makeSomeData(), function (err) {
  // will find all objects where 'my' and 'query' are present 
  db.query('doc', 'my query')
    .on('data', console.log)
    .on('stats', function (stats) {
      // stats contains the query statistics in the format 
      //  { indexHits: 1, dataHits: 1, matchHits: 1 }); 
    });
 
  // will find all objects where 'my' OR 'query' are present 
  db.query('doc', 'my query', 'or')
    .on('data', console.log)
    .on('stats', function (stats) {
      // stats contains the query statistics in the format 
      //  { indexHits: 1, dataHits: 1, matchHits: 1 }); 
    });
});

Currently only one index strategy is supported:

  • 'fulltext' (default) - full text index the property defined by the indexName.

Note: if you want to index another property with a different name than the indexName then pass the property path through to the constructor of the fulltextEngine.index() function.

db.query.use(fulltextEngine());
 
// index 'stringfield' property of objects 
db.ensureIndex('stringfield', 'fulltext', fulltextEngine.index());
 
// index the 'anotherName' property of objects but store it in the 'oneName' index 
db.ensureIndex('oneName', 'fulltext', fulltextEngine.index('anotherName'));

If a full text index is not present for a query, then it will result in a full leveldb "table" scan. You will get the same results as an index query, it will just take longer.

The result stream that gets returned from db#query also emits 'stats' events so you can tell if an index did or didn't get used.

  db.query('doc', 'my query', 'or')
    .on('data', console.log)
    .on('stats', function (stats) {
      // stats looks like this if an index got used 
      //  { indexHits: 1, dataHits: 1, matchHits: 1 }); 
 
      // stats looks like this if an index did not get used 
      //  { indexHits: 0, dataHits: 100, matchHits: 1 }); 
    });

Returns a full text engine query engine for use with level-queryEngine.

Note: you can pass an optional boolean parameter to the contructor of the fulltextEngine factory function if you want to use a "fuzzy" search similar sounding words will match; (eg. "for" and "fear" would match under the fuzzy match).

Returns a full text engine indexing strategy to use with db.ensureIndex.

If not provided, the ensureIndex will index the object path defined by the index name.

Will seach the object path pathName for the presence of searchText. The default search is an AND (ie. all search terms must be present). You can also pass in 'or' if you want to match any documents that have ANY of the search terms present.

This project is under active development. Here's a list of things I'm planning to add:

  • indexing the frequency of words as well and use it to rank better matching documents higher.
  • proper Information Retrieval ranking algorithms