resourceful-couchelastic

Hybrid Couchdb/Elasticsearch engine for Resourceful

resourceful-couchelastic

The goal of CouchElastic is to provide seamless full-text and parametric search to Node.js apps, accomplished via Flatiron's Resourceful ODM layer and the awesome power of CouchDB and Elasticsearch.

Start up an empty instance of CouchDB and an empty instance of Elasticsearch, then do this:

    var resourceful = require('resourceful'),
        Couchelastic = require('resourceful-couchelastic').Couchelastic;
    
    resourceful.engines.Couchelastic = Couchelastic;  // patch in the engine 
    resourceful.use( 'couchelastic', {
        // CouchDB settings 
        host : 'localhost', // optional - defaults to localhost 
        port : 5984,        // optional - defaults to 5984 
        database:'db-name', // required 
        search:{
            // Elasticsearch settings 
            index:'my-index-name'  // optional - defaults to db name 
        }
    });

From this point onwards each Resource that you define will have its data stored in your specified CouchDB cluster and have an Elasticsearch indexing river configured to index the data for searching.

Assuming you're already defining resources with Resourceful nothing more is required, but there are some additional options you might want to use. Everything specified within the options.search is passed straight into ES's Mapping API, allowing you fine-grained control over how that individual field is treated in the index.

    var Creature = resourceful.define('Creature', function(){
        this.string('name',{ required:true });
        this.string('genus');
        this.number('population');
        this.number('legs',{
            required: true,        // validation rules 
            minimum: 0,
            search:{               
                type:'integer'       // this tells ES what type of number this is 
            }
        });
        this.array('eats',{
            search:{ type:'string' }
        });
        this.string('geography',{
            search:{
                analyzer:'snowball'  // stem words to match word stem rather than exact form. 
            }
        });
    });

The sync() method sets up your data sources for you. Until you sync you can save() and get() but not much more.

For each Resource type sync does the following:

  • Saves a CouchDB design document
  • Saves an Elasticsearch Mapping document
  • Creates a River on the Elasticsearch cluster that listens to CouchDB's replication stream

You'll need to sync after you make any significant design changes so that they're carried through to the schemas of both servers. Be aware that changing the data type of a field after data has already been indexed can be problematic.

Assuming you've defined a Creature as per the Resourceful docs you should be able to search your database in free text like this.

    Creature.search("fur",function( errcreaturesresponse ){
        // creatures is a convenient array of Creature objects that match the search 
        // response contains the entire ES response including search results and metadata 
    });

You can also search using the Elasticsearch query DSL, for example the following fuzzy query would also match the word "oscar"

    Actor.search({
        query : {
            fuzzy : { text : "socar" }  
        }
    },function( erractorsresponse ){
        // actors is an array 
        // response is the full response containing metadata, e.g. facets 
    });

Or using Resoureceful's parametric query method Resourece.find()

    Creature.find({
        legs:4,
        eats:"wasps"
    },function( errcreaturesresponse ){
        //  
    });

If you have two data types related by a one-to-many foreign key you can do this (many-many not available yet):

    Creature.search("name:wolf",function( errcreatures ){
        Genus.join( creatures, function( errcreatures ){
            // creatures now contains: 
            //     creature.genus_id - foreign key 
            //     creature.genus - linked entity 
        });
    });

If some of your Resource data is sensitive you might want to block it from being indexed

    var Employee = resourceful.define('Employee', function(){
        this.string('name');
        this.number('salary',{search:{index:'no'}});
    });

TODO - clean up the object search API and allow faceting etc.

    npm install resourceful-couchelastic`

All tests are written with vows and should be run with npm:

  $ npm test