este

This module was re-named esta. see: https://www.npmjs.com/package/esta

esta

The Simplest ElasticSearch Node.js Module

##Guide to esta Documentation

Usage:

Philosophy / Background / Detail:

Installation

Install from NPM

npm install esta --save

If you need to check the connection status to the ElasticSearch Instance/Cluster we expose the handy ES.CONNECT method:

var ES = require('esta');
 
ES.CONNECT(function (response) {
  console.log(response);
  // for more detailed stats see: STATS method below 
});

example ES.CONNECT response:

{ status: 200,
  name: 'Ultragirl',
  cluster_name: 'elasticsearch',
  version:
   { number: '1.4.2',
     build_hash: '927caff6f05403e936c20bf4529f144f0c89fd8c',
     build_timestamp: '2014-12-16T14:11:12Z',
     build_snapshot: false,
     lucene_version: '4.10.2' },
  tagline: 'You Know, for Search' }

Note: Esta expects you to have environment variables set up for ES_HOST and ES_PORT (see below)


###CRUD Methods

Creating a new record is easy:

// define the record you want to store: 
var record = {
  index: 'twitter',
  type: 'tweet',
  id: Math.floor(Math.random() * (100000)), // or what ever GUID you want 
  message: 'Your amazing message goes here'
};
ES.CREATE(record, function(response) {
 // do what ever you like with the response 
});

A typical successful ES.CREATE response:

{ _index: 'twitter',
  _type: 'tweet',
  _id: '112669114721',
  _version: 1,
  created: true }
  • index can be compared to a Database in SQL see: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/glossary.html#glossary-index
  • type is like the table in SQL-world or a collection in other NoSQL systems. see: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/glossary.html#glossary-type
  • id is the unique key for your record. equivalent to the primary-key in a SQL-world


READing your record:

// define the record you want to retrieve: 
var record = {
  index: 'twitter',
  type: 'tweet',
  id: 1234, // or what ever GUID you want to lookup 
};
ES.READ(record, function(response) {
 // do what ever you like with the response 
});

A typical successful ES.READ response:

{ _index: 'twitter',
  _type: 'tweet',
  _id: '735981868114',
  _version: 1,
  found: true,
  _source: { message: 'My Awesome Message' }
}

Here _source is the original data you inserted as the record.

When a record does not exist response.found is false. e.g:

{ _index: 'twitter',
  _type: 'tweet',
  _id: '804164689732',
  found: false }
  • index we need to know which "database" our record is in
  • type "table"
  • id the unique key for the record you are looking up.


UPDATE an existing record:

// define the record you want to store: 
var record = {
  index: 'twitter',
  type: 'tweet',
  id: 1234, // or what ever GUID you want 
  message: 'Revised message'
};
ES.UPDATE(record, function(response) {
 // do what ever you like with the response 
});

A typical successful ES.UPDATE response:

{ _index: 'twitter',
  _type: 'tweet',
  _id: '639403095701',
  _version: 2,
  created: false }

Notice how the _version gets incremented to 2

  • index we need to know which "database" our record is in
  • type "table"
  • id the unique key for the record you are updating.

Note: UPDATE actually performs an UPSERT
UPdate record if already exists or inSERT (create) if its new.


// define the record you want to store: 
var record = {
  type: 'tweet',
  index: 'twitter',
  id: 1234, // or what ever GUID you want 
  message: 'Revised message'
};
ES.DELETE(record, function(response) {
 // do what ever you like with the response 
});

A typical successful ES.DELETE response:

{ found: true,
  _index: 'twitter',
  _type: 'tweet',
  _id: '137167415115',
  _version: 2,
  deleted: true }

Notice how the deleted is true

  • index we need to know which "database" our record is in
  • type "table"
  • id the unique key for the record you are updating.

Obviously if the record is NOT Found, there is nothing to delete. In that case, the response look like this: (found is false)

{ found: false,
  _index: 'twitter',
  _type: 'tweet',
  _id: '951078315032',
  _version: 1 }


Searching is super easy:

// setup query: 
var query = {
  index: 'twitter',
  type:  'tweet',
  field: 'text',     // the field we want to search in 
  text:  'amazing'   // string we are searching for 
};
 
SEARCH(query, function(response) {
  // console.log(res); 
  t.equal(res.hits.total > 0, true,
    chalk.green("✓ Search results found: "+ res.hits.total));
  t.end();
});

A typical successful ES.SEARCH response:

{ took: 8,
  timed_out: false,
  _shards: { total: 5, successful: 5, failed: 0 },
  hits:
   { total: 924,
     max_score: 0.6355637,
     hits:
      [ [Object],
        [Object],
        etc...
  }
}

The response.hits.total is 924 (the number of records that matched our SEARCH query)

  • index we need to know which "database" our record is in
  • type "table"
  • field the field in the record you want to search in.
  • text the text you are searching for.

When NO RECORDS are FOUND the response will look this:

{ took: 2,
  timed_out: false,
  _shards: { total: 5, successful: 5, failed: 0 },
  hits: { total: 0, max_score: null, hits: [] } }

We check for if(response.hits.total > 0) { /* use/display results */ } else { /* show sad face */}
Here's the image we use:


The ES.STATS method exposes the ElasticSearch Instance/Cluster _stats see: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.html

STATS(function (response) {
  // do something awesome response 
});

ElasticSearch returns rich information on cluster health, document count etc. see: #31 for complete STATS output


Most of the Node.js developers I've worked with, don't handle errors well.
A typical (bad) example:

if(error) {
  console.log(error); // this is worse than useless! 
}

So instead of having of having code full of if(err) ... we have deliberately cut out errors from callback functions completely.

Thus, all the methods in this module have the simplified signature:

ES.METHOD(record, function(response){
  // do something with response 
});

Instead, we propose using a central error catcher. e.g:

process.on('uncaughtException', function(err) {
  console.log('ERROR: ' + err); // preferably handle errors appropriately 
});

or, if you are using Hapi.js use https://github.com/hapijs/poop

For more on Errors, please read: https://www.joyent.com/developers/node/design/errors



Required: Use Environment Variables for HOST & PORT

We need to move away from using config files.
Read: http://12factor.net/config (Store config in the environment - no more config.json!)

To use environment variables for HOST & PORT on your local machine: you will need to run the following Shell Commands:

export ES_HOST="127.0.0.1"
export ES_PORT=9200

Sample .travis.yml file:

language: node_js
node_js:
  - 0.10
services:
  - elasticsearch
env:
  - ES_HOST="127.0.0.1" ES_PORT=9200

if you are new to Travis-CI see: https://github.com/docdis/learn-travis

(Optional) Use Vagrant to Run ElasticSearch

If, like me you prefer not to have Java running on your dev machine (because its chronically insecure) I highly recommend using Vagrant to run a light-weight virtual machine to isolate ElasticSearch and only install Java in the VM.

The other obvious benefit of using Vagrant is that all your fellow developers will have exactly the same (latest) build so there's no risk of version incompatibility. Learn more at: https://github.com/nelsonic/learn-vagrant

I've included a Vagrantfile in this repo which will get you up-and-running with Ubuntu, Node.js & ElasticSearch with a single command: vagrant up

If you have any questions, just ask!



Philosophy / Background / Detail

We wanted something simpler.
Easier to understand (under 300 lines of code!) and thus much easier to extend if you need to!

We wanted a way of "soft-deleting" records (i.e. avoiding data loss.) If you like the idea of being able to * recover accidentally deleted data*, you will love our DELETE method see: lib/delete.js

Only Core Modules

Zero external dependencies (3rd party modules).

There are quite a few modules in the node ecosystem for use with ElasticSearch. However, when I saw how many dependencies the "Official" ElasticSearch Node.js Module https://github.com/elasticsearch/elasticsearch-js had and especially the number of DevDependencies, it made it hard to contribute to the project...

Our aim is to build something that only uses core modules with Stable APIs, so we never have to think about upgrading - it also makes it a lot easier for others to learn how the module works, which invites contribution from the community.
Given that ElasticSearch has a REST API we are only using Node's http (core) module. and this is kept DRY (only in one file) see: lib/http_request.js

Dev Dependencies

We carefully select and only use well-maintained "pure" JavaScript modules in our development toolchain:

  • Tape for testing: https://github.com/substack/tape
  • Istanbul for Code Coverage: https://github.com/nelsonic/learn-istanbul
  • Chalk for colors in test output (readability)
  • Pre-commit for ensuring all commits pass strict quality checks before being pushed to GitHub. see: https://github.com/nelsonic/learn-pre-commit
  • jshint checks code style is consistent: https://github.com/nelsonic/learn-jshint
  • CodeClimate for tracking code quality and test coverage: https://github.com/nelsonic/learn-codeclimate

Code Quality

If you are looking for a module you can trust, these are the "badges" you are looking for.

Contributing

All contributions are welcome.
If anything is unclear please create an issue: https://github.com/nelsonic/esta/issues

We prefer to have the METHOD names UPPERCASE because it makes them easy to spot and differentiate from your code. If you feel they are a bit "shouty" all methods are available in lowercase too; take your pick! see: http://git.io/pZ6t

The choice of module name was the answer to the question:

Q: Which ElasticSearch Node Module should I use...?
A: https://translate.google.com/#auto/en/esta

MIT