This package has been deprecated

Author message:

No longer supported

kypseli

1.0.18 • Public • Published

Kypseli Open Source Love GPL Licence Dependencies Travis CI

NPM

Kypseli (Κυψέλη in Greek, meaning beehive) is a search engine written in pure JavaScript while utilizing MySQL as its database and focuses on robustness while taking up as little disk space as possible.

To get it simply run:

npm install --save kypseli

How it Works

Kypseli is able to index documents in JSON format by parsing their fields and tokenizing any string fields. It then saves records associating any keywords with the newly inserted document.

JSON documents are indexed and stored based on what fields the user wishes to store. Every search query is cached and then updated if a document that is indexed afterwards contains the search term.

Documentation

Versions earlier than 1.0.14 have been built in order to support older ECMA because some of my servers (and many hosting services) have not updated to the newer versions of node. As of 1.0.14 the package will start transitioning to full current JavaScript support.

Require the File

const SearchIndex = require("kypseli");

Create a Search Index

const si = new SearchIndex({
    "host": "mysqlhost",
    "user": "username",
    "password": "password",
    "dbname": "search_index"
});

Options

  • host*: MySQL host
  • user*: MySQL database username
  • password*: MySQL database password
  • dbname: The name of the MySQL database to create or access
  • fields: An array of strings indicating which fields of the JSON documents to save in the database (explained below)

*: Required fields

If you wish to only save some of the documents' fields add the option fields in the options the class takes as follows:

{
    "fields": [ "field_i_want_to_keep" ],
}

Index and Search

One thing to note before anything else is that indexing, currently, takes a long time. The code hasn't been optimized to handle fast document inserts, but it is planned and will be an important milestone.

The documents that the search index takes are always JSON. Also, the add function can either take an array or a single document. To add new documents and search them do the following:

var data = {
    "title": "This is my post",
    "content": "This is the content of my post."
};
 
// Index some document
si.add(data, (docId) => {
    console.log("Added document " + docId);
});
 
var page = 1;
 
// Search for something
si.search("search terms", page, (error, result) => {
    console.log(result);
});
 

Updating or Removing

To update a document simply provide the id and the new schema for that document to the update function:

let newDoc = { ... };
 
si.update(123, newDoc, (error, newId) => {
    console.log(error, newId);
});

If you want to delete the document, just pass the document id.

si.remove(123, (error) => {
    console.log(error);
});

Misc Functions

To get the size of the database do:

si.getSize((error, size) => {
    console.log(size);
});

If you need the sizes of each individual table do:

si.getTableSizes((error, sizes) => {
    console.log(sizes);
});

Alternatives

I created this package because there were no good search engines in node.js, that were maintained at least. The only one that I was able to find and is currently still maintained is search-index, however that has one problem: It requires a LOT of disk space and it isn't as fast as I'd hope it would be, since it uses LevelDB.

Changelog

Version 1.0.15

In this version importing batch documents is more efficient and also a timing and duplication issue related to cached search results updates is fixed.

Version 1.0.12

This version fixes various issues including a bug in recalculating cached search results of a query.

Version 1.0.8

This version fixes the issue of the non-existing database and database structure and also gives the ability to the user to define the database name.

TODO

  • Use MySQL as the datastorage
  • Search query caching
  • Support adding a batch of documents
  • Natural language processing for similar words or search queries
  • "Did you mean xxxx?"
  • Deletion of indexed files
  • Using Redis for memory caching

License GPL Licence

GNU General Public License version 3. Check out the LICENSE.md file.

Package Sidebar

Install

npm i kypseli

Weekly Downloads

2

Version

1.0.18

License

GPL-3.0

Last publish

Collaborators

  • conmarap