Apache lucene API, types and bindings for Node.js.
- Reference API
- Node.js API straightforward translated / ported from Java API (trying to respect the original Java packages, types and interfaces as far as I could).
- By default, methods are synchronous but there also exist signatures for Promise based return values.
- lucene Java Api calls binding directly called from node.js thanks to node-java. (no IPC, no process spawn, no server).
- Written in TypeScript.
- apache lucene is easily upgradable and included along the project.
- Requirements: Node.js and Java. For the installation Python is also required.
npm install --save node-lucene
Also, lucene .jar files need to be accessible. By default lucene version
7.7.2 jar files will try to be read from
node_modules/node-lucene/lucene-lib/. This can be configured in
node-lucene/src/util/getLuceneJava.ts. Although lucene version 8 could be easily adapted it currently won't pass tests since the API are not 100% compatible with lucene version 7.
Important node.s version 12 is not currently supported (since npm package
java doesn't support it yet.). Nevertheless, node.js version 11 is supported.
Example: indexing and searching in RAM
// or `import * as lucene from 'node-lucene'`// We will be indexing the following "books":// before using this library make sure you call the following function for loading lucene .jars in Java classpath:lucene.initialize// create lucene index in memory:// add the documents:books.forEach// because we finished adding documents we close the index so we can start searching:writer.close// now we want to search so we create a directory reader, index searcher and a query parser// searching for 'phrase does not exists' should return 0 results// searching for 'cold' should return 1 results. We iterate found documents and print its author and titles:topDocs = searcher.searchparser.parse'cold', 10 //TODO: lucene issue : why searching for 'the' is returning 0 results ?// now get back the document from the index to access matched book's author and titleconsole.log`Found "" authored by `;
Example: indexing in File System
WIP / Planning
- still deciding if handcrafting an implementation or auto generating it using java2js like tool. Right now just a handcrafted respecting original lucene java package structure with folders and TS namespaces and using auto generated java2js interfaces as reference / copy/paste
- lucene is THE best index / search engine implementation
- I want to call lucene API directly from my nodejs program without fork(), spawn(), request(), socket, or any IPC.
- I want to use original lucene java implementation not a clone/port like lucy and easily upgrade the java lucene implementation at any time.
- I want to call lucene java from node.js using analog API as fast a s possible.
- I don't want to start a server like solr or elasticsearch. Example: a text editor indexing lots of source code as eclipse IDE does.
- support node 12 - currently because
javadon't support node 12 we neither. node 11 is supported.
- make all extends java.lang.object and this should extends JavaBase - so they all implement toString() etc
- contribute node-java README: to document standard
promisify: require('util').promisifyso no library is necessary. Also implement it as default promisify impl if exists since its supported from node 8.0.0 (LTS version now so it should be available in most cases) .
- use https://github.com/joeferner/node-java-maven to manage lucene jars and run in npm prepare script
- do reflection in java to get method / class signatures and generate TS interfaces and implementations automatically using java2js. We realized this will be very hard task but would be ideal.
- contribute node-java README: fix several typos : "classes separate usi", "if an error occurs", "sted classes separate u"
- consider deprecating *Async methods since promises are available OOTB in node LTS, can be polly filled in older Node versions and callbacks are "obsolete" today in favor of promises. Supporting *Async is double work and testing it is harder (testing promises is just copy&paste *Sync and add async/await while testing *Async implies callbacks and require('async') - test code is totally different - too much work and no real gain).
- async constructors ? today calling
var o = new java.lang.Object()is sync - should we support async analogy for example,
var o = await java.lang.Object.newAsync()
- node-java - README - getStaticFieldValue parameter "callback(err, item) - Callback to be called when the class is created." is confusing - is it really exists ? and if then " class is created" is wrong