levi-chinese

0.1.3 • Public • Published

Levi Chinese

Chinese text processing plugins for Levi.

Build Status

Levi Chinese aims to facilitate Chinese support in Levi full-text search. This is under active development but I am no expert in Chinese NLP. Any comments or PRs are appreciated.

npm install levi-chinese

Levi Chinese provides text processing plugins chinese.converter() and chinese.segmenter(). Mount them under the default plugins of Levi.

var levi = require('levi')
var chinese = require('levi-chinese')
 
var lv = levi('db')
.use(levi.tokenizer())
.use(levi.stemmer())
.use(levi.stopword())
.use(chinese.converter()) // chinese plugin
.use(chinese.segmenter()) // chinese plugin
 
lv.pipeline('Lorem Ipsum is dummy text我是拖拉機學院手扶拖拉機專業的。', function (err, tokens) {
  // tokens
  ['lorem', 'ipsum', 'dummi', 'text',
    '手扶拖拉机', '拖拉机', '学院', '专业' ]
})

chinese.converter()

Convert Traditional Chinese into Simplified Chinese text tokens. Based on dictionary from Tongwen

chinese.segmenter()

Chinese words segmentation using nodejieba. This requires native bindings so it only works on Node.js.

License

MIT

Readme

Keywords

none

Package Sidebar

Install

npm i levi-chinese

Weekly Downloads

1

Version

0.1.3

License

MIT

Last publish

Collaborators

  • cshum