@caijs/lang-def

1.0.2 • Public • Published

@caijs/lang-en

@caijs/lang-en is an NLP library for English written in javascript. It supports normalization, tokenization and stemming of english. It use the snowball implementation of the stemmer: https://snowballstem.org/

Installation

Run this in your project folder:

$ npm install @caijs/lang-en

Tokenize and stem a sentence

You can tokenize and stem a sentence. It will normalize the sentence, tokenize it and return the stems.

const { tokenizeAndStem } = require('@caijs/lang-en');

const stems = tokenizeAndStem('what else is developing your enterprise');
console.log(stems); // ['what', 'els', 'is', 'develop', 'your', 'enterpris']

Normalize a sentence

You can normalize a sentence, it will pass the sentece to lower case and replace special characters with the equivalent characters.

const { normalize } = require('@caijs/lang-en');

const normalized = normalize('What döès youR Compañy develop');
console.log(normalized); // what does your company develop

Tokenize a sentence

It tokenizes a sentence, without normalizing it. Split the sentence into tokens

const { tokenize } = require('@caijs/lang-en');

const tokens = tokenize('If you\'re here, then enter');
console.log(tokens); // ['If', 'you', 'are', 'here', 'then', 'enter']

Stem a word

It stems a word, without normalizing it.

const { stem } = require('@caijs/lang-en');

const stemmed = stem('enterprise');
console.log(stemmed); // enterpris

Readme

Keywords

none

Package Sidebar

Install

npm i @caijs/lang-def

Weekly Downloads

3

Version

1.0.2

License

MIT

Unpacked Size

9.69 kB

Total Files

9

Last publish

Collaborators

  • jesus-seijas
  • jseijas