npm | Profile

ja-sentence

Light-weight sentence tokenizer for Japanese.

rairye-nlp

published version 1.0.2, 4 years ago

zh-sentence

Light-weight sentence tokenizer for Chinese languages.

rairye-nlp

published version 1.0.1, 4 years ago

kr-sentence

Light-weight sentence tokenizer for Korean. Supports both full-width and half-width punctuation marks.

rairye-nlp

published version 1.0.1, 4 years ago

mnl-ws-norm

Light-weight tool for normalizing whitespace, splitting lines, and accurately tokenizing words (no regex). Multiple natural languages supported.

rairye-nlp

published version 1.0.3, 4 years ago

mnl-punct-norm

Tool for stripping and normalizing punctuation and other non-alphanumeric characters. Supports multiple natural languages. Useful for scrapping, machine learning, and data analysis.

rairye-nlp

published version 1.0.2, 4 years ago

convert-with-ents

Light-weight tool for converting characters in a string into common HTML entities (without regex).

rairye-nlp

published version 1.0.2, 4 years ago

st-no-love

Tool for escaping script tags using backslashes (no regex).

rairye-nlp

published version 1.0.4, 4 years ago

rairye-nlp

Packages 7

ja-sentence

zh-sentence

kr-sentence

mnl-ws-norm

mnl-punct-norm

convert-with-ents

st-no-love