simple-text-tokenizer
TypeScript icon, indicating that this package has built-in type declarations

1.0.1 • Public • Published

Simple Text Tokenizer

Tokenize text to paragraphs, sentences, subsentences, and words.

Installation

Use npm:

npm install simple-text-tokenizer

How to

Import functions

 
import * as tokenizer from 'text-tokenizer'

Tokenize text to paragraphs

getParagraphTokens('this is the text of paragraph1\n\n\n this is the text of paragraph1\n');

Tokenize paragraph to sentences:

getSentenceTokens('this is the text of sentence1. And this is sentence2!');

Tokenize sentence to subsentences

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');

Tokenize sentence to words

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');

Readme

Keywords

Package Sidebar

Install

npm i simple-text-tokenizer

Weekly Downloads

2

Version

1.0.1

License

MIT

Unpacked Size

13.5 kB

Total Files

27

Last publish

Collaborators

  • akosbalasko