simple-text-tokenizer
TypeScript icon, indicating that this package has built-in type declarations

1.0.1 • Public • Published

Simple Text Tokenizer

Tokenize text to paragraphs, sentences, subsentences, and words.

Installation

Use npm:

npm install simple-text-tokenizer

How to

Import functions

 
import * as tokenizer from 'text-tokenizer'

Tokenize text to paragraphs

getParagraphTokens('this is the text of paragraph1\n\n\n this is the text of paragraph1\n');

Tokenize paragraph to sentences:

getSentenceTokens('this is the text of sentence1. And this is sentence2!');

Tokenize sentence to subsentences

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');

Tokenize sentence to words

getSubSentenceTokens('this is the text of subsentence1, this is sentence2; and this is the 3rd one!');

/simple-text-tokenizer/

    Package Sidebar

    Install

    npm i simple-text-tokenizer

    Weekly Downloads

    1

    Version

    1.0.1

    License

    MIT

    Unpacked Size

    13.5 kB

    Total Files

    27

    Last publish

    Collaborators

    • akosbalasko