@shelf/text-normalizer
TypeScript icon, indicating that this package has built-in type declarations

1.1.0 • Public • Published

text-normalizer CircleCI

Originally took from openai/whisperer and rewrote to TS

TypeScript library for normalizing English text. It provides a utility class EnglishTextNormalizer with methods for normalizing various types of text, such as contractions, abbreviations, and spacing. EnglishTextNormalizer consists of other classes you can reuse independently:

  • EnglishSpellingNormalizer - uses a dictionary of English words and their American spelling. The dictionary is stored in a JSON file named english.json
  • EnglishNumberNormalizer - works specifically to normalize text from English words to actually numbers
  • BasicTextNormalizer - provides methods for removing special characters and diacritics from text, as well as splitting words into separate letters.

Install

$ yarn add @shelf/text-normalizer

Usage

import {EnglishTextNormalizer} from '@shelf/text-normalizer'

const normalizer = new EnglishTextNormalizer()

console.log(normalizer.normalize("Let's")); // Output: let us
console.log(normalizer.normalize("he's like")); // Output: he is like
console.log(normalizer.normalize("she's been like")); // Output: she has been like
console.log(normalizer.normalize('10km')); // Output: 10 km
console.log(normalizer.normalize('10mm')); // Output: 10 mm
console.log(normalizer.normalize('RC232')); // Output: rc 232
console.log(
  normalizer.normalize('Mr. Park visited Assoc. Prof. Kim Jr.')
); // Output: mister park visited associate professor kim junior

Publish

$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags

License

MIT © Shelf

Readme

Keywords

none

Package Sidebar

Install

npm i @shelf/text-normalizer

Weekly Downloads

396

Version

1.1.0

License

MIT

Unpacked Size

83 kB

Total Files

21

Last publish

Collaborators

  • ksenia_holovko
  • petro.bodnarchuk
  • kateryna-kochina
  • maksym.tarnavskyi
  • andrii-nastenko
  • mykhailo.yatsko
  • ahavrysh
  • nikita_shelf
  • maciej.orlowski
  • monopotan
  • andrew214
  • bogdan.kolesnyk
  • andrii.batutin
  • kristina.zhak
  • anton-russo
  • mmazurowski
  • toms-shelf
  • mateuszgajdashelf
  • kchlon
  • dmytro.harazdovskiy
  • duch0416
  • i5adovyi
  • olesiamuller
  • mykola.khytra
  • yuliiakovalchuk
  • el_scrambone
  • bodyaflesh
  • slavammellnikov
  • andriisermiahin
  • mpushkin
  • batovpavlo
  • domovoj
  • vozemer
  • oleksii.dymnich
  • dima-bond
  • maksym.hayovets
  • oles.zadorozhnyy
  • ss1l
  • gemshelf
  • hartzler
  • vladgolubev
  • hmelenok
  • knupman
  • maaraanas
  • terret
  • chapelskyi.slavik
  • pihorb
  • irynah
  • diana.kryskuv
  • andy.raven
  • rafler
  • sskalp88
  • demiansua
  • yuriil
  • ktv18
  • drews_abuse
  • rostyslav-horytskyi
  • whodeen