punctuation-restore

0.1.0Β β€’Β PublicΒ β€’Β Published

πŸ§‘β€πŸ­ Punctuation Restore

A Node.js package that restores punctuation and casing to unpunctuated text using the punctuation_fullstop_truecase_english ONNX model: https://huggingface.co/1-800-BAD-CODE/punctuation_fullstop_truecase_english

punctuation-restore

Features

  • Restores punctuation marks (periods, commas, question marks, etc.)
  • Handles casing
  • Supports batch processing of multiple texts
  • Uses efficient ONNX runtime for inference
  • Automatically downloads required models

Notes

  • Models are automatically downloaded from Hugging Face on first use and saved locally to the ./models directory for future use.
  • Punctuation correction isn't perfect, but it's good enough for most use cases (use with caution).

Installation

npm install punctuation-restore

Quick Start

import PunctuationRestorer from 'punctuation-restore';

const restorer = new PunctuationRestorer();

const texts = [
  "this is a string without any punctuation or casing yesterday i went to disneyworld and had a great time",
  "washing your dog once a month is important nothing quite beats a walk on the beach"
];

const results = await restorer.restore(texts);
console.log(results);

API Reference

PunctuationRestorer

The main class for handling punctuation restoration.

Methods

  • async restore(texts: string[]): Promise<string[]>

    • Takes an array of unpunctuated texts
    • Returns an array of punctuated and cased sentences
    • Automatically handles model initialization and cleanup
  • async cleanup()

    • Manually release ONNX session resources
    • Called automatically after restore(), but can be called explicitly if needed

Model Architecture

The package uses two main models:

  • model.onnx: Main ONNX model for punctuation and casing prediction
  • tokenizer.model: Tokenizer model for text preprocessing

Models are automatically downloaded from Hugging Face on first use and saved locally to the ./models directory for future use.

Example

Check out example/example.js for a complete working example:

import PunctuationRestorer from '../punctuationRestore.js';

const testTexts = [
  "this is a string without any punctuation or casing yesterday i went to disneyworld and had a great time",
  "washing your dog once a month is important nothing quite beats a walk on the beach"
];

try {
    const restorer = new PunctuationRestorer();
    const results = await restorer.restore(testTexts);
    results.forEach(result => console.log(result));
} catch (error) {
    console.error('Test failed:', error);
}

Development

Scripts

  • npm run clean: Clean install dependencies
  • npm run example: Run the example script

Project Structure

punctuation-restore/
β”œβ”€β”€ modules/
β”‚   β”œβ”€β”€ downloadModel.js    # Model download handling
β”‚   β”œβ”€β”€ tokenizer.js        # Text tokenization
β”‚   └── postProcessor.js    # Output processing
β”œβ”€β”€ example/
β”‚   └── example.js         # Usage example
└── punctuationRestore.js  # Main package entry

Dependencies

  • onnxruntime-node: ^1.16.3 - ONNX runtime for Node.js
  • sentence-parse: ^1.3.0 - Sentence parsing utilities

License

MIT

Package Sidebar

Install

npm i punctuation-restore

Weekly Downloads

3

Version

0.1.0

License

MIT

Unpacked Size

1.8 MB

Total Files

11

Last publish

Collaborators

  • jparkerweb