node-speech-recognition
TypeScript icon, indicating that this package has built-in type declarations

1.0.7 • Public • Published

node-speech-recognition

npm downloads npm downloads

Transcribe speech to text on node.js using OpenAI's Whisper models converted to cross-platform ONNX format

Installation

  1. Add dependency to project
npm i node-speech-recognition

Usage

import NSR from "node-speech-recognition";
const { default: Whisper } = NSR;

const whisper = new Whisper();
await whisper.init('base.en')

const transcribed = await whisper.transcribe('your/audio/path.wav');

console.log(transcribed)

Result (JSON)

[
  {
    text: " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country."
    chunks: [
       { timestamp: [0, 8.18],  text: " And so my fellow Americans ask not what your country can do for you" },
       { timestamp: [8.18, 11.06], text: " ask what you can do for your country." }
    ]
  }
]

API

Whisper

The Whisper class has the following methods:

  • init(modelName: string) : you must initialize it before trying to transcribe any audio.

    • modelName: name of the Whisper's models. Available ones are:

          | Model     | Disk   |
          |-----------|--------|
          | tiny      | 235 MB |
          | tiny.en   | 235 MB |
          | base      | 400 MB |
          | base.en   | 400 MB |
          | small     | 1.1 GB |
          | small.en  | 1.1 GB |
          | medium    | 1.2 GB |
          | medium.en | 1.2 GB |
      
  • transcribe(filePath: string, language?: string) : transcribes speech from wav file.

    • filePath: path to wav file
    • language: target language for recognition. Name format - the full name in English like 'spanish'
  • disposeModel() : dispose initialized model.

Made with

Package Sidebar

Install

npm i node-speech-recognition

Weekly Downloads

43

Version

1.0.7

License

ISC

Unpacked Size

16.7 kB

Total Files

6

Last publish

Collaborators

  • eric-t-a