node-red-contrib-tesseract

1.1.4 • Public • Published

Tesseract

Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. It performs all OCR tasks locally without requiring a connection to any external service.

Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google.

Tesseract flow

This Node-RED implementation of Tesseract.js has been provided by Sjoerd van der Hoorn.

Settings

Input

  • msg.payload - Local filename, URL, or image buffer.

Output

  • msg.payload - String with recognized text.
  • msg.tesseract - Object with recognized text split out per line and word, plus confidence information.
{
    text: "Text from image\nSecond line",
    confidence: 87,
    lines: 
    [
        {
            text: "Text from image",
            confidence: 93,
            words:
            [
                {
                    text: "Text",
                    confidence: 97
                },
                {
                    ...
                }
            ]
        },
        {
            ...
        }
    ]
}

Additional information

Package Sidebar

Install

npm i node-red-contrib-tesseract

Weekly Downloads

31

Version

1.1.4

License

ISC

Unpacked Size

24.7 MB

Total Files

10

Last publish

Collaborators

  • sjoerdvanderhoorn