Tesseract
Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. It performs all OCR tasks locally without requiring a connection to any external service.
Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google.
This Node-RED implementation of Tesseract.js has been provided by Sjoerd van der Hoorn.
Settings
- Language - Code (List of available language codes).
Input
msg.payload
- Local filename, URL, or image buffer.
Output
msg.payload
- String with recognized text.msg.tesseract
- Object with recognized text split out per line and word, plus confidence information.
text: "Text from image\nSecond line" confidence: 87 lines: text: "Text from image" confidence: 93 words: text: "Text" confidence: 97 ... ...