aws-lambda-tesseract-french
TypeScript icon, indicating that this package has built-in type declarations

1.0.0 • Public • Published

aws-lambda-tesseract CircleCI Tesseract

Tesseract 5.1 (with French training data) to fit inside AWS Lambda

Forked from https://github.com/shelfio/aws-lambda-tesseract, all the credits go to shelf.io, I just compiled Tesseract 5.1 for french language, changed the params passed to the cli and published it !

Inspired by chrome-aws-lambda & lambda-scanner-ocr

Install

$ yarn add aws-lambda-tesseract-french

Works for Node 16.x runtime and compiled with Tesseract 5.1.0. It works with x86_64 CPUs for now only.

How does it work?

This package contains an archive with Tesseract 5.1 compiled for usage in AWS Lambda environment.

When a Lambda starts, it unpacks an archive with a binary to the /tmp folder and makes sure it's done only once per Lambda cold start.

Usage

const {getTextFromImage, isSupportedFile} = require('aws-lambda-tesseract-french');

module.exports.handler = async event => {
  // assuming there is a photo.jpg inside /tmp dir
  // original file will be deleted afterwards

  if (!isSupportedFile('/tmp/photo.jpg')) {
    return false;
  }

  return getTextFromImage('/tmp/photo.jpg');
};

isSupportedFile checks that file has image-like file extension and it's not in the list of unsupported by Tesseract file extensions.

Compile It Yourself

See compile-tesseract.sh

Smoke test that it works by running test.sh script

See Also

Publish

$ git checkout master
$ yarn version
$ yarn publish
$ git push origin master --tags

License

MIT © Shelf

Readme

Keywords

Package Sidebar

Install

npm i aws-lambda-tesseract-french

Weekly Downloads

0

Version

1.0.0

License

MIT

Unpacked Size

4.7 MB

Total Files

8

Last publish

Collaborators

  • kimchicharlie