fork-pdf-parse-with-pagepertext

1.1.2 • Public • Published

fork-pdf-parse-with-pagepertext

Pure javascript cross-platform module to extract texts from PDFs.

version downloads node status

info

this is a fork of https://gitlab.com/autokent/pdf-parse. All Credits to mehmet.kozan. I forked the package to add the stuff in "Basic Usage". If you want to use the basic package and don't need the text of every page for indexing. I'm just publishing the package to enable hosting on one of my elastic beanstalk instances without giving access to the private repo, so it can easy autoscale. Thanks so much for the package, i hope this could help. I will also start a PR on your project.

Installation

npm install pdf-parse

Basic Usage - Local Files

const fs = require('fs');
const pdf = require('pdf-parse');

let dataBuffer = fs.readFileSync('path to PDF file...');

pdf(dataBuffer).then(function(data) {
	// PDF Text Per Page (Array with {page: number, text: string})
	console.log(data.textPerPage); 
        
});

License

MIT licensed and all it's dependencies are MIT or BSD licensed.

Package Sidebar

Install

npm i fork-pdf-parse-with-pagepertext

Weekly Downloads

50

Version

1.1.2

License

MIT

Unpacked Size

29.2 MB

Total Files

50

Last publish

Collaborators

  • gope153