fork-pdf-parse-with-pagepertext

Pure javascript cross-platform module to extract texts from PDFs.

info

this is a fork of https://gitlab.com/autokent/pdf-parse. All Credits to mehmet.kozan. I forked the package to add the stuff in "Basic Usage". If you want to use the basic package and don't need the text of every page for indexing. I'm just publishing the package to enable hosting on one of my elastic beanstalk instances without giving access to the private repo, so it can easy autoscale. Thanks so much for the package, i hope this could help. I will also start a PR on your project.

Installation

npm install pdf-parse

Basic Usage - Local Files

const fs = require('fs');
const pdf = require('pdf-parse');

let dataBuffer = fs.readFileSync('path to PDF file...');

pdf(dataBuffer).then(function(data) {
	// PDF Text Per Page (Array with {page: number, text: string})
	console.log(data.textPerPage); 
        
});

License

MIT licensed and all it's dependencies are MIT or BSD licensed.

fork-pdf-parse-with-pagepertext

fork-pdf-parse-with-pagepertext

info

Installation

Basic Usage - Local Files

License

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

fork-pdf-parse-with-pagepertext

fork-pdf-parse-with-pagepertext

info

Installation

Basic Usage - Local Files

License

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads