Popplonode

Return metadata & text extraction of a PDF file

Why Popplonode

Popplonode is an addons node.js which means it use pure c++ code, it's clearly faster than PDFJS & it's faster than a spawn of Poppler pdfinfo too! because we only use specific c++ class of it.

Requirements

This version is working for Node.js v10, v8 & v6 (LTS) on Linux & OSX, a working for windows is in progress..

Install

npm install popplonode

INFO If you want to use it with a particular version of node(eg: 8.5) you will need:

sudo apt-get install cmake g++
brew install cmake

Usage

const Popplonode = require('popplonode');
 
const poppl = new Popplonode();
 
// We load the PDF file into poppl
poppl.load('path/to/my/file.pdf'); 
 
// We can access the metadata of the PDF file
const metadata = poppl.getMetadata(); // 
 
poppl.getTextFromPage(0, (error, content) => {
  // do something with the content page
});

API

load(string)

arguments:

string path to your pdf file

getMetadata()

returns:

object returns an object that contains all of the pdf's metadata

// example of an metadata object return
{ 
  CreationDate: 'D:20100304130800+01\'00\'',
  Author: 'manshanden',
  Creator: 'PScript5.dll Version 5.2',
  Producer: 'Acrobat Distiller 7.0.5 (Windows)',
  ModDate: 'D:20100304130837+01\'00\'',
  Title: 'Microsoft Word - Test document Word.doc',
  TotalNbPages: 1,
  PDFFormatVersion: '1.4'
}

getTextFromPage(number, function)

arguments :

number page number (first page start at zero)
function callback who return page text

Windows

If anyone could help us to build poppler on windows we could then build it for node.js :D

popplonode

Popplonode

Why Popplonode

Requirements

Install

Usage

API

load(string)

getMetadata()

getTextFromPage(number, function)

Windows

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

popplonode

Popplonode

Why Popplonode

Requirements

Install

Usage

API

load(string)

getMetadata()

getTextFromPage(number, function)

Windows

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads