Have ideas to improve npm?Join in the discussion! »

    pdf2table

    0.0.2 • Public • Published

    pdf2table

    pdf2table is a node.js library that attempts to extract tables from a pdf.

    The 'tables' are extracted as an array of rows.

    It uses pdf2json to extract the pdf data.

    Install

    You can install pdf2table using the Node Package Manager (npm):

    npm install pdf2table
    

    Simple example

     
    var pdf2table = require('pdf2table');
    var fs = require('fs');
     
    fs.readFile('./test.pdf', function (err, buffer) {
        if (err) return console.log(err);
     
        pdf2table.parse(buffer, function (err, rows, rowsdebug) {
            if(err) return console.log(err);
     
            console.log(rows);
        });
    });
     

    Note

    Note that this is a simplistic implementation to extract tables. If your pdf contains other stuff that's not a table, pdf2table will still attempt to shape this data into a row. Feel free to improve and send pull requests.

    Keywords

    none

    Install

    npm i pdf2table

    DownloadsWeekly Downloads

    3,774

    Version

    0.0.2

    License

    none

    Last publish

    Collaborators

    • avatar