Simple utility for scraping data from html tables on a given website into a list of javascript objects.


npm install --save table-scraper



Returns a promise that resolves to a list of tables found on the input website. HTML table rows are converted to javascript objects

For example: suppose the website at consisted of the following:

    <tr><th>State</th><th>Capitol City</th><th>Pop.<th></tr>
    <tr><td>Minnesota</td><td>Saint Paul</td><td>3</td></tr>
    <tr><td>New York</td><td>Albany</td><td>Eight Million</td></tr>

The following code would result in the array displayed below:

var scraper = require('table-scraper');
  .then(function(tableData) {
       tableData === 
            { State: 'Minnesota', 'Capitol City': 'Saint Paul', 'Pop.': '3' },
            { State: 'New York', 'Capitol City': 'Albany', 'Pop.': 'Eight Million' } 

Important to note: the tableData returned is a list of lists. So, if contained three tables, the structure of the response would look like

  [ /* list of data from the first table */ ],
  [ /* list of data from the second table */ ],
  [ /* list of data from the third table */ ]

If a table has NO headings (no <th> elements), the object keys are simply the column index:

  {'0': <first column data of first row>, '1': <second column data of first row>, .... }

Feedback/PRs welcome! Please include tests around any new functionality, and make sure existing tests pass:

npm test

The following node libraries make this utility super easy: