Simple web scraper chassis, scrapes fields from a list of web pages and dumps the results to JSON/CSV files.


A very simple web scraper chassis.


  • a page url
  • a selection to scrape
  • fields to pull out from within that section
  • an output folder
  • number of seconds to delay

and it creates CSV and JSON files with the results. Claw creates a separate file for each page it scrapes.

// libararies
var claw = require('claw');
// get settings
var page = '';

var selector = 'h3 a';

var fields = {
    "text" : "text()",
    "href" : "attr('href')"

claw(page, selector, fields, 'output', 3);

Give it an array of pages, and it will save the results of each page to a separate file.

claw(['', ''], selector, fields, 'output', 3);

Claw can also grab its page list from JSON file that is a list of urls (or an object with .href properties).

claw("pages.json", selector, fields, 'output', 3);

Questions? Ideas? Hit me up on twitter - @dylanized