node package manager
Love JavaScript? Your insights can make it even better. Take the 2017 JavaScript Ecosystem Survey ยป


eval-spider -- Programmable spidering of web sites with node.js


From npm:

  npm install eval-spider

(How to use the) API

Creating a Spider

  var spider = require('eval-spider');
  var spider = new spider(options);


The options object can have the following fields:

  • maxPages - Integer containing the maximum Pages to be crawled. Default 10
  • requestThrottle -Integer How many Request at a time. Default 5
  • url - String Website url. Default ''
  • fileName - String Output File Name . Default 'output.csv'
  • connect - Map Aerospike Db Details . Default {host:(Default : localhost) ,port :(Default 3000),namespace : (Default : test),set:(Default webcrawler),metadata:(Default : {}) }

Queuing an URL for spider to fetch.

spider.crawler() Return a Promise

Response : Response when promise is resolve

  	response : Array, // Result set
  	crawledUrls : Map, // Crawled Urls
  	count : Integer // Number of Crawled Urls

Write response in csv

spider.writeToFile(name,data) - write Data into csv file. Name(string) optional : Result file name, Data(array)

Write response in aerospike

spider.aerospike(data) - write Data into aerospike. Data(array)