scraper.js - crawl the web.
Scraper.js is a tool to crawl webpages. It contains various strategies to scrape a page.
Quick start guide
Install scraper.js:
npm install --save scraper.js
Install a queue manager:
npm install --save scraper.js-queue-bull
Write a scraper:
// save as reddit.jsmoduleexports = start: url: 'https://www.reddit.com/' method: 'home' name: 'Reddit' {
Write a data manager:
// save as data.jsmodule { console;};
Run it!
scraperjs --concurrency 2 --queue scraper.js-queue-bull --data ./data.js ./reddit.js
Todo
- Clean up readme
- Documentation
- Tests