scrape-cache

0.0.1 • Public • Published

scrape-cache

Scrape web pages, storing pages locally to lessen repetitive network requests.

Uses cheerio for scraping.

Installation

git clone https://github.com/kevinschaul/scrape-cache
npm install

Usage

scrape-cache exposes one method: scrape(url, scraper, callback). Its parameters:

  • url String

    The URL to scrape.

  • scraper($) Function

    A function that scrapes the HTML and returns data that will be passed to callback.

    The parameter $ is a cheerio jQuery-like object with the HTML already loaded.

  • callback(result) Function

    A function that does something with result.

Full usage example

To scrape the contents of an H1:

var scrapeCache = require('scrape-cache');

var url = 'https://github.com/';

var scrapeH1 = function($) {
    return $('h1').text();
};

scrapeCache.scrape(url, scraper, function(result) {
    console.log(result);
});

Readme

Keywords

none

Package Sidebar

Install

npm i scrape-cache

Weekly Downloads

0

Version

0.0.1

License

MIT

Unpacked Size

3.32 kB

Total Files

4

Last publish

Collaborators

  • kevin.schaul