Scrappy

Extract rich metadata from URLs.

Installation

npm install scrappy --save

Usage

Scrappy attempts to parse and extract rich structured metadata from URLs.

import { scraper, urlScraper } from "scrappy";
import * as plugins from "scrappy/dist/plugins";

Scraper

Accepts a request function and a list of plugins to use. The request is expected to return a "page" object, which is the same shape as the input to scrape(page).

const scrape = scraper({
  request,
  plugins: [plugins.htmlmetaparser, plugins.exifdata],
});

const res = await fetch("http://example.com"); // E.g. `popsicle`.

await scrape({
  url: res.url,
  status: res.status,
  headers: res.headers.asObject(),
  body: res.stream(), // Must stream the request instead of buffering to support large responses.
});

URL Scraper

Simpler wrapper around scraper that automatically makes a request(url) for the page.

const scrape = urlScraper({ request });

await scrape("http://example.com");

License

Apache 2.0

scrappy

Scrappy

Installation

Usage

Scraper

URL Scraper

License

Dependents (1)

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

scrappy

Scrappy

Installation

Usage

Scraper

URL Scraper

License

Dependents (1)

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads