simple-html-scraper
TypeScript icon, indicating that this package has built-in type declarations

1.0.2 • Public • Published

Simple Html Scraper

Simple Html Scraper is a small scraper using puppeteer that can be used to fetch the html content and images of a web page.

Installation

Use the package manager npm to install Simple Html Scraper.

npm i simple-html-scraper

Usage

import { Scraper } from 'simple-html-scraper';

const scraper = new Scraper(/* { options } */);

(async () => {
  const result = await scraper.get('url');
  console.log(result.content); // Html
  console.log(result.images); // Array of image urls
})();

/* options
{
  scroll?: boolean; //enable scrolling
  maxScroll?: number | 'MAX'; // scroll iterations
  scrollWait?: number; // time to wait after each scroll
  resources?: string[]; // rescources to accept during the page load
  puppeteer?: LaunchOptions; // options sent to puppeteer
}
*/

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

/simple-html-scraper/

    Package Sidebar

    Install

    npm i simple-html-scraper

    Weekly Downloads

    6

    Version

    1.0.2

    License

    MIT

    Unpacked Size

    10.7 kB

    Total Files

    8

    Last publish

    Collaborators

    • aminekamal