epic-crawler
TypeScript icon, indicating that this package has built-in type declarations

1.0.21 • Public • Published

Epic Crawler

A simple crawler for scraping important data from web pages.

Installation

$ npm i epic-crawler --save

Usage

const crawler = new epicCrawler;
crawler.init("https://google.com", {
    depth: 5,
}).then(() => {
    crawler.crawl().then((data) => {
        console.log(data);
    });
}).catch((data) => {
    console.log(data);
});

Options

Just three options are supported for now.

  • depth - 1 to 5 (Default 1) | Crawling Depth.
  • strict - boolean (Default True) | Set to False if you also want to collect links related to other websites.
  • cache - boolean (Default True) | Speeds up the crawl by saving data in the cache.

Methods

  • init: (url: string, { depth, strict, cache }?: options) => Promise - Initialize crawler.
  • blackList: (fingerPrintList: (string | RegExp)[]) => this - Black List Links.
  • clearCache: () => this - Clear previous crawled cache.
  • crawl: () => Promise - Start Crawling.

Package Sidebar

Install

npm i epic-crawler

Weekly Downloads

6

Version

1.0.21

License

MIT

Unpacked Size

32.1 kB

Total Files

9

Last publish

Collaborators

  • selfsofts