Search results
162 packages found
Sort by: Default
- Default
- Most downloaded this week
- Most downloaded this month
- Most dependents
- Recently published
A set of shared utilities that can be used by crawlers
A library to recursively retrieve and serialize Notion pages with customization for machine learning applications.
Dependency free module for scraping and crawling websites using [Crawlbase](https://crawlbase.com) API
- scraping
- crawling
- scraper
- scrape
- crawler
- crawlbase
- scraping-websites
- scraping-framework
- crawlbase-api
- leads
- leads-api
Web crawler for Node.js
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
- automation
- bot
- bot-detection
- crawler
- crawling
- chromedriver
- webdriver
- headless
- headless-chrome
- stealth
- captcha
- scraping
- web-scraping
- cloudflare
- View more
Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously
Node SDK for Hyperbrowser API
Paginator enriches ability to paginate over the pages in Goose Parser
A web crawler for Nodejs.
Crawler (spider) of site web pages by domain name
Distributed web crawler powered by Headless Chrome
Crawlyx is an open-source command-line interface (CLI) based web crawler built using Node.js. It is designed to crawl websites and extract useful information like links, images, and text. It is lightweight, fast, and easy to use.
- web crawler
- web scraping
- data extraction
- SEO analysis
- command-line tool
- Node.js
- HTML reporting
- cross-platform
- configurable options
- plugin system
- open-source
- crawling
- crawler
- scraper
Priority based Semantic Web Crawler.
- semantic-crawler
- pdf-crawler
- text-crawler
- priority
- priority-crawler
- scraper
- crawling
- spider
- scraping
- simplecrawler
- crawler
- osmosis
- js-crawler
- supercrawler
- View more
An `URL` parser for crawling purpose.
- crawler-url-parser
- url-parser
- extract-url
- url-parse
- is-parent-url
- is-child-url
- url
- parser
- parse
- crawler
- extract
- extractor
- absolute
- relative
- View more
Real transparent HTTP-Proxy-Server. Upstream your requests whatever you want!
- proxy
- tunnel
- ssl
- http-proxy
- mitm
- pinning
- proxy-authentication
- transparent
- upstream
- server
- squid
- privoxy
- tcp
- intercept
- View more
A damn simple tool to extract json-ld metadata from webpage using jquery like api (jQuery, Cheerio, CashDOM, ...).
Collects torrents from various sources (dump, RSS, HTML pages) and associates the video files within with IMDB ID
JS client for WecrawlerAPI
Sample website text content over time.
simple polite crawling of the web.