scraptor
!!This library is a work in progress. The API most likely will change.!!
This library is my attempt to wrap puppeteer
and cheerio
to create a
library that allows me to easily construct web scrapers. A DSL implements
common patterns, while allowing to break out into the underlying libraries if
necessary.
Synopsis
;; const spinnerDone = "document.querySelector('.spinner').classList.contains('hide')";const waitForSpinner = ;const search = ; ; // Prints full HTML
API
usingBrowser
: Execute a scrape in a browser session.usingHeadlessBrowser
: Execute a scrape in a headless browser session.browse
: Visit a URL and load the page.html
: Select the inner HTML of a DOM node.fillForm
: Input a string into a form field.click
: Click on an DOM node.once
: Continue the browser session once a predicate fulfills.onceLoaded
: Continue the browser session once the page loaded.onceMs
: Continue browser session once a set time passes.doUntil
: Run an action once a predicate fulfills.