web crawler, parser and scraper with storage capabilities
published 0.3.8 6 years agoResponsible for handling toolbar integration. Opens the admin front-end into a new tab.
published 0.3.0 4 years agoContains common typescript definitions and utilities used throughout the lerna monorepo.
published 0.4.1 4 years agoExtract Resources scenario is used for extracting various resources from the corresponding sites.
published 0.4.1 4 years agoHandles the crawling and scraping logic.
published 0.4.1 4 years agoReact based frontend communicating with the backend (extension background script) via chrome.runtime.sendMessage. Bundled as a single page app with the help of react-router-dom.
published 0.4.1 4 years agoBundles the entire monorepo sub-packges into a valid extension folder.
published 0.4.1 4 years agoExtract Html Headings scenario is used for extracting H1, H2, H3, H4, H5, H6 text content.
published 0.1.0 5 years agoExtract html headings (H1 - H6) content.
published 0.2.0 4 years agoExtract Html Content scenario is used for extracting html nodes text based on dom selectors.
published 0.1.2-rc.2 5 years ago- published 0.4.1 4 years ago
Extract article content using Mozilla Readability library.
published 0.2.0 4 years agoextracts text and binary content from dynamic (javascript) pages based on CSS selectors
published 0.4.1 4 years agoextracts text and binary content from static html pages based on CSS selectors
published 0.4.1 4 years agoscraping test definitions, launches resources to be scraped under a configurable web server
published 0.8.0 3 years agoPlugin based node.js web scraper. It scrapes, stores and exports data. Supports multiple storage options: SQLite, MySQL, PostgreSQL. Supports multiple browser or dom-like clients: Puppeteer, Playwright, Cheerio, Jsdom.
published 0.11.0 2 years ago