Search results

16 packages found

the readability script ported to a sax parser

published version 1.6.1, 12 years ago6 dependents licensed under $BSD-like
191,466

A web page content extractor

published version 3.2.0, 7 years ago62 dependents
26,579

Built for Node.js, this package empowers users to effortlessly convert PDF files into images of exceptional quality, supporting multiple formats including PNG, JPG, GIF, and others. Its streamlined functionality ensures a smooth and reliable conversion pr

published version 1.0.7-alpha, 9 months ago1 dependents licensed under $ISC
6,495

Mozilla Readability in Rust

published version 0.1.3, 3 years ago0 dependents licensed under $MIT
449

An automatic web page content extractor.

published version 1.0.0, 5 years ago0 dependents licensed under $ISC
125

A web page content extractor

published version 1.1.2, 8 years ago0 dependents
49

A web page content extractor

published version 1.1.0-forkv4, 9 years ago0 dependents
26

A web page content extractor

published version 3.3.4, 5 years ago0 dependents
23

Domsi is a powerful web scraping library that allows you to query HTML elements based on DOM hierarchy, element attributes, and CSS styles. Works across \*all\* automated browsers, so long as they allow execution of arbitrary JavaScript. That includes non

published version 1.0.2, 2 years ago0 dependents licensed under $MIT
17

A web page content extractor

published version 3.3.1, 5 years ago0 dependents
15

A web page content extractor based on https://github.com/ageitgey/node-unfluff, but ready for browserify

published version 1.3.2, 8 years ago1 dependents
14

Fork of node-red-contrib-unfluff. Handles redirects and user-agent for scrape.

published version 1.0.2, 3 years ago0 dependents licensed under $ISC
14
published version 1.0.0, 4 months ago0 dependents licensed under $ISC
12

A web page content extractor

published version 3.2.3, 4 years ago0 dependents
12

A web page content extractor

published version 3.2.2, 6 years ago0 dependents
12

A library for web scraping, web content extraction, and Google Custom Search.

published version 1.0.0, 5 months ago0 dependents licensed under $ISC
10