readability-cli

    2.3.0 • Public • Published

    readability-cli

    Firefox Reader View in your terminal!

    readability-cli takes any HTML page and strips out unnecessary bloat by using Mozilla's Readability library. As a result, you get a web page which contains only the core content and nothing more. The resulting HTML is suitable for terminal browsers, text readers, and other uses.

    Here is a before-and-after comparison, using an article from The Guardian as a test subject.

    Standard view in W3M

    An article from The Guardian in W3M

    So much useless stuff that the main article does not even fit on the screen!

    readability-cli + W3M

    An article from The Guardian in W3M using readability-cli

    Ah, much better.

    Installation

    readability-cli can be installed on any system with Node.js:

    npm install -g readability-cli

    Arch Linux

    Arch Linux users may use the readability-cli AUR package instead.

    Usage

    readable [SOURCE] [options]

    readable [options] -- [SOURCE]

    where SOURCE is a file, an http(s) URL, or '-' for standard input

    See readable --help for more information.

    Examples

    Read HTML from a file and output the result to the console:

    readable index.html

    Fetch a random Wikipedia article, get its title and an excerpt:

    readable https://en.wikipedia.org/wiki/Special:Random -p title,excerpt

    Fetch a web page and read it in W3M:

    readable https://www.nytimes.com/2020/01/18/technology/clearview-privacy-facial-recognition.html | w3m -T text/html

    Download a web page using cURL, parse it and output as JSON:

    curl https://github.com/mozilla/readability | readable --base=https://github.com/mozilla/readability --json

    It's a good idea to supply the --base parameter when piping input, otherwise readable won't know the document's URL, and things like relative links won't work.

    Localization

    See locales.

    Why Node.js? It's so slow!

    I know that it's slow, but JavaScript is the most sensible option for this, since Mozilla's Readabilty library is written in JavaScript. There have been ports of the Readability algorithm to other languages, but Mozilla's version is the only one that's actively maintained as of 2020.

    Install

    npm i readability-cli

    DownloadsWeekly Downloads

    49

    Version

    2.3.0

    License

    GPL-3.0-only

    Unpacked Size

    76 kB

    Total Files

    8

    Last publish

    Collaborators

    • avatar