endl

Link extractor, downloader, executer, unzipper

endl (Link Extractor and Downloader)

A Node.js module for extracting links from web pages and downloading them.

endl has a very simple also an advanced API for link extracting, file downloading, executing and unzipping.

Every version under 1.0 is beta. This means it has bugs and features can change.

This is written in CoffeeScript.

endl = require 'endl'
 
endl.page('http://lame.buanzo.org/')
  .find('a[href^="http://lame.buanzo.org/Lame_"]')
  .download(pageUrlAsReferrer: truefilenameMode: { urlBasename: true })

More examples here

  1. We require our endl module.
  2. endl.page() loads the page we want. (It takes two arguments, second argument is an options object and optional.)
  3. find() finds the elements we want. (Works just like jQuery and querySelectorAll)
  4. Download our file to the current directory, using basename of our download link for file name and using our page URL as Referer header.

Things to note:

  • We actually get 4 elements when we do find() but download() automatically selects the first element (0-index). Use index() to change index of element array.
  • download() after find() is a shortcut. The long way is: find(...)href()download(...)

See CHANGELOG.md

Prerequisites: Tools for building NodeJS native modules (Visual Studio or Visual Studio Express)

Like Handel the composer, but without the handel :)

  • findXpath doesn't work. Blame web pages (for incorrect structure), xmldom and xpath modules.
  • Unify all downloading, extraction and execution options across submodules. (endl.coffee, file.coffee, parser.coffee) These 3 submodules have different default options for each task.
  • Add more tests.
  • Turn every blocking function into async (I'm using deasync in some places)

See endl-cli

Go to API page