endl

Link extractor, downloader, executer, unzipper

endl (Link Extractor and Downloader)

A program for extracting links from web pages and downloading them.

endl has a very simple also an advanced API for link extracting, file downloading, executing and unzipping.

Every version under 1.0 is beta. This means it has bugs and features can change.

  • Changed endl.load() to endl.page()
  • Changed endl.parse() to endl.load()
  • Changed download option fileDirectory to directory

Prerequisites: Tools for building NodeJS native modules

endl has a command line shortcut!

Like Handel the composer, but without the handel :)

This is written in CoffeeScript.

endl = require 'endl'
 
endl.page('http://lame.buanzo.org/')
  .find('a[href^="http://lame.buanzo.org/Lame_"]')
  .download(pageUrlAsReferrer: truefilenameMode: { urlBasename: true })

More examples here

  1. We require our endl module. (Node style)
  2. endl.load() loads the page we want. (It takes two arguments, second argument is an options object and optional.)
  3. find() finds the elements we want. (Works just like jQuery and querySelectorAll)
  4. Download our file to the current directory, using basename of our download link for file name and using our page URL as Referer header.

Things to note:

  • We actually get 4 elements when we do find() but download() automatically selects the first element (0-index). Use index() to change index of element array.
  • download() after find() is a shortcut. The long way is: find(...)href()download(...)
  • findXpath doesn't work. Blame web pages (for incorrect structure), xmldom and xpath modules.
  • Unify all downloading, extraction and execution options across submodules. (endl.coffee, file.coffee, parser.coffee) These 3 submodules have different default options for each task.
  • Add more tests.
  • Turn every blocking function into async (I'm using deasync in some places)
endl d "http://www.mp3tag.de/en/download.html" "div.download a"

More info about Command Line

Go to API page