sinew-node

Sinew-Node collects structured data from web sites (screen scraping).

Sinew-Node collects structured data from web sites (screen scraping).

Sinew is distributed as a ruby gem:

npm install sinew-node

Here's an example for collecting Reddit's topic list:

sinew = require '../lib/sinew-node'
sinew.get 'http://www.reddit.com/r/javascript/'->
  (@$ '#siteTable div.thing a.title').each (index)->
    console.log @innerHTML if index < 5
  • Sinew caches all HTTP requests on disk. That makes it possible to iterate quickly. Crawl once and then continue to work on your recipe. Run the recipe over and over while you tune your CSS selectors and regular expressions.