crawlstream
A website crawler that gives a readable stream of request streams.
Development of this module has been sponsored by Knowit
Installation
$ npm install crawlstream
Running the tests
$ npm test
Examples
Printing out the paths of all the pages found.
Streaming API
var crawlstream = require('crawlstream');
crawlstream('mysite.com', 10)
.on('data', function(req) {
console.log(req.uri.path);
});
Callback API
var crawlstream = require('crawlstream');
crawlstream('mysite.com', 10, function(err, req) {
console.log(req.uri.path);
});
Methods
var crawlstream = require('crawlstream')
crawlstream(baseUrl, concurrency, limit, [callback])
Crawl all pages under baseUrl.
Optionally supply a callback(err, req) which will receive the request stream(!) for all pages.
License
Copyright 2012 Knowit
MIT