google-search-scraper
Google search scraper with captcha solving support
This module allows google search results extraction in a simple yet flexible way, and handles captcha solving transparently (through external services or your own hand-made solver).
Out of the box you can target a specific google search host, specify a language and limit search results returned. Extending these defaults with custom URL params is supported through options.
A word of warning: This code is intented for educational and research use only. Use responsibly.
Installation
$ npm install google-search-scraper
Examples
Grab first 10 results for 'nodejs'
var scraper = ;var options =query: 'nodejs'limit: 10;scraper;
Various options combined
var scraper = ;var options =query: 'grenouille'host: 'www.google.fr'lang: 'fr'age: 'd1' // last 24 hours ([hdwmy]\d? as in google URL)limit: 10params: {} // params will be copied as-is in the search URL query string;scraper;
Extract all results on edu sites for "information theory" and solve captchas along the way
var scraper = ;var DeathByCaptcha = ;var dbc = 'username' 'password';var options =query: 'site:edu "information theory"'age: 'y' // less than a year,solver: dbc;scraper;
You can easily plug your own solver, implementing a solve method with the following signature:
var customSolver = { // Do something with image data, like displaying it to the user // id is used by BDC to allow reporting solving errors and can be safely ignored here var id = null; ; };