Wappalyzer-puppeteer
Wappalyzer-puppeteer is a simple library built on top of Wappalyzer dataset that uncovers the technologies used on websites.
Wappalyzer uses zombie which can handle most websites, but fails on complex / large / js heavy sites.
Since puppeteer is stable and available for many years now, let's use the best of both words, a real browser and Wappalyzer's dataset.
The internal logic is rewritten from scratch, since the original Wappalyzer code has a lot of Promises, on-the-fly regex parsing.
Installation
$ npm i -g wappalyzer-puppeteer # Globally $ npm i wappalyzer-puppeteer --save # As a dependency
There are three main dependencies for this project:
- Wappalyzer - for apps.json only
- puppeteer-cluster
- puppeteer
Run from the command line
wappalyzer [url] [options]
Options
--max-wait=ms Wait no more than ms milliseconds for page resources to load.
--user-agent=str Set the user agent string.
Run from a script
const AppAnalytics PuppeteerCluster Cluster = ; const url = 'https://www.wappalyzer.com'; const options = maxWait: 5000 userAgent: 'Wappalyzer' // puppeteerClusterOptions is passed to puppeteer-cluster // More options here: https://github.com/thomasdondorf/puppeteer-cluster puppeteerClusterOptions: concurrency: ClusterCONCURRENCY_CONTEXT maxConcurrency: 2 puppeteerOptions: headless: true ignoreHTTPSErrors: true ; const appAnalytics = ;const wappalyzer = appAnalytics options; // Load apps.json (you can provide your own json file as well)appAnalytics // start the puppeteer cluster // queue an url and wait for the result // do whatever you want with the result // close the cluster ;