Automate full website screen shots and PDF generation with multiple view port support
- Crawls specified host and generates a
sitemap.xmlon the fly
- Generates entire website screen shots based on
- Define multiple view ports
- Automated PDF generation
- Includes crawled meta data in generated PDF
- Reports on broken website links (404 http response)
- Supports HTTP basic authentication
- Supports Microsoft Online 3 step authentication
- Supports Salesforce Visualforce 3 step authentication
- Supports site maps with HTTP, HTTPS, and FTP protocol URLs
- Follows HTTP 301 redirects
- Trigger page events by passing querystring values to custom inject.js file
Do you need a website and workflow management platform?
In This Documentation
Install the following prerequisite on your development machine:
Notable npm Modules
$ npm install siteshooter --global
If siteshooter is installed, make sure you have the latest version by running:
$ npm update siteshooter --global
- You may need to run these commands with elevated privileges, e.g.
sudo, you will be prompted to do so if needed.
- Installing with the
--globalflag affords you the
siteshootercommand on your machine's command line at any path.
- Read more about the
Create a Siteshooter Configuration File
$ siteshooter --init
Update Siteshooter Configuration File
siteshooter.yml, add additional options.
- All Simple Web Crawler options can be added to
sitecrawler_optionsand will pass through to the crawler process
- Generated screenshot image files are optimized using imagemin and imagemin-pngquant modules, which reduce the overall size of generated PDFs. To adjust the image quality, update the image_quality option in your siteshooter.yml file.
domain:name:auth:user:pwd:pdf_options:excludeMeta: truescreenshot_options:delay: 2000image_quality: '60-80'sitecrawler_options:exclude:- "pdf"stripQuerystring: falseignoreInvalidSSL: trueviewports:- viewport: desktop-largewidth: 1600height: 1200- viewport: tablet-landscapewidth: 1024height: 768- viewport: iPhone5width: 320height: 568- viewport: iPhone6width: 375height: 667
$ siteshooter --helpUsage: siteshooter [options]OPTIONS_______________________________________________________________________________________-c --config Show configuration-C --cwd Set working directory, which will load a siteshooter.yml file in the specified path-e --debug Output exceptions-h --help Print this help-i --init Create siteshooter.yml template file in working directory-p --pdf Generate PDFs, by defined view ports, based on screen shots created via Siteshooter-q --quiet Only return final output-s --screenshots Generate screen shots, by view ports, based on sitemap.xml file-S --sitemap Crawl domain name specified in siteshooter.yml file and generate a local sitemap.xml file-v --version Print version number-V --verbose Verbose output-w --website Report on website information based on Siteshooter crawled results
When running a
siteshooter command without any options, the following options will run in order by default:
To manipulate the DOM, prior to the screen shot process, add a
inject.js file in the same working directory as the
Example: inject.js file
When using the optional
inject.js file, events can be triggered based on the following querystring parameter - pevent
// Add URL with pevent querystring parameter in the generated sitemap.xml<url><loc>https://www.devopsgroup.io?pevent=open-privacy-overlay</loc><changefreq>weekly</changefreq></url>
Example: Event detection & triggering
Tests are written with Mocha and can be run with
If you're having issues with Siteshooter, submit a GitHub Issue.
- Make sure you have a
siteshooter.ymlfile in your working directory and the yaml file is well formatted
- Experiencing font-loading issues? Try increasing the delay setting in your siteshooter.yml file
- Trying to take a screenshot of a page with a video? Unfortunately, PhantomJS does not support videos. As such, here's one approach to showing a video's poster image.
/*** @file: inject.js* @description: used to display a video's poster image*/if length >0parent;;
- SimpleCrawler TypeError: The header content contains invalid characters
- Try setting the acceptCookies option to false
Code of Conduct
Take a moment to read or Code of Conduct
Contributing to the project
We are always looking for quality contributions! Please check the CONTRIBUTING.md for contribution guidelines.