npm install --save @studiowebux/crawler
- Open browser, page and navigate to url
- Collect HAR Data
- Collect media resources (Audio, Video, Images)
- Take Screenshot (Fullscreen or not)
- Extract HTML Elements
- Extract named Metadata
- Generate identifier (Using page title and url)
- Extract html (
document.*
) and extract innerHtml (body.innerHtml
)
See Example
node __tests__/index.js