vacuumjs

1.0.1 • Public • Published

vacuumjs

A low-level node.js web page content extractor based on parse5.

Build Status codecov

Usage

var extract = require('vacuumjs')
var targetDOM = parse5.parse('some page content')
// the reference dom, not optional
var refDOM = parse5.parse('reference page content')
console.log(extract(targetDOM, refDOM))

Principium

  • Layout similairity
  • Text density

Package Sidebar

Install

npm i vacuumjs

Weekly Downloads

1

Version

1.0.1

License

MIT

Last publish

Collaborators

  • damngoto