wrapper around libhubbub, node-htmlparser 1.x, 2.x api compatible


A forgiving HTML parser with a native backend based on the html parser from the netsurf browser ( It is fully backwards compatible with both tautologistics/node-htmlparser 1.x and 2.x.

There were some types of html that the tautologistics parser was unable to handle so I created this native addon that uses an actual web browser's parser. It can be operated in blocking or non-blocking mode.

Similar to tautologistics's parser, it can operate in chunked mode as well. There are currently a few known utf-8 conversion bugs, but hopefully I'll get around to fixing these soon.

$ npm install hubbub

You can use it with jsdom, overriding the default parser by invoking node-hubbub's jsdom configuration function before requiring jsdom. Here's a brief example:

var jsdom = require('hubbub').jsdomConfigure(require("jsdom"));
  html: "",
  scripts: [""],
  donefunction (errorswindow) {
    var $ = window.$;
    console.log("HN Links");
    $("td.title:not(:last) a").each(function() {
      console.log(" -", $(this).text());