minimize

Minimize HTML

HTML minifier

Minimize is a HTML minifier based on the node-htmlparser. This depedency will ensure output is solid and correct. Minimize is focussed on HTML5 and will not support older HTML drafts. It is not worth the effort and the web should move forward. Currently, HTML minifier is only usuable server side. Client side minification will be added in a future release.

Minimize does not parse inline PHP or raw template files. Templates are not valid HTML and this is outside the scope of the minimiz. The output of the templaters should be parsed and minified.

  • fast and stable HTML minification (no inline PHP or templates)
  • highly configurable
  • CLI interface usable with stdin and files
  • can distinguish conditional IE comments and/or SSI
  • build on the foundations of htmlparser2
  • pluggable interface that allows to hook into each element
  • minification of inline javascript with uglify or similar

To get the minified content make sure to provide a callback. Optional an options object can be provided. All options are listed below and false per default.

var Minimize = require('minimize')
  , minimize = new Minimize({
      empty: true,        // KEEP empty attributes 
      cdata: true,        // KEEP CDATA from scripts 
      comments: true,     // KEEP comments 
      ssi: true,          // KEEP Server Side Includes 
      conditionals: true, // KEEP conditional internet explorer comments 
      spare: true,        // KEEP redundant attributes 
      quotes: true,       // KEEP arbitrary quotes 
      loose: true         // KEEP one whitespace 
    });
 
minimize.parse(content, function (errordata) {
  console.log(data);
});

Supplying a custom instance to do the HTML parsing is possible. I.e. this can be useful if the HTML contains SVG or if you need to specific options to the parser.

var Minimize = require('minimize')
  , html = require('htmlparser2')
  , minimize = new Minimize(new html.Parser(
      new html.FeedHandler((this.emits('read'))
    ), { /* options */ });
 
minimize.parse(content, function (errordata) {
  console.log(data);
});

Empty

Empty attributes can usually be removed, by default all are removed, excluded HTML5 data-* and microdata attributes. To retain empty elements regardless value, do:

var Minimize = require('minimize')
  , minimize = new Minimize({ empty: true });
 
minimize.parse(
  '<h1 id=""></h1>',
  function (errordata) {
    // data output: <h1 id=""></h1> 
  }
);

CDATA

CDATA is only required for HTML to parse as valid XML. For normal webpages this is rarely the case, thus CDATA around javascript can be omitted. By default CDATA is removed, if you would like to keep it, pass true:

var Minimize = require('minimize')
  , minimize = new Minimize({ cdata: true });
 
minimize.parse(
  '<script type="text/javascript">\n//<![CDATA[\n...code...\n//]]>\n</script>',
  function (errordata) {
    // data output: <script type=text/javascript>//<![CDATA[\n...code...\n//]]></script> 
  }
);

Comments

Comments inside HTML are usually beneficial while developing. Hiding your comments in production is sane, safe and will reduce data transfer. If you ensist on keeping them, fo1r instance to show a nice easter egg, set the option to true. Keeping comments will also retain any Server Side Includes or conditional IE statements.

var Minimize = require('minimize')
  , minimize = new Minimize({ comments: true });
 
minimize.parse(
  '<!-- some HTML comment -->\n     <div class="slide nodejs">',
  function (errordata) {
    // data output: <!-- some HTML comment --><div class="slide nodejs"> 
  }
);

Server Side Includes (SSI)

Server side includes are special set of commands that are support by several web servers. The markup is very similar to regular HTML comments. Minimize can be configured to retain SSI comments.

var Minimize = require('minimize')
  , minimize = new Minimize({ ssi: true });
 
minimize.parse(
  '<!--#include virtual="../quote.txt" -->\n     <div class="slide nodejs">',
  function (errordata) {
    // data output: <!--#include virtual="../quote.txt" --><div class="slide nodejs"> 
  }
);

Conditionals

Conditional comments only work in IE, and are thus excellently suited to give special instructions meant only for IE. Minimize can be configured to retain these comments. But since the comments are only working until IE9 (inclusive) the default is to remove the conditionals.

var Minimize = require('minimize')
  , minimize = new Minimize({ conditionals: true });
 
minimize.parse(
  "<!--[if ie6]>Cover microsofts' ass<![endif]-->\n<br>",
  function (errordata) {
    // data output: <!--[if ie6]>Cover microsofts' ass<![endif]-->\n<br> 
  }
);

Spare

Spare attributes are of type boolean of which the value can be omitted in HTML5. To keep attributes intact for support of older browsers, supply:

var Minimize = require('minimize')
  , minimize = new Minimize({ spare: true });
 
minimize.parse(
  '<input type="text" disabled="disabled"></h1>',
  function (errordata) {
    // data output: <input type=text disabled=disabled></h1> 
  }
);

Quotes

Quotes are always added around attributes that have spaces or an equal sign in their value. But if you require quotes around all attributes, simply pass quotes:true, like below.

var Minimize = require('minimize')
  , minimize = new Minimize({ quotes: true });
 
minimize.parse(
  '<p class="paragraph" id="title">\n    Some content\n  </p>',
  function (errordata) {
    // data output: <p class="paragraph" id="title">Some content</p> 
  }
);

Loose

Minimize will only keep whitespaces in structural elements and remove all other redundant whitespaces. This option is useful if you need whitespace to keep the flow between text and input elements. Downside: whitespaces or newlines after block level elements will also have one trailing whitespace.

var Minimize = require('minimize')
  , minimize = new Minimize({ loose: true });
 
minimize.parse(
  '<h1>title</h1>  <p class="paragraph" id="title">\n  content\n  </p>    ',
  function (errordata) {
    // data output: <h1>title</h1> <p class="paragraph" id="title"> content </p> ' 
  }
);

Plugins

Register a set of plugins that will be ran on each iterated element. Plugins are ran in order, errors will stop the iteration and invoke the completion callback.

var Minimize = require('minimize')
  , minimize = new Minimize({ plugins: [{
      id: 'remove',
      element: function element(nodenext) {
        if (node.type === 'text') delete node.data;
        next();
      }
    }]});
 
minimize.parse(
  '<h1>title</h1>',
  function (errordata) {
    // data output: <h1></h1> 
  }
);

Note: plugins have no control over the flow of minimize. The DOM structure that is parsed by htmlparser2 is asynchronously reduced. Each element is handed of to the plugin element method. Thus, plugins have full control over properties of each node as objects always have reference in javascript.

Tests can be easily run by using either of the following commands. Travis.ci is used for continous integration.

make test
make test-watch
npm test

Minimize is influenced by the HTML minifier of kangax. This module parses the DOM as string as opposes to an object. However, retaining flow is more diffucult if the DOM is parsed sequentially. Minimize is not client-side ready. Kangax minifier also provides some additional options like linting. Minimize will retain strictly to the business of minifying. Minimize is already used in production by Nodejitsu.

node-htmlparser of fb55 is used to create an object representation of the DOM.