Simple HTML tags stripping library
Install
via npm:
npm install htmlstrip-native
git clone git://github.com/zaro/node-htmlstrip-native.git
cd node-htmlstrip-native
npm install
Use
Example:
var html_strip = ; var html = '<style>b {color: red;}</style>' + ' Yey, <b> No more, tags</b>' + '<script>document.write("Hello from Javascript")</script>';var options = include_script : false include_style : false compact_whitespace : true include_attributes : 'alt': true ; // Strip tags and decode HTML entitiesvar text = html_strip; console // Decode HTML entities onlyvar no_entities = html_strip
The html_strip function expects either a string as first argument or a 'utf-16le', encoded Buffer. The optional second argument can hold the following options:
include_script : true // include the content of <script> tags include_style : true // include the content of <style> tags compact_whitespace : false // compact consecutive '\s' whitespace into single char include_attributes : // include attribute values in the output '*':true // special value, means : Include ALL attributes 'alt': true // include attributes named 'alt'
Speed
Same thing can be achieved really simply without native modules with htmlparser2 for example. This module is ~30 times faster than using htmlparser2.