This module is designed to decode charset first from the content-type header, if that doesn't work, it will try extracting the charset from html tag.
This library should work with the famous request library and node's http module. Possibly with others libraries too.
The design is simple, you wrap your request and listen for the data event.
var request = require('request'),
decode = require('request-stream-charset')
var req = request({url: 'http://host.my.url', encoding: null}), // must pass encoding null!
decodedReq = decode(req)
decodedReq.on('data', function(decodedStringChunk){
// consume decoded chunk
})
decodedReq.on('end', function(){
// finished!
})
Despite the module name, you cannot use pipe because decodedReq is simply a event emitter, I couldn't make the html metadata charset detection work with streams, don't know if it's possible either. PR's are welcome.
You can pass the following options:
- order (default: ['decodeFromHeader', 'readHTMLMetadata']) It will first try to decode from http header, if it fails, will try metadata, you can change the order, e.g:
decode(req, {order: ['readHTMLMetadata', 'decodeFromHeader']}) // will first read from html metadata
-
shouldDecodeUtf8 (default: false) If it detects the encoding is utf-8, should it try decoding?
-
maxBytesToReadMeta (default: 1024 * 10) It will read at most 10kb before giving up extracting meta tag charset information.
-
forceDecode (default: null) This is just shorthand for the iconv-lite library call, useful for testing. This will disable any automatic detection.
You can turn on debug messages with:
DEBUG=request-stream-charset node yourapp.js
Write tests