Determine the Encoding of a HTML Byte Stream
This package implements the HTML Standard's encoding sniffing algorithm in all its glory. The most interesting part of this is how it pre-scans the first 1024 bytes in order to search for certain
<meta charset>-related patterns.
const htmlEncodingSniffer = ;const fs = ;const htmlBuffer = fs;const sniffedEncoding = ;
const whatwgEncoding = ;const htmlString = whatwgEncoding;
You can pass two potential options to
const sniffedEncoding =;
These represent two possible inputs into the encoding sniffing algorithm:
transportLayerEncodingLabelis an encoding label that is obtained from the "transport layer" (probably a HTTP
Content-Typeheader), which overrides everything but a BOM.
defaultEncodingis the ultimate fallback encoding used if no valid encoding is supplied by the transport layer, and no encoding is sniffed from the bytes. It defaults to
"windows-1252", as recommended by the algorithm's table of suggested defaults for "All other locales" (including the