Have ideas to improve npm?Join in the discussion! »

    chardet
    TypeScript icon, indicating that this package has built-in type declarations

    1.3.0 • Public • Published

    chardet Build Status

    Chardet is a character detection module written in pure Javascript (Typescript). Module uses occurrence analysis to determine the most probable encoding.

    • Packed size is only 22 KB
    • Works in all environments: Node / Browser / Native
    • Works on all platforms: Linux / Mac / Windows
    • No dependencies
    • No native code / bindings
    • 100% written in Typescript
    • Extensive code coverage

    Installation

    npm i chardet
    

    Usage

    To return the encoding with the highest confidence:

    const chardet = require('chardet');
     
    chardet.detect(Buffer.from('hello there!'));
    // or
    chardet.detectFile('/path/to/file').then(encoding => console.log(encoding));
    // or
    chardet.detectFileSync('/path/to/file');

    To return the full list of possible encodings use analyse method.

    const chardet = require('chardet');
    chardet.analyse(Buffer.from('hello there!'));

    Returned value is an array of objects sorted by confidence value in decending order

    [
      { confidence: 90, name: 'UTF-8' },
      { confidence: 20, name: 'windows-1252', lang: 'fr' }
    ];

    Working with large data sets

    Sometimes, when data set is huge and you want to optimize performace (in tradeoff of less accuracy), you can sample only first N bytes of the buffer:

    chardet
      .detectFile('/path/to/file', { sampleSize: 32 })
      .then(encoding => console.log(encoding));

    Supported Encodings:

    • UTF-8
    • UTF-16 LE
    • UTF-16 BE
    • UTF-32 LE
    • UTF-32 BE
    • ISO-2022-JP
    • ISO-2022-KR
    • ISO-2022-CN
    • Shift_JIS
    • Big5
    • EUC-JP
    • EUC-KR
    • GB18030
    • ISO-8859-1
    • ISO-8859-2
    • ISO-8859-5
    • ISO-8859-6
    • ISO-8859-7
    • ISO-8859-8
    • ISO-8859-9
    • windows-1250
    • windows-1251
    • windows-1252
    • windows-1253
    • windows-1254
    • windows-1255
    • windows-1256
    • KOI8-R

    Currently only these encodings are supported.

    Typescript?

    Yes. Type definitions are included.

    References

    Install

    npm i chardet

    DownloadsWeekly Downloads

    13,938,183

    Version

    1.3.0

    License

    MIT

    Unpacked Size

    207 kB

    Total Files

    33

    Last publish

    Collaborators

    • avatar