Nimble Polyglot Microcosm

    utfx

    0.6.0 • Public • Published

    utfx - A compact library to process, convert, encode and decode UTF8 / UTF16 in JavaScript.

    utfx is a compact library to process, convert, encode and decode UTF8 / UTF16 in JavaScript using arbitrary sources and destinations through the use of callbacks. Includes polyfills for String.fromCodePoint and String#codePointAt.

    Background

    While there are already tons of UTF8 libraries around, most (if not all) of them are based on a specific data scheme (e.g. binary strings) that may not be appropriate in specific use cases. To work around that, utfx provides the developer with the freedom to implement the low level operations (obtaining and outputting data) on their own.

    PLEASE NOTE: Though utfx is the outsourced UTF8 component of ByteBuffer.js, it is still in its early stages and hasn't been heavily tested yet.

    API

    Class TruncatedError

    An error indicating a truncated source. Contains the remaining bytes as an array in its bytes property.

    encodeUTF8(src, dst)

    Encodes UTF8 code points to an arbitrary output destination of UTF8 bytes.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | number Code points source, either as a function returning the next code point respectively null if there are no more code points left, an array of code points or a single numeric code point.
    dst function(number) | Array.<number> | undefined Bytes destination, either as a function successively called with the next byte, an array to be filled with the encoded bytes or omitted to make this function return a binary string.
    @returns undefined | string A binary string if dst has been omitted, otherwise undefined
    @throws TypeError If arguments are invalid
    @throws RangeError If a code point is invalid in UTF8

    decodeUTF8(src, dst)

    Decodes an arbitrary input source of UTF8 bytes to UTF8 code points.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | string Bytes source, either as a function returning the next byte respectively null if there are no more bytes left, an array of bytes or a binary string.
    dst function(number) | !Array.<number> Code points destination, either as a function successively called with each decoded code point or an array to be filled with the decoded code points.
    @throws TypeError If arguments are invalid
    @throws RangeError If a starting byte is invalid in UTF8
    @throws utfx.TruncatedError If the last sequence is truncated. Has an array property bytes holding the remaining bytes.

    UTF16toUTF8(src, dst)

    Converts an arbitrary input source of UTF16 characters to an arbitrary output destination of UTF8 code points.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | string Characters source, either as a function returning the next char code respectively null if there are no more characters left, an array of char codes or a standard JavaScript string.
    dst function(number) | Array.<number> Code points destination, either as a function successively called with each converted code point or an array to be filled with the converted code points.
    @throws TypeError If arguments are invalid or a char code is invalid
    @throws RangeError If a char code is out of range

    UTF8toUTF16(src, dst)

    Converts an arbitrary input source of UTF8 code points to an arbitrary output destination of UTF16 characters.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | number Code points source, either as a function returning the next code point respectively null if there are no more code points left, an array of code points or a single numeric code point.
    dst function(number) | !Array.<number> | undefined Characters destination, either as a function successively called with each converted char code, an array to be filled with the converted char codes or omitted to make this function return a standard JavaScript string.
    @returns undefined | string A standard JavaScript string if dst has been omitted, otherwise undefined
    @throws TypeError If arguments are invalid or a code point is invalid
    @throws RangeError If a code point is out of range

    encodeUTF16toUTF8(src, dst)

    Converts and encodes an arbitrary input source of UTF16 characters to an arbitrary output destination of UTF8 bytes.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | string Characters source, either as a function returning the next char code respectively null if there are no more characters left, an array of char codes or a standard JavaScript string.
    dst function(number) | Array.<number> | undefined Bytes destination, either as a function successively called with the next byte, an array to be filled with the encoded bytes or omitted to make this function return a binary string.
    @returns undefined | string A binary string if dst has been omitted, otherwise undefined
    @throws TypeError If arguments are invalid or a char code is invalid
    @throws RangeError If a char code is out of range

    decodeUTF8toUTF16(src, dst)

    Decodes and converts an arbitrary input source of UTF8 bytes to an arbitrary output destination of UTF16 characters.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | string Bytes source, either as a function returning the next byte respectively null if there are no more bytes left, an array of bytes or a binary string.
    dst function(number) | !Array.<number> | undefined Characters destination, either as a function successively called with each converted char code, an array to be filled with the converted char codes or omitted to make this function return a standard JavaScript string.
    @returns undefined | string A standard JavaScript string if dst has been omitted, otherwise undefined
    @throws TypeError If arguments are invalid
    @throws RangeError If a starting byte is invalid in UTF8
    @throws utfx.TruncatedError If the last sequence is truncated. Has an array property bytes holding the remaining bytes.

    calculateUTF8(src)

    Calculates the number of UTF8 bytes required to store an arbitrary input source of UTF8 code points.

    Parameter Type Description
    src function():(number | null) | Array.<number> Code points source, either as a function returning the next code point respectively null if there are no more code points left or an array of code points.
    @returns number Number of UTF8 bytes required
    @throws TypeError If arguments are invalid
    @throws RangeError If a code point is invalid in UTF8

    calculateUTF16asUTF8(src)

    Calculates the number of UTF8 bytes required to store an arbitrary input source of UTF16 characters when converted to UTF8 code points.

    Parameter Type Description
    src function():(number | null) | !Array.<number> | string Characters source, either as a function returning the next char code respectively null if there are no more characters left, an array of char codes or a standard JavaScript string.
    @returns number Number of UTF8 bytes required
    @throws TypeError If arguments are invalid
    @throws RangeError If an intermediate code point is invalid in UTF8

    fromCodePoint(var_args)

    A polyfill for String.fromCodePoint.

    Parameter Type Description
    var_args ...number Code points
    @returns string Standard JavaScript string
    @throws TypeError If arguments are invalid or a code point is invalid
    @throws RangeError If a code point is out of range

    codePointAt(s, i)

    A polyfill for String.prototype.codePointAt.

    Parameter Type Description
    s string Standard JavaScript string
    i number Index
    @returns number | undefined Code point or undefined if i is out of range
    @throws TypeError If arguments are invalid

    polyfill(override=)

    Installs utfx as a polyfill for String.fromCodePoint and String#codePointAt if not implemented.

    Parameter Type Description
    override boolean Overrides an existing implementation if true, defaults to false
    @returns !Object.<string,>* utfx namespace

    Usage

    • node.js: npm install utfx

      var utfx = require("utfx");
      ...
    • Browser: <script src="/path/to/utfx.min.js"></script>

      var utfx = dcodeIO.utfx;
      ...
    • Require.js/AMD

      require.config({
          "paths": {
              "utfx": "/path/to/utfx.min.js"
          }
      });
      require(["utfx"], function(utfx) {
          ...
      }

    Downloads

    Examples

    License

    Apache License, Version 2.0

    Install

    npm i utfx@0.6.0

    Version

    0.6.0

    License

    Apache-2.0

    Last publish

    Collaborators

    • dcode