valid-8

1.0.1 • Public • Published

valid-8

Build Status Build status npm version Bower version

Pure JavaScript implementation of UTF-8 validation.

To be drop-in replacement for utf-8-validate.

Most time and efforts were spent to develop extensive test suite (over 18k assertions).

Testing

Tests are run using mocha with regular command:

npm test

Many non-obvious aspects of UTF-8 validation are tested, including:

  • UTF surrogates
  • long sequences
  • overlong sequences
  • incomplete sequences

Testing other libraries

To test other UTF-8 validation libraries, first install them

cd test/others
npm install
cd ../..

and then run tests for one library, eg:

npm test --lib=utf-8-validate

or:

npm test --lib=is-utf8

Speed

Validation speed is measured during test. So far this validator is fastest (this is not a joke!).

  • valid-8: 300 Mb/s (pure JavaScript)
  • utf-8-validate: 260 Mb/s (C++)
  • is-utf8: 110 Mb/s (pure JavaScript either)

API

Validation is simple:

valid8 = require('valid-8')
 
if(!valid8(new Buffer('你好,世界!')))
{
  // ...
}

For compatibility with utf-8-validate alias is set valid8.Validation.isValidUTF8 === validate8.

By default, valid8 rejects UTF surrogates (0xD800-0xDFFF) and codepoints higher than 0x10FFFF, according to UTF specification.

One can force UTF surrogates to pass test setting valid8.surrogates = true.

To allow long sequences (say, 5 or 6 bytes), set validate8.maxBytes to 5 or 6. 7-byte sequences will always be rejected. By default validate8.maxBytes=4, and can be set to 1, 2 or 3 either. Eg, set validate8.maxBytes=2 to disable Chinese ideograms (and many other symbols).

Rivals

See also

Package Sidebar

Install

npm i valid-8

Weekly Downloads

13

Version

1.0.1

License

ISC

Last publish

Collaborators

  • ukoloff