Pure JavaScript implementation of UTF-8 validation

Usage no npm install needed!

<script type="module">
  import valid8 from '';



Build Status Build status npm version Bower version

Pure JavaScript implementation of UTF-8 validation.

To be drop-in replacement for utf-8-validate.

Most time and efforts were spent to develop extensive test suite (over 18k assertions).


Tests are run using mocha with regular command:

npm test

Many non-obvious aspects of UTF-8 validation are tested, including:

  • UTF surrogates
  • long sequences
  • overlong sequences
  • incomplete sequences

Testing other libraries

To test other UTF-8 validation libraries, first install them

cd test/others
npm install
cd ../..

and then run tests for one library, eg:

npm test --lib=utf-8-validate


npm test --lib=is-utf8


Validation speed is measured during test. So far this validator is fastest (this is not a joke!).

  • valid-8: 300 Mb/s (pure JavaScript)
  • utf-8-validate: 260 Mb/s (C++)
  • is-utf8: 110 Mb/s (pure JavaScript either)


Validation is simple:

valid8 = require('valid-8')

if(!valid8(new Buffer('你好,世界!')))
  // ...

For compatibility with utf-8-validate alias is set valid8.Validation.isValidUTF8 === validate8.

By default, valid8 rejects UTF surrogates (0xD800-0xDFFF) and codepoints higher than 0x10FFFF, according to UTF specification.

One can force UTF surrogates to pass test setting valid8.surrogates = true.

To allow long sequences (say, 5 or 6 bytes), set validate8.maxBytes to 5 or 6. 7-byte sequences will always be rejected. By default validate8.maxBytes=4, and can be set to 1, 2 or 3 either. Eg, set validate8.maxBytes=2 to disable Chinese ideograms (and many other symbols).


See also