split-graphemes

Divide the string into graphemes.

Usage no npm install needed!

<script type="module">
  import splitGraphemes from 'https://cdn.skypack.dev/split-graphemes';
</script>

README

split-graphemes

Divide ligature letters such as Thai, Khmer letters and complex emoji into array of graphemes. You can simply use this library instead of Array.from to get graphemes.

CircleCI

Installation

$ npm install split-graphemes

Examples

Emoji

// An emoji '๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ' consists of 4 people face emoji joined by Zero Width Joiners (ZWJ).
const chars = Array.from('๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ') // ['๐Ÿ‘จ', ZWJ, '๐Ÿ‘ฉ', ZWJ, '๐Ÿ‘ฆ', ZWJ, '๐Ÿ‘ฆ']
// It is interpreted exactly as one character!
const chars = splitGraphemes('๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ') // ['๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ฆโ€๐Ÿ‘ฆ']

Khmer characters

Array.from('แž”แŸ‰แžปแžŸแŸ’แžŠแžทแŸ') // ['แž”', 'แŸ‰', 'แžป', 'แžŸ', 'แŸ’', 'แžŠ', 'แžท', 'แŸ']
splitGraphemes('แž”แŸ‰แžปแžŸแŸ’แžŠแžทแŸ') // ['แž”แŸ‰แžป', 'แžŸแŸ’แžŠแžทแŸ']

Japanese NFD

splitGraphemes('ใ“ใ‚™ใ‚“ใ‚™ใซใ‚™ใกใ‚™ใฏใ‚™') // ['ใ“ใ‚™', 'ใ‚“ใ‚™', 'ใซใ‚™', 'ใกใ‚™', 'ใฏใ‚™']
splitGraphemes('ใƒใ‚šใƒ’ใ‚šใƒ•ใ‚šใƒ˜ใ‚šใƒ›ใ‚š') // ['ใƒใ‚š', 'ใƒ’ใ‚š', 'ใƒ•ใ‚š', 'ใƒ˜ใ‚š', 'ใƒ›ใ‚š']

English

splitGraphemes('Hello') // ['H', 'e', 'l', 'l', 'o']

Supported ligature characters

The list of characters is at here.