JS Library to sort Tibetan
State of the art
The most logical option to sort Tibetan would by using Intl.Collator. The problem is that all browsers seems to use ICU to implement this object, and ICU has a bug on Tibetan collation, which won't be fixed in the short term. It will take even more time for the fix to appear in mainstream browsers, so it's not even a middle term solution. Bugs have been filled for Firefox, ChakraCore, Chrome and Safari.
The only library we found that would be of possible use is lasca, but it proved very buggy and extremely inefficient.
This implementation aims at being very efficient, at the cost of difficult corner cases in Tibetan. As a consequence:
- it does not normalize strings (
\u0F77is not treated like
- it does not handle Sanskrit stacks very precisely (the ICU rule
&ཀར<ཀརྐis too difficult to handle)
yarn add tibetan-sort-js --save
Compares two strings in Tibetan Unicode, can be used as argument of Array.compare(). The behavior is undefined if the arguments are not strings. Doesn't workswell with non-Tibetan strings.
Returns number 0 if equivalent, 1 if a > b, -1 if a < b
Compares two strings in EWTS, has the same argument and return value as
compare. The function only works on customary EWTS and doesn't handle oddly encoded cases such as
b.r+g+ya (instead of
- add an option to normalize strings?
See change log.
The code is Copyright 2017-2019 Buddhist Digital Resource Center, and is provided under the MIT License.