utf8-binary-cutter
A small node.js lib to truncate UTF-8 strings to a given binary size. Useful when dealing with old systems handling UTF-8 as ascii/latin-1, for ex. MySQL or Oracle database.
Interesting reads :
Usage
- Works on UTF-8 strings (javascript strings are UTF-8 unless you're doing fancy things)
var Cutter = ;
getBinarySize()
: returns the binary size of the given string
var utf8String = 'abc☃☃☃'; // abc then 3 times the UTF-8 « snowman » char which takes 3 bytes console; // 12 = 1 + 1 + 1 + 3 + 3 + 3
truncateToBinarySize()
truncate so that final binary size is lower or equal than the given limit :
var utf8String = 'abc☃☃☃'; // abc then 3 times the UTF-8 « snowman » char which takes 3 bytes console; // 'abc☃☃☃' -> no changeconsole; // 'abc☃☃☃' -> no changeconsole; // 'abc☃...' -> to avoid cutting utf8 chars, // the two last snowmen had to be removed. Final size = 9 bytesconsole; // 'abc☃...' -> idemconsole; // 'abc☃...' -> idemconsole; // 'abc...'
truncateFieldsToBinarySize()
multiple truncations at the same time :- NOTE : returns a new object.
- NOTE : iterates only on own properties
- NOTE : only truncated strings are copied, other members are shared with original object.
var maxBinarySizes = title: 40 content: 200; console; --> title: '☃☃☃ A véry véry long title wi...' content: 'I ❤ utf8-binary-cutter !' foo: 42
truncateToCharLength()
normal truncate is also provided for convenience : truncate so that final char length is lower or equal than the given limit :
var utf8String = 'abc☃☃☃'; // 6 chars console; // 'abc☃☃☃' -> no changeconsole; // 'abc☃☃☃' -> no changeconsole; // 'ab...' -> 5 chars, ok
- optional callback when truncating (useful for logging) :
truncateToBinarySize(foo, 42, function(maxBinarySize, originalString, truncatedString) {
logger.warn(...
});
truncateToCharLength(foo, 42, function(maxCharLength, originalString, truncatedString) {
logger.warn(...
});
Cutter.truncateFieldsToBinarySize({
title: '☃☃☃ A véry véry long title with UTF-8 ☃☃☃',
content: 'I ❤ utf8-binary-cutter !',
foo: 42
},
// maxBinarySizes
{
title: 40,
content: 200
},
// callback
// will be called for each member truncated.
// 4th param : the key of the member being truncated.
function(maxCharLength, originalString, truncatedString, key) {
logger.warn(...
}
);
Contributing
- clone repo
- ensure your editor is decent and pick up the
.editorconfig
and.jshintrc
files npm install
npm test
- add tests, add features, send a PR
Thanks !