String Lerp
This is a library to linearly interpolate between two string values, that is, progressively turn one string into another. It implements several ways to do this, and can automatically pick the best one based on the strings given to it.
For example, it knows:
- "explore" is between "implore" and "explode".
- Or conversely, "implode" between "explode" and "implore".
- "chicken wing" and "buffalo wing" are actually the same "wing".
- Likewise, "apple core" and "core dump" can keep the "core".
- 1/3 of the way from "0%" to "100%" is "33%".
- Halfway between rgb(255, 0, 0) and rgb(0, 0, 255) is rgb(128, 0, 128). (Well, it's good at strings, not color science.)
To try more, open up [demo.html][] in your browser.
Setting Up
Include the following in your HTML:
<script type="text/javascript" src="string-lerp.js"></script>
Then, you can use window.stringLerp
(or just stringLerp
).
Or if you're using Node or some other non-browser whatever,
var stringLerp = require("string-lerp");
If you want a minified version, at a shell run
$ make ugly
If you think something's wrong, or make changes, run
$ make check
(Running these will also download modules from NPM.)
API
stringLerp.lerp(source, target, amount)
Interpolate between strings as best as possible.
This automatically picks the best interpolator based on the strings
provided. If the strings are identical aside from numbers in them,
they are passed through numericLerp
.
If the strings are not numbers and short, they are passed through
diffLerp
.
Otherwise, they are passed through fastLerp
.
stringLerp.numericLerp(source, target, amount)
Interpolate numerically between strings containing numbers.
Numbers may have a leading "-" and a single "." to mark the decimal
point, but something must be after the ".". No other floating point
syntax (e.g. 1e6
) is supported. They are treated as fixed-point
values, with the point's position itself interpolating.
For example, numericLerp("0.0", "100.0", 0.123) === "12.3"
because
the .
in 0.0
is interpreted as a decimal point.
But numericLerp("0.", "100.", 0.123) === "12."
because the strings
are interpreted as integers followed by a full stop.
Calling this functions on strings that differ in more than numerals gives undefined results.
stringLerp.diffLerp(source, target, amount)
Interpolate between two strings using edit operations.
This interpolation algorithm applys a partial edit of one string into the other. This produces nice looking results, but can take a significant amount of time and memory to compute the edits. It is not recommended for strings longer than a few hundred characters.
stringLerp.fastLerp(source, target, amount)
Interpolate between source
to target
based on length.
This interpolation algorithm progressively replaces the front of one string with another. This approach is fast but does not look good when the strings are similar.
stringLerp.diff(source, target) and stringLerp.patch(diff, source)
These are the functions used to implement diffLerp
. diff
calculates an array of edit operations - substitutions, insertions,
and deletions - to turn source
into target
.
The type of the edit operations is unspecified. What is guaranteed is:
- There's an Array of them.
- The array can be cut up and applied in-order but piecemeal.
- They are simple objects, i.e. can be (de)serialized via JSON and fed
into the same version of
patch
later.
Do not rely on edit operations to be exactly the same type (or same operations) between versions / installations.
stringLerp.costMatrix(source, target, ins, del, sub)
Calculate the edit distance between the source and target sequences, and return a matrix representing the possible edits and their costs. The matrix returned is a flat typed array.
Because stringLerp needs to be able to reconstruct the edit path, this is not an optimal algorithm if you only need the Levenshtein distance.
Unicode Concerns
String Lerp handles Unicode reasonably well. Surrogate pairs are recognized and not split, and combining characters stay attached to the character they started with. All algorithms will be notably slower, and memory-intensive, when given such strings.
Some scripts, such as Hangul and Tamil, do not work ideally. Multi-glyph graphemes will be split up, and potentially rejoined, during interpolation. The intermediate string is always a valid Unicode string containing only glyphs present in one string or the other, but the glyphs may be arranged very differently.
A similar problem occurs when switching between LTR and RTL in the same string. The codepoints indicating bidi switches may move around the string capturing glyphs in ways that are not visually appealing.
License
Copyright 2014 Joe Wreschnig
Licensed under the terms of the GNU GPL v2 or later
@license http://www.gnu.org/licenses/gpl-2.0.html
@source: http://yukkurigames.com/string-lerp/
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
As additional permission, you may distribute the program or works
based on it without the copy of the GNU GPL normally required,
provided you include this license notice and a URL through which
recipients can access the corresponding source code.