A sequence of strings that are lexicographically ordered and grow slowly.
npm i save lexsequence
This library provides a sequence of strings that are lexicographically ordered (in order by <
) and grow slowly (a few 1char strings, then some 2char strings, then many 3char strings, etc.).
For example, the base10 sequence is
"0", "1", "2", "3", "4", "50", "51", "52", ..., "73", "74", "750", "751", ...
There are 5 length1 strings, 5^2 length2 strings, 5^3 length3 strings, etc. So the nth string is only as long as the base5 encoding of n. (For more short strings, you can use a base larger than 10.)
Additional properties:
 No string is a prefix of another. E.g., we skip from
"4"
to"50"
without emitting"5"
.  The strings are internally represented as numbers; these numbers are in numeric order, in addition to lexicographic order. (You obtain strings by encoding those numbers in the chosen base.)

Why not ordinary integers?
 They are not lexicographically ordered:
"10" < "2"
.
 They are not lexicographically ordered:

Why not fixedlength strings? (
"000"
,"001"
,"002"
, ...,"999"
) That limits you to a fixed number of strings, instead of an indefinite number. Also, the first few strings (which you'll probably use most often) are more than one char long.

What might I use this library for?
 Naming file versions so that they show up in order.
 Encoding a timestampplustiebreaker as a lexicographicallyordered string
`${sequence(timestamp, BASE)}${tiebreaker}`
, so that you can sort by this single field. (E.g., Lamport timestamps with a process ID tiebreaker.)  I use it in listpositions's lexicographicString function, to assign slowlygrowing strings to sequential positions.
It's a good idea to specify your base as a constant:
import {
sequence,
sequenceInv,
sequenceInvSafe,
successor,
} from "lexsequence";
const BASE = 10;
sequence(n, BASE)
returns the nth entry in the sequence as an integer, which you can then BASE
encode:
for (let n = 0; n < 100; n++) {
console.log(sequence(n, BASE).toString(BASE));
}
// Prints "0", "1", ..., "4", "50", ..., "74", "750", ..., "819"
sequenceInv(seq, BASE)
converts a sequence member back to its index:
console.log(sequenceInv(819, BASE)); // Prints 99
"Safe" version that will return 1 instead of throwing an error, if seq
is not a valid sequence member:
console.log(sequenceInvSafe(5, BASE) === 1); // Prints true
successor(seq, BASE)
is a fast way to go from sequence(n, BASE)
to sequence(n + 1, BASE)
:
let seq = 0;
for (let i = 0; i < 100; i++) {
console.log(seq.toString(BASE));
seq = successor(seq, BASE);
}
// Prints "0", "1", ..., "4", "50", ..., "74", "750", ..., "819"

BASE
must even and >= 4. For a base2 (binary) sequence, binaryencode the numbers from the base4 sequence.  The
BASE
encoding ofsequence(n, BASE)
is always as long as theBASE/2
encoding ofn
.  You can use any alphabet to encode numbers as strings, so long as it consists of
BASE
chars and they are in lexicographic order. E.g., you can use base64 chars in the nonstandard, lexicographic ordering+/09AZaz
.  lexicographicinteger implements the same idea, but only in base 16 or 256, with strings that are generally a bit longer than ours.
 The sequence is as follows, with examples in base 10:
 Starting with 0, enumerate
BASE/2
numbers. (0, 1, ..., 4.)  Add 1, multiply by
BASE
, then enumerate(BASE/2)^2
numbers. (50, 51, ..., 74.)  Add 1, multiply by
BASE
, then enumerate(BASE/2)^3
numbers. (750, 751, ..., 874.)  Repeat this pattern indefinitely, enumerating
(BASE/2)^d
lengthd numbers for each d >= 1. Imagining a decimal place in front of each number, each d consumes 2^(d) of the unit interval, so we never "reach 1" (overflow to d+1 digits when we meant to use d digits).
 Starting with 0, enumerate
 I have not found an existing source describing this sequence, but the binaryencoded base4 sequence is similar to Elias gamma coding.