Node API used to detect meaningful indent levels a la Python, specifically by handling mixed tabs and spaces.
Node API usage
const {
indentmonitor,
IndentError,
} = require('indentmon');
const indentlevel = indentmonitor();
try {
for (const line of somearray) {
const [level, trimmed] = indentlevel(line);
}
} catch(e) {
if (e instanceof IndentError) {
console.error('Your indentation ruined everyting');
console.error(e);
}
}
Motivation
When writing my own tooling I keep finding new reasons to write DSLs. Having a way to parse out meaningful indents aware of mixed tabs and spaces lets me quickly establish scope.
Indent rules
When indentmon
scans each line, it is following these rules
(paraphrased from Antti Haapala):
- If both number of tabs and number of spaces matches the previous line (no matter the order), then the indent level does not change.
- If the number of one of (tabs, spaces) is greater than that on the previous line and number of the other is at least equal to those on the previous line, this is an indented block.
- If the tuple (tabs, spaces) matches an indent from a previous block, dedent to that block.
- Otherwise, raise
IndentError
.
Testing
While testing may be automated using the below command, you may also
use the Node.js REPL to interact with indentmon
directly.
$ npm run test MINLEN NTRIALS
The test suite generates MINLEN*NTRIALS
indented code samples and
uses brute force to ensure they are all parsed correctly. It also
intentionally indents code incorrectly to make sure indentmon
raises exceptions.
* `MINLEN`: The minimum lines of code to create in each production.
* `NTRIALS`: The number of productions belonging to each family
to generate for testing.
Test reports have the format FM#: PROD
, where FM
is a two letter
code for one of the below production families, #
is the trial, or
instance of the production tested, and PROD
is the indent DSL used
to generate code according to a given indent style and pattern.
Production families
CT
)
Constant (Indentation never changes.
foo
bar
baz
snafu
IN
)
Increasing (Indentation increases each line.
foo
bar
baz
snafu
ND
)
Nondecreasing (Indentation may or may not increase per line, but will never decrease.
foo
bar
baz
snafu
fubar
barbaz
NM
)
Nonmonotonic (Indentation grows, then shrinks.
foo
bar
baz
snafu
fubar
barbaz
goofus
gallant
archangel of shamalama
AN
)
Anchored (Indentation always returns to column 0 at the end.
foo
bar
baz
snafu
fubar
barbaz
goofus
gallant
archangel of shamalama
DO
)
Dropoff (Indentation grows sharply, and falls to a low indent level.
foo
bar
baz
snafu
fubar
barbaz
foo
bar
baz
snafu
fubar
barbaz
gallant
archangel of shamalama
Indent DSL
DSL productions use the following charset: ><-0123456789
-
: Print line number, then move to next line.>
: Indent one level, then '-' command.<
: Dedent one level, then '-' command.0-9
: Go to indicated indent level, then '-' command.
The following rules apply:
- Indent levels are zero-based.
- You cannot use
<
before the first>
. - There cannot be more
<
s than>
s - Digits may only exceed the current level by at most 1.
Examples:
'+' indicates an indent. 'n' indicates a newline.
---- prints 1n2n3n4n
->>- prints 1n+2n++3n++4n
><> prints +1n2n+3n
->>>0><>>1 prints 1n+2n++3n+++4n5n+6n7n+8n++9n+10n
->>59 is invalid. Only 6 can follow 5.
->>95 is invalid. Only 3 can follow 2 (implied by >>).
>>3 is valid. (Same as >>>)
>>>1 is valid.
Using in Node.js REPL
$ node
> .load ./test.js
> render('>>--->><<--')
1
2
3
4
5
6
7
8
9
10
11
You can also generate production from families like so, such that L
is
an integer for the minimum number of lines you want to print. Removing
the call to render
will show the indent DSL used.
> render(createConstantProduction(L))
> render(createIncreasingProduction(L))
> render(createNonDecreasingProduction(L))
> render(createAnchoredProduction(L))
> render(createNonMonotonicProduction(L))
> render(createDropOffProduction(L))
TODO
Consider using a generator like so:
const {
indentmonitor,
IndentError,
} = require('indentmon');
try {
for (const [level, trimmed] of indentmonitor(someiterable)) {
// Should user keep access to original line?
// If so, is it worth storing two strings for each line?
}
} catch (e) {
...
}
Also, distinguish between TabError
and IndentError
.