Convert Paragraph
Node to plain text with SourceMap.
It means that you can get original position from plain text.
This library is for textlint and textstat.
Installation
npm install textlint-util-to-string
Terminology
The concepts position
and index
are the same with TxtAST Interface and textlint/structured-source.
-
position
is a{ line, column }
object.- The
column
property ofposition
is 0-based. - The
line
property ofposition
is 1-based.
- The
-
index
is an offset number.- The
index
property is 0-based.
- The
API
new StringSource(node: TxtParentNode, options?: StringSourceOptions)
Create new StringSource
instance for paragraph
Node.
toString(): string
Get plain text from Paragraph
Node.
This plain text is concatenated from value
of all children nodes of Paragraph
Node.
import { StringSource } from "textlint-util-to-string";
const report = function (context) {
const { Syntax, report, RuleError } = context;
return {
// "This is **a** `code`."
[Syntax.Paragraph](node) {
const source = new StringSource(node);
const text = source.toString(); // => "This is a code."
}
}
};
In some cases, you may want to replace some characters in the plain text for avoiding false positives.
You can replace value
of children nodes by options.replacer
.
options.replacer
is a function that takes a node
and commands like maskValue
or emptyValue
.
If you want to modify the value
of the node, return command function calls.
// "This is a `code`."
const source = new StringSource(paragraphNode, {
replacer({ node, maskValue }) {
if (node.type === Syntax.Code) {
return maskValue("_"); // code => ____
}
}
});
console.log(source.toString()); // => "This is a ____."
-
maskValue(character: string)
: mask thevalue
of the node with the givencharacter
. -
emptyValue()
: replace thevalue
of the node with an empty string.
originalIndexFromIndex(generatedIndex): number | undefined
Get original index from generated index value
originalPositionFromPosition(position): Position | undefined
Get original position from generated position
originalIndexFromPosition(generatedPosition): number | undefined
Get original index from generated position
originalPositionFromIndex(generatedIndex): Position | undefined
Get original position from generated index
Examples
Create plain text from Paragraph
Node and get original position from plain text.
import assert from "assert";
import { StringSource } from "textlint-util-to-string";
const report = function (context) {
const { Syntax, report, RuleError } = context;
return {
// "This is [Example!?](http://example.com/)"
[Syntax.Paragraph](node) {
const source = new StringSource(node);
const text = source.toString(); // => "This is Example!?"
// "Example" is located at the index 8 in the plain text
// ^
const index1 = result.indexOf("Example");
assert.strictEqual(index1, 8);
// The "Example" is located at the index 9 in the original text
assert.strictEqual(source.originalIndexFromIndex(index1), 9);
assert.deepStrictEqual(source.originalPositionFromPosition({
line: 1,
column: 8
}), {
line: 1,
column: 9
});
// Another example with "!?", which is located at 15 in the plain text
// and at 16 in the original text
const index2 = result.indexOf("!?");
assert.strictEqual(index2, 15);
assert.strictEqual(source.originalIndexFromIndex(index2), 16);
}
}
};
Integration with sentence-splitter
sentence-splitter splits a paragraph into sentences.
You can pass the Sentence node to StringSource
to get the plain text of the sentence.
import assert from "assert";
import { splitAST, SentenceSplitterSyntax } from "sentence-splitter";
import { StringSource } from "textlint-util-to-string";
import type { TextlintRuleModule } from "@textlint/types";
const report: TextlintRuleModule<Options> = function (context) {
const { Syntax, report, RuleError } = context;
return {
// "First sentence. Second sentence."
[Syntax.Paragraph](node) {
// { children: [Sentence, WhiteSpace, Sentence] }
const sentenceRoot = splitAST(node);
// ["First sentence." node, "Second sentence." node]
const sentences = sentenceRoot.children.filter((node) => node.type === SentenceSplitterSyntax.Sentence);
for (const sentence of sentences) {
const sentenceSource = new StringSource(sentence);
const sentenceText = sentenceSource.toString();
console.log(sentenceText);
const sentenceIndex = sentenceText.indexOf("sentence");
const originalSentenceIndex = sentenceSource.originalIndexFromIndex(sentenceIndex);
console.log({ sentenceIndex, originalSentenceIndex });
}
}
}
};
export default report;
Rules that use this library
- azu/textlint-rule-first-sentence-length: textlint rule that limit maximum length of First sentence of the section.
- azu/textlint-rule-en-max-word-count: textlint rule that specify the maximum word count of a sentence.
- nodaguti/textlint-rule-spellchecker: textlint rule to check spellings with native spellchecker
FAQ
Why return relative position from rootNode?
const AST = {...}
const rootNode = AST.children[10];
const source = new StringSource(rootNode);
source.originalIndexFor(0); // should be 0
To return relative position easy to compute position(We think).
One space has a single absolute position, The other should be relative position.
Tests
npm test
Contributing
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature
- Commit your changes:
git commit -am 'Add some feature'
- Push to the branch:
git push origin my-new-feature
- Submit a pull request :D
License
MIT