remark-code-processor

rehype plugin that visits code blocks with specific meta information and provides them to functions.

What is this?
When should I use this?
Future
Install
API
- unified().use(rehypeAbsorbSiblings[, options])
- Options
Examples
Compatibility
Security
License

What is this?

This package is a unified(rehype) that visits a tree's code blocks, looks for meta infomration matching a pattern, and if found will run a function on the node.

unified is a project that transforms content with abstract syntax trees (ASTs). remark adds support for Markdown to unified. mdast is the Markdown AST that remark uses. This is a remark plugin uses mdast.

When should I use this?

This project is useful for a handful of objectives, each of which has an example doucmented in Examples:

Literate programming

With a to:file somefile.js directive, one could collect code fragments from a document and extract them to files. This enables a literate style of programming where code is the artifact of a document. You can also send code to multiple files from a single markdown document in this manner, enabling something like "single file components" in places that otherwise don't support them.
```
 Furthermore, with Markdown Wiki tools like [Obsidian][], [Logseq][] or [Markdown Oxide](), your code fragments can live in a hyperlinked wiki.
```

Additional Metadata

 If one wanted to add document metadata besides what's easily represented in a YAML/JSON/TOML frontmatter format, or to mix-and-match those formats, or to add document metadata based on the result of a code expression, this plugin can help enable that goal.

Inline Templates

 Some visual Markdown editors make working with inline HTML less-than-pleasant. With code blocks such as `\`\`\`html to:inline`, `\`\`\`javascript to:inline`, or `\`\`\`css to:inline`, one could write html, javascript, or css code in an area the editor treats *as code*, potentially with syntax highlighting, and then either directly inline the HTML, or wrap the contents in script or style tags as appropriate.

Future

This project is a work-in-progress, and I am actively using it for my own immediate use-cases, described in When should I use this?.

At the moment, the only possible change I see being useful would be to customize the directive heading (to:<somefn>), however I will probably not take this on myself. If someone wanted to add support for this, I would like any customization of the to: pattern to support multiple tests for opting in to code processing.

Install

This package is ESM only. In Node.js (version 16+), install with npm:

npm install remark-code-processor

API

This package exports no identifiers. The default export is remarkCodeProcessor.

unified().use(remarkCodeProcessor, options)

Process code blocks containing a to: signifier in their "meta" part (the part after whitespace after the langauge names), with the processors provided.

Parameters

options (Options)
- configuration

Returns

Transform (Transformer)

Options

Configuration of processors. This is a Record<string, fn>:

The key string value is the "directive name", what this plugin will match after to:
The fn processor is a function with the signature: arg, node, index, parent, and its return value is given to unist-util-visit to decide how to proceed.

If a to: directive with no matching key is found, remark-code-processor will throw an error.

Processor Function Arguments

arg: (string) -- the node's 'meta' value following the whitespace after the processing directive
node: (Node) -- the mdast node for the code block. It will have lang, meta, and value keys.
index: (number or undefined) -- the index of node in parent
parent: (Node or undefined) -- parent of node

Processor Function Returns

What to do next. This is passed directly to unist-util-visit's Visitor, but you will likely only want to use:

'skip': Do not process this node's children. Code blocks do not normally have children, however if you have transformed this node into a different type that DOES have children and do not want code blocks in the resultant children processed, you can return 'skip'
Index (number): The index of the parent's children to process next. If you modified the parent's children nodes, either by introducing or removing siblings, or have removed the target node itself, this will tell the processor where it should look next.

NOTE: The processor function may not be asyncronous. If it returns a promise, that promise will be ignored and the document will continue being processed as normal. This is a limitation of the underlying unist-util-visit library.

Examples

Example: extracting code blocks for further usage.

providing these processors:

	const codeBlocks = {}

	unified()
		.use(remarkParse)
		.use(remarkCodeProcessor, {
			file: (arg, node, index, parent) => {
				const [name] = arg.split(/\s+/)
				if (!codeBlocks[name]) { codeBlocks[name] = [] }
				codeBlocks[name].push(node.value)
			}
		})
		.use(remarkStringify)

Will parse this markdown:

Here is some my component:
```jsx to:file component.jsx
<div class="example"><slot></slot></div>
```

and here are its styles:
```css to:file component.css
.example { color: red; }
```

and result in this value for codeBlocks:

{
	'component.jsx': ['<div class="example"><slot></slot></div>'],
	'component.css': ['.example { color: red; }']
}

The code blocks will remain in the document's output.

Example: removing a code block from the output document

providing these processors:

	unified()
		.use(remarkParse)
		.use(remarkCodeProcessor, {
			elide: (arg, node, index, parent) => {
				parent.children.splice(index, 1)
				return index
			}
		})
		.use(remarkStringify)

Will parse this markdown:

Here is my document.
```text to:elide
This code block will be removed
```

And here is more of my document.

and result in the following markdown rendered by remarkStringify:

Here is my document.

And here is more of my document.

Example: transforming a node to inline html:

providing these processors:

	unified()
		.use(remarkParse)
		.use(remarkCodeProcessor, {
			inline: (arg, node, index, parent) => {
				if (node.lang == 'javascript') {
					node.value = `<script>${node.value}</script>`
				} else if (node.lang == 'css') {
					node.value = `<style>${node.value}</style>`
				}
				node.type = 'html'
				delete node.lang
				delete node.meta
			}
		})
		.use(rehypeRemark, { allowDangerousHtml: true })
		.use(rehypeRaw)
		.use(rehypeStringify)

will transform this markdown:

This is my document
```html to:inline
<aside>This is an aside</aside>
```
```javascript to:inline
console.log('hello')
```
```css to:inline
aside { font-size: 12px; }
```

into this html:

<p>This is my document</p>
<aside>This is an aside</aside>
<script>console.log('hello')</script>
<style>aside { font-size: 12px; }</style>

Compatibility

This project is compatible with NodeJS version 20 and forward.

This plugin works with unified version 11+. It may work with versions 6+ but I have not tested it with those.

Security

Use of remark-code-processor is safe by default; however, you may introduce security issues dependeing on how you process the resultant code extractions.