cmark-gfm-js
- A port of GitHub's cmark to JavaScript (using Emscripten)
- Support Node.js and browser
- GitHub Flavored Markdown (GFM) Compatibility
- HTML Sanitization
- Benchmarks
- TypeScript friendly
Installation
Node.js
yarn add cmark-gfm-js
Browser
Download cmark-gfm.js
Usage
/** * convert converts a GitHub Flavored Markdown (GFM) string to HTML. */; /** * convertUnsafe calls convert with GFM's tagfilter extension disabled. (See "HTML Sanitization" below for details) */;
Examples
In Node.js:
const gfm = ; const markdown = '# Hi\nThis ~text~~~~ is ~~~~curious 😡🙉🙈~.';let html = gfm;console;/** Prints: <h1>Hi</h1> <p>This <del>text</del> is <del>curious 😡🙉🙈</del>.</p>*/ // Specify an optionhtml = gfm;console;/** Prints <h1 data-sourcepos="1:1-1:4">Hi</h1> <p data-sourcepos="2:1-2:44">This <del>text</del> is <del>curious 😡🙉🙈</del>.</p>*/
In browser:
GFM Compatibility
Task list items are not supported (issue). Use emojis instead. e.g.
✅ Done.
❌ To be done.
HTML Sanitization
Some background
TL;DR: See A Good HTML Sanitizer for a working example of a HTML Sanitizer.
The current CommonMark Spec 0.27 allows raw HTML tags in markdown but does not state anything on sanitizing raw HTML data. cmark-gfm comes with two possible (but not perfect) builtin solutions.
- cmark comes with a
SAFE
option, which will suppress most raw HTML tags (see Options below). Drawback: many safe tags are killed, not configurable. - cmark-gfm comes with an extension called
tagfilter
, which filters a set of HTML tags, and is written in GFM Spec. (see spec). Drawbacks: cannot filter tags with malicious attributes, not configurable.
Let's see a real example:
const gfm = ; /** Consider the following markdown ❌ <script>alert(1)</script> ❌ <img src="x.jpg" onclick="alert(1)"/> ✅ <img src="cool.jpg"/> ✅ <figcaption>caption</figcaption>*/const dangerous = '<script>alert(1)</script>\n<img src="x.jpg" onclick="alert(1)"/>\n<img src="cool.jpg"/>\n<figcaption>caption</figcaption>'; // GFM's tagfilter is enabled by default.const tagfiltered = gfm;console;/** Prints <script>alert(1)</script> <img src="x.jpg" onclick="alert(1)"/> <img src="cool.jpg"/> <figcaption>caption</figcaption>*/ // Do not use GFM's tagfilter, use cmark's SAFE option.// gfm.convertUnsafe will disable GFM's tagfilter extension.const cmarkSafe = gfm;console;/** Prints <!-- raw HTML omitted --> <!-- raw HTML omitted -->*/
So actually none of the above solutions work perfectly. GFM's tag filter is not able to filter some tags with malicious attributes, while cmark's SAFE
option seems like an overkill.
A Good HTML Sanitizer
If you want to sanitize HTML in a good way, I suggest you completely ignore the builtin solutions above from cmark-gfm, instead output raw HTML with gfm.convertUnsafe
and use a more professional HTML sanitizer instead. For example ting:
const gfm = ;const ting = ; /** Dangerous markdown ❌ <script>alert(1)</script> ❌ <img src="x.jpg" onclick="alert(1)"/> ✅ <img src="cool.jpg"/> ✅ <figcaption>caption</figcaption>*/const dangerous = '<script>alert(1)</script>\n<img src="x.jpg" onclick="alert(1)"/>\n<img src="cool.jpg"/>\n<figcaption>caption</figcaption>'; const unsafeHTML = gfm;const safeHTML = ting; console;/** Prints Unsafe: <script>alert(1)</script> <img src="x.jpg" onclick="alert(1)"/> <img src="cool.jpg"/> <figcaption>caption</figcaption> Safe: <img src="x.jpg" /> <img src="cool.jpg" /> <figcaption>caption</figcaption>*/
See examples/sanitizeHTML
for full source code.
cmark-gfm Options
Who is using cmark-gfm-js
Coldfunction is using cmark-gfm-js to generate HTML from markdown in browser, and uses cmark-gfm in its backend services.