cmark-gfm-js

A port of GitHub's cmark to JavaScript (using Emscripten)
Support Node.js and browser
GitHub Flavored Markdown (GFM) Compatibility
HTML Sanitization
Benchmarks
TypeScript friendly

Installation

Node.js

yarn add cmark-gfm-js

Browser

Download cmark-gfm.js

Usage

/**
 * convert converts a GitHub Flavored Markdown (GFM) string to HTML.
 */
function convert(markdown: string, options?: number): string;
 
/**
 * convertUnsafe calls convert with GFM's tagfilter extension disabled. (See "HTML Sanitization" below for details)
 */
function convertUnsafe(markdown: string, options?: number): string;

Examples

In Node.js:

const gfm = require('cmark-gfm-js');
 
const markdown = '# Hi\nThis ~text~~~~ is ~~~~curious 😡🙉🙈~.';
let html = gfm.convert(markdown);
console.log(html);
/** Prints: 
  <h1>Hi</h1>
  <p>This <del>text</del> is <del>curious 😡🙉🙈</del>.</p>
*/
 
// Specify an option
html = gfm.convert(markdown, gfm.Option.sourcePos);
console.log(html);
/** Prints
  <h1 data-sourcepos="1:1-1:4">Hi</h1>
  <p data-sourcepos="2:1-2:44">This <del>text</del> is <del>curious 😡🙉🙈</del>.</p>
*/

In browser:

<p id="text"></p>
<hr/>
<p id="html"></p>
<p id="htmlPreview"></p>
<script src="../dist/cmark-gfm.js"></script>
<script>
  if (!CmarkGFM) {
    document.getElementById('text').textContent = 'window.CmarkGFM not defined. Please build the project and refresh this page.';
  } else {
    var markdown = '# Hi\nThis ~text~~~~ is ~~~~curious 😡🙉🙈~.';
    var html = CmarkGFM.convert(markdown);
    
    document.getElementById('text').innerHTML = 'Markdown (GFM): <p><code>' + markdown + '</code></p>';
    document.getElementById('html').innerHTML = html;
 
    // Specify an option
    var htmlWithSourcePos = CmarkGFM.convert(markdown, CmarkGFM.Option.sourcePos);
    document.getElementById('htmlPreview').textContent = htmlWithSourcePos;
  }
</script>

GFM Compatibility

Task list items are not supported (issue). Use emojis instead. e.g.

✅ Done.
❌ To be done.

HTML Sanitization

Some background

TL;DR: See A Good HTML Sanitizer for a working example of a HTML Sanitizer.

The current CommonMark Spec 0.27 allows raw HTML tags in markdown but does not state anything on sanitizing raw HTML data. cmark-gfm comes with two possible (but not perfect) builtin solutions.

cmark comes with a SAFE option, which will suppress most raw HTML tags (see Options below). Drawback: many safe tags are killed, not configurable.
cmark-gfm comes with an extension called tagfilter, which filters a set of HTML tags, and is written in GFM Spec. (see spec). Drawbacks: cannot filter tags with malicious attributes, not configurable.

Let's see a real example:

const gfm = require('cmark-gfm-js');
 
/** Consider the following markdown
  ❌ <script>alert(1)</script>
  ❌ <img src="x.jpg" onclick="alert(1)"/>
  ✅ <img src="cool.jpg"/>
  ✅ <figcaption>caption</figcaption>
*/
const dangerous = '<script>alert(1)</script>\n<img src="x.jpg" onclick="alert(1)"/>\n<img src="cool.jpg"/>\n<figcaption>caption</figcaption>';
 
// GFM's tagfilter is enabled by default.
const tagfiltered = gfm.convert(dangerous);
console.log(tagfiltered);
/** Prints
  &lt;script>alert(1)&lt;/script>
  <img src="x.jpg" onclick="alert(1)"/>
  <img src="cool.jpg"/>
  <figcaption>caption</figcaption>
*/
 
// Do not use GFM's tagfilter, use cmark's SAFE option.
// gfm.convertUnsafe will disable GFM's tagfilter extension.
const cmarkSafe = gfm.convertUnsafe(dangerous, gfm.Option.safe);
console.log(cmarkSafe);
/** Prints
  <!-- raw HTML omitted -->
  <!-- raw HTML omitted -->
*/

So actually none of the above solutions work perfectly. GFM's tag filter is not able to filter some tags with malicious attributes, while cmark's SAFE option seems like an overkill.

A Good HTML Sanitizer

If you want to sanitize HTML in a good way, I suggest you completely ignore the builtin solutions above from cmark-gfm, instead output raw HTML with gfm.convertUnsafe and use a more professional HTML sanitizer instead. For example ting:

const gfm = require('cmark-gfm-js');
const ting = require('ting');
 
/** Dangerous markdown
  ❌ <script>alert(1)</script>
  ❌ <img src="x.jpg" onclick="alert(1)"/>
  ✅ <img src="cool.jpg"/>
  ✅ <figcaption>caption</figcaption>
*/
const dangerous = '<script>alert(1)</script>\n<img src="x.jpg" onclick="alert(1)"/>\n<img src="cool.jpg"/>\n<figcaption>caption</figcaption>';
 
const unsafeHTML = gfm.convertUnsafe(dangerous);
const safeHTML = ting.sanitize(unsafeHTML);
 
console.log(`Unsafe:\n${unsafeHTML}\nSafe: ${safeHTML}`);
/** Prints 
  Unsafe:
  <script>alert(1)</script>
  <img src="x.jpg" onclick="alert(1)"/>
  <img src="cool.jpg"/>
  <figcaption>caption</figcaption>
 
  Safe:
  <img src="x.jpg" />
  <img src="cool.jpg" />
  <figcaption>caption</figcaption>
*/

See examples/sanitizeHTML for full source code.

cmark-gfm Options

enum Option {
  /**
   * ### Options affecting rendering
   */
 
  /** Include a `data-sourcepos` attribute on all block elements. */
  sourcePos = (1 << 1),
 
  /** Render `softbreak` elements as hard line breaks.
   */
  softBreak = (1 << 2),
 
  /** Suppress raw HTML and unsafe links (`javascript:`, `vbscript:`,
   * `file:`, and `data:`, except for `image/png`, `image/gif`,
   * `image/jpeg`, or `image/webp` mime types).  Raw HTML is replaced
   * by a placeholder HTML comment. Unsafe links are replaced by
   * empty strings.
   */
  safe = (1 << 3),
 
  /** Render `softbreak` elements as spaces.
   */
  noBreaks = (1 << 4),
 
  /**
   * ### Options affecting parsing
   */
 
  /** Legacy option (no effect).
   */
  normalize = (1 << 8),
 
  /** Validate UTF-8 in the input before parsing, replacing illegal
   * sequences with the replacement character U+FFFD.
   */
  validateUTF8 = (1 << 9),
 
  /** Convert straight quotes to curly, --- to em dashes, -- to en dashes.
   */
  smart = (1 << 10),
 
  /** Use GitHub-style <pre lang="x"> tags for code blocks instead of <pre><code
   * class="language-x">.
   */
  githubPreLang = (1 << 11),
 
  /** Be liberal in interpreting inline HTML tags.
   */
  liberalHTMLTag = (1 << 12),
 
  /** Parse footnotes.
   */
  footnotes = (1 << 13),
 
  /** Only parse strikethroughs if surrounded by exactly 2 tildes.
   * Gives some compatibility with redcarpet.
   */
  strikethroughDoubleTilde = (1 << 14),
 
  /** Use style attributes to align table cells instead of align attributes.
   */
  tablePreferStyleAttributes = (1 << 15),
  
  /** tablePreferStyleAttributes.
   */
  default = tablePreferStyleAttributes,
}

Who is using cmark-gfm-js

Coldfunction is using cmark-gfm-js to generate HTML from markdown in browser, and uses cmark-gfm in its backend services.

cmark-gfm-js

cmark-gfm-js

Installation

Node.js

Browser

Usage

Examples

GFM Compatibility

HTML Sanitization

Some background

A Good HTML Sanitizer

cmark-gfm Options

Who is using cmark-gfm-js

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

cmark-gfm-js

cmark-gfm-js

Installation

Node.js

Browser

Usage

Examples

GFM Compatibility

HTML Sanitization

Some background

A Good HTML Sanitizer

cmark-gfm Options

Who is using cmark-gfm-js

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads