remark-rehype

remark plugin that turns markdown into HTML to support rehype

Usage no npm install needed!

<script type="module">
  import remarkRehype from 'https://cdn.skypack.dev/remark-rehype';
</script>

README

remark-rehype

Build Coverage Downloads Size Sponsors Backers Chat

remark plugin that turns markdown into HTML to support rehype.

Contents

What is this?

This package is a unified (remark) plugin that switches from remark (the markdown ecosystem) to rehype (the HTML ecosystem). It does this by transforming the current markdown (mdast) syntax tree into an HTML (hast) syntax tree. remark plugins deal with mdast and rehype plugins deal with hast, so plugins used after remark-rehype have to be rehype plugins.

The reason that there are different ecosystems for markdown and HTML is that turning markdown into HTML is, while frequently needed, not the only purpose of markdown. Checking (linting) and formatting markdown are also common use cases for remark and markdown. There are several aspects of markdown that do not translate 1-to-1 to HTML. In some cases markdown contains more information than HTML: for example, there are several ways to add a link in markdown (as in, autolinks: <https://url>, resource links: [label](url), and reference links with definitions: [label][id] and [id]: url). In other cases HTML contains more information than markdown: there are many tags, which add new meaning (semantics), available in HTML that aren’t available in markdown. If there was just one AST, it would be quite hard to perform the tasks that several remark and rehype plugins currently do.

unified is a project that transforms content with abstract syntax trees (ASTs). remark adds support for markdown to unified. rehype adds support for HTML to unified. mdast is the markdown AST that remark uses. hast is the markdown AST that rehype uses. This is a remark plugin that transforms mdast into hast to support rehype.

When should I use this?

This project is useful when you want to turn markdown to HTML. It opens up a whole new ecosystem with tons of plugins to do all kinds of things. You can minify HTML, format HTML, make sure it’s safe, highlight code, add metadata, and a lot more.

A different plugin, rehype-raw, adds support for raw HTML written inside markdown. This is a separate plugin because supporting HTML inside markdown is a heavy task and not always needed. To use both together, you also have to configure remark-rehype with allowDangerousHtml: true.

The rehype plugin rehype-remark does the inverse of this plugin. It turns HTML into markdown.

Install

This package is ESM only. In Node.js (version 12.20+, 14.14+, or 16.0+), install with npm:

npm install remark-rehype

In Deno with Skypack:

import remarkRehype from 'https://cdn.skypack.dev/remark-rehype@10?dts'

In browsers with Skypack:

<script type="module">
  import remarkRehype from 'https://cdn.skypack.dev/remark-rehype@10?min'
</script>

Use

Say we have the following file example.md:

# Hello world

> Block quote.

Some _emphasis_, **importance**, and `code`.

And our module example.js looks as follows:

import {read} from 'to-vfile'
import {reporter} from 'vfile-reporter'
import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeStringify from 'rehype-stringify'

main()

async function main() {
  const file = await unified()
    .use(remarkParse)
    .use(remarkRehype)
    .use(rehypeDocument)
    .use(rehypeFormat)
    .use(rehypeStringify)
    .process(await read('example.md'))

  console.error(reporter(file))
  console.log(String(file))
}

Now, running node example.js yields:

example.md: no issues found
<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>example</title>
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>
  <body>
    <h1>Hello world</h1>
    <blockquote>
      <p>Block quote.</p>
    </blockquote>
    <p>Some <em>emphasis</em>, <strong>importance</strong>, and <code>code</code>.</p>
  </body>
</html>

API

This package exports defaultHandlers, all, and one. The default export is remarkRehype.

unified().use(remarkRehype[, destination][, options])

Plugin that turns markdown into HTML to support rehype.

destination

If a Unified destination processor is given, that processor runs with a new HTML (hast) tree (bridge-mode). As the given processor runs with a hast tree, and rehype plugins support hast, that means rehype plugins can be used with the given processor. The hast tree is discarded in the end.

👉 Note: It’s highly unlikely that you want to do this.

options

Configuration (optional).

options.allowDangerousHtml

Whether to persist raw HTML in markdown in the hast tree (boolean, default: false). Raw HTML is available in the markdown (mdast) tree as html nodes and can be embedded in the HTML (hast) tree as semistandard raw nodes. Most rehype plugins ignore raw nodes, but two notable plugins don’t:

  • rehype-stringify also has an option allowDangerousHtml which will output the raw HTML. This is typically discouraged as noted by the option name but is useful if you completely trust who authors the markdown
  • rehype-raw can handle the raw embedded HTML strings in hast trees by parsing them into standard hast nodes (element, text, etc). This is a heavy task as it needs a full HTML parser, but it is the only way to support untrusted content
options.clobberPrefix

Prefix to use before the id attribute on footnotes to prevent it from clobbering (string, default: 'user-content-'). DOM clobbering is this:

<p id=x></p>
<script>alert(x) // `x` now refers to the DOM `p#x` element</script>

Elements by their ID are made available by browsers on the window object, which is a security risk. Using a prefix solves this problem.

👉 Note: this option affects footnotes. Footnotes are not specified by CommonMark so they’re not supported in remark by default. They are supported by GitHub, so they can be enabled by using the remark plugin remark-gfm.

options.footnoteLabel

Label to use for the footnotes section (string, default: 'Footnotes'). Affects screen readers. Change it when the markdown is not in English.

👉 Note: this option affects footnotes. Footnotes are not specified by CommonMark so they’re not supported in remark by default. They are supported by GitHub, so they can be enabled by using the remark plugin remark-gfm.

options.footnoteBackLabel

Label to use from backreferences back to their footnote call (string, default: 'Back to content'). Affects screen readers. Change it when the markdown is not in English.

👉 Note: this option affects footnotes. Footnotes are not specified by CommonMark so they’re not supported in remark by default. They are supported by GitHub, so they can be enabled by using the remark plugin remark-gfm.

options.handlers

This option is a bit advanced as it requires knowledge of ASTs, so we defer to the documentation available in mdast-util-to-hast.

options.passThrough

This option is a bit advanced as it requires knowledge of ASTs, so we defer to the documentation available in mdast-util-to-hast.

options.unknownHandler

This option is a bit advanced as it requires knowledge of ASTs, so we defer to the documentation available in mdast-util-to-hast.

defaultHandlers

The defaultHandlers export from mdast-util-to-hast, useful when passing in your own handlers.

all

The all export from mdast-util-to-hast, useful when passing in your own handlers.

one

The one export from mdast-util-to-hast, useful when passing in your own handlers.

Examples

Example: supporting HTML in markdown naïvely

If you completely trust the authors of the input markdown and want to allow them to write HTML inside markdown, you can pass allowDangerousHtml to this plugin (remark-rehype) and rehype-stringify:

import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeStringify from 'rehype-stringify'

main()

async function main() {
  const file = await unified()
    .use(remarkParse)
    .use(remarkRehype, {allowDangerousHtml: true})
    .use(rehypeStringify, {allowDangerousHtml: true})
    .process('It <i>works</i>! <img onerror="alert(1)">')

  console.log(String(file))
}

Running that code yields:

<p>It <i>works</i>! <img onerror="alert(1)"></p>

⚠️ Danger: Observe that the XSS attack through the onerror attribute is still present.

Example: supporting HTML in markdown properly

If you do not trust the authors of the input markdown, or if you want to make sure that rehype plugins can see HTML embedded in markdown, use rehype-raw. The following example passes allowDangerousHtml to this plugin (remark-rehype), then turns the raw embedded HTML into proper HTML nodes (rehype-raw), and finally sanitizes the HTML by only allowing safe things (rehype-sanitize):

import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import rehypeRaw from 'rehype-raw'
import rehypeSanitize from 'rehype-sanitize'
import rehypeStringify from 'rehype-stringify'

main()

async function main() {
  const file = await unified()
    .use(remarkParse)
    .use(remarkRehype, {allowDangerousHtml: true})
    .use(rehypeRaw)
    .use(rehypeSanitize)
    .use(rehypeStringify)
    .process('It <i>works</i>! <img onerror="alert(1)">')

  console.log(String(file))
}

Running that code yields:

<p>It <i>works</i>! <img></p>

👉 Note: Observe that the XSS attack through the onerror attribute is no longer present.

Example: footnotes in languages other than English

If you know that the markdown is authored in a language other than English, and you’re using remark-gfm to match how GitHub renders markdown, and you know that footnotes are (or can?) be used, you should translate the labels associated with them.

Let’s first set the stage:

import {unified} from 'unified'
import remarkParse from 'remark-parse'
import remarkGfm from 'remark-gfm'
import remarkRehype from 'remark-rehype'
import rehypeStringify from 'rehype-stringify'

main()

async function main() {
  const file = await unified()
    .use(remarkParse)
    .use(remarkGfm)
    .use(remarkRehype)
    .use(rehypeStringify)
    .process('Hallo[^1]\n\n[^1]: Wereld!')

  console.log(String(file))
}

Running that code yields:

<p>Hallo<sup><a href="#user-content-fn-1" id="user-content-fnref-1" data-footnote-ref aria-describedby="footnote-label">1</a></sup></p>
<section data-footnotes class="footnotes"><h2 id="footnote-label" class="sr-only">Footnotes</h2>
<ol>
<li id="user-content-fn-1">
<p>Wereld! <a href="#user-content-fnref-1" data-footnote-backref class="data-footnote-backref" aria-label="Back to content">↩</a></p>
</li>
</ol>
</section>

This is a mix of English and Dutch that screen readers can’t handle nicely. Let’s say our program does know that the markdown is in Dutch. In that case, it’s important to translate and define the labels relating to footnotes so that screen reader users can properly pronounce the page:

@@ -10,7 +10,7 @@ async function main() {
   const file = await unified()
     .use(remarkParse)
     .use(remarkGfm)
-    .use(remarkRehype)
+    .use(remarkRehype, {footnoteLabel: 'Voetnoten', footnoteBackLabel: 'Terug'})
     .use(rehypeStringify)
     .process('Hallo[^1]\n\n[^1]: Wereld!')

Running the code with the above patch applied, yields:

@@ -1,8 +1,8 @@
 <p>Hallo<sup><a href="#user-content-fn-1" id="user-content-fnref-1" data-footnote-ref aria-describedby="footnote-label">1</a></sup></p>
-<section data-footnotes class="footnotes"><h2 id="footnote-label" class="sr-only">Footnotes</h2>
+<section data-footnotes class="footnotes"><h2 id="footnote-label" class="sr-only">Voetnoten</h2>
 <ol>
 <li id="user-content-fn-1">
-<p>Wereld! <a href="#user-content-fnref-1" data-footnote-backref class="data-footnote-backref" aria-label="Back to content">↩</a></p>
+<p>Wereld! <a href="#user-content-fnref-1" data-footnote-backref class="data-footnote-backref" aria-label="Terug">↩</a></p>
 </li>
 </ol>
 </section>

Syntax tree

A frequent problem arises when having to turn one syntax tree into another. As the original tree (in this case, mdast for markdown) is in some cases limited compared to the destination (in this case, hast for HTML) tree, is it possible to provide more info in the original to define what the result will be in the destination? This is possible by defining data on mdast nodes, which this plugin will read as instructions on what hast nodes to create.

An example is remark-math, which defines semistandard math nodes that this plugin doesn’t understand. To solve this, remark-math defines instructions on mdast nodes that this plugin does understand because they define a certain hast structure.

As these instructions are somewhat advanced in that they requires knowledge of ASTs, we defer to the documentation available in the low level utility we use: mdast-util-to-hast.

CSS

Assuming you know how to use (semantic) HTML and CSS, then it should generally be straight forward to style the HTML produced by this plugin. With CSS, you can get creative and style the results as you please.

Some semistandard features, notably remark-gfms tasklists and footnotes, generate HTML that be unintuitive, as it matches exactly what GitHub produces for their website. There is a project, sindresorhus/github-markdown-css, that exposes the stylesheet that GitHub uses for rendered markdown, which might either be inspirational for more complex features, or can be used as-is to exactly match how GitHub styles rendered markdown.

Types

This package is fully typed with TypeScript. It exports Options and Processor types, which specify the interfaces of the accepted options.

Compatibility

Projects maintained by the unified collective are compatible with all maintained versions of Node.js. As of now, that is Node.js 12.20+, 14.14+, and 16.0+. Our projects sometimes work with older versions, but this is not guaranteed.

This plugin works with unified version 6+, remark-parse version 3+ (used in remark version 7), and rehype-stringify version 3+ (used in rehype version 5).

Security

Use of remark-rehype can open you up to a cross-site scripting (XSS) attack. Embedded hast properties (hName, hProperties, hChildren) in mdast, custom handlers, and the allowDangerousHtml option all provide openings. Use rehype-sanitize to make the tree safe.

Related

Contribute

See contributing.md in remarkjs/.github for ways to get started. See support.md for ways to get help.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

License

MIT © Titus Wormer