Markdown-it - high speed, pluggable parser for JS, with CommonMark support

This thread reflects latest & actual state about parser, been done by me & Alex.

Why it’s done

  1. Because we needed easy way to modify syntax & output. Almost all parsers i know have hardcoded logic.
  2. Because i wished to improve my skills in writing high-speed js, and this project was good for it.

Key differences with reference parcer

  • Focused on producing safe HTML out of box in one step. Instead of universal data abstraction layer.
  • Token steam instead of AST. More simple (IMHO), and enougth to support local transformation filters.
    • It’s still technically possible to convert data into commonmark ast, but no reasons to do it in core.
  • HTML disabled by default. Any features can be added via plugins in secure way.
    • We can’t run owasp validator in browser, it’s more simple to disable html at all.
    • But it’s not a big problem. In 99% people use HTML because it’s hard to modify parser. markdown-it has no such problems. You will be able to do everything without html, in secure way.
  • SourceMapping limited to line numbers only (no time, not needed now for our tasks).
  • High speed (it’s actually faster than all available JS parsers for markdown).
  • Very easy to modify. Everything!

With default settings parser is more close to GH’s, and tends to support the most demamded features, not yet covered in CM spec. But it still has strict CM mode, and existing syntax is periodically syncronized with CM on spec updates.

This parser is not better or worse than reference one. It have different goals, different priorities, different architecture and so on. We have no primary goal to promote new standard, but we like to follow CM spec.

PS. Of cause, tons of thanks to @jgm for his outstanding efforts on writing CommonMark spec.

2 Likes

Started splitting rarely used things like footnotes to separate plugins. It helps to verify if existing arch is flexible enougth and to fix issues.

First example of this progress - emoji plugin. Already available in demo

25 posts were split to a new topic: Remarkable vs. Markdown-it

markdown-it is now included to Babelmark2

Let’s return to development, as was naively expected in the first post :smile:

markdown-it 2.2.0 released. Quick summary from 2.x:

  • Commonmark spec conformance updated to v.0.13. Now all tests pass.
  • Added API docs and more examples.
  • API simplification:
    • no new required
    • curring on configuration methods
    • helpers for bulk enable/disable rules
    • new “zero” preset for “opt-in” configuring
  • Plugins interfaces polish (checked on markdown-it-emoji).
  • Different fixes.
  • Demo reworked, now can be built with plugins (current one already include emojies).

Plan for 3.0 is to polish plugin system, split several rules to plugins and apply some postponed breaking changes.

2 Likes

3.0.0 released

  • Spec v0.15 all tests ok
  • Fixed regexp hangs on some HTML markup, discussed in CM tracker (#263, #267)
  • All features out of [ CM + tables + strike ] moved to plugins.
    • that should help to reduce count of possible collisions with spec in future.
    • that helped to detect possible issues for plugin writers
    • each plugin is a good demo itself.
  • dropped full preset. Currently supported are zero + commonmark + default.
  • Polished some rules location (core -> block, better data flow) & names
  • other changes

This release contains some incompatible internals changes, but this should be not noticeable at top level, if you did not used full preset. If you used full - just drop it and load ‘missed’ features via use().

1 Like

4.0.0 released

Renderer and tokens format in v.4.0.0 were changed, to allow manipulation with tag attributes and unify renderer logic. Plugins should be updated.

Update details are described in changelog, but i’d like to summarize some important things:

  • linkify-it now used to autodetect links (since v3.1.0).
  • Improved links encoding & normalization:
    • IDNa encode/decode for domains.
    • percent-decode for link texts.
    • use mdurl to process links.
  • CM spec v0.18 conformance (with additional fix for #12).

Also, as bonus, custom container plugin, to define your own block wrappers in fenced style.

Performance

Due renderer refactoring, speed regression is ~20% in node v.10, and up to 50% in node v0.12 and iojs v1.5.0. Note, that some speed loss caused by v8 in new node versions. There are things, which can be done faster, but, honestly, i don’t wish to spend many time for it, because total speed is good enougth.

As you can see in tracker, we solved all issues, except some minor related to current spec state. We have done everything we needed, and have to continue with other projects. Of cause, we will support future spec conformance and bugfixes.

5 Likes

In regards to the custom container plugin

I checked the GitHub page for examples, but I couldn’t find any instances where the source wasn’t specifying the default block-element <div>.

Your example:

::: warning
*here be dragons*
:::

Gets parsed to:

<div class="warning">
<em>here be dragons</em>
</div>

But what if I wanted to do something besides a <div>, like, say, a custom <h1>:

::: h1.postTitle
Star Wars is The Best
:::

Getting parsed to:

<hr class="postTitle">Star Wars is The Best</h1>

Or is that getting so complicated that I might as well just write pure HTML?


PS

What is the required syntax here?

var md = require('markdown-it')()
.use(require('markdown-it-container'), name [, options]);

I tried .use(require('markdown-it-container'), name); but that threw an error in my node.js app.

You can rewrite default renderer https://github.com/markdown-it/markdown-it-container/blob/master/index.js#L29.

For call examples - see tests.

Are you talking about this page? If so, there are only examples of the default <div> getting parsed.

Also, since I’m not a JavaScript expert, I’m not sure what to put here for default settings:

var md = require('markdown-it')()
.use(require('markdown-it-container'), name [, options]);

Again, I tried

.use(require('markdown-it-container'), name);

And got this error:

And when I tried

.use(require('markdown-it-container'));

I got no parsing of ::: blah ::: to <div>

:frowning:

Do you maintain a list of high profile projects that use markdown-it? I noticed that the pretty trendy ProseMirror project was using it. Now I’m curious to know what other projects might be using your library.

Edit: I guess the list of dependents on npmjs.com has some decent coverage:

https://www.npmjs.com/browse/depended/markdown-it

Released 5.0.0, conforming CommonMark 0.22 spec.

Version bump caused by some incompatible internals change (can break external plugins). Public API left intact. See migration info for details.

If you are plugin author and have trouble with updating your plugin - create issue in tracker with details, me & alex will try to help. Also see the list of our updated plugins and src changes.

1 Like

Just for info: markdown-it is now 8.0.0 and follows CM 0.26 spec. It’s stable and forks fine with current features.

Next major improvements could be sourcemap support, but that requires:

  • AST design
  • major rewrite (token stream to AST, plugins, AST api)
  • more stable CM spec (desired), because it can affect some algorythms very much

We have no plans to do it in nearest future, because that requires a lot of time. But if anyone is interested, we could collaborate efforts to “split costs”. Feel free to contact me, if you have some ideas about.

2 Likes