[ANN] Cmarkit 0.1.0 – CommonMark parser and renderer for OCaml

dbuenzli · April 6, 2023, 11:37pm

Hello,

It’s my pleasure to announce the first release of the Cmarkit OCaml library (API docs) which parses and renders the CommonMark specification.

It’s likely unexciting for members of this forum so I’d just like to take this opportunity to thank @jgm and the other participants for their answers to my (perhaps silly :–) specification misundertandings. I also liked @mity’s own description and implementation of multipass inline parsing which I personally found more palatable than the one described in the specification (I adopted a similar startegy albeit more explicit as a list of tokens that is gradually transformed into inlines).

Other than that these features could be of interest:

The supported extensions in non strict parsing mode are: strikethrough, latex math, task items, (djot) pipe tables, (djot) footnotes.
The library has a notion of label resolver that can be used to alter label definitions and references during parsing. Notably this allows to create synthetic label definitions on undefined label references (rather than them ending as plain text in standard CommonMark) which you can process later in the abstract syntax tree. This allows to treat the very liberal link label syntax as your own DSL; for data binding, for referencing program identifiers, for treating link syntax as a generalized span, etc.
The abstract syntax tree takes care to preserve (on demand) the original document layout in dedicated layout fields to allow to render to CommonMark without normalizing the documents too much. You can read about the approach and its limitations here.

Only 379 out of the 652 specification tests round trip exactly, but these are rather extreme examples (a per test classification of the round trip failures can be found here it’s mostly due to eager escaping). In practice it seems to work quite well on your average README.

vas · July 20, 2023, 1:42pm

You should add it to the list of implementations!