CommonMark Formal Grammar

PEG isn’t a toolchain, it’s a class of grammars, like context-free or mildly context-sensitive. There are many toolchains that can handle PEGs, including Haskell’s parsec (which has been cloned into quite a few languages), Scala’s scala.util.parsing.combinator, Python’s pyparsing, and my own Hammer, which is written in C but provides bindings to many scripting languages.

RFCs that define formats – such as IP datagrams or TCP packets – include a formal grammar defined in ABNF (which is itself currently defined in RFC 5234) so that implementors have an unambiguous reference for the layout of the format. Such grammars are included in an appendix, after the natural-language description of the protocol. People who don’t know how to read BNF aren’t expected to understand them, but they are an essential component of the specs they belong to, and without them, “implementation-dependent” behaviour proliferates and we’re back to why we wanted a Standard Markdown spec in the first place.

The STDM spec is incomplete without a formal grammar. If the formal grammar does not cover the corner cases you describe, then the formal grammar is incorrect and incomplete. But this standard must not ship without one.

3 Likes