How is it decided if a breaking change with existing Markdown implementations is acceptable?

I am confused about how a commonmark spec that breaks with existing Markdown implementations is considered a good thing in one case and bad in another. There appears to be no guiding list of principles to help make these decisions consistent, like (just an example, not a suggested list):

  1. Disambiguate the spec
  2. Intuitive results
  3. Readable source
  4. Do not break current implementations

A few examples of where I am confused:

  1. Setext headers and empty list items seems to make the argument that because there are a lot of lazy setex headers with one - in existing documents then requiring at least two - is a bad thing, even when it introduces an ambiguity for empty list items.

  2. Blank lines before lists, revisited which argues against requiring a blank line between a paragraph and a list even when it may cause inadvertent list item in a paragraph when it is wrapped and a -, *, + or 1. winds up at the beginning of a line. It also breaks with most other markdown implementations.

  3. On the other hand the spec has no issues with breaking with existing implementations when it comes to many other elements like:

    1. Different list markers start a new list
    2. First ordered list item >1 causes the list to start with a >1 first item
    3. Block Quotes do not lazy continue to a blank line Spec section 5.1: Ref-impl bug and/or unclear spec?
  4. Ordered list item can break a paragraph as in 2. above. When it is not supported in any existing Markdown processors.

  5. HTML block processing is, IMO, an improvement but a break with existing processors.

If potentially breaking existing markdown document rendering is not an issue for 3.1 to 3.5 then why is it an issue for making a setext marker at least two characters?

Item 2. is worse. It is a break with the majority existing markdown processors on Babelmark 2.0, including Markdown.pl. The argument that it allows list items to break paragraphs of other list items is weak. Simply adding this clause to the spec would do the same thing as most other processors already do without adding much complexity in implementation while not introducing surprises in the middle of paragraph text. Should the “if it a’int broke, don’t fix it” rule apply to the spec? If Markdown.pl already does it right, why change it?

I have also seen an argument that a break with existing implementations can be easily addressed by using an existing processor to render the HTML and then passing the HTML to pandoc to generate commonmark. A web service automating the conversion would go a long way to make this argument valid, but then all compatibility breaks are reduced to “just convert it” argument and the spec can ignore current implementations where they make no sense for commonmark.

I suspect that most users do not read the spec and use what works by trial and error. If a widely used markdown processor allows a construct, by design or by bug in the implementation, then it will be widely used.

If breaking with existing markdown processors is a consideration then should the length and quality of existing documents be taken into consideration? There are a lot of poorly formatted markdown documents that rely on the rendered results rather than making the original source readable, which is the original goal of Markdown. Do we want the commonmark spec to be constrained by what amounts to ignorance and laziness in large numbers?

I do not see a consistent approach in making these trade off decisions. Am I missing something or is there no consistent approach and none is desired?

2 Likes