Here’s my vote on the twelve (now eleven!) MUST items, cc: @jgm
Preservation of spaces in backtick code
`are trailing spaces trimmed from here--> `
We should preserve spaces in code spans, unless it’s super hard. There’s a near 50/50 split in babelmark so either approach is really OK.
Backtick fences and inline code collision
``` hello this is NOT inline code with one backtick ` and two backticks ``; it is a code block! ```
Multi-word restriction of some kind, or leading space restriction (if possible). Removing code fences is absolutely not an option. Do note that in GitHub code fences the official documentation shows no space between the backticks and the language.
Links within links
Whatever the simplest way is to disallow this, I think we should do it. As already mentioned by @balpha it feels pathological to me, I don’t see a good use for it.
Inconsistent handling of spaces in links
<http://example.com/hey nice link>
There’s a very well understood way to encode spaces into links, our old pal
%20 and spaces in links are some bad mojo anyway that we should not be encouraging. We should not allow spaces in links.
Quotes in titles
Foo [bar](/url/ "Title with "quotes" inside")
" seems crazily HTML-specific, which long term is not the goal of CommonMark, so I support your solution of allowing the escape
\" to work, instead.
Already decided, so great!
Should space be required after # in ATX headers?
I am 100% certain the answer is “yes” here. Way too much damage to average user input if we allow this unnecessary flexibility, largely due to the rise of the
#hashtag in popular media. Tighten it down, we have ample proof there’s a problem, and the tension from the fix is minor.
Setext header and list precedence issue
1. Juli ------ - Event 1 - Event 2
I think this should be escaped like any other list special case, or the user can use the
## header form.
Allow setext headers to interrupt paragraphs for consistency
Paragraph Header ===== Paragraph
Babelmark says this is quite divergent, and I think we should continue to be strict here and NOT allow sextext headers to interrupt paragraphs, as that reads quite poorly to me in plaintext – what kind of heading has no whitespace?
I don’t find multiline headers particularly compelling.
Odd reference link/list case
- [foo]: bar baz
Babelmark says a list with one item is what this should produce. Given that there’s virtually no divergence here, do we care? Are people really running into this? Is it a useful set of input?
Remove two-blanks rule
I’ve come to think that the “two blank lines breaks out of lists” rule is more trouble than it’s worth…
… I think the current spec could be clarified if rule 1 for list items explicitly said “a sequence of lines not containing two consecutive blank lines that are not in a fenced code block.”
I completely trust your instincts on this one. Everyone in the topic says your suggestions are reasonable, so go with whatever you believe is best here.
Handling of tabs needs to be further specified
Seems to me a tab is not quite a space, so the “is this a space after a blockquote” rule doesn’t apply.
I think the babelmark results show a rough consensus that this should be a code block with 3 spaces.
Honestly I think as long as a code block is rendered, which is definitely the consensus here, no one will be too bothered by some extra spaces.
Revise what the spec requires a propos entities:
Only replying here to the MUST issues marked by @codinghorror as “no topic”.
Quotes in titles
Foo [bar](/url/ "Title with "quotes" inside") Foo [bar](/url/ "Title with \"quotes\" inside")
The latter should absolutely work, although it currently doesn’t in many implementations.
Since most implementations (incl. Markdown.pl) seem to handle the former as a naive author would expect (i.e. automatically escape the inner quotes), it would be fine if the spec could define that behavior, too.
Allow setext headers to interrupt paragraphs for consistency
Paragraph Heading ---- Paragraph
We should not be looking at top-level
= setext headings, but second-level
- ones, because they’re ambiguous with “thematic breaks”. There are four general ways to parse this:
- Paragraph, heading, paragraph
- Heading (with 2 lines), paragraph
- Paragraph (with 2 lines), separator, paragraph
- Paragraph (with 4 lines)
The second interpretation is rare (only Parsedown in Babelmark). It assumes lazy wrapping of headings. That can be expensive, because every soft-wrapped paragraph could turn into a heading this way and the parsers wouldn’t know until it had consumed the last line with only equal signs or dashes in it. All other lazy wrapping is decided by the marker of the first line, so it’s reasonable to avoid this behavior (like the CM spec does), although some authors may expect otherwise.
The third option, which the reference implementation currently adopts (as do few others), makes no sense in my opinion. Setext headings should always take precedence over mere horizontal lines! Since they’re restricted to a single line of text for reasons explained above, that suggests option 1.
The last option has some merit over the first, because – like the third – it allows consistent treatment if put inside a list, for example, where headings would not be expected to occur at all.
* Paragraph Heading ---- Paragraph
To fulfill both requirements, i.e. heading wins over separator and no different treatment inside a list item, there’s actually no choice but to treat it as a single paragraph (like only Pandoc, Kramdown and Minima do), unless we want (setext) headings inside lists (which Cebe, Maruku and Discount support). Most implementations actually support ATX headers and unambiguous setext headings inside lists, though. Since the reference implementation is among them, option 2 makes more sense again.
Odd reference link/list case
- [foo]: bar baz
It seems to be a single-item tight list with content “baz”.
With the algorithm currently used by cmark and commonmark.js, there is no efficiency cost to allowing multiline setext headers. We store successive lines in paragraphs, and when we hit one that looks like a setext header line, we just convert the paragraph into a setext header. No backtracking is needed. Indeed, allowing multiline setext headers would eliminate the need for a check we currently do (to see if only one line of text has been parsed).
I’m leaning towards thinking multiline headers are the best interpretation here. People who hard-wrap their text to a fixed column width may occasionally need these (although stylistically it’s generally best to avoid overly long headers). Currently there’s no way to do multiline headers in CommonMark (or Markdown generally), so the change would increase expressive power.
I’ve gone with allowing multiline setext headings. Spec changes have been pushed, please have a look at
The code changes in cmark and commonmark.js amounted to deleting one line, and had no effect on benchmarks.
I’m wondering how can we help to go to 1.0 release? What would help you @jgm concretely?
I have added some items to the list, I’m afraid.
What and where is the current list?
Regarding the odd list case with the link reference definition, the source is probably at Example 0.28:
Link reference definitions can occur inside block containers, like lists and block quotations.
Example 175 shows how a paragraph can start directly after a line containing a reference link definition.
A sub-question is whether this should be an empty list item:
There is no example or prose explaining exactly that.
I have found myself wishing such embedded link reference definitions would work but would still be shown in the output, especially in lists where they otherwise always lead to empty list items because nothing can precede or follow them therein.
- [foo]: bar baz . <ul> <li><a href="bar">[foo]: bar</a></li> </ul> <p>baz</p>
It’s been 2½ years since this list was started, at least 3½ years since this forum was opened, over 5 years since Jeff’s The Future of Markdown.
It’s now been a year since GitHub adopted CommonMark.
Releasing 1.0 will further the goals and adoption of CommonMark. It will also allow the development of CommonMark v1.1 or v2 to move forward, which will further the goals and adoption even more.
Perhaps add a V1 milestone to the issues database, add the issues corresponding to the above to the milestone, and start taking pull requests? BTW, maybe announce that this forum is for design/new feature discussion, but bugs in the spec should get moved into GitHub, with a message on such threads here with a link to the GitHub issue. It will make this all easier to manage.
I think part of the reason for the delay in releasing a “1.0” version of the core spec is that it would be very hard to patch without breaking backward compatibility, and a 1.0 version isn’t strictly needed for the spec to be useful in production. I don’t disagree with your points though @vas.
That said, I see no reason why work couldn’t commence on extension specs before the core spec reaches 1.0 (by other members of the community if @jgm is busy). Formalising some of the extensions from GitHub Flavored Markdown - tables, task list items, strikethrough, autolinks, and disallowed raw HTML - here on commonmark.org would be good place to start. Alternatively, if CommonMark extensions as part of the CommonMark project are no longer on the cards, that’s something the community would benefit from being made aware of so that extension specs could be developed independently.
If what you say is true, it has been true for 2½ years, and implies such wide adoption that breaking backward compatibility keeps the spec stuck this way for 2½ years, which in turn means we have a de facto 1.0 release. If so, we should make it official and move on to v1.1 where we’d have the freedom to fix things with minor backward incompatibility.
If what you say is not true, we can address this list for v1.0 and do it soon.
A standard that can’t take a stand is just a recommendation.
Some of the issues listed here aren’t so minor. For the issues with linked forum topics, these are known issues that are still under discussion; the implementation hasn’t yet been decided. If you want to see a 1.0 release sooner, contributing to those individual discussions would help move the spec towards 1.0 as they have been explicitly stated as 1.0 blockers (independent of time).
Seniority certainly has to be taken into account at some point.
But let’s see. Github does not it any more. it uses commonmark.
How to count now ? it remains 13,537,268 - 1 users of kramdom ? Overstated?
May be should we also precise github use a github flavored commonmarkdown spec instead of a github flavored markdown (gfm).
Is it that an unconditional endorsement for commonmark diffusion around a well approved status quo ?
As others, i appreciate and thanks the efforts of the community around this project.
But IMHO, it seems there is some bikeshedding around important issues like:
- css class
Issues way more important than to handle uncountable variants of line break.
As any project, open source should know how to stay on track and on time…
I agree, and that’s essentially what I’ve been saying in a different, but less presumptive, way.
But saying things like “It seems that commonmark is slowly dying” is neither accurate nor helpful. It’s no more accurate than saying “It seems that Gruber’s Markdown is slowly dying.” We all know the opposite is true. The problem isn’t death/lack of adoption, it’s fragmentation.
Neither are your statistics helpful, because you know that saying about lies and statistics. Such comments aren’t going to spur things to move. It’s not how leaders talk, nor is it how you get leaders to listen.
How about you start a new forum topic making your above bikeshedding case, with a solid line of reasoning, sans hyperbole?
GitHub Pages used to use kramdown, but now it follows CommonMark.
FYI: GitHub Pages still uses kramdown. GitHub uses CommonMark (with GitHub Flavored Markdown extensions) for READMEs and markdown rendered on GitHub itself (but not for GitHub Pages). Sorry if this sounds confusing.
You’re right. I was misremembering the time they switched to kramdown.
From @notriddle’s link:
GitHub-flavored Markdown is supported by kramdown by default, so you can use Markdown with GitHub Pages the same way you use Markdown on GitHub.
In other words you can use CommonMark with GitHub Pages the same way you use it on GitHub.
And yes, even kramdown supports CommonMark. Does that mean we get to add kramdown’s numbers to CommonMark’s?
That’s a rhetorical question, so please don’t answer. This debate is starting to get silly, and sounding like national politics. Let’s get back to progress with CommonMark, whether that’s nailing 1.0 first or declaring it done, we need to move on to 1.1 asap. We need to move to head off even more fragmentation.
JFTR, Kramdown’s GFM mode for Jekyll (which is what GitHub Pages is build upon) does not conform to Commonmark and probably never will. If I’m not confusing things again, GitHub uses commonmarker for .md file previews but cmark-gfm for READMEs.