"comment" facility in commonmark?

  1. the triple dash makes pandoc ignore the comment when it parses the markdown file.

  2. the Comments Plugin(https://github.com/ryneeverett/python-markdown-comments ) for python-markdown ( https://github.com/waylan/Python-Markdown ) , which implements the same pandoc commenting style mentioned above.

  3. But even if you use another markdown engine, the comment WILL show up in the generated HTML and so still be invisible on render , until someone sees “view source”. So it is graceful fallback.

  4. NOTE: if you put double dash, then its HTML comment always -in pandoc and all other markdown processors

If we do not standardise then some will invent there own syntax like http://criticmarkup.com/ (superset of markdown) invents >>This is comment<< syntax.

It is great need to make a uniform comment syntax in all the markdown implementations.

The triple dash has an advantage of having a graceful fallback, that it will not showup in browser , even if it was not implemented.

Hence, I will request you that please include the triple dash of pandoc as comment syntax of commonmark, which should not be rendered by the conforming implementations



commenting is such a dire need. Markdown is used for documentation in many places.
But documentation in the markdown raw file itself via uniform comment syntax (which does not render) - is itself not existing.
Is it not an irony that documenting(commenting) the file written in documentation language (commonmark) itself uniformly not possible, as of now, “as per spec”

(Earlier i filed this as an issue https://github.com/jgm/CommonMark/issues/348 , later I saw this forum and asked this question here)

1 Like


This is exactly the thing I came here seeking.

As a professional writer and amateur coder, I edit other people’s writing all the time, and have come to embrace Markdown as a quick, easy way to get the job done. However when critiquing other documents, I’m forced to use a word processor/google docs/draftly to insert my thoughts. There’s no common markup for commenting and even having a discussion about the comments within the document.

I understand it could get really messy, so that's why I've taken to using fenced codeblocks as a way of separating my comments in-line with the doc. 

Perhaps we can then take this approach and use another set of punctuation like "%%%" or something and then use the triple-dashes to separate conversations between the collaborators? 

Just an idea.

Obviously in my example above, since it’s using <pre> and parsing for code, some of the words get formatted funny. Using a designated commenting markdown tag would eliminate this.




may be you would like to support this cause at this url


please add your comment there.
do you like this syntax, or you have some other comment syntax in mind



Something like using double forward and backward slashes as denotation…
// comment would go in here \\
would work fine with me.



I do think it would be great to have a way to put comments in the source. zaxebo1 is actually not correct that pandoc does not pass through HTML comments beginning with three dashes – <!--- like this --> through to the generated HTML, though I think I did once propose this.

For ease of writing and reading, something that makes comments stand out more would be good. One possibility is to make lines beginning with % behave as comments (so that they were entirely ignored in parsing).



What would an additional syntax to “put comments in the source” text accomplish?

I thought that according to the 0.22 CommonMark spec an “HTML Comment” (which is also an “XML Comment”, and is a restricted form of a ISO 8879:1986 SGML comment declaration) would be

[…] parsed as a raw HTML tag and will be rendered in HTML without escaping.

What would be the need that introducing a “native” CommonMark comment syntax should satisfy? And how would the content of such a comment be represented in the output, if at all?

If the only goal is to discard the “comment in the source”, and not transfer it into the output document, I would prefer

  1. an option that a CommonMark processor discards all “HTML comments”, and/or

  2. a “special form” of “HTML comments” that will be discarded in any case by a CommonMark processor, like the one cited by @jgm above.

Because by definition a comment declaration aka “HTML comment” aka “XML comment” is not part of the (SGML/XML/HTML/XHMTL) document content, I see no compelling reason why such comments should be passed on into the CommonMark processor’s output anyway (by default at least).

In case a comment is supposed to “mean” something for a processor down the line: that’s what processing instructions are for, and they too are mentioned in the CommonMark spec as being passed through (IMO rightly so).



One more thing regarding “HTML comments”:

Both the HTML (whereby I mean “real” HTML conforming to the W3C HTML 4.01 and/or ISO 15445 specifications, not the “look-alike” HTML5) and XML rules restrict the comment syntax to start with <!-- and end with -->:

All comments in HTML document instances shall appear in comment declarations. There shall be exactly one comment per comment declaration.

there is technically the “degenerate” form <!> too: it is not mentioned in the CommonMark spec, and (consequentially) is passed through as literal text by cmark, and browsers like Mozialla would not recognize it as a comment anyway (as they should).

But I think that <!> would be a perfectly fitting candidate to play the same role in CommonMark that \& has in groff (or nroff, or troff):

Insert a zero-width character, which is invisible. Its intended use is to stop interaction of a character with its surrounding.

Why would this be useful? It would in many cases provide an alternative to the “backslash escape” used to prevent parsing, eg instead to “hide” the FULL STOP to prevent recognition of a list item like this:

1\. Lorem ipsum dolor sit amet.

one could write:

<!>1. Lorem ipsum dolor sit amet.

More importantly, and not possible using backslash (as far as I can tell), one could differentiate “inline” and “block” tag markup (or rather: element types) in the CommonMark text, and thus prevent or request the CommonMark parser to wrap the content in a <P> (or <LI>) element.

This input would produce the first paragraph wrapped into <P>, but the second paragraph would generate an instance of the “block-level” element type <block-elem>, following the <P> element instance.

Lorem ipsum dolor sit amet.

<block-elem>consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

On the other hand, that input would produce two consecutive <P> elements, where the second has a <inline-elem> as an immediate (and only) child:

Lorem ipsum dolor sit amet.

<!><inline-elem>consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

The second use is of course the important one: this could fix an IMO serious flaw in the CommonMark specification, and at the same time simplify the rules for handling “HTML tags”, namely to:

  • make the CommonMark specification as general as possible,
  • in particular to avoid biasing it towards HTML, or
  • constraining the specification to the HTML syntax, or
  • the HTML element type names (which variant of HTML anyway?), or
  • which named character references are available in HTML etc.

In the current specification are all kinds of HTML-specific rules, in particular in section “4.6 HTML blocks”, naturally. On the other hand, in section “6.2 Entities” it explicitly says:

With the goal of making this standard as HTML-agnostic as possible, all valid HTML entities (except in code blocks and code spans) are recognized as such and converted into Unicode characters before they are stored in the AST. This means that renderers to formats other than HTML need not be HTML-entity aware.

Maybe it is just me, but is the requirement that

  1. every CommonMark processor has to know the complete HTML character entity set (again: of which HTML variant?) so that

  2. the CommonMark specification can assign fixed meanings defined by HTML to all entity references which look like a HTML named character reference, and

  3. require the processor to “silently” substitute the corresponding Unicode character for these references

not the direct and extreme opposite of “making this standard as HTML-agnostic as possible”?

[NOTE: Admittedly, this should be taken to another discussion topic.]