HTML Comments, blank lines, and CommonMark Appendix B

OK, I’ll admit it in public: I was, for a while, a professional tester (or maybe I was a testee, I’m not sure, I hope not.) Either way, I developed strategies to cope with my role.

The following is intended as a critique of the current version of the CommonMark specification and stmd implementations, written in hope of improving the same. It in no way represents a something that can’t be fixed.

Please look at this example in Babelmark 2:

<!--
(C) 2014 by Burt Harris

Anyone may copy this text freely, but the author prefers this one sentence remain hidden in web browser HTML views.  
-->
This document is another of my patent-pending (just-kidding) lightweight markup language torture tests.  Please view it using http://johnmacfarlane.net/babelmark2.   Thank you.

I trust this will come as no surprise to @jgm, but BabelMark 2 shows a lot of different interpretations of this example in existing implementations: at least 14 of them by my count in the “code” tab view.

But how many of these variations matter, and which one is “best”? It’s important to switch to BabelMark 2’s “Preview” tab to discuss this.

There’s a lot less that matters when this example passes through both a lightweight language processor and a browser, and if you (like I) don’t care too much if it hyperlinks the inline URL or not, the number of Visual Variants drops to reasonably small number. I count 5 using an eyeball and brain.

As to “best” it’s unfortunate but, stmd and Markdown.pl still show dramatic difference in interpretation of my example, and when it comes to matching my explicitly stated intent, Markdown.pl (and many others) win the “best” contest in my book.

I think the biggest reason for this is really related to issues addressed in a hard-to-find relative to the specification: [Appendix B: An alternate spec for HTML blocks only visible to people exploring the source code.

1 Like

There is no reason the alternate spec described in Appendix B couldn’t be implemented within the parsing strategy described in Appendix A. And perhaps it should be. In any case, there needs to be a special rule for HTML comments, which are quite often multi-line.

Thanks @jgm. I’ve removed reference to Appendix A from the original post.

In experimenting with modifications to handle HTML comments better (using several torture-tests), I’ve had to change the current stmd.js assumption that blocks end on line boundries, which is what me think Appendix A something to do with it. But I’ll accept your expert opinion, but offer this little example, where stmd produces another minority opinion visual result, but it may have more to do with requiring blank lines than Appendix A.

<!-- comment -->
## In the beginning

Do you think an Appendix B implemenation would get that right?