Ordered vs Unordered List

tin-pot · January 1, 2016, 1:36pm

First of all, I’d like to wish you and everybody around here a happy new year!

If I understand your remark correctly, you do include LaTeX, RTF, nroff etc among the “multitude of other general markup languages” here? If that’s the case, then I have a pretty different view on Markdown in general and CommonMark in particular. Mine is based for the most part on the extent to which the Markdown description and CommonMark specification do support/include/refer to/borrow from/rely on “HTML” or generally XML syntax, structure, and notions—which you seem to regard as merely inherited baggage, so to say? After all, none of these “XML support features” are useful for creating any of these “other general markup languages” directly from CommonMark input, right?

But anyway, what really matters though is the definition of CommonMark, as expressed in the specification: and I see no reason why the same specification (more or less as it is right now) couldn’t be “compatible” with both points of view.

[To be precise:
As I said, I’m primarily referring to the document content model of XML/SGML anyway—formally the XML Infoset rsp SGML ESIS as “upper bounds”¹⁾ of what CommonMark can express, but the far, far simpler “data model” of µXML would probably suffice too (being isomorphic to what you refer to as the “AST”)—and I’m not so much concerned with the syntax: if this is what you mean by “representing this abstract structure”, then we do largely agree again after all … And be assured that I certainly don’t want to “require” every CommonMark processor to generate output in XML syntax, ie produce XML documents or fragments!

On the contrary, I think that the CommonMark specification should not prescribe any syntax at all for the representation of the parsing result (the parsed content, the AST, the content model instance, the abstract structure, whatever you call it). For example, a CommonMark parser could well “represent” the result only as a sequence of invocations of SAX-style callback functions, and the manner in which this is done is clearly outside the scope of any CommonMark specification. But still the specification must allow to test and decide of a parser of this kind does in fact parse correctly. The same considerations are behind the XML Infoset and ESIS specifications, which was the reason I pointed to RAST and to canonical XML in the following related discussion:

But it sure would be useful to have a simple but precise notation for the “abstract” parsing result for use in the specification text (and for denoting test case results): that discussion is over there

________

The term “upper bound” is used here in an only “half-sloppy way”, because (any reasonable definition of) an “is-lossless-translatable-to” relation between content models would obviously induce a pre-order among content models (and thus document types in general), in which “upper bound” would have the usual, precise meaning.
]

Personally, I’d prefer “itemized list” or—equally good—“unordered list” over “bullet list”, because “bullet” refers to a particular glyph (or family of glyphs), thus this term is in my view too much tied to a presentation style. And also because there is relevant “prior art” (and in LaTeX too) for the term “itemized list”, while the term “bullet list” seems to be used primarily in and around MS Office documentation.

But all in all I don’t care that much either.

However, and more generally, I still think that a distinction in terminology between

syntactic parts of input text on the one side and
output elements/nodes/subtrees/structural parts of the parsed content (or “AST”) on the other

could be occasionally helpful in the specification, if only to avoid confusion.