Alternative (1) ordered list syntax

@chrisalley I’m not too convinced of the utility of automatic paragraph numbers in CommonMark either (and one could still use the alternative syntax “1-␣” for that); but just wanted to point out that “(1)␣is commonly used to mark other text elements than “ordered list items”, too. More related to style and typography than to the question of CommonMark syntax, if you see it this way.

And then there’s the differences between, say, anglo-american typography and popular writing style, and the customs prevalent in (western-)europe, and of course the “rest of the world”—I have no idea if there are “typical latin-american” typographical guidelines used in South America, for example.


Problematically, these types are not all represented in HTML.

Depending on which HTML definition you look at, there might be no variations in the style of unordered lists represented at all, as this is purely a presentational choice. Let alone when converting CommonMark to non-HTML document types and document formats like LaTeX, OpenDocument, DITA, and so on.

Thus it is clearly “out of scope” for the CommonMark specification to decree which particular “item marker styles” can or can not be represented in the rendered, “final”, output document (from a post included in a web forum, to a printed article or book). As a consequence, what (any version of) HTML provides or can represent should not be the ultimate arbiter on what can be expressed in the CommonMark input syntax, but only some rough guideline.

What the specification could—and in my opinion: also should—require from a CommonMark processor however, is that the “list item marker style” used in each list item is somehow made available in the parser’s output (that is: the “AST”, the “CommonMark XML DTD”, the “CommonMark document content model”).


Specification proposal

One obvious way to represent this information would be a “marker=” attribute1) in the DTD’s <item> element type (replacing or augmenting the current delimiter= attribute), which would store the (trimmed) marker string used in the input text for this particular list item. It wold hold CDATA content like:

  1. marker="-", marker="+", and marker="*" for the unordered list items of now;
  2. marker="42.", marker="07)" etc for the ordered list items of now; and maybe;
  3. marker="a)", marker="ii)" and what not for the ordered list items of the future.

But how exactly these markers are mapped into and represented in the target document (HTML or otherwise), is the job and decision of the particular application, typically in conjunction with some style sheet, and can not be mandated by the CommonMark specification as I see it.

______

  1. The “=” suffix in “name=” is just meant to indicate an attribute name, in contrast to an element type name, which is often written as “<name>”—a convention that I’ve seen used elsewhere and happen to find useful.