Inline lists: a) bananas b) apples c) oranges

tin-pot · November 26, 2016, 1:48pm

Regarding the “inline list” topic: You’re right that HTML only has “block-level” lists which can’t occur inside a <p> element (ie as “inline-level” content). But other document types have such lists, for example <simplelist> in DocBook.

However I dislike the urge to pile on special-purpose “extensions” to the Markdown syntax for things like this. How about a “generic syntax” that represents your (slightly expanded) example like this:

<subject/Coco/ likes fruit. Her favorites are: <il/bananas,<>apples,<>oranges and<>lemons./ Lorem ipsum dolor etc. etc

This is IMO even somewhat nicer and terser (and as always the list labels should be generated by the processing application)—but most importantly, it is already valid (SGML) markup.

The only change required here in Markdown would thus be to recognize and “pass through” empty start tags <> (and similarly, emtpy end tags </>), and so-called NET-enabling start tags <gi att=value ... /. (A “NET” or “null end tag” is - usually - a solidus / that acts as an end tag.) This is trivial to implement, and the resulting output

<p><subject/Coco/ likes fruit. Her favorites are: <il/bananas,<>apples,<>oranges and<>lemons./ Lorem ipsum dolor etc. etc</p>

is already valid SGML with respect to a DTD like this one:

<!DOCTYPE test [
<!ENTITY % m.ph    "#PCDATA|subject|il">
<!ELEMENT  test    O O (p+)>
<!ELEMENT  p       - O (%m.ph)*>
<!ELEMENT  subject - - RCDATA>
<!ELEMENT  il      - - (li+)>
<!ELEMENT  li      O O (%m.ph)*>
]>
<p><subject/Coco/ likes fruit. Her favorites are: <il/bananas,<>apples,<>oranges and<>lemons./ Lorem ipsum dolor etc. etc</p>

Note that this makes rather heavy use of tag omissions (the <li> element is not even mentioned in the Markdown text!).

If you (or your tools) prefer less “minimized” markup, you can for example pass this document through sgmlnorm to “restore” the omitted tags and produce:

<TEST>
<P><SUBJECT>Coco</SUBJECT> likes fruit. Her favorites are: <IL>
<LI>bananas,</LI>
<LI>apples,</LI>
<LI>oranges and</LI>
<LI>lemons.</LI>
</IL> Lorem ipsum dolor etc. etc</P>
</TEST>

This same approach would allow a Markdown author for example to use HTML phrase elements like <DFN>—for which there is no Markdown syntax—with “null end tags” (NETs) in this manner:

A <dfn/empty start tag/ is a start tag of the form `<>`, ...

as a shorthand alternative to the “regular” and already existing form

A <dfn>empty start tag</dfn> is a start tag of the form `<>`, ...

A quick rant

Given (1.) Markdown’s (or Gruber’s) stance that (emphasis mine)

Markdown is not a replacement for HTML, or even close to it. Its syntax is very small, corresponding only to a very small subset of HTML tags. The idea is not to create a syntax that makes it easier to insert HTML tags. In my opinion, HTML tags are already easy to insert.

and (2.) the assertion in the Commonmark spec that (emphasis mine)

Tag and attribute names are not limited to current HTML tags, so custom tags (and even, say, DocBook tags) may be used.

and (3.) the fact that SGML (and HTML to some extent) already has various ways to “minimize” markup as a convenience for human authors, as argued for by Goldfarb (in annex A.3 of ISO 8879:1986) regarding tag omission (emphasis mine):

The price of this simplicity, though, is that an end-tag must be present for every element.

This price would be totally unacceptable had the user to enter all the tags himself. He knows that the start of a paragraph, for example, terminates the previous one, so he would be reluctant to go to the trouble and expense of entering an explicit end-tag for every single paragraph just to share his knowledge with the system. He would have equally strong feelings about other element types he might define for himself, if they occurred with any great frequency.

With SGML, however, it is possible to omit much markup by advising the system about the structure and attributes of any type of element the user defines.

I therefore would much prefer if a Markdown processor rsp the CommonMark spec would “know enough” about already existing markup minimization techniques in order to “get out of the way” of authors who want to use them, instead of bolting on additional ad-hoc syntax for all kinds of situations.

Just because XML (for good reasons—which are irrelevant here) went into the opposite direction of the Goldfarb quote above and did away with all the SGML conveniences for human authors—aka markup minimization rules—in the interest of simplicity for implementations, and famously stated “terseness in XML markup is of minimal importance” as a design goal, should IMO not mean that CommonMark needs to re-invent them all over. At least not in cases where they already provide clear, simple, short and standard ways to mark-up text.