Should single element lists be supported?

chrisalley · January 3, 2016, 12:35pm

1986. What a great season.

This produces an ordered list with one list item. To avoid creating a list, you must type 1986\. What a great season.

Markua has a nice alternative. In Markua, single element ordered lists are not lists. So 1986. What a great season. on it’s own becomes a paragraph.

To me, the notion of a list is something with more than one thing in it. If I have a list of chores to do, there are many of them. If I have one chore to do, I don’t call it a list of chores–I call it a chore.

So, in Markua, the automatic creation of an ordered list only happens if you have two or more lines starting with numbers and periods.

Should CommonMark adopt this rule? Are there any use cases for single element lists (ordered or unordered)?

tin-pot · January 3, 2016, 1:30pm

I happen to write, or rather: misuse, single-item “lists” from time to time, mostly as a means to highlight the item content. But I’m not sure how valuable this possibility should be considered.

On the other hand—and this is not really a form of misuse, but more like an edge case—there might be cases where “conceptionally” several items belong to the same list, while “technically” they are separated by other, non-item stuff, like paragraphs or images or even headings.

In this case, it would be useful to have

single-item “lists” to host the dispersed list items, and also
to continue numbering the items in an ordered list “across” the intervening stuff.

Is it not the case that the problem you point out occurs primarily because a line starting with “1986.⎵What a great season.” would interrupt the current paragraph? At least I see no real problem in the case when above and below this line are blank lines in the input text.

So maybe it would be sufficient to “forbid” single-item lists only from occurring inside a paragraph, ie without being separated by blank lines?

Crissov · January 3, 2016, 1:41pm

I’ve seen continuous lists being interrupted by (often explanatory) paragraphs that were not part of list items. This may happen with single items, of course. Contrived example:

Fruits I like the best:
1. Apples
2. Oranges

… but cannot digest properly:
3. Bananas

So, yes, a single-item list would not have to be supported, but it can’t be distinguished reliably from a solitary item of an interrupted list.

tin-pot · January 3, 2016, 3:15pm

but it can’t be distinguished reliably from a solitary item of an interrupted list.

There’s no need for such a distinction, as “interrupted list” is a “meta-term” – from the point of CommonMark, there are only (nested) lists, any “interruption” simply terminates the list. However, a reader may “see” an interrupted list, and see “conceptionally” as a whole (list) what technically are multiple lists.

In a (technically, not “conceptionally”!) single-item list the distinction between “compact” and “loose” style is moot anyway, as far as I can see—right?

So the “Bananas” solitary item in your example (which is in fact what I had in mind!) could be equally well written as:

… but cannot digest properly:

3. Bananas

I would say that according to the current spec, a processor must produce the same result for both “Bananas”. So the “Bananas” item can be separated by a blank line, and the rule “no single-item list in a paragraph” would not impede the use of single-item lists in general (for notating “conceptionally” single lists which were “accidentally” interrupted).

Technically, both “Bananas” items constitute each an independent, single-item list. “Conceptionally” (ie, for the author and/or the reader, well: hopefully “and” …), conceptionally the “Bananas” item is the third item in an “interrupted” list, starting with “Apples” and “Oranges”.

chrisalley · January 4, 2016, 10:36am

I’m not sure what you mean by “inside a paragraph” - can you explain how the 1986. What a great season. example differs from the 3. Bananas example? Or give another example of where the “no single-item list in a paragraph” rule would apply.

tin-pot · January 4, 2016, 12:06pm

The “no single-item list inside a paragraph” rule would apply—and would exclude the recognition of a list item—both in the original example (concerning the “1986.⎵” at the line’s beginning):

Oh, I still remember
1986.⎵What a great season.

or here:

Oh, I still remember
1986.⎵What a great season.
We don't have those today anymore.

And also in the “Banana” example:

… but cannot digest properly:
3.⎵Bananas

Here “3.⎵” would not be accepted as a list item marker, because there is only one (candidate) item in the block, too: just as in the “1986.⎵” examples.

But this rule does no harm, because for example the “Bananas” input text can written in this—I think completely equivalent—way:

… but cannot digest properly:

3.⎵Bananas

if one wants to create a single-item list containing the numbered “Bananas” item.

And the same holds for the two “1986.⎵” examples, if one would insist on creating a single-item list using this as the item number: This input would produce a single-item list:

Oh, I still remember

1986.⎵What a great season.

As would this:

Oh, I still remember

1986.⎵What a great season.

We don't have those today anymore.

[ Edit: ] Note that said rule does not apply to a “candidate” for a list item marker in the first line of a block (that is, when the block itself starts with an item marker). So this:

Oh, I still remember:

1986.⎵What a great season.
We don't have those today anymore.

would still produce a single-item list, just as it does according to the current CommonMark rules (whether one likes it or not).

chrisalley · January 4, 2016, 12:41pm

Thanks for the clarification. The proposal you’re suggesting is to allow single element lists in some cases, but not others. That’s moving in the direction of a rule that’s non-uniform, something which CommonMark strives against, so perhaps it is better to leave out the “no single-item list inside a paragraph” exception even though it would be useful in some cases.

Regarding the other replies to this topic, some strong arguments have been raised for continuing to support single element lists.

tin-pot · January 4, 2016, 4:42pm

No, that’s not what I wrote. Let me clarify my view further (you might want to skip this, and jump to my “actual list syntax-rule change” proposal below …):

If the rules should be changed with the intent to cursory (and IMO: arbitrarily, ie without good reason nor gain) preclude single-item lists in CommonMark, then at least make sure that this will produce a (yes: one-item long) list:

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

1.⎵Donec a diam lectus.

Sed sit amet ipsum mauris.

Because the following will still produce a list, right? (Or I’ll let go all hopes for the syntax …)

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

1.⎵Donec a diam lectus.
2.⎵Donec sed odio eros.

Sed sit amet ipsum mauris.

If you disagree—and suggest that the first input would better not create a list, but the second input should—then I’d ask you: how is that not moving in the direction of a rule that’s non-uniform, and a rule that would enforce this would not only be “non-uniform”, but also lack any good reason, and fails to accomplish any advantage, as far as I can see from the discussion so far.

So if you ask me what I’d propose: I see only two reasonable decisions here:

Leave the syntax rules as they are now: They are not “perfect”, but I see no pressing need to change them. A free-standing line (preceded and followed by a blank line) starting with “1986.⎵” would still produce a list item, but considering that such a situation is (a) quite easy to visually recognize and to fix while writing, and (b) impossible to arise “inadvertently” through re-formatting a paragraph’s text—why the impetus to rule out any and all list with only one item as a “remedy”?
Change the syntax rules to avoid “unwanted” list items: The motivation behind your “`1986.⎵What a great season.” example seems to be that this might produce a list item where none is intended—otherwise, this whole discussion would be for naught. But “avoiding unwanted list items” has little to do with the number of items in a list, and precluding single-item lists is IMO arguably the wrong approach to do it, see below.

Consider an input text where your line appears inside a paragraph like this:

Oh, I still remember
1986.⎵What a great season.
We don't have those today anymore. Or wait 'till
3.⎵October.

Simply declaring single-item lists “verboten” won’t help in this case: as I understand it, this would still produce (as it will now, and as cmark does output):

An ordinary paragraph with character content “Oh, I still remember”;
A two-item long, ordered list where
- the first item has content “What a ... Or wait 'till”, and a start attribute value of 1986; and
- the second item has content “October”, and an (implied) item number of 1987.

Assuming that creating a list like that was not the intention here, observe that:

Even if single-item lists are forbidden, this paragraph would still generate a (two-item!) list;
If you’d replace the FULL STOP following “3” in the last line by say a COMMA, then—assuming single-item lists were forbidden—the paragraph would suddenly not produce a list any more: how’s that for a subtle change in the input text with non-local consequences?
The “3.⎵” in the last line could plausibly have ended up there as a result of re-formatting the input text paragraph (as I have tried to show by making it grammatically part of a sentence).
To check whether this (or any other) block of lines will be treated by a CommonMark processor as a “plain” paragraph, or will be split into list items, one must inspect each and every line start in it: this is so according to the current syntax rules, and it would still be the case according to rules where “single-item” lists were “forbidden” (excluded, suppressed, not accepted, …).

An actual list syntax-rule change proposal

All in all I’d agree that the current rules make the “inadvertent creation” of list items too easy, and too hard to avoid, in particular in cases where the input text has undergone some automatic line-breaking process (be it a text formatter, be it a text generator).

But instead of precluding single-item lists it would seem more sensible and useful to drop the rule that “list item lines” (ie lines starting with “3.⎵” or “-⎵” etc) can interrupt a paragraph even if that paragraph so far was nothing else but an ordinary piece of text, ie a “plain” paragraph which did not start with a list item (marker).

Dropping this rule seems even more sensible when taking into account that the rule is not needed anyway, namely: it achieves nothing what couldn’t be achieved without it (as the “Bananas.” examples above should make clear), simply by inserting a blank line before the intended first item line of a list to create.

Does this make sense?

tin-pot · January 4, 2016, 6:02pm

The pertaining quote from the CommonMark spec is, as I suppose that’s what you meant with “uniformity” (bold highlighting done by me):

Our adherence to the principle of uniformity thus inclines us to think that there are two coherent packages:

Require blank lines before all lists and blockquotes, including lists that occur as sublists inside other list items.

Require blank lines in none of these places.

reStructuredText takes the first approach, for which there is much to be said. But the second seems more consistent with established practice with Markdown.

To be honest: I don’t buy into the assertion that there are only two “coherent packages”—I’ll take that to mean “design decisions”—, but since there are no other “packages” discussed, that seems to be the implication here.

In fact I believe that there is (at least!) one more “consistent package”, and that accepting this simple dichotomy between “everywhere” or “nowhere” (to require a blank line, in our case) as the primary guideline for syntax design does more harm than good.

The “third package” (regarding lists) I’d like to propose would be simply:

Require a blank line before the first line of a list.

That is, here a blank line would (newly) be reqired:

Aurea prima sata est aetas qua vindice nullo

1. Sponte sua, sine lege, fidem rectumque colebat.

But (as before) not here:

 1. Sponte sua, 
    - sine lege,
    - fidem rectumque colebat.

The main effect (and purpose) of restricting the recognition of “list items”, or more precise: lines that start a new list, in this way is to avoid creating “accidental” items inside a “plain” paragraph. I think this would be worthwhile, and I see no drawback incurred by introducing this rule.

jgm · January 5, 2016, 10:47pm

This “third package” does violate the principle of uniformity, since if you take this text:

Sponte sua,
- sine lege

and put it in a list item, it would have a different meaning than it does on its own. If you don’t care about the principle of uniformity, then yes, there are many packages, but my claim was about packages that respect the principle.

tin-pot · January 5, 2016, 11:00pm

You mean that

Sponte sua,
- sine lege

would not be a list, but a vanilla paragraph, while “putting it in a list item” like this:

-   Sponte sua
    - sine lege

would produce a “different meaning”, in that the second line would be seen as a subordinate list item?

Well, yes, that’s certainly true. But I would reckon that “determining the (and that may be: a different!) meaning” of a piece of text is precisely what mark-up is supposed to do. I mean, if you take said paragraph and insert four SPACEs in front of each line, wouldn’t that produce a “different meaning”, too? And notably, whether or not the paragraph would be seen as a list before or not: it certainly isn’t afterwards.

The same holds true, for example, switching between ATX heading and setext heading syntax: there too, and with much weaker rationale, you can’t just “transplant” a piece of text from one context into the other without side effects on meaning.

Or do you mean something quite different than that with “principle of uniformity”?

jgm · January 5, 2016, 11:57pm

I mean this: http://spec.commonmark.org/0.23/#principle-of-uniformity

tin-pot · January 6, 2016, 12:10am

Well, okay. That’s a principle.

Do you agree that an implication of this principle is that in:

Sponte sua,
- sine lege

and in:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Donec a diam lectus. 2. Sed sit amet ipsum mauris. 
Maecenas congue ligula ac quam viverra nec consectetur.
3. ante hendrerit. Donec et mollis dolor. 4. Praesent et diam 
eget libero egestas mattis sit amet vitae augue. 5. Nam tincidunt 
congue enim, ut porta lorem lacinia consectetur. 
Donec ut libero sed arcu vehicula ultricies a non tortor. 
6. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

and so on, an author now

must inspect each and every line, and
must edit each and every line-start which “looks like” a list item marker (that is here: the “-␣”, the “3.␣” and “6.␣”, but not the “2.␣” and “4.␣” etc)
into say “3\.␣” and “6\.␣”, and
must do this for every decimal numeral after SPACE followed by FULL STOP, and
for every HYPEN-MINUS between SPACES, and
for every such ASTERISK and PLUS SIGN too,

in order to ensure that his paragraph will be interpreted as a vanilla paragraph? And is this not more effort, and more special cases, than to have the simple rule “a new list can only start after a blank line”? [ Nota bene: had the second example started with “1.␣” at the beginning of its first line, the situation would be very different: in that case, the decision that this is (meant to be) a list, or at least a list item, would be done and dealt with right in the first line. ]

Because that’s how I see the the uniformity at play here.

jgm · January 6, 2016, 5:12am

It’s a matter of balances. Each of the proposals has advantages and disadvantages. If I were designing for myself, I’d take the reStructuredText package, requiring blank lines everywhere. I don’t think that’s really an option, though, since it would break too many existing Markdown documents. Another option is to give up the principle of uniformity, as you suggest. But then a completely new spec for block quotes and list items would need to be drawn up, since the current one assumes that the principle holds. The principle is also, I think, a good principle; uniformity makes the syntax easier to remember. The drawback is, just as you say, an increased risk of accidentally triggering lists. One has to weigh the pros and cons (and believe me, I’ve done that).

Note that even if one gives up the principle of uniformity to reduce risk of accidental lists at the outer level, that wouldn’t do anything to reduce the same kind of risk in list items themselves:

* Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
  Donec a diam lectus. 2. Sed sit amet ipsum mauris. 
  Maecenas congue ligula ac quam viverra nec consectetur.
  3. ante hendrerit. Donec et mollis dolor. 4. Praesent et diam 
  eget libero egestas mattis sit amet vitae augue. 5. Nam tincidunt 
  congue enim, ut porta lorem lacinia consectetur. 
  Donec ut libero sed arcu vehicula ultricies a non tortor. 
  6. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

So, authors are going to need to look out for this kind of thing in any case.

tin-pot · January 6, 2016, 7:05am

Yes, that was the gist of my “nota bene” remark; at least in this case a look at the first line of the block would give a hint whether it falls into the “list” or “vanilla paragraph” camp.

Apart from that, I agree that at the core of this matter is, as you state:

The principle is also, I think, a good principle; uniformity makes the syntax easier to remember. The drawback is, just as you say, an increased risk of accidentally triggering lists. One has to weigh the pros and cons (and believe me, I’ve done that).

I happen to weigh said risk a bit heavier than formally sticking to a principle—but that’s indeed a matter of preferences.

Given the disparity in current implementation’s behavior, any Markdown writer should watch out for that risk anyway, and probably for a long time coming.

An even nastier case is documented in your BabelMark FAQ as List items and code spans:

- `a long code span can contain a hyphen characater like this
  - and it can screw things up`

In light of the various implementation’s results, including the CommonMark implementation, the phrase “can screw things up” seems more than appropriate: there’s not one implementation that keeps the code span intact; but then again there seems to be no remedy as long as the basic parsing model, “blocks first, then inline content”, applies. And here I do agree that this should probably stay as it is.

[ So in a sense, disrupting the code span is correct by definition; and the only guideline for writers would be “just don’t do that!” ]