Nested ordered lists indented with 2 spaces are broken

#1

According the list-items rule (that sublists need to be indented to the column of the first non-whitespace character after the list marker), the following Markdown:

1. item
2. another item

  1. sub-list item
  2. sub-list another item
  3. sub-list last item

3. last item

Represents a single non-nested list:

<ol>
  <li><p>item</p></li>
  <li><p>another item</p></li>
  <li><p>sub-list item</p></li>
  <li><p>sub-list another item</p></li>
  <li><p>sub-list last item</p></li>
  <li><p>last item</p></li>
</ol>

Which seems wrong, because using a 2-space indent is enough to have the list rendered correctly with marked, redcarpet (& therefore GitHub), the original Markdown implementation, and PHP Markdown (see comparison on Babelmark).

I haven’t gotten any stats yet on how popular 2 space indents are among people who write Markdown, but since there’s at least one styleguide recommending it, I think it’s more than an edge-case.

For people who are thinking about indentation in terms of the number of spaces before the list marker, it’s quite strange that you can use 2 spaces to nest a list within an unordered list, but need at least 3 spaces to nest within an ordered one. The absurdity of this rule is illustrated further when you have ordered lists in the 2-3 digit range:

99. list item 99

    - sublist item
    - sublist item 2

100. list item 100

    - sublist item
    - sublist item 2

The tutorial at http:// commonmark.org/help/tutorial/10-nestedLists.html says:

To nest one list within another, indent each item in the sublist by four spaces.

…But that doesn’t work in the above example because once you get above 100 you need to switch to 5-space minimum indents (see comparison on Babelmark).

I’ve read the “motivation” section of the spec discussing this rule, but I’m unconvinced because cases like:

1.  foo

        indented code

…Are already handled perfectly by all parsers. And cases like:

10. foo

   bar

…Probably should be read as a list item with a subparagraph.

This was moved from https:// github.com/jgm/CommonMark/issues/399 and a few other discussions have covered the same topic:

PS: The “changing the bullet or ordered list delimiter starts a new list” rule is an awful thing to put in the standard. No other parsers do that.

PPS: Discourse says “new users can only put 4 links in a post”, so you should go to GitHub to read this post without most of the links intentionally broken or removed.

0 Likes

#2

Some implementations will often allow a 2-space indent for a sublist. But no implementation on BabelMark2 consistently follows a 2-space indent rule:
http://johnmacfarlane.net/babelmark2/?normalize=1&text=+1.+a +++2.+b +++++3.+c +++++++4.+d

A fairly recent style guide does recommend this. I think it’s incredibly bad advice for a style guide. Many implementations don’t allow a two-space indent for a sublist (and as I’ve noted above, none allow it consistently). Gruber’s original syntax description says nothing about a two-space indent, and hints at a four-space rule, which many implementations follow. A style guide should encourage a style that will work well across implementations, and a four-space indent for sublists is the best advice for portability.

If you want the list items to line up, you can achieve that in CommonMark like this:

99.  list item 99

     - sublist item
     - sublist item

100. list item 100

     - sublist item
     - sublist item

Let me elaborate on the indented code issue. Take

foo

    indented code

What is that? A paragraph followed by a code block. Now, let’s put it in a list item. Intuitively, you might think you could do that this way: put a list marker in front of the thing and indent the whole thing consistently,

- foo

      indented code

But as you can see from BabelMark2, many implementations will parse this as two paragraphs. You have to add two extra spaces to the indented code (bringing it to a total of 8) in order to get a code block:

- foo

      indented code

This eight-space requirement is mentioned in Gruber’s original syntax description. It makes sense if we have a consistent four-space rule: if all block-level content under a list item needs a four-space indent, then it’s naturally for indented code to require an eight-space indent. That’s the choice pandoc makes. But if we want to allow nested list items to be indented less than four spaces, then always requiring an eight-space indent for indented code in a list item is unnatural. It means that, when you put a block of text in a list item, you sometimes need to adjust the relative indentation inside that block of test, or it has a different meaning – a violation of what in the spec we call the Principle of Uniformity.

So, CommonMark uses the first nonblank text after the list item marker as the “baseline” for determining what following content is a child of the list item. This preserves the very nice property that, if you have some text and stick it in a list item, it will have the same meaning inside the list item as it had outside of it. This decision, however, is incompatible with a rule that always allows sublists to be indented two spaces.

Now, of course, you’re right that if someone writes

1. a
  1. b

they probably meant ‘b’ to be a sublist of ‘a’. But then, why not also allow a one-space indent? Not only does a two-space rule violate Uniformity, it also seems arbitrary.

By the way, if you think that marked, RedCarpet, kramdown, and PHP Markdown have good list parsing rules, what about this case?

The question is whether there are real-world cases where it would make sense to change the list bullet without starting a new list? But, if you want to discuss this further, please open a separate topic so we can keep the issues separate.

1 Like

#3

Now that I’ve found out about this issue (and the lack of proper support from implementations), I agree with you. I’m thinking about changing my tool tidy-markdown so it stops using 2 space indents as well.

However, none of that changes the fact that people (myself included) have been writing quite a bit of Markdown with 2-space indents. If CommonMark ignores the use of 2-space indents then either many documents will break or people just won’t switch to CommonMark.

I’m not worried about getting the list items to line up. I’m wondering how we expect people to know that once their lists have markers as long as 100. they won’t be able to use the 4-space rule that the tutorial taught them. My point is that it’s not intuitive for the width of your list item marker to dictate the minimum amount of indentation needed.

Yeah, I wouldn’t mind breaking compatibility for something like that… A decrease in indentation shouldn’t result in a nested list.

I wouldn’t mind breaking compatibility for something like that either. IMO, triple-nested lists are pretty rare in the wild, and I highly doubt that anyone has written their Markdown with the expectation that:

1. a
  1. b
    1. c
      1. d

…will compile into anything other than 4 lists nested within each other.

I could see a case for allowing a 1-space indent. Personally, I’ve never found someone who writes their Markdown with 1-space indents, and I haven’t found any other languages that use an indent smaller than 2 spaces, but I guess that:

- foo
 - bar

…Kinda looks like a nested list, and there is a fair bit of support for it.

I’m not arguing for a 2-space rule… That would probably break even more documents than the current rule. I actually don’t know what rule would be best. So far, the rule that marked uses is the closest to how I would expect Markdown to work, but it’s complicated to describe. All I want to do right now is point out that the current rule breaks in some rather odd ways and that 2-space indented documents should be supported.

0 Likes

#4

I don’t know of any way to support 2-space indented nested lists (even with ordered list markers) without breaking other desirable properties. I’ve tried to explain why above, and in the spec itself. Given the massive divergences between implementations in their treatments of lists, some breakage of existing documents is inevitable—no matter what rules we use. Fortunately, given a reliable HTML → CommonMark converter (such as pandoc -f html -t commonmark or https://github.com/jgm/html2cmark), there will be a good automatic migration path: just use your current converter to produce HTML, and then convert that back to CommonMark.

I hope you will change your linter to stop giving the 2-space indent advice, since it’s incompatible with a good many implementations.

As for the use of the 4-space rule in our tutorial: the tutorial is just meant to get people up and running, not to instill all the finer points of the syntax. It’s not too likely that people will have ordered lists with more than 100 elements, so I think the advice is fine. (The author of the tutorial apparently thought it best not to explain the actual CommonMark rule, opting for something simpler, and something more portable across implementations.)

1 Like

#5

I really hate this. Do we expect people to switch the TAB key to insert 3 spaces? It’s going to be really hard to fix indentation when you have a lot of child content. I can’t just multi-select lines and hit TAB according to the level it needs to be at. From an authoring standpoint this is bad. From a “I want to look at a pretty source document” it’s good.

0 Likes