Tabs in bullet lists, how do they work?

I can’t understand how tabs work when indenting bullet lists. According to the spec,

in contexts where indentation is significant for the document’s structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.

When I try

 - 1 space
   - 3 spaces
     - 5 spaces
     - tab + space

the result is equivalent to

  • 1 space
  • 3 spaces
    • 5 spaces
  • tab + space

but when I change the order of the last two lines

 - 1 space
   - 3 spaces
     - tab + space
     - 5 spaces

I get

  • 1 space
  • 3 spaces
  • tab + space
  • 5 spaces

Not only does it completely change the structure of the list depending on the order of the items, but both cases differ from what I would have expected and what I see in the editor (with tabstop=4):

  • 1 space
  • 3 spaces
    • tab + space (or 5 spaces)
    • 5 spaces (or tab + space)

Could someone please explain to me the reasoning behind this behavior and how it follows from the spec?

Here’s a Dingus for minimum and maximum number of spaces around a list item marker for three-level lists, no tabulators.

[Note: In code blocks below, I mark spaces with a middle dot ·, tabulators with an elongated arrow , and level start column with a circumflex ^ if there is indentation afterwards.]

The actual spec may differ slightly, but basically the minimal indentation per level is the start column of the previous level, which is determined by its indentation plus characters in its line prefix (here 1: -, but 2+ in enumerated list items) plus spaces after the prefix (1–4). Therefore, each nested level must be indented at least 2 places more than the previous one.

-·0 space (0+0+0)
^·-·2 spaces (0+1+1)
^·^·-·4 spaces (2+1+1)

The maximal indentation per level is the start column of the previous level (as defined above) plus 3.

^··-·3 spaces (0+0+0+3)
^····^··-·8 spaces (3+1+1+3)
^····^····^··-·13 spaces (8+1+1+3)

Each tabulator before the line prefix should advance the nesting level by exactly one, because it’s more than the maximal 3 spaces there. It jumps to the next start column, which therefore works like a tabstop.

·- 1 space
^  ^ start for 2nd level
···- 3 spaces
^  ^ ^ start for 3rd level
·····- 5 spaces
^  ^ ^ ^ start for 4th level
——⇥·- 1 tab (to first ^) + 1 space
^   ^ ^ start for 3rd level

·- 1 space
^  ^ start for 2nd level
···- 3 spaces
^  ^ ^ start for 3rd level
——⇥·- 1 tab + 1 space
^   ^ ^ start for 3rd level
·····- 5 spaces
^    ^ ^ start for 3rd level

A tabulator cannot jump to a column more than 4 characters away!

···- 1 space
^    ^ start for 2nd level
———⇥- 1 tab (max. 4 spaces)
^     ^ start for 2nd level

These flexible tabulators actually make some sense, but lesson learned: don’t mix tabs and spaces for indentation!

Thanks for your reply!

So a tabs are expanded as if there was a tabstop after 4 spaces or at the next start column?

I understand how these rules create the results that I got, but I still don’t understand why these rules exist. In particular, where in the spec does it say so? Because it seems like a clear violation of 2.2 to me…

This looks like an implementation bug to me. The spec pretty clearly says that tab + space indentation should be equivalent to 5 spaces indentation. Could you put an issue up at either jgm/cmark or jgm/commonmark.js or both?

1 Like

I added an issue for each repository. That closes this issue for me.