Counterintuitive interaction of tabstops and list items

There are a couple of instances where the tabstop rule (“in contexts where whitespace helps to define block structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters”) interact in possibly unexpected ways with the list item rules.

Since I had some problems with them when trying to write an implementation, I was going to submit a pull request to the spec with explanation and examples. But first, I decided I wanted to bring the examples up here and make sure that they were indeed the results people wanted in the spec going forward.

Both of the examples below follow from the tabstop rule, so I’m not suggesting they’re a bug. But the confusion arises from the fact that tabstops are calculated before a list item is parsed, so the meaning of a block can change by adding a list marker, or by altering the list marker. This means that rules like 5.2.1 and 5.2.4 that claim that blocks shouldn’t be changed by the addition of list markers should mention the tabstop complications. In addition the tabstop rule should mention that it takes priority over other rules.

(I will admit, though, that I’m pretty unconvinced that tabstops are worth these complications.)

Example 1.
If we have the following two blocks:

Foo

→Bar

we get

<p>foo</p>
<pre><code>bar
</code></pre>

but if we add a 3-wide list marker, we get different block contents

1.␣Foo

␣␣␣→Bar

produces

<ol>
<li>
<p>Foo</p>
<p>Bar</p>
</li>
</ol>

However, if we add a 4-wide list marker,

1.␣␣Foo

␣␣␣␣→Bar

we get the original block contents:

<ol>
<li>
<p>Foo</p>
<pre><code>Bar
</code></pre>
</li>
</ol>

Example 2
The tabstop rule means that adding 1-3 spaces before a list marker can change the meaning of the blocks in the list

-␣␣→Foo

produces

<ul>
<li>Foo</li>
</ul>

but adding a space before the bullet

 -␣␣→Foo

produces a different result

<ul>
<li>
<pre><code> Foo
</code></pre>
</li>
</ul>

This would seem to go against 5.2.4:

If a sequence of lines Ls constitutes a list item according to rule #1, #2, or #3, then the result of indenting each line of Ls by 1-3 spaces (the same for each line) also constitutes a list item with the same contents and attributes.

Because of the tabstop rule, it doesn’t quite contradict it, but 5.2.4 certainly seems to need a bit of elaboration.

Conclusion
As I said, I’m not sure myself that the tabstop rule is worth all this, but I also don’t know the full background of the decision, so I’m not going to argue against it. If we do have the tabstop rule, though, I think that there needs to be

  1. some discussion of precedence in the tabstop rule, and

  2. examples/tests like the above in 5.2.1 and 5.2.4, along with explanation.

1 Like

I agree it would be helpful to have in the spec examples and explanations like the ones you give, and perhaps some explicit discussion of how the list rules interact with the tabstop rule.

I don’t think changing the tabstop rule would be a good idea; it is designed to ensure backwards compatibility with existing Markdown implementations that do a destructive tab expansion before parsing. It also makes good sense: if your editor tabstop is set to 4, you can use spaces or tabs as you like for indentation, and it won’t make any difference.

1 Like

I also had a hard time understanding the implications of the tabstop rule, see issue #410 on Github.

I understand the wish for backwards compatibility, but is it really worth it?

Are there any numbers available about how many existing Markdown sources are actually using tabs?
Or at least some anecdotal evidence that people are using tabs?

1 Like