Tab-related issues



Here’s a nice case:

>[TAB][TAB]x> x

Here is a summary of the different approaches different implementations take:

  1. Treat this exactly as if it’s expanded to spaces, that is, as equivalent to > + 7 spaces + x. In this case the first space is gobbled up as an optional space after >, the next 4 are treated as indentation for a code block, and we get a code block with two leading spaces.

  2. Treat this as if it’s expanded to spaces, but don’t treat the first space as an optional space after >. Then we get a code block with three leading spaces.

  3. Treat the first tab as an optional space character after >, and the second tab as the indentation of a code block. Then we get a code block with just x, with no leading spaces.

  4. Don’t treat the first tab as an optional space character. Treat it as code block indentation, leaving the second tab as part of the code block contents. Then we get a code block with a leading tab.

Now that we’re allowing tabs in code blocks, I find the first two approaches strange. After all, there are no spaces in the Markdown source – only tabs – so why should the code block contain spaces, which aren’t there? (This made more sense when we converted all tabs to spaces before parsing.)

I’m somewhat inclined to favor the 4th approach, but I fear that it might break some existing documents, making regular block quotes into quoted code blocks. So maybe the 3rd approach is best overall.

There is a related issue about lists. How, exactly, do we calculate padding when the list marker is followed by a tab? If we pretend we’re converted tabs to spaces here, we’ll again get code blocks that have spaces when there’s none in the text.

1.[TAB][TAB]hi hi

Note that this looks like a bug; this should contain a code block of some kind. But I need to get clearer on the spec before this can be fixed.

Comments welcome.

Issues we MUST resolve before 1.0 release [8 remaining]
Issues we MUST resolve before 1.0 release [8 remaining]

In approach 4, why would the first tab be treated as a code block indentation? It’s equivalent to only 3 spaces when using the “4 space tab stop” rule, no?


Yes, I suppose you’re right that #4 doesn’t make much sense.
I’d be curious about your thoughts on the others.


I have adjusted the reference implementations to implement approach 1, which I think is the one most consistent with existing processors. (The changes are in the repository but not yet in a released version.) Also added a few more tab-related cases to the spec.


Oh, I thought you made a good argument in favor of option #3, which I preferred (even without @robinst debunking #4) since it comes closest to elastic tab-stops, but backwards compatibility is also a good point and apparently in favor of #1.

Option #2 didn’t make sense at all, because quotations are (now) the only block elements where the space is optional, which is a point that should be hidden as much as possible (if it can’t be changed). The mental model for authors is simpler if at least one space after the (possibly indented) line prefix is always required.


My thought is that if two texts look the same (with tab stop set to 4 spaces), then they should render the same. Also, backwards compatibility is important.