Example 2 ambiguity

Topogram · May 24, 2019, 7:18am

I’m trying to implement a parser for Markdown in a language called Xojo.

I think there is some ambiguity in example 2:

••→foo→baz→→bim

The spec says this should output:

<pre><code>foo→baz→→bim
</code></pre>

The spec states:

in contexts where whitespace helps to define block structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.

And slightly later in the same section it states:

In the following case > is followed by a tab, which is treated as if it were expanded into three spaces. Since one of these spaces is considered part of the delimiter, foo is considered to be indented six spaces inside the block quote context, so we get an indented code block starting with two spaces

Doesn’t this mean that the output of example 2 should be:

<pre><code>••foo→baz→→bim
</code></pre>

I.e: why is the tab before foo not being expanded at all? This seems confusing.

mity · May 24, 2019, 11:50am

Tabs are not treated as replaced with 4 spaces. They are treated as there are tab stops every four spaces. That means that at the beginning of a line <tab> is equivalent to <space><space><tab>.

The logic is the same as sane editors behave if you set tab width to 4 spaces in their configuration.

Topogram · May 24, 2019, 12:17pm

Ah, I think I see. So if a tab is encountered it should be treated as “up to 4 spaces” for the purposes of determining indentation. Correct?

mity · May 24, 2019, 12:23pm

Yes. 1 <= N <= 4 so that you reach offset divisible by four (computed since start of the line).

Topogram · May 24, 2019, 12:37pm

Got it. Thank you for explaining.