Example 2 ambiguity

I’m trying to implement a parser for Markdown in a language called Xojo.

I think there is some ambiguity in example 2:

••→foo→baz→→bim

The spec says this should output:

<pre><code>foo→baz→→bim
</code></pre>

The spec states:

in contexts where whitespace helps to define block structure, tabs behave as if they were replaced by spaces with a tab stop of 4 characters.

And slightly later in the same section it states:

In the following case > is followed by a tab, which is treated as if it were expanded into three spaces. Since one of these spaces is considered part of the delimiter, foo is considered to be indented six spaces inside the block quote context, so we get an indented code block starting with two spaces

Doesn’t this mean that the output of example 2 should be:

<pre><code>••foo→baz→→bim
</code></pre>

I.e: why is the tab before foo not being expanded at all? This seems confusing.

1 Like

Tabs are not treated as replaced with 4 spaces. They are treated as there are tab stops every four spaces. That means that at the beginning of a line <tab> is equivalent to <space><space><tab>.

The logic is the same as sane editors behave if you set tab width to 4 spaces in their configuration.

3 Likes

Ah, I think I see. So if a tab is encountered it should be treated as “up to 4 spaces” for the purposes of determining indentation. Correct?

Yes. 1 <= N <= 4 so that you reach offset divisible by four (computed since start of the line).

2 Likes

Got it. Thank you for explaining.