Example 126 confusion (autoclosing fence blocks)

dbuenzli · November 25, 2022, 5:59pm

Hello,

I’m slightly confused by the output of example 126 whose input is "```\n" and output:

<pre><code></code></pre>

My reading is is that we have a(n empty) line here and that is should rather be:

<pre><code>
</code></pre>

Since the spec says:

If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document).

For example with "```\nbla" cmark gives me:

<pre><code>bla
</code></pre>

and "```\nbla\n"

<pre><code>bla

</code></pre>

and "```\n\n":

<pre><code>

</code></pre>

What am I missing ?

dbuenzli · November 26, 2022, 8:56am

It seems I got my input wrong, the test the output is:

<pre><code>
</code></pre>

I guess this is what I’m missing:

CommonMark Spec
Blank lines at the beginning and end of the document are also ignored.

But that is a bit ambiguous. So I guess, given the above, a single blank line is ignored at the end of input ?

dbuenzli · November 26, 2022, 9:08am

Not really it seems, in the reference implementeations a final blank line is not ignored, only a final empty line seem to be. That is "```\n\n " (two final spaces) gives:

<pre><code>

</code></pre>

Which has two lines and the final one two spaces.

Now confused again :–)

jgm · November 30, 2022, 5:42pm

The first \n is considered part of the fence; the code only starts on the next line. so if you just have ```\n then you have a code block with no content.

dbuenzli · November 30, 2022, 8:50pm

Sorry but I’m not convinced by your answer :–)

My intuition was that to explicitely simulate the closure of an autoclosing fence block ``` you’d append: \n``` since that’s the only reliable to do it. But that doesn’t seem to be the case, the rule rather seems to be append : \n``` if the last line is non-empty and ``` if the last line is empty.

More precisely both this input (no final newline):

```
aaa

and this input (with a final newline)

```
aaa

Yields the same rendering:

<pre><code>aaa
</code></pre>

While fencing the same content explicitely

```
aaa
```

```
aaa

```

Yields different renderings:

<pre><code>aaa
</code></pre>
<pre><code>aaa

</code></pre>

I don’t think this logic can be inferred from the current text – I then thought it was an application document’s initial an trailing blank line stripping, but as shown above it doesn’t seem to apply to autoclosing blocks, which in turn could also be clarified.

vas · December 6, 2022, 6:23pm

First, you’re using the wrong kind of intuition. Markdown and CommonMark aren’t looking at this from a parser’s perspective, but from that of the human eye. Second, it’s wrong to call an unclosed fence block “auto-closing”. The idea here is not to make closing fences optional, but to gracefully handle the situation when a closing fence is missing.

To illustrate my first point:

More precisely both this input 1 (no final newline):
```
aaa
and this input 1 (with a final newline)
```
aaa
Yields the same rendering:
<pre><code>aaa
</code></pre>

That’s because both of those look exactly the same to the human eye in any plain text editor.

While fencing the same content explicitely 1
```
aaa
```

```
aaa

```
Yields different renderings:
<pre><code>aaa
</code></pre>
<pre><code>aaa

</code></pre>

Because these do not look the same to the human eye. You can see a blank line in the second case.

So I guess, given the above, a single blank line is ignored at the end of input ?

Not really it seems, in the reference implementeations a final blank line is not ignored, only a final empty line seem to be. That is "```\n\n " 2 (two final spaces) gives:
<pre><code>

</code></pre>
Which has two lines and the final one two spaces.

I think this is a flaw, as to the human eye the plain text looks identical to "```\n", and ideally it should be corrected, but I don’t think it’s a priority because it’s a flaw in CommonMark’s attempt to gracefully handle flawed input (missing closing fence). The plain text author should use a closing fence to make explicit their intent with regard to trailing blank lines in the code, or the lack thereof.

dbuenzli · December 7, 2022, 11:44am

Second, it’s wrong to call an unclosed fence block “auto-closing”. The idea here is not to make closing fences optional, but to gracefully handle the situation when a closing fence is missing .

Not sure I see what that distinction brings to the discussion. But in any case all that still doesn’t tell me where in the specification is the logic that defines all this and what the exact rules are. If it’s

a final empty line is ignored

Then I personally can’t infer that from the text.

Regarding the discussion about intuitions personally I’m not here to define the standard but to implement it, so I’m not really interested in it. But if I had to design the standard I would say that being able to remove or add a trailing code fence without changing the concrete content of the document would be a good property for humans as well (especially since code fences are for verbatim text). UX is not only about visuals.

dbuenzli · February 15, 2023, 3:59pm

Note that similar confusion seems to occur with HTML blocks.

If you take example 173. This example has a final empty line which according to the test is ignored.

However if you add blanks on this final line then while the dingus still ignores the final line, neither cmark nor md2html do (they include the final blank line):

> printf '<style\n  type="text/css">\n\nfoo\n ' | cmark --unsafe  
<style
  type="text/css">

foo
 
> printf '<style\n  type="text/css">\n\nfoo\n ' | md2html       
<style
  type="text/css">

foo

So I guess something need fixing, either the spec or these implementations. The spec says:

Blank lines at the beginning and end of the document are also ignored.

But first that’s ambiguous, is it one line at the beginning and one at the end or multiple lines ? And at least for the cmark reference implementation on code blocks and HTML that’s one empty (vs blank) final line that gets ignored.

> printf '```\n  \n' | cmark    
<pre><code>  
</code></pre>
> printf '```\n  \n ' | cmark
<pre><code>  
 
</code></pre>
> printf '```\n  \n \n' | cmark
<pre><code>  
 
</code></pre>
> printf '```\n  \n \n \n' | cmark
<pre><code>  
 
 
</code></pre>

(md2html behaves similarly)

dbuenzli · February 15, 2023, 4:11pm

If the cmark reference implementation is deemed correct then I guess the spec could be fixed with something like:

Blank lines at the beginning of a document are ignored. Blank lines not part of
an open block at the end of a document are ignored. A final empty line is always ignored.

jgm · February 15, 2023, 4:40pm

dbuenzli:

More precisely both this input (no final newline):
aaa
and [this input ](https://spec.commonmark.org/dingus/?text=%60%60%60%0Aaaa%0A) (with a final newline)
aaa
Yields the same rendering:

The parser simply adds a newline to the end of content that doesn’t end in a newline, before parsing. That is why these cases don’t differ. (It’s a general expectation that text files end with a newline, and to keep things simple we ensure that this expectation is met before parsing.)

jgm · February 15, 2023, 4:45pm

I’m confused about how you’re understanding this. The last line of example 173 is not empty; it has the text “foo.” Nor is it ignored.

dbuenzli · February 15, 2023, 5:04pm

Ah I see, my problem seems to be with the definition of line. That is characters and a line ending vs characters and lines incrementing on newlines, like editors do. The file "\n" gives you either one line or two depending on the definition. Sorry used the wrong def.

I guess it makes more sense but I think even equipped with the right definition these examples need clarification (witness the difference between the dingus and cmark).