Not sure if I managed to miss something entirely, but I noticed an inconsistency between the spec description and the reference implementation for type-1 HTML blocks. Here’s what the spec (v0.29) says:
Start condition: line begins with the string
<script
,<pre
, or<style
(case-insensitive),
followed by whitespace, the string>
, or the end of the line.
Because no other characters than whitespace, >
, or the end of line can follow these tags, I expect that the following snippet does not satisfy the start condition of type-1 HTML blocks:
<pre trailing-chars
*foo*
</pre>
Since none of the lines satisfy any other start conditions for HTML blocks, everything should be treated as valid Markdown and rendered as such:
<p>
<pre trailing-chars
</p>
<p>
<em>foo</em>
<p>
<p>
</pre>
<p>
But the dingus still parses everything as an HTML block, and renders it as:
<pre trailing-chars
*foo*
</pre>
I believe the dingus interpretation, which allows for trailing characters, makes more sense here, since people might write something like:
<script defer
type="text/javascript">
console.log("not *emphasis*");
</script>
To avoid collision with type-7 blocks, which are terminated by blank lines, we should require at least one whitespace between <script
, <pre
, or <style
and its trailing characters, to keep these special tag names recognizable. For instance:
<scriptBlock>
This is a paragraph *with emphasis*.
</scriptBlock>
<script>
... but this one is *not*.
</script>
So the spec can say something like:
Start condition: line begins with the string
<script
,<pre
, or<style
(case-insensitive), followed by at least one whitespace character.