The spec says:
A link reference definition does not correspond to a structural element of a document.
and that is all it currently has to say about the fact that link reference definitions (“LRDs”) are removed.
Looking at the behavior of the reference implementation, this is arguably not true, or at least imprecise.
Consider the following Markdown:
-
x
which yields
<ul>
<li></li>
</ul>
<p>x</p>
– the x
is outside the list.
However if we add a LRD:
- [foo]: /foo.html
x
we get
<ul>
<li>x</li>
</ul>
with the x
inside the list item, because it is now considered paragraph continuation text of the first list item line, which is no longer empty.
So while you could argue that the LRD is not a structural element, its presence in the Markdown source certainly changes the structure of the document (beyond the obvious question whether or not a link may be created elsewhere in the document).
In fact it is close to impossible to say something like “after the LRDs are collected, the document is treated like they hadn’t been there in the first place”, because you can end up with weird cycles. Imagine this:
x
- [foo]: /foo.html
Because it’s at the beginning of a paragraph in a list item, the LRD is parsed and removed. If you now treat the document like the LRD hadn’t been there, you’re converting this:
x
-
– which is <h2>x</h2>
because the dash is now a setext header underline*. However, if the dash is now interpreted as an underline, not a bullet, then the original LRD would no longer have been in a spot where a LRD is legal, and thus shouldn’t have been removed. But if the LRD hadn’t been removed, the dash would go back to being a bullet. GOTO START
.
* This is not what the spec currently says, but it is what the reference implementation does, and it is likely that the spec will be changed accordingly. You can construct a similar example with a LRD and a horizontal line using the current spec as-is.
So it seems inevitable that LRDs, even after being removed, have an influence on the document structure, and the reference implementation’s behavior makes some sense to me (at least I could not think of any highly problematic issues).
However the spec should accurately describe the desired behavior to match, because this issue seems very prone to edge cases.
Any suggestions how this could be phrased? Or any problems/inconsistencies with the current behavior that you can think of and that I’ve missed?