Abbreviations: longest/shortest valid match?


#1

Consider the following example: (babelmark)

*[Foo]: foo
*[Foo Bar]: foobar

Foo Bar

and the discrepancy between outputs:
<abbr title="foo">Foo</abbr> Bar (e.g. php-markdown-extra) and
<abbr title="foobar">Foo Bar</abbr> (e.g. Markdig)

I believe that the behaviour of Markdig is correct in this case, matching the longest possible abbreviation definition. Thoughts?

I’m new to this forum, so are there any plans for commonmark-unified specifications for commonly used extensions?


#2

I am surprised how many implementations support this abbreviation syntax, at least Kramdown, Markdig, Maruku, Multimarkdown and PHP Markdown Extra.

An abbreviation definition is a line beginning with an asterisk *, optionally indented by up to three spaces, immediately followed by an opening square bracket [, followed by the abbreviation, followed a closing bracket ], immediately followed by a colon :, followed by optimal whitespace, followed by the expansion.

There is at least one thing similar in Commonmark to derive expected behavior of this extension from: pairs of square brackets immediately following each other that could either be interpreted as a reference link or a multiple shortcut reference links. Commonmark prefers the longer link syntax, even if the link text would resolve to a link definition but the link label does not.

[foo][nil] 
[foo][bar] 

[foo]: baz
[bar]: quuz 
.
<p>[foo][nil] 
<a href="quuz">foo</a></p>

One should also note that all of the implementations that support this syntax match the longer term if it is specified first. Commonmark also prefers the first definition, although many legacy parsers prefer the last one.

[foo]

[foo]: bar
[foo]: baz

[foo] 
.
<p><a href="bar">foo</a></p>
<p><a href="bar">foo</a></p>

So, the question is not actually about the longest or shortest match, but about the longest or first defined match. I tend to go for first defined.


#3

I myself prefer there be no re-definitions (like with link references for example), hence in my Markdown Validator project, I will emit errors on markdown with them.

Going by first defined, with markdown like the one in the original post, Foo Bar could never be matched.