Optional title of link reference definitions

Hello,

I find the spec slightly unclear or underspecified on Link reference definitions.

Namely:

A link reference definition consists of a link label, optionally preceded by up to three spaces of indentation, followed by a colon (: ), optional spaces or tabs (including up to one line ending), a link destination, optional spaces or tabs (including up to one line ending), and an optional link title, which if it is present must be separated from the link destination by spaces or tabs.

First the sentence seems slightly contradictory it says that after a destination there may be optional space, tabs and one endline but then the end of the sentence says that the title must be separated by spaces and tabs, no end line is mentioned. I guess maybe what was meant is:

[…] it should be separated from the link destination by at least one space, tab or line ending.

Second I don’t understand how this wording defines the behaviour of example 197. Somehow in that example the “failure” to parse the link title results in the whole link reference definition to not being recognized (rather than take the the definition up to the link destination and treat the rest as inline)

But in the following example:

[foo]: 
/url
'the 

title'

[foo]

The definition gets recognized up to the link destination by both cmark and md2html:

<p>'the</p>
<p>title'</p>
<p><a href="/url">foo</a></p>

A link reference definition consists of a link label, optionally preceded by up to three spaces of indentation, followed by a colon (: ), optional spaces or tabs (including up to one line ending), a link destination, optional spaces or tabs (including up to one line ending), and an optional link title, which if it is present must be separated from the link destination by spaces or tabs.

First the sentence seems slightly contradictory it says that after a destination there may be optional space, tabs and one endline but then the end of the sentence says that the title must be separated by spaces and tabs, no end line is mentioned. I guess maybe what was meant is:

[…] it should be separated from the link destination by at least one space, tab or line ending.

You are right that it would be correct to mention the possibility of
a line ending here, but this formulation isn’t quite right: you
can’t use two newlines, for example. Better to formulate like
the others: “by spaces or tabs (including up to one [line ending]).”

Second I don’t understand how this wording defines the behaviour of example 197. Somehow in that example the “failure” to parse the link title results in the whole link reference definition to not being recognized (rather than take the the definition up to the link destination and treat the rest as inline)

A reference link definition is a block-level construct, so it
can’t be followed directly by inline content. Perhaps we should
make that explicit by saying that after the items mention, and
optional spaces or tabs, there should be a line-ending. I note
that this is left implicit also for some other block-level
elements, e.g. in Setext headings it is not explicitly stated
that the setext underline must be followed by end-of-line and not
other content.

But in the following example:

[foo]: 
/url
'the 

title'

[foo]

The definition gets recognized up to the link destination by both cmark and md2html:

<p>'the</p>
<p>title'</p>
<p><a href="/url">foo</a></p>

Yes: the relevant difference is that in this example, the link
destination is followed by end-of-line, and in the previous one,
it is followed by some non-whitespace content.

1 Like

Thanks for your answers.

It could be added yes, but in this case I found it more clear since it has

…any number of trailing spaces or tabs

Emphasis is mine. The trailing aspect makes it difficult to think having other content there is allowed. But there’s no harm in making that more explicit.

Right but had there been no blank line in my example this would have been accepted. So basically the logic is

  1. If the link destination is followed by spaces and tabs not including one line ending. You must parse the rest of the line as a link title. If that fails, the whole thing was not a link reference definition.
  2. If the link destination is followed by spaces and tabs including one line ending. You need to try to parse a link title on the line following the destination, if that succeeds it’s part of the link definition, otherwise you got a link reference definition without a title.

P.S. Tell me if you want me to file an issue for clarification on the spec issue tracker.

P.S. Tell me if you want me to file an issue for clarification on the spec issue tracker.

Sure, an issue would be good.

It’s easier to explain the syntax this way:

‘[’ REFNAME ‘]’ ‘:’ SPNL LINKDEST (SPNL TITLE)? SP EOL

where

SPNL = zero or more space characters, including at most one
newline
SP = zero or more space characters

1 Like

+1 for formal grammars :–)

Even though it’s likely hard to describe markdown using one I wouldn’t mind if the spec described syntax fragments via IETF’s ABNF notation.

I opened Clarify title parsing of link reference definitions ¡ Issue #697 ¡ commonmark/commonmark-spec ¡ GitHub on the spec issue tracker.