Footnote extensions vs Link Reference Definition


#1

I need a general agreement of footnote extensions.

There are lots of discussions about footnotes, but I can’t find discussions that point the pandoc’s block-style footnote notation conflict with Link reference definitions in CommonMark.

Pandoc’s style footnote notation is like this:

http://pandoc.org/MANUAL.html#footnotes

Here is a footnote reference.[^1]

[^1]: Here is the footnote.

It means that if a CommonMark-compatible markdown processor has footnotes, it must differ from pandoc’s, while there are lots of CommonMark-incompatible markdown processors that implement pandoc’s footnotes.

I think there some choices:

  1. Adopt pandoc’s inline-style footnotes (^[inline-style footnotes])
  2. Redefine CommonMark to accept pandoc’s block-style footnotes by defining link labels must not have leading ^
  3. GIve up the compatibility with Pandoc’s block-style footnotes

(2) seems the best because there are lots of markdown processors that have pandoc’s block-style footnotes.


#2

Found some related topics in this forum:


#3

If we went down this route, the core CommonMark spec would need to “know” about a number of other extensions and where not to use particular syntax. Unless we plan in advance all of the extensions, that seems too demanding of the core spec. It might be better to include something in each extension spec stating that you will lose the ability to do X from the CommonMark core feature set if you choose to adopt the extension.


#4

I don’t think CommonMark must know all the syntax extensions but should know footnotes because it is supported by a lot of markdown parsers.

In other words, if CommonMark ignores footnotes, markdown parsers that support footnotes could never be compatible with CommonMark.

I think there’s another option: to remove link reference definitions from CommonMark. It makes CommonMark simpler, gives markdown parsers with footnotes a possibility to get CommonMark compatibility, and does not break backward compatibility.


#5

Or CommonMark could disallow link references that start with punctuation. I’m betting that the vast majority of link references start with numbers or letters.


#6

@notriddle

Possibly better because the markdown parsers that support footnotes even support link ref definitions too:

https://johnmacfarlane.net/babelmark2/?normalize=1&text=Here+is+a+[link]. Here+is+a+footnote[^footnote]. [link]%3A+http%3A%2F%2Fexample.com+"link+ref+definition" [^footnote]%3A+http%3A%2F%2Fexample.com


#7

Actually, rethinking it, we’d want to exempt quotation marks and maybe a few other pieces of punctuation, too. Like “this”.

Actually, rethinking it, we'd want to exempt quotation marks and maybe a few other pieces of punctuation, too. Like ["this"].

["this"]: http://example.com/

#8

UPDATE:

github/cmark now supports Kramdown-style footnotes: https://github.com/github/cmark/pull/64

And thus CommonMarker gem does, so I enables the footnotes-option for my product.


#9

Babelmark 3 currently interfaces with “GitHub Flavored Markdown”, i.e. cmark-gfm, version 0.17.4, which does not seem to have this extension enabled yet.

GitHub’s documentation extensions.txt should probably be written more like spec.txt. It is missing several edge cases right now.

Anyway, I did not really get the point of this thread. No other implementation seems to support Pandoc’s inline footnotes with a circumflex preceding bracketed text, ^[footnote], and automatically generated mark. MMD will turn the link label [^footnote] into inline footnote text if it cannot be resolved to a reference link definition.

Many existing implementations largely agree with CM about link labels which start with a circumflex, so will try to turn them into links. If the respective reference link definition only contains a single string without whitespace, it will be used as the URL. I believe CM could be prepared better for extensions if reference link definitions were slightly changed: the link destination should be optional when a link title (in parentheses, single or double quotation marks) follows (or possibly any part of an info string that cannot be interpreted as the destination URL). However, feeding empty or degenerate URLs did not have the expected results in CM-compatible implementations and some others – at least unless I separate the ill-formed link definitions from the rest by blank lines.

An extension or postprocessor that hooks into the AST could simply check whether the first character of <text> within <link> is ^ and then modify it accordingly. This also avoid stuff like parsing the destination # as the start of a heading, which is something that happens in several implementations that support footnotes. Link titles, which become footnote texts, do not support blank lines inside and thus no paragraphs, but may contain inline formatting.

A minor drawback perhaps, the original location of the link definition will be lost already at this point, so all notes will necessarily be automatically placed at either the (page) foot or the (document) end.

Babelmark 3
Dingus

PS: The reference implementations do not agree about empty URLs inside angle brackets:

[label]: <> "title"

A link destination consists of either

  • a sequence of zero or more characters between an opening < and a closing >