Cross-references and citations

Numerous programs have implemented cross-references and citations on top of Markdown. Similar discussions have not resulted in a standard syntax for cross-references and citations added to CommonMark:

Existing implementations have a reasonably consistent syntax, including:

  • pandoc citations extension and R Studio[@bibKey] (e.g., [@doe99; @smith2000; @smith2004], [see @doe99, pp. 33-35 and *passim*; @smith04, chap. 1], [-@smith04], @smith04), plus an extensive set of locator terms
  • PandocCiter@bibKey or [@bibKey]
  • zotxt[@bibKey] (e.g., [@Doe2006], [@Doe:2006])
  • MultiMarkdown[][#bibKey] (e.g., [p. 26][#Doe2006])
  • HTML<cite>...</cite>
  • Zotero’s Scannable Cite{See | Smith, (2012) |p. 45 | for an example |zu:2433:WQVBH98K}, and includes legal types

Citations and cross-references are somewhat similar concepts. Internal cross-references have been discussed at length:

An astute observation from the thread includes:

Though perhaps this mechanism could replace the current mechanism for numbered examples: {#ex:foo}, @ex:foo?

A possible issue with the lengthy discussion on cross-references is that content and presentation logic are intermingled. That is, writing Figure @fig:label is redundant because @fig already denotes that the reference is a figure. How @fig is rendered can be left to the presentation layer. (The pandoc-fignos demo exemplifies this scenario.)

Rather than create a specific form for each reference type, consider the general form:

{#type-name:label}
[@type-name:label]

Where the type name can be any two- to four-letter value (including I18N). Thus the following pairs of anchors (braces) and references (brackets) are valid:

{#fig:cats}
[@fig:cats]
{#図版:猫}
{@図版:猫}
{#eq:mass-energy}
[@eq:mass-energy]
{#eqn:laplace}
[@eqn:laplace]

Note that @type:label could be considered invalid (syntactic sugar?), although it’d be a breaking change because some implementations support it.

Multiple cross-references could be written as:

see [@fig:cat; @fig:dog; @fig:dolphin; @tab:habitats].

That could be rendered in one of many ways, depending on the presentation logic:

see Figures 1.1 to 1.3 and Table 1.1.
see Figures 1.1—1.3 and Table 1.1.
see Figures 1.1, 1.2, 1.3, and Table 1.1.
see Figure 1.1, Figure 1.2, Figure 1.3, and Table 1.1.

This allows for:

see [@fig:cat; @fig:dog; @fig:dolphin; @tab:habitats] 
starting on [@page:cat].

The rendering software would have to be flexible enough to allow the user to define the behaviour for the type name (or provide suitable defaults). For example, @page could map to \at{page}[label]. This allows the flexibility of:

[@fig:cat]

to map to see \in[cat] on \at{page}[cat], which could render as:

see Figure 1.1 on page 9

For bibliographic references, labels are cross-referenced against an external database. The render must be informed what label denotes such a reference, meaning the following may be valid:

[@bib:doe2021]
[@参考文献:doe2021]

Labels not found in the database may result in a warning by the rendering software.

A locator is the reference followed by a comma, its type, and numbering, such as:

[@bib:doe2021, pp. 33-35, and *passim*; @bib:smith2024, ch. 1]

Here, too, what qualifies as a locator would need to be configured to allow for I18N. Such as:

[@bib:descartes, lv. 2]

Where lv. would be rendered as livre. The default locator type is page, allowing it to be omitted, as per:

[@bib:doe2021, 33-35, and *passim*; @bib:smith2024, ch. 1]

Thoughts?

P.S.
This topic stems from a question by a discussion item posed about KeenWrite. See Is it possible to support autocompletion of citing references in bib file? · Discussion #144 · DaveJarvis/keenwrite · GitHub for details.