Expanding multiple definition behavior

It’s been five years since this rule was proposed, but I think it should be reconsidered.

As it notes, this is the most sensible rule to follow to ensure that multiple concatenated documents won’t interfere with each other: any simpler rule will cause one document using the same set of tokens as another concatenated document to break the other’s links.

(And yes, by arguing for a prescriptive change like this to the spec, this means I’m doing a complete 180 from the hardline-descriptivist position I originally held when CommonMark was first launched five years ago. Lesson learned: if you’re going to be asking every implementation to make some changes in order to conform to a first-of-its-kind formal specification, it’s worth sneaking in minor cleanups that’ll apply for all of them.)

I am not opposed to multiple link definitions using the same link label, although that would be a major change so close to 1.0. The exact algorithm would, however, need to consider automatically generated labels from headings. I therefore think that the rules proposed in the old thread are not sufficient:

  1. A link uses the first label defined after it appears.
  2. If no label appears after a link, the link uses the closest label defined above .

Instead, I would like to suggest a hierarchical approach. There are several possibilities to specify the nuances of the required algorithm. I would prefer a strong vertical priority.

  1. A reference link uses the deepest (alternative: closest) link definition within its hierarchy. Vertical relationships come before any (alternative: equidistant) horizontal ones.
  2. If ancestral and descendant sections contain any link definition, the highest descendant (if any) is chosen over the lowest ancestor. Alternative: … the one with the closest distance is chosen, descendant before ancestor.
  3. If two implicit or two explicit definitions have the same horizontal distance, the closest one appearing after the link reference is preferred
  4. If an implicit and an explicit definition have the same distance to the link, the explicit definition is always preferred.
Relationship Vertical Distance Horizontal Distance Distance
Self ±0 0 0
Child +1 0 1
Parent –1 0 1
Grandchild +2 0 2
Grandparent –2 0 2
Descendant +1…6 0 1…6
Ancestor –1…6 0 1…6
Root –0…6 0 0…6
Sibling ±0 1 1
Cousin ±0 2 2
Niece / Nephew +1 1 2
Aunt / Uncle –1 1 2

Another alternative

The link definition distance is 0 for the section of the reference link itself. Otherwise it is the sum of the absolute values of the vertical link distance and the horizontal link distance. The vertical link distance is increased by 1 for each section level below the one of the reference link or it is decreased by 1 for each section level above. The horizontal link distance is the vertical link distance to the closest common ancestor section.

Sample structure

[foo]: e <!-- first explicit link definition -->
[foo]: x <!-- last explicit link definition -->

# Foo 
<!-- 1i, implicit link definition -->

[foo]: 1e
[foo]: 1x

## Foo 
<!-- 1.1i, preceding sibling -->

[foo]: 1.1e
[foo]: 1.1x

## Foo 
<!-- 1.2i, self -->

[foo]: 1.2e
[foo]: 1.2b <!-- explicit link definition before -->

[Link][foo]

[foo]: 1.2a <!-- explicit link definition after -->
[foo]: 1.2x

### Foo 
<!-- 1.2.1i, child -->

[foo]: 1.2.1e
[foo]: 1.2.1x

## Foo 
<!-- 1.3i, succeeding sibling -->

[foo]: 1.3e
[foo]: 1.3x

## Foo 
<!-- 1.4i, final sibling -->

[foo]: 1.4e
[foo]: 1.4x

# Foo 
<!-- 2i, sibling of parent, end of document -->

[foo]: 2e
[foo]: 2x

1.2a > 1.2x > 1.2b > 1.2e > 1.2i >
1.2.1e > 1.2.1x > > 1.2.1i >
1e > 1x > 1i > e > x >
1.3e > 1.3x > 1.3i >
1.4e > 1.4x > 1.4i >
1.1e > 1.1x > 1.1i >
2e > 2x > 2i

Compatibility

Most existing Markdown parsers use either the very first definition, 1e (as specified by Commonmark) , or the very last one, 2E. RDiscount seems to choose the deepest definition, 1.2.1e.

Reconsidering, Iʼm afraid it would make more sense and improve compatibility to prefer any explicit link definition over all implicit link definitions:

1.2a > 1.2b > 1.2x > 1.2e >
1.2.1e > 1.2.1x >
1e > 1x > e > x >
1.3e > 1.3x > 1.4e > 1.4x > 1.1e > 1.1x >
2e > 2x >
1.2i > 1.2.1i > 1i > 1.3i > 1.4i > 1.1i > 2i

The alternatives outlined above mostly differ for edge cases.

Distance over Sequence

1.2a > 1.2b > 1.2x > 1.2e > 1.2i >

1.3e > 1.3x > 1.3i >
1.1e > 1.1x > 1.1i >
1.4e > 1.4x > 1.4i >