Anchors in markdown

I agree that this is a good syntax. It’s already widely supported (e.g. in pandoc). I think the main questions are:

  1. Since this is really an extension, should it wait til we’ve got the existing core nailed down, or should we just plow ahead?

  2. Should this be thought of as a special case of a more general attribute specifier? In pandoc you can have {#identifier .class .other-class key="value"}. Of course we could also support the simple identifier form for now and leave the others for later.

4 Likes

Plow ahead with this being the first extension. It’s been two years and the distraction could prove to be a refreshing break from dealing with edge cases in the core. Plus lots of people have been asking for a way to add anchors, and manual header ID generation is relatively simple (compared to, say, tables). Automatic header ID generation could be considered later.

It would make sense to group these together. #id for IDs and .class for classes is intuitive. key="value" could be key: "value" or key: value and look a bit less “programmerish” so it’s less obvious to me what the best syntax is here.

2 Likes

There are lots of subtle syntax variants that most people would also accept, but which may score better at compatibility. Several ones of them could be supported, others forbidden. Some alternatives play better with info strings of fenced code blocks, others with current or proposed link syntax.

Meta data inside curly braces

  1. ## Heading {#ID .class}

  2. ## Heading ## {#ID .class}

  3. ## Heading {#ID .class} ##

  4. ## {#ID .class} Heading

  5. {#ID .class} ## Heading

  6. {#ID .class}
    ## Heading

  7. {#ID .class}
    ## Heading ##

  8. ## Heading
    {#ID .class}

  9. ## Heading ##
    {#ID .class}

  10. Heading {#ID .class}
    -------

  11. {#ID .class} Heading
    -------

  12. Heading
    ------- {#ID .class}

  13. Heading
    {#ID .class} -------

Meta data (only) separated by line affix

  1. ## Heading ## #ID .class

  2. #ID .class ## Heading ##

  3. #ID .class ## Heading

  4. #ID ## Heading ## .class

  5. .class ## Heading ## #ID

  6. Heading
    ------- #ID .class

Explicit IDs by reusing link (definition) syntax

  1. [ID]
    ## Heading

  2. [ID]:
    ## Heading

  3. ## Heading [][ID]

  4. ## [][ID] Heading

  5. ## [Heading][ID]

  6. ## Heading
    [][ID]

  7. [][ID]
    ## Heading

  8. ## Heading …
    [Heading]: ID

  9. ## Heading …
    [Heading]: #ID

  10. ## Heading …
    [#Heading]: ID

  11. ## Heading …
    [#Heading]: #ID

  12. ## Heading …
    [Heading]: [ID]

  13. ## [Heading] …
    [Heading]: ID

  14. ## [Heading] …
    [Heading]: #ID

  15. ## [Heading] …
    [#Heading]: ID

  16. ## [Heading] …
    [#Heading]: #ID

  17. ## [Heading] …
    [Heading]: [ID]

I’m probably forgetting some possibilities and proposals.

1 Like

It would be nice to see progress with autogenerated ancors, but with security considerations in mind.

As i explained earlier, it’s not safe to generate ID-s/name-s without prefixes (when value can become equal to window.<anything> in browser). And it would be very inconvenient for developers if such problem will be ignored in spec.

2 Likes

Perhaps the spec could include a default prefix, e.g. # My Header {#id-of-header} becomes <h1 id="commonmark-id-of-header">My Header</h1>.

That’s completely different thing. Manual direct access to id/classes/attrs manipulation is unsafe almost as html use. And it should be disabled for unsafe input if you don’t wish to use sanitizers.

Here i speak only about autogenerated header ids, this use case is specific.

If my point about scopes is right, then this is the responsibility of the embedding scope to address.

1 Like

Very similar question: Feature request: automatically generated ids for headers

That’s moving problem from one place to another (and more difficult) instead of resolution.

But as I explained in my above linked comment, it’s moving the problem to the right place. For example, pre-HTML5, there should only be one H1 on a page. But the Markdown spec, which I believe we all agree should be portable and not tightly coupled to HTML, doesn’t and shouldn’t concern itself with possible collisions between a level one Markdown heading and an H1 in the embedding context. It’s the responsibility of the embedding context (e.g. this discourse page) to demote the Markdown headings if it wanted to implement the “only one H1” rule. It’s actually far more complex to try and solve this problem for every possible downstream context, both those that exist and ones that haven’t been invented yet. It’s far more complex to solve it in the wrong place.

That’s subjective personal opinion. From my point of view, this place is not right :). Because implementation will be much more difficult. At least, from your posts, i don’t see that you are familiar with implementations and know easy way to sanitize inputs.

Hi,

Is this working?

I can only say that from a users point of view this is generating a smorgasboard of dialects when just adding a piece of text like {#get-back-here} would be sufficient.

1 Like

Hi,
I see a very simple easy to use implementation of anchors in Commonmark spec:

A [whitespace character](@) is a space …

…A [non-whitespace character](@) is any character that is not a [whitespace character].

Will that not be suitable? It is very simple, clear…

1 Like

I actually like Pal_Petho’s general idea - though it should match the style of the already-agreed on manual header IDs, and shouldn’t delay implementation of that.

Questionnaire

The topic is complex and there are a lot of options. I have tried to condense the principles behind them into a set of questions. This is not intended as a deciding vote but for finding out the collective opinion.

If existing Commonmark constructs are used to generate target anchors in the output format, these are known as automatic anchors or implicit anchors. A new construct or convention would be needed for manual anchors or explicit anchors.

Automatic anchors

  • All headings should become anchors automatically (using their textual content)
  • Implicit anchors (e.g. ## Heading) should automatically be available as reference link definition labels for overrides (e.g. [Heading]: {#ID})
  • Unused reference link definitions ([label]:) should become anchors automatically

0 voters

  • All links should become anchors automatically (using their textual content)
  • Specific inline links (e.g. [text](@) or [text]()) should become anchors automatically
  • All reference links ([text][label]) should become anchors automatically (using their label)
  • Specific reference links (e.g. [text][#label] or [][label]) should become anchors automatically
  • Other links should not become anchors automatically

0 voters

Manual anchor restrictions

Manual anchors …

  • may be restricted to headings
  • may be restricted to defining terms (e.g. <dfn> in HTML output)
  • may be restricted to blocks (i.e. headings, code blocks, quotations, …)
  • may be restricted to headings and defining terms
  • may be restricted to blocks and defining terms
  • should be available in arbitrary locations

0 voters

Manual anchor positions

Manual anchors …

  • should always come before/above text (e.g. ## Heading {#ID} ##)
  • should always come after/below text (e.g. ## {#ID} Heading ##)
  • may come either before/above or after/below text

0 voters

Manual anchors in headings

  • should come between text and affix ## or underline === (e.g. ## Heading {#ID} ##)
  • should stay outside text and affix ## or underline === (e.g. ## Heading ## {#ID})

0 voters

Manual anchors in links

  • should be inside the text part (e.g. [text {#ID}](target))
  • should be inside the target or label part (e.g. [text](target {#ID}))
  • should be outside current text and target or label parts (e.g. [text](target){#ID})

0 voters

Stylistic preferences

  • Anchor ID should always be inside curly braces {}
  • Anchor ID should always be prefixed by a hash sign #
  • Manual anchors should always be on a separate line

0 voters

2 Likes

Reminder: Anchors in markdown

I’d be happy to have automated anchors, but it will be a big ass pain if such things appear in spec without security considerations. Also, manual ID-s are not convenient (IMHO) as primary solution and may tend users to make security mistakes.

“Choose at least two options” on the link anchor poll is wack when I don’t think links should be come anchors.

It’s headers (as in h1, h2, h3, h4, h5 and h6) that should become anchors.

Slugified from the text, sanitized for safety, and disambiguated with numbers iff there are duplicates.

So, are we anywhere useful on this?

Nope. Many people are just using GFM instead of CM when they really need this feature

1 Like

From a practical point of view, I’m wondering if this is actually a real problem. GFM is a strict superset of CommonMark, so you’re using CommonMark if you’re using GFM.

The goal of CommonMark was to be a strongly defined, highly compatible specification of Markdown. Anchors were never a part of Markdown. CommonMark has been successful in formalising the features of Markdown, but hasn’t had any movement toward going beyond this and formalising extensions. GitHub has a practical need to formalise particular extensions to the CommonMark spec, so why not just use their spec?

2 Likes