Anchors in markdown

See also this related thread: Turning empty link definitions into anchors

No, please do not merge this topic with http://talk.commonmark.org/t/consistent-attribute-syntax

This are completely different things.

Additional attributes are presentation-level bells and whistles - classes for different styling, image sizes, custom attributes… This can be additional extensions to spec - no problem.

But anchors (this topic) should be part of spec core. Part of links. Why spec allow me to make link to other page/doc, but doesn’t allow to link to other place in same doc?

2 Likes

I’m very sad to see this thread hasn’t moved in 2 years. I feel this is an absolute essential core element of markdown that should supersede all other consideration right now. Markdown simply cannot be used in any serious document format without the ability to set and reference anchors for use table of contents, chapter indexes, bibliographic and footnote references… you name it. Every single document format ever created, from richtext to word, hlp to chm, from epub to mobi, and 99 other ebook and document formats, all have an anchor naming and linking system in place.

Markdown is the absolute singular contender yet to enter the arena.

Anchors for headers have been discussed quite extensively in the Automatically generated IDs for headers topic.

Thanks for the link! Btw, I’m not sure that headers are the only place for anchors.

1 Like

tldr; There is absolute consensus that manual header id assignment is an important feature for CommonMark and its absence is a weakness of the spec. There is general consensus that the syntax that best fits this use case is a {#anchor-id} following the definition of the header. All the contentious issues that have held it up do not actually apply to manual header id generation. How can this move forward?

Consensus syntax:

# Header example {#Anchor-id}

To create

<h1 id="Anchor-id"> Header example </h1>  

Estimates of consensus:

This thread

Of 16 posts on this thread 11 speak positively of this syntax or the importance of a solution. 4 of the remaining do not address the merits of the syntax or the solution. 1 states that the problem is likely to have no perfect solution.

Automatic id generation thread

Of 84 posts on the Automatically generated IDs for headers topic, 15 speak positively of the {# } syntax & of the importance of having a solution.

There are three other syntax styles proposed, which have 4, 2 and 1 post in favour of each of them which have 2, 1, and 1 proponent (respectively). Of posts in which other syntaxes are proposed, these all included mention of the more general attribute assignment problem. Of those individuals favouring other syntaxes, 1 of the 2 proponents of the 4 post syntax later agreed that the {#anchor} syntax is better, on the grounds that it is already present in pandoc.

The rest of the posts (62) do not explicitly discuss the syntax or need for manual anchor ids focusing on the main issue of that other thread (how to automatically generate ids).

Conclusion: Consensus has been reached, progress has been arrested by scope creep

Having analyzed this thread as well as the discussion at Automatically generated IDs for headers topic, it’s seems that the correct next move is to proceed toward implementing the {# } for manually specifying ids for atx headers.

The discussion began 2 years ago, and it should not need to continue longer before this feature is included in the spec.

There universal consensus that allowing manual ids on headers is an important feature and a assured improvement to the CommonMark spec.

There is widespread (but not universal) consensus on the use of the {# } syntax for manual header id assignment. Of posts that comment on syntaxes, 79% support this syntax, over 6× the level of support for any other syntax.

Addressing Dissent: Manual header ids are not objects of dissent

All of dissent in these threads seems to revolve around whether this should be a more general way of assigning attributes and whether there should be auto-generated header ids.

These concerns don’t need to block progress manually specified header ids. This approach to including header ids leaves open the possibility of autogenerating them (but says nothing about autogeneration one way or the other). Additionally, it allows for other syntaxes (as well as an expansion of the same syntax) as a means of assigning attributes to headers.

Steps forward?

@jgm and @codinghorror, what are the next steps needed to see progress on this? Happy to put in the effort wherever it is needed.

1 Like

I agree that this is a good syntax. It’s already widely supported (e.g. in pandoc). I think the main questions are:

  1. Since this is really an extension, should it wait til we’ve got the existing core nailed down, or should we just plow ahead?

  2. Should this be thought of as a special case of a more general attribute specifier? In pandoc you can have {#identifier .class .other-class key="value"}. Of course we could also support the simple identifier form for now and leave the others for later.

4 Likes

Plow ahead with this being the first extension. It’s been two years and the distraction could prove to be a refreshing break from dealing with edge cases in the core. Plus lots of people have been asking for a way to add anchors, and manual header ID generation is relatively simple (compared to, say, tables). Automatic header ID generation could be considered later.

It would make sense to group these together. #id for IDs and .class for classes is intuitive. key="value" could be key: "value" or key: value and look a bit less “programmerish” so it’s less obvious to me what the best syntax is here.

2 Likes

There are lots of subtle syntax variants that most people would also accept, but which may score better at compatibility. Several ones of them could be supported, others forbidden. Some alternatives play better with info strings of fenced code blocks, others with current or proposed link syntax.

Meta data inside curly braces

  1. ## Heading {#ID .class}

  2. ## Heading ## {#ID .class}

  3. ## Heading {#ID .class} ##

  4. ## {#ID .class} Heading

  5. {#ID .class} ## Heading

  6. {#ID .class}
    ## Heading

  7. {#ID .class}
    ## Heading ##

  8. ## Heading
    {#ID .class}

  9. ## Heading ##
    {#ID .class}

  10. Heading {#ID .class}
    -------

  11. {#ID .class} Heading
    -------

  12. Heading
    ------- {#ID .class}

  13. Heading
    {#ID .class} -------

Meta data (only) separated by line affix

  1. ## Heading ## #ID .class

  2. #ID .class ## Heading ##

  3. #ID .class ## Heading

  4. #ID ## Heading ## .class

  5. .class ## Heading ## #ID

  6. Heading
    ------- #ID .class

Explicit IDs by reusing link (definition) syntax

  1. [ID]
    ## Heading

  2. [ID]:
    ## Heading

  3. ## Heading [][ID]

  4. ## [][ID] Heading

  5. ## [Heading][ID]

  6. ## Heading
    [][ID]

  7. [][ID]
    ## Heading

  8. ## Heading …
    [Heading]: ID

  9. ## Heading …
    [Heading]: #ID

  10. ## Heading …
    [#Heading]: ID

  11. ## Heading …
    [#Heading]: #ID

  12. ## Heading …
    [Heading]: [ID]

  13. ## [Heading] …
    [Heading]: ID

  14. ## [Heading] …
    [Heading]: #ID

  15. ## [Heading] …
    [#Heading]: ID

  16. ## [Heading] …
    [#Heading]: #ID

  17. ## [Heading] …
    [Heading]: [ID]

I’m probably forgetting some possibilities and proposals.

1 Like

It would be nice to see progress with autogenerated ancors, but with security considerations in mind.

As i explained earlier, it’s not safe to generate ID-s/name-s without prefixes (when value can become equal to window.<anything> in browser). And it would be very inconvenient for developers if such problem will be ignored in spec.

2 Likes

Perhaps the spec could include a default prefix, e.g. # My Header {#id-of-header} becomes <h1 id="commonmark-id-of-header">My Header</h1>.

That’s completely different thing. Manual direct access to id/classes/attrs manipulation is unsafe almost as html use. And it should be disabled for unsafe input if you don’t wish to use sanitizers.

Here i speak only about autogenerated header ids, this use case is specific.

If my point about scopes is right, then this is the responsibility of the embedding scope to address.

1 Like

Very similar question: Feature request: automatically generated ids for headers

That’s moving problem from one place to another (and more difficult) instead of resolution.

But as I explained in my above linked comment, it’s moving the problem to the right place. For example, pre-HTML5, there should only be one H1 on a page. But the Markdown spec, which I believe we all agree should be portable and not tightly coupled to HTML, doesn’t and shouldn’t concern itself with possible collisions between a level one Markdown heading and an H1 in the embedding context. It’s the responsibility of the embedding context (e.g. this discourse page) to demote the Markdown headings if it wanted to implement the “only one H1” rule. It’s actually far more complex to try and solve this problem for every possible downstream context, both those that exist and ones that haven’t been invented yet. It’s far more complex to solve it in the wrong place.

That’s subjective personal opinion. From my point of view, this place is not right :). Because implementation will be much more difficult. At least, from your posts, i don’t see that you are familiar with implementations and know easy way to sanitize inputs.

Hi,

Is this working?

I can only say that from a users point of view this is generating a smorgasboard of dialects when just adding a piece of text like {#get-back-here} would be sufficient.

Hi,
I see a very simple easy to use implementation of anchors in Commonmark spec:

A [whitespace character](@) is a space …

…A [non-whitespace character](@) is any character that is not a [whitespace character].

Will that not be suitable? It is very simple, clear…

1 Like

I actually like Pal_Petho’s general idea - though it should match the style of the already-agreed on manual header IDs, and shouldn’t delay implementation of that.