Anchors in markdown

vitaly · December 25, 2016, 6:12am

It would be nice to see progress with autogenerated ancors, but with security considerations in mind.

As i explained earlier, it’s not safe to generate ID-s/name-s without prefixes (when value can become equal to window.<anything> in browser). And it would be very inconvenient for developers if such problem will be ignored in spec.

chrisalley · January 30, 2017, 8:06pm

Perhaps the spec could include a default prefix, e.g. # My Header {#id-of-header} becomes <h1 id="commonmark-id-of-header">My Header</h1>.

vitaly · January 30, 2017, 8:32pm

That’s completely different thing. Manual direct access to id/classes/attrs manipulation is unsafe almost as html use. And it should be disabled for unsafe input if you don’t wish to use sanitizers.

Here i speak only about autogenerated header ids, this use case is specific.

vas · February 8, 2018, 4:52pm

If my point about scopes is right, then this is the responsibility of the embedding scope to address.

koppor · February 9, 2018, 9:26pm

Very similar question: Feature request: automatically generated ids for headers

vitaly · February 24, 2018, 3:31am

That’s moving problem from one place to another (and more difficult) instead of resolution.

vas · March 2, 2018, 9:15pm

But as I explained in my above linked comment, it’s moving the problem to the right place. For example, pre-HTML5, there should only be one H1 on a page. But the Markdown spec, which I believe we all agree should be portable and not tightly coupled to HTML, doesn’t and shouldn’t concern itself with possible collisions between a level one Markdown heading and an H1 in the embedding context. It’s the responsibility of the embedding context (e.g. this discourse page) to demote the Markdown headings if it wanted to implement the “only one H1” rule. It’s actually far more complex to try and solve this problem for every possible downstream context, both those that exist and ones that haven’t been invented yet. It’s far more complex to solve it in the wrong place.

vitaly · March 2, 2018, 9:33pm

That’s subjective personal opinion. From my point of view, this place is not right :). Because implementation will be much more difficult. At least, from your posts, i don’t see that you are familiar with implementations and know easy way to sanitize inputs.

serverhorror · July 7, 2018, 1:28am

Hi,

Is this working?

I can only say that from a users point of view this is generating a smorgasboard of dialects when just adding a piece of text like {#get-back-here} would be sufficient.

Pal_Petho · September 2, 2018, 10:07am

Hi,
I see a very simple easy to use implementation of anchors in Commonmark spec:

A [whitespace character](@) is a space …

…A [non-whitespace character](@) is any character that is not a [whitespace character].

Will that not be suitable? It is very simple, clear…

DStaal · September 16, 2018, 2:48pm

I actually like Pal_Petho’s general idea - though it should match the style of the already-agreed on manual header IDs, and shouldn’t delay implementation of that.

Crissov · September 17, 2018, 10:54am

Questionnaire

The topic is complex and there are a lot of options. I have tried to condense the principles behind them into a set of questions. This is not intended as a deciding vote but for finding out the collective opinion.

If existing Commonmark constructs are used to generate target anchors in the output format, these are known as automatic anchors or implicit anchors. A new construct or convention would be needed for manual anchors or explicit anchors.

Automatic anchors

All headings should become anchors automatically (using their textual content)
Implicit anchors (e.g. ## Heading) should automatically be available as reference link definition labels for overrides (e.g. [Heading]: {#ID})
Unused reference link definitions ([label]:) should become anchors automatically

0 voters

All links should become anchors automatically (using their textual content)
Specific inline links (e.g. [text](@) or [text]()) should become anchors automatically
All reference links ([text][label]) should become anchors automatically (using their label)
Specific reference links (e.g. [text][#label] or [][label]) should become anchors automatically
Other links should not become anchors automatically

0 voters

Manual anchor restrictions

Manual anchors …

may be restricted to headings
may be restricted to defining terms (e.g. <dfn> in HTML output)
may be restricted to blocks (i.e. headings, code blocks, quotations, …)
may be restricted to headings and defining terms
may be restricted to blocks and defining terms
should be available in arbitrary locations

0 voters

Manual anchor positions

Manual anchors …

should always come before/above text (e.g. ## Heading {#ID} ##)
should always come after/below text (e.g. ## {#ID} Heading ##)
may come either before/above or after/below text

0 voters

Manual anchors in headings …

should come between text and affix ## or underline === (e.g. ## Heading {#ID} ##)
should stay outside text and affix ## or underline === (e.g. ## Heading ## {#ID})

0 voters

Manual anchors in links …

should be inside the text part (e.g. [text {#ID}](target))
should be inside the target or label part (e.g. [text](target {#ID}))
should be outside current text and target or label parts (e.g. [text](target){#ID})

0 voters

Stylistic preferences

Anchor ID should always be inside curly braces {}
Anchor ID should always be prefixed by a hash sign #
Manual anchors should always be on a separate line

0 voters

vitaly · September 29, 2018, 8:07pm

Reminder: Anchors in markdown

I’d be happy to have automated anchors, but it will be a big ass pain if such things appear in spec without security considerations. Also, manual ID-s are not convenient (IMHO) as primary solution and may tend users to make security mistakes.

snan · February 16, 2021, 10:05am

“Choose at least two options” on the link anchor poll is wack when I don’t think links should be come anchors.

It’s headers (as in h1, h2, h3, h4, h5 and h6) that should become anchors.

Slugified from the text, sanitized for safety, and disambiguated with numbers iff there are duplicates.

jackdw · February 27, 2021, 4:30am

So, are we anywhere useful on this?

trallnag · January 2, 2022, 12:47pm

Nope. Many people are just using GFM instead of CM when they really need this feature

chrisalley · January 2, 2022, 9:59pm

From a practical point of view, I’m wondering if this is actually a real problem. GFM is a strict superset of CommonMark, so you’re using CommonMark if you’re using GFM.

The goal of CommonMark was to be a strongly defined, highly compatible specification of Markdown. Anchors were never a part of Markdown. CommonMark has been successful in formalising the features of Markdown, but hasn’t had any movement toward going beyond this and formalising extensions. GitHub has a practical need to formalise particular extensions to the CommonMark spec, so why not just use their spec?