Standardized way to fix document outline

Markdown is often used to render user generated content as part of a website. In this context, it is important to fit it into the document outline, especially for accessibility.

Unfortunately, the HTML document outline algorithm has not been widely implemented, so just wrapping the document in an <article> tag is not enough. Instead, we have to shift the level of headings. (See for example how a first level heading is rendered as a <h3> on github)

I think this is an important feature that most (all?) renderers should implement. But there do not seem to be many implementations (I could only find a python markdown extension).

I understand that this would probably not be a feature of the language. But still it would be nice to get this reliably working across implementations.

Update: pandoc has the --base-header-level=NUMBER option.

Update: Github does no longer shift headings for READMEs.

I don’t think this is necessary within markdown at all.

I think it makes sense as a simple post-processing step, as that’s what you’ve described.

Particularly: you’re going to need to define what happens with <h6> tags if you have to shift h1-h6 down 1 level. There are a variety of options available, and I don’t think markdown should have to define a single solution to that problem.

It’s not particularly hard to run the HTML output from markdown through a parser to perform secondary transformation. While you’re manipulating headings you can also automatically add IDs that don’t conflict with the existing IDs on the page. You can also add multiple versions of linked images via srcset.

There are a lot of enhancements to be made after markdown is processed that don’t require any sort of interaction with the raw markdown, and–in my mind–don’t belong in markdown at all.

@xim isn’t saying it belongs in the CommonMark spec. She/he is saying that it belongs somewhere, perhaps as a recommendation for parser implementers. Perhaps it makes sense for CommonMark.org to host a Best Practices for Renderer Implementations document.

Some things, such as automatically generated IDs, might belong in the spec but if so should be accompanied with a note about implications for renderers. These could all be collected in the best practices doc. @jgm and @codinghorror: something to consider?

Of course the reference implementations should follow those best practices.

1 Like