What is the point of limiting URI schemes in autolinks?

Burt_Harris · September 12, 2014, 10:53pm

There may be some confusion here, I’m discussing the CommonMark input specification, not the target language (e.g. HTML 5 vs XHTML). My question is really independent of that.

Specifically I asked about in the 1.0 autolink syntax, which sort of looks like HTML due to the use of angle brackets, but autolinks are not HTML (or XML).

The confusion arises because the html blocks and raw html features, also use angle bracket notations. But these features, which do use HTML as input are distinct, and in the current spec trigger on tag name definitions which exclude colons.

Note: Within a CommonMark html block you can already use notation like <svg:svg> to your heart’s content. You just can’t start the block with <svg:svg>. Instead, just wrap it one of the supported HTML block tags like <div> (or start it with an HTML comment), eliminate any blank lines, and you are set to go!

But that’s not my point.

The point is that at the beginning of, or inside the text of a paragraph. input <svg:svg> wouldn’t be treated recognized as any of the above under the 1.0 specification. The reference implementation would translate it to HTML<svg:svg>. But unless I’ve missed something, that’s being done based on an unwritten rule.

I’m not unhappy that CommonMark 5<7 renders as HTML 5<7 – that’s a good thing that deserves an explicit rule in the spec.

Its the boundary conditions between the unwritten rule and autolink syntax in the 1.0 spec that concern me. The current boundary is implied from the list of schemes, adding complexity to the spec and hurting some extension scenarios.,