I think, on practice, current whitelist has no much use, because allows unsafe jsvascript/vbscript schemas. Removing checks will not make things worse, but will make those more simple and flexible.
As you can see from the thread above, the intent of the whitelist was not to help with security, but to allow tags with XML qnames, like <math:mrow>. So that is the primary issue.
Hm⊠i can understand XML output somehow (for advanced structure validations), but input⊠isnât HTML5 enougth? Iâve seen only 2 related mentions - math & epub. Math does not need qname in html5. No experience with epub. Looks like epub3 is ok with html5, and will require additional cryptyc convertors anyway.
Taking into account the âno SPACE in auto-linkâ requirement, that is: if this requirement still stands, and thus the example 552 input is not treated as an auto-link:
Spaces are not allowed in autolinks:
Example 552 (interact)
<http://foo.bar/baz·bim>
Then it seems to me that there is no need for a hard-coded list of URI schemes to distinguish URIs from XML tags starting with a QName1): you can always add SPACE in any (start, end, or âempty-elementâ) tag that contains a GI and thus prevent interpretation of say <m:math> as an auto-link by writing instead <m:mathâ”>, and I think inserting this SPACE into âempty elementâ tags like this: <m:piâ”/> is or was even recommended (to help user agents cope with XML).
______
Actually, element type and attribute names containing COLON were already allowed in W3C HTML 4, but just not of much use there; so I would say that CommonMark should be able to handle âraw HTMLâ (and particularly XML markup) using such names in a more general way than just allowing one letter in front of the (first) â:â.
Hmm, well, per the title of the topic itself, the most relevant bit of this topic is only a âshouldâ 1.0 issue, not a âmustâ 1.0 blocking issue â
Remove hard-coded list of protocols for autolinks? (SHOULD)
And I think the answer is, yes, we should remove the hard-coded list of protocols. See the last post by @jgmup above.
Well, yes. So? Is anything but blocking issues off-topic now? [Iâd reckon that this topic is equally pertaining to the blocking issueâInconsistent handling of spaces in links? (MUST)â anyway.]
[âŠ] I think the answer is, yes, we should remove the hard-coded list of protocols.
seems pretty harsh and, as I tried to explain, unneeded, and would plausibly not suffice for applications other people have in mind. Thus Iâd rather not see such a restriction introducedânotwithstanding whether itâs meant to be a resolution of a SHOULD issue or a MUST issue âŠ.
@tin-pot, thanks for the comment, thatâs useful.
I think we should make the change I suggested above, or something like it, for 1.0.
Itâs not a âmust,â but I donât see a strong argument against making it, and itâs not a difficult change.
Iâve made the change, updated the spec and both reference implementations. Please have a look.
Of course, we now allow more things that arenât valid absolute URIs, e.g. what-the-heck:is!this. I had to switch a test with <localhost:5001>, which previously didnât count as an auto-link but now does (with localhost interpreted as the scheme!). But I think thatâs okay. We were making no real attempt to weed out invalid URIs before (except for invalid schemes).
1. This is [hopp](localhost:8080)
2. This is [hopp](localhost:80 80)
3. This is <foobar:dingbat class="invalid">.
4. This is <f:dingbat class="invalid">
5. This is <dingbat class="invalid">
for (1.) a link, for (2.) not (the SPACE!); no markup for (3.) or (4.); but (5.) gets passed through as markup (as html_inline in the AST).