Imagine for a moment being able to type a shortcut notation like <WP:TL;DR>
to generate an external hyperlink that helped readers understand your concise intent. Better yet, try mousing over the link above.
Compact URIs
About 5 years back, the textual datatype W3C Compact URI (a.k.a CURIE) was standardized. CURIEs are similar to QNames, but resolved some of their shortcomings. CURIEs are the sort of building-block datatype that would have been a great foundation build XML, YAML, etc.; but came a bit too late for those standards, but itâs seems worth consideration consider for CommonMark.
Leveraging CURIEs for in CommonMark could be a great way to help standardize a way of being non-standard. To do so, we need a few things:
- A set locations where text would be interpreted as CURIEs
- A syntax for declaring CURIE prefixes
- Some conventions on dealing with unrecognized prefixes
Lets illustrate the first two of these with an example. There are potentially many locations in CommonMark that could benefit from CURIE syntax, but lets start with a simple case of a hyperlink to illustrate:
<?prefix wp: <http://en.wikipedia.org/wiki/>?>
[JavaScript]: wp:JavaScript
[prototype-based]: prototype-based
**[JavaScript]** is a [prototype-based] programming language.
Which (under this proposal) should produce HTML output of:
<p><strong><a href="http://en.wikipedia.org/wiki/JavaScript">JavaScript</a></strong> is a <a href="http://en.wikipedia.org/wiki/prototype-based">prototype-based</a> programming language.</p>
To be clear, CURIE processing should only occur in contexts where a CURIE is expected, the naked instance of something that looks like a CURIE in plain text a would not invoke the auto-linking mechanism. So a counter example is in order:
<?prefix wp: <http://en.wikipedia.org/wiki/>?>
The text wp:foo is not treated as a CURIE.
Should produce HTML output of:
<p>The text wp:foo is not treated as a CURIE.</p>
But when angle brackets set inâŚ
Angle brackets are frequently an indication of hyperlink intent. But we donât want to misinterpret angle brackets needlessly, so we need some clearer indication of the writerâs intent before interpreting something that might be an extension as an extension.
A processor can distinguish between a URI and a CURIE only if the prefix has been declared. Pre-assignment of some of prefixes (like wp) might be part of some flavors of CommonMark, but we also need a way for authors to declare prefixes unanticipated by a flavor implementer.
Prefix Processing Instruction
The <?prefix...
processing directive would be passed through to the in the output verbatim, but would be ignored by a web browser. Such an extension could be implemented in a post-processor, or a more integrated extension could address it during CommonMark processing. A CURIE prefix follows the XML definition of a NCName, and would be followed by the URI prefix definition. As shown in this example, itâs generally useful that the prefix is defined with any trailing delimiter included.
CURIEs for CommonMark extensions
The same Compact URI prefixes that can reduce the amount of text needed to specifying the URL target of a hyperlink could be leveraged in a framework for CommonMark extensions. In an integrated extension however, the CURIEs would only be treated as a identifier, and generally not de-referenced CommonMark processor, only (potentially) referenced by it. For example, letâs say that the @
character is treated as an extension mechanism for link syntax, then something like this might make sense:
<?prefix x: <http://www.sample.com/formspackage/>?>
@x:button[OK]: submit.aspx
@x:button[Cancel]: home.htm
Are you sure:<br> [OK] [Cancel]
Now, if prefix and CURIE were recognized by an integrated CommonMark processor extension, then it could output something like:
<p>Are you sure:<br> <button href='submit.aspx'>OK</button> <button href='home.htm'>Cancel</button></p>
While if URI were not a recognized extension, it might make sense to generate links rather than buttons. If use of an extension were semantically ânecessaryâ to the document content, then perhaps a different prefix (e.g. !
) might be appropriate. The best output from the processor for a unrecognized !
extension might be warning message.
Its important to emphasize that the CommonMark processor should never need to de-reference an extension URI generated from, a Curie prefixed with @
or !
, it would instead be simply matching it up with the identifiers specified by available extensions. But the use of a resolvable URL for extensions might be a convenient way to find documentation on the extension.