Optional syntax

Related to the discussions here: Generic directives/plugins syntax, here: Guide for syntax extensions and here: Add “plugin” syntax to the spec, but from a different perspective, I propose that the CommonMark specification include optional elements.

I don’t mean optional elements with special namespaces I mean optional elements that are completely generic in terms of syntax like the rest of the spec, but optional. At least the following would have to be considered:

  • What would be the process for getting optional elements accepted into the spec?
  • How would the spec convey that an element is optional?
  • How could this be done in a way that doesn’t adversely affect the deterministic nature of implementations of the spec?

For a made up example consider two imaginary optional elements that both render text small:

  1. surround text with $$: $$small text$$
  2. surround text with $$$: $$$small text$$$

If both of these optional elements were implemented by a markdown rendering engine, would they render as something like small text or $small text$? It depends which is considered to take priority and this should be handled cleanly by the spec for example by assigning each optional syntax a sequence which is the order of processing.

Namespacing and plugins can then be used to handle esoteric or ‘uncommon’ extensions and ‘optional syntax’ can cover the things like table and footnotes syntax where competing implementations exist.

This opens the door to including syntax that is hard to parse or maintain (like some proposed table syntax) as supporting the syntax is then optional for each implementation.

6 Likes

I think that this is more important than people might think. It would allow the standardization of common features such as tables and anchor links, but leave the inclusion of those things up to the developer when they build their implementation. The key thing here (IMHO) is the standardization concept – that way common features will all behave predictably, regardless of which implementation is being used or whether the features are considered “core” or not.

Without something like this, we risk the fragmentation of how other features work. The CommonMark spec should at least standardize things to prevent 100 different table syntaxes, even if the CommonMark implementation doesn’t include tables itself. Because, believe me, if a spec isn’t laid down for something popular like tables, there is going to be fragmentation. That runs counter to CommonMark’s goal.

4 Likes

Could we be running into a small (and, I expect, temporary) nomenclature cul de sac?

  • Is “The Spec” always going to be “The Spec”, that is, limited to the “canonical” Markdown markup?
  • Will all non-canonical elements, then, be in the domain of “extension”?
  • So will there be two “specs”: CommonMark Core Spec + CommonMark Extended Spec? or just one “spec”, with a set of optional elements (is that what you’re envisaging, @Jack_Douglas?)?

Either way, the intention is to avoid proliferation of competing popular extensions (tables and footnotes being the most prominent) which is against the whole spirit of CommonMark, of course. So another way of thinking of my conundrum (and perhaps it’s only me!) is whether this thread is best categorized with “spec” or “extension”?

It would be good to have some clarity on this, even while the attention of the core team is on the core spec, and justifiably so.

1 Like

So I see three categories emerging:

  • Core: the scope of what currently is in the spec, which all CommonMark implementations need to support.
  • Recommended/endorsed/standardized extensions: syntax that the CommonMark project (either through a section in the spec or through a separate document) endorses and says: “if you are going to implement this feature (e.g. footnotes, definition lists, etc.), then you SHOULD use this syntax here.” (Or if each extension is published in a separate document, it could even say: “you MUST use this syntax.”) Of course we must make sure that endorsed extensions are in no conflict to another whatsoever.
  • Custom extensions: all extensions that haven’t been officially endorsed by the CommonMark project.

Since every string is valid markdown, you can always construct cases where an implementation that supports an extension outputs different HTML than an implementation that doesn’t. That’s why the core spec doesn’t say that all implementations must behave the same way on all input strings, but rather that there are these extensive test cases for all features in core and that all implementations must behave the same on those. Endorsed extensions would then just add a few test cases and implementations that claim to support an extensions need to pass those. That way extensions can be cherry-picked by implementors while still remaining compliant to the spec.

It would be nice if some of the authors (@jgm, @codinghorror) could quickly chime in and say which direction they think is most appropriate for “endorsed extensions”. That is: add section to current spec or publish in separate document(s)? work on this already now or wait till core has reached 1.0?

2 Likes

One way to structure different editions of CommonMark is this:

Specification Levels:

CommomMark Barebone, Core, and Extended comes as both a specification and an implementation. Only CommonMark draft comes as a specification sheet only (as a wiki?). Anything written in a lower number specification can be rendered in a higher number.

  1. CommonMark Barebone := The most minimalistic specification. Designed to be most computationally efficient. Uses only the most commonly used markups. E.g. Here is a tiny implementation of markdown: micromarkdown

  2. CommonMark Core := Is considered the base standard. This will cover 80% of all use cases. If an extension is used in too many CommonMark Extended package editions, then it should probably be in core.

  3. CommonMark Extended := Most complete officially implemented specifications theoretically. It is a set of packages, aimed at different audiences with a mix of extensions relevant to a field. Examples shown below:

  • Scholar :~ Supports extensions often requested in academia like Figure tags
  • Film Script :~ Stuff useful for those storybording or film scripting
  • Coder :~ Stuff useful for programmers documentation
  • Etc… you get the point.
  1. CommonMark Draft := This specs is like the ‘w3 draft specification’ sheets. It is not implemented anywhere, but is there to be revised over and over again by the community (via a wiki?) until it is ready to be implemented in CommonMark Extended. (But at the very least, it will lets other implementers know ahead of time how to write their semi-fork. E.g. the moz- prefix for HTML/CSS specification drafts in firefox)
  • Perhaps every month, the wiki is generated as a pdf and stamped with a version number, so that people can say they are compliant to CommonMark Draft V2.XX etc… for non officially supported implementation.

Fail Gracefully

In html specification. A key philosophy is to fail gracefully. So for CommonMark, if you render using a lower implementation, it should still come out as legible writing. This comes though writing higher order syntaxes carefully in such a way that it can fall back into other forms of rendering in lower levels, that will come out as aesthetically and contently pleasing enough.

e.g. !youtube[Cat Video](https://www.youtube.com/watch?v=dQw4w9WgXcQ) will gracefully die like this " !youtubeCat Video ". Where the older parser will still parse []()

I’m envisaging just one spec, with a set of optional elements, one of which will be a generic syntax for extensions/plugins. There is a set of features of which tables and footnotes are examples, which are already commonly implemented in various ways and there is an advantage in having these part of the ‘spec’ but not mandatory.

In my view, the first ‘complete’ version of the spec should include at least two ‘optional’ elements so we are forced to think and work through any practical issues pertaining to including optional elements in the spec early in the process.

Ideally I’d like to see three or four, such as:

  1. Syntax for extensions
  2. Syntax for pipe delimited tables
  3. Syntax for footnotes
  4. Syntax for definition lists

No implementation would be forced to handle any, but those that choose to do so (like GitHub
Flavored Markdown for example) would be able to benefit from the consistency provided by the spec here.

We can then sidestep some of the questions here about whether tables (for example) should be included in the ‘core’ spec, without encumbering them with namespacing or leaving them without a clear set of rules for implementation.

1 Like

I do want to be clear that I agree that attention should be on the core spec: what I’m trying to promote is that the process/mechanism for including optional elements be part of the core spec. I’m not primarily trying to persuade anyone which optional elements in particular should be included.

What do you think about having Specifications levels like in my previous post in this thread? Wouldn’t that provide the process for the eventual inclusion of optional elements in core specs once it’s found to be of use in majority of ‘extension packages’?

You make some good and helpful points about the process or evolution of syntax into the ‘core’ spec that I haven’t addressed. I think we are looking at things slightly differently though - I’m imagining that most optional elements would be forever optional and implementations would always be allowed to be ‘CommonMark Compliant’ without including them.

Well, that can be part of a CommonMark Draft wiki of community editable specifications, which means you could just call the extension CommonMark draft compliant. If you need consistency, then you could implement ‘version system’ for the draft specification wiki (e.g. auto compiles a pdf spec doc every month with a version number).

The issue with making elements CommonMark Compliant instead of CommonMark Draft Compliant is that people will end up expecting it to work in many other places, and probably will just end up getting frustrated. The draft speccing wiki could contain a list under each draft extension, on what websites or software is currently using that particular draft extension.

Hope that sounds more reasonable. If you like it, then maybe I should post speccing level thingg as a new thread and see what’s the general response.

Edit: Posted seperate thread to Multiple levels of CommonMark specification?


In practice:This engine is compliant to 'CommonMark Core' and includes figure extension from 'CommonMark Draft V2.XX'