Generic directives/plugins syntax

The main idea is to make easy to extend CommonMark.

And include directive is more than welcome and part of the core CommonMark.

agreed.

A directive only specifies intent, not implementation. So it’s fine to use it for !include . Plus if it’s specific to a site, we want them to keep away from core syntax modification and just use directives if possible. This is so that it is easier to ignore and gracefully fail non-standard directives.

```latex
[some latex stuffs]
```

Should be restricted to highlighting code only. A directive may not necessarily display it’s content visually on unsupported implementations. Plus the ! in front of !name indicate it’s special status as a directive. Hence

!!!name
... content ...
!!!

Still wondering if this is not too bad of an alternative approach to block representation. What I think this can allow for, is selecting how it can fail. E.g. let [] be displayed visually if extension not found. While () is hidden from user view.

!name
[[[
    ... content ...
]]]

RE the block syntax:

I disagree on this point. Several concerns:

  1. I don’t think there should be any element that, by default, acts as a comment and yet isn’t an explicit comment syntax.
  2. I view ```thing as saying “hand this block off to a parser for ‘thing’”. Extensions would operate no differently than code coloring and the only change that would need to be made is to provide the initiating line to the extension so it can parse it if it desires to.
  3. Any syntax needs to be human-readable and easy for writers/authors/creators. Once the syntax starts including all manner of special control characters, it is more complex than just using HTML. Even the standard image and link syntax are very hard to remember, IMO the image syntax should be the upper bar of complexity of syntax in CommonMark.
  4. I’m concerned that in some cases this is outside the scope of what Markdown/CommonMark should be.

RE the inline syntax:

The various use cases you provided in Generic Directive Extension List · commonmark/commonmark-spec Wiki · GitHub are all very different. IMO that means there should be different discussions around such use cases and if it makes sense to provide a syntax for them.

For example, I like extending ![]() to mean “embed this resource” not “create an image tag”. It is keeping with its semantic use and the spirit of MD. But that use-case is completely different (both semantically and to a user) than providing behavioral interaction ala a spoiler tag or generating content ala a table of contents and as such, it shouldn’t have the same syntax. So, at a minimum, that should rule out ![]() as a generic extension syntax.

Further, I don’t see the point or purpose of defining additional fields like {}. All that needs to be defined are delimeters, and the rest left up to any plugins/extensions.

I’ll say again, there are massive assumptions being made when designing things in this manner and I don’t think it’s wise.

What is the “more complex” syntax? This:

<figure>
    <iframe width="420" height="315" src="//www.youtube.com/embed/dQw4w9WgXcQ" frameborder="0" allowfullscreen></iframe>
    <figcaption>Funny video</figcaption>
</figure>

or:

@youtube[Funny video](dQw4w9WgXcQ)

You said you would use code blocks for that. So you’d want the following?

```youtube
Funny video
dQw4w9WgXcQ
```

which CommonMark parsers should translate to:

<pre><code class="language-youtube">
Funny video
dQw4w9WgXcQ
</code></pre>

I appreciate that not everybody has use for custom directives, but many people do. And for those it should be a great win if there is a standardized generic way to use and implement them. Keep in mind that I’m not advocating this to be part of “Core CommonMark”: it is a recommendation by the spec, not a requirement.

3 Likes

The use-case you used is the one I explicitly said is in-keeping with the spirit of Markdown.

For example, I like extending ![]() to mean “embed this resource” not “create an image tag”. It is keeping with its semantic use and the spirit of MD.

A theoretical !youtube[Funny video](http://youtu.be/dQw4w9WgXcQ) would render as “!youtubeFunny video” which, IMO, is a perfectly acceptable fallback since it still provides a link to the content and the exposed exclamatory declaration isn’t so jarring as to render the output confusing or meaningless.

But to specifically answer your straw man, no, I do not see any problem with this either:

```youtube
http://youtu.be/dQw4w9WgXcQ
```

It would render as an embed where a “youtube” plugin is available and as

http://youtu.be/dQw4w9WgXcQ

where no such plugin is available, it’s a reasonable fallback and non-breaking to existing implementations.


But even that doesn’t answer a multitude of other questions:

  • Why would you resolve this to a <figure> element?
  • Why an iframe instead of an object or video embed?
  • What if someone else wanted a different output format? Are there now multiple different potential outputs for “!youtube” depending on what site you’re posting on or what parser you’re using?

May be it worth to move block directives markup discussion to separate thread? I’d like to implement something. It seems that such block are independent on inlines, and syntax can be more simple.

1 Like

I agree. Inline discussion is stabilising around the ![](){} vs @[](){} but there is still some debate around block syntax. There should be a separate discussion on block syntax.

http://talk.commonmark.org/t/block-directives/802


nexussays: It’s a directive, not a function. It only specifies intent of users, not how it looks at the output. So the question about HTML element output in this context makes no sense, because it ignores other context like pdf or latex.

@mb21, could you join Block Directives discussion? I thinks blocks have more high chance to be stabilized in short time.

I’ve updated the first post in this thread with what was my takeaways from the block directives thread…

Upon @jgm’s remark, which echoes concerns that have been voiced before, and after reading through these pandoc div syntax proposals, I’ve tried to simplify the syntax even more and updated the first post yet again. So now the (arg) is gone and wrapped into the attributes.

Having read the entire thread, I find the use of !foo[](){} convincing for inline directives (where !image[]() is the default directive that you get if you don’t specify a name). I don’t think there’s sufficient goodness from other syntaxes to justify introducing something new here; they’ll always look somewhat code-y, so we might as well just reuse the existing code-y smell we’ve already accepted. I also like the use of []{} for generic spans, while we’re at it.

For blocks, I’m still somewhat convinced by “just use tagged code blocks”, again for the “just re-use the syntax we already have” reasons. The html-based input language for the spec preprocessor I maintain has a number of parsing extensions done with <pre class=foo>, which is very similar, and it seems to work fine. However, these extensions are all special-syntaxes, not nestable HTML; in other words, they’re a type of code, not a type of content. It may make sense to keep tagged code blocks for, well, code, even when it’s some specialized language. A block of latex that’s meant to be rendered, for example, might be good for this.

So then, the case in the OP for ::: block directives is relatively convincing. It allows both leaf and container blocks, which is great. It makes nesting of content possible without the processor needing to know what’s going on (it can just render unknown container directives as divs and continue processing the contents). Tagged code blocks can’t do this - if the processor doesn’t know the tag, it has to treat it as a <pre>. And I think the aesthetics are just quite nice, particularly the “spoiler” example without any arguments. That looks like something I might actually write in my plain-text documents, which is a big plus in its favor. None of the other proposed syntaxes have this.

So, in summary, I think we should be consistent with image links and use !foo for inline custom directives. We should use tagged code blocks for code, including cases where the code is interpreted and the result is somehow used in the place of the code block, like inline LaTeX. We should use the :::foo syntax for leaf custom blocks, and container custom blocks that contain more markdown. This introduces a minimum of new syntax, while preserving some imo decent plain-text aesthetics.

3 Likes

Thanks for sifting through the whole discussion! So you think !foo[](){} is preferable to just !foo[]{}? The () were supposed to contain an id, url or path, similar to the image syntax. However, this can always be done with {src=foo}. It’s a tradeoff between a more concise but harder to explain/remember syntax, and a simpler alternative with only contents and attributes.

(btw, I think !foo, @foo and ::foo are all fine for inline directives.)

Or, a somewhat more coherent version of my last post:

I think inline directives will always smell kinda bad, because plain text doesn’t really do anything with them. So, reusing an existing syntax at least minimizes the additional bad-smell introduced to the language, and increases learnability. So, I think we should stick with !foo for inlines, a la images, rather than introducing another meaningless bit of characters. There’s no functional difference between !foo and ::foo otherwise, just the prefix.

For blocks, I think we can usefully separate them into three categories:

  1. Leaf blocks
  2. Code-containing containers
  3. Markdown-containing containers

#1 is technically a subset of the others, but a lot of languages have special support for them, and I think they look decent in plain-text to support.

#2 and #3 should be distinguished so that processors which don’t understand the extension can do something useful - show preformatted text, or continue formatting contents as markdown. This is an important feature of generic extensions.

I think tagged code blocks already fulfill #2. It works for the obvious case of syntax highlighting, and I think is appropriate for “evaluated” code as well - falling back to displaying raw LaTeX, or raw railroad diagram descriptions, is totally fine and useful.

I think the OP’s proposal for using :::foo for #3 is good. It doesn’t clash with existing syntaxes, and it has good plain-text aesthetics, particularly when used with no arguments.

I also think it’s good for #1, for consistency. I assume you tell leaf apart from container by the presence of a blank line after it?

1 Like

So you think !foo[](){} is preferable to just !foo[]{}?

Not hugely important to me. I’m fine if the core language directives use some special syntax; I don’t think we really need to “explain” ref links ([foo][ref]) in this syntax.

On the other hand, consistency is nice - the two existing forms that use have a shared syntax for the () part: url, and possibly string for title. We could enforce this consistency: allow () on custom directives, but force it to have that same syntax. Not all custom directives need a url, for those that do, having them all use the same basic syntax (rather than having to remember which key this or that particular custom directive uses) is good. (I liked the examples of !video[foo](/foo.mpeg)/etc, those seemed very easy to understand if you already knew how images worked.)

Using () on a directive that doesn’t use that information is the same as passing a key it doesn’t use - it’s just wasted data.

(Things with more complex needs, like the slideshow example from previous that needs multiple urls, couldn’t use () for that. That’s fine, I think.


Edit: I expressed an opinion on the syntax inside of {} here, but I was wrong. Matching PanDoc and MarkdownExtra is valuable here; we should use their syntax, per http://talk.commonmark.org/t/consistent-attribute-syntax/272.

If 2. is implemented using fenced code blocks with attributes, that makes sense. Otherwise, the implementation could always keep the raw strings contents of the container in the AST (in addition to the parsed markdown-content) so that plugins etc. can do with it whatever they want.


No, the container blocks are closed with ::: on its own line, while leaf blocks are not. For nesting, a different number of colons can be used, using the same rule as currently for fenced code blocks.

If 2. is implemented using fenced code blocks with attributes, that makes sense. Otherwise, the implementation could always keep the raw strings contents of the container in the AST (in addition to the parsed markdown-content) so that plugins etc. can do with it whatever they want.

My reason for separating the two was because they have different “ideal fallbacks” when presented in viewers that don’t understand the extension. “code containers” should fallback to preformatted text, while “markdown containers” should fall back to formatting their contents.

No, the leaf blocks are closed with ::: on its own line, while container blocks are not. For nesting, a different number of colons can be used, using the same rule as currently for fenced code blocks.

The OP doesn’t have ::: closing any of the leaf directives, and one of the container directive examples (:::::SPOILERS:::::) has ::: on both sides of the initial tag.

Oh, obviously it’s exactly the other way, edited my post to:

No, the container blocks are closed with ::: on its own line, while leaf blocks are not.

btw, I’m the OP :wink:

Oh, obviously it’s exactly the other way,

The problem is that you currently have to search forward an arbitrary number of lines to tell whether the directive is a leaf or a container. That’s not very good for parsing; ideally you have a finite amount of lookahead (right now, parsing MD requires 1 or 2 lines of lookahead, depending on your strategy in some cases). It also means that you wont’ be able to use a bare ::: without any further arguments to just mean “div”, as that would also trigger the “I’m a container!” behavior of a preceding block directive.

It would be better if there was some way to distinguish containers from leaves in the first line, or the line after. That’s why I suggested “blank line after” to mean leaf; it seems consistent with how we handle other types of blocks, and is easy to learn. (“end of container block” would also work, obviously, for those blocks where that’s unambiguous.)

btw, I’m the OP

I know, I was using it to mean “original post”, to specifically point at the first post in the thread for clarity.

I’m no parsing expert, but fenced code blocks currently work the same way. You potentially have to look ahead to the end of the document to make sure that three backticks aren’t starting a code block, but are actually just part of the paragraph text. jgm seems to be fine with the proposal for pandoc as well.

Since you usually don’t want an empty div, I think it makes sense to have the div syntax to be closing as well:

::: {.my-empty-div}
:::

Without a blank-line-after rule, you can do something like:

::: spoiler

# my title

my very long paragraph...

:::