Generic directives/plugins syntax

Unfortunately, the details element is not widely supported by browsers.

:confused: then at least as a ‘detail’ block within commonmark, but expressed in html in whatever best way we can do for now (e.g. how we do spoiler text in html nowadays).

There is no need for using a <detail> element which is already kind-of-defined in HTML, but may or may not be recognized by browsers and other HTML consumers: if the purpose (from the point of view of the CommonMark processor) is to “wrap” an uninterpreted block of raw text (which is what I take from @mofosyne’s remark about “to see how the custom container function in the wild”), one better “invents” an element type name which does not conflict with popular DTDs like HTML.

This works already quite nicely for an element type called <mark-up> in my little project (not exactly “in the wild”, but we’ll go there): As the CommonMark processor cmark_filter does now, so will other mark-up processors recognize and process this special “container element” type only, until it is vanished from the output document.

Something along those lines is used also by recommonmark (e.g. to bridge rst-specific constructs that are too annoying to support natively)

That’s an interesting key word you mentioned there: “recommonmark”—so thank you for the hint!

Yes, that seems similar to the concept I’ve based my project on; but so far I only took a glimpse of the first page that popped out of the Google search …

[The rst means reStructuredText, if I understand correctly.]

Exactly and probably this part should be agreed soon, since would be nice to be able to use it.

On the other side, I wouldn’t see any harm in allowing either colons : or equality signs = when specifying attributes, or allowing optional commas ,. The values on the right-hand side of a key-value pair shouldn’t contain either symbol and the additional overhead for parsers is minimal.

Markdown has some relations with HTML (e. g. it allows HTML code), so HTML attribute syntax is preferable.

I would like to specify attributes for RDFa. In such a case values can have colons:

[tiger]{property=ov:preferredAnimal}
1 Like

It seems like this discussion has significantly stalled; is that right? I’m trying to research what the current state of extending Markdown is, and this thread seemed like the place. Is that true?

1 Like

As I understand it, the priority right now is to get version 1.0 of the core CommonMark spec finalized before adding extensions. See this thread about adding tables to CommonMark:

5 Likes

That’s a very subjective one :wink: I prefer the equal ‘=’ sign.

Good point!

If we’d use colons, I would leave out the empty space after the colon. It looks good on multline listings but not on a single line. It competes with the literal spaces, that belong to a sentence and, thus, creates line noise. I think, here, perception of the line is different as it is in source code, where I prefer spaces after most of the symbols.

But I can live with either. Though, for uniformity, we should settle on only one.

I think the space " " itself creates ambiguity. Take programs, that let you add tags (to posts, etc.). This way I can not have spaces in them. Then I forget about it, assume a comma to be the seperator and bam! I have an unneeded tag.

Comma ‘,’ is what is used to list stuff, within a sentence, in literature. This is why we should be consequent about it. We also use them in arrays and in function signatures in programming. Empty space means separation of words, not items.

1 Like

Is there further progress on this, or is it stalled?

On the other hand, spaces do have precedent for item separation. Shell is one, and Fish shell notably goes all in on this. The other notable (and relevant) example off the top of my head is HTML’s tag attributes.

As such, for consistency with HTML, I’m against commas.

1 Like

Hi, everyone!

I made a plugin for markdown-it which fully implement this spec. :tada:

If you use javascript as your development language, just have a look!

If you think this work can help you, please give me a star! :smiley:


4 Likes

I don’t think it’s yet in the current official standard, but thanks @lookas for making an implementation. It will be interesting to see how well it will work in the real world. Is there a way to track if people are having issues with how your plugin is implemented that the community should account for?

It doesn’t look like anyone has explicitly discussed significant whitespace in directives. Currently the only Markdown syntax that supports significant whitespace is code blocks, and otherwise the spec says that leading whitespace should be trimmed. However, even in the OP’s examples, there is code inside the directives. I’m currently using an implementation of generic directives via the remark-directive plugin, and virtually all of the container directives I’ve implemented could benefit greatly from respecting significant whitespace.

Interestingly, I can maintain significant whitespace in my directives by also wrapping them in code fences, e.g.

:::ditaa
```
+---+
| A |
+-+-+
  |  +---+
  +->+ B +
     +---+
```
:::

While this works, it’s not particularly intuitive, and honestly it’s probably more of a bug than a feature.

The biggest concern that I have with making all directives respect significant whitespace is that there may be some directives that specifically shouldn’t respect it. Considering that all generic directives will need an implementation to be built into their renderer, though, I think it’s reasonable to assume that significant whitespace could be trimmed in the renderer in those cases.

considering that already the parser has to decide how to handle the contents of a directive, this is not something that can be handled in the renderer. IMHO, the contents of directives should always be parsed as markdown. For your use-case, using code blocks (with attributes) seems to be the right solution:

``` {.ditaa}
+---+
| A |
+-+-+
  |  +---+
  +->+ B +
     +---+
```

Reading your original post, it really feels to me like that wasn’t the intention, specially given directives such as :::eval, :::form and :::md-example given as examples in the post, which definitely contain things that cannot be interpreted as plain Markdown.

1 Like

Also, I wanted to be able to share a suggestion regarding this. Talking specifically about block directives: Is it really necessary to interpret attributes in any way in the CommonMark specification itself?

The specification currently makes no effort to actually parse the info string in code blocks, leaving it up to implementations to handle it whichever way they prefer. Would it not make sense to do the same with blocks of nested Markdown?

For example, consider the following:

~~~ class="lang-c" id="block"
int main() { return 0; }
~~~

CommonMark doesn’t specify that the info string should be interpreted as a sole language name, so if some implementation wants to parse the above as attributes to apply to the pre or code elements (if converting to HTML), it is free to do so. (And it can choose whatever syntax it wants.)

I think that could be extended for blocks of Markdown too, where the “info string” is also left to be parsed by the implementation itself:

::: aside class="ad"
Buy **our product**!
:::

The whole aside class="ad" would be parsed as an opaque string in the CommonMark spec, and the implementation could be free to interpret whichever way it wants.

For inline text spans, I think a good approach would be to simply have a []{} syntax, where the text inside [] is always parsed as inline Markdown, and the text inside {} is also an opaque string (that implementations can choose what to do with).

1 Like

Considering that already the parser has to decide how to handle the contents of a directive, this is not something that can be handled in the renderer.

Directives seem like a really great opportunity to enable any type of custom rendering, but being prescriptive about how the contents should be handled limits the opportunities to any type of custom rendering, so long as you’re good with the contents being converted to Markdown first.

To remedy this, I think that the parser should decide to leave the content of the directive alone. If the user wants the directive to be rendered to Markdown, they can rerun the parser separately.

As a more concrete example, consider a directive that renders Markdown into a specific element, like a figure with a figcaption:

:::figure
![Image](image_url)
::figcaption[A description of the *diagram* rendered ~below~ above.]
:::

Outputs:

<figure>
  <img title="Image" src="image_url">
  <figcaption>A description of the <em>diagram</em> rendered <s>below<s> above.</figcaption>
</figure>

The parser would not parse the contents of the directive, which kind of sucks. However, if the contents of the directive need to be rendered as Markdown, the renderer could run the parser again over contents of the directive. That’s maybe not ideal, but it enables so many more opportunities.

For your use-case, using code blocks (with attributes) seems to be the right solution…

In my case, I’m not trying to render a code block. The contents of the directive are actually being captured, encoded, passed to an API, then the API returns an SVG to be rendered in place of the original ASCII diagram. You can see a live example of this on my website. All of the diagrams are rendered with a generic directive, similar to the one I posted above with the code fences. The figures are also using generic directives. The implementation looks like this:

::::figure
:::ditaa
```
(diagram)
```
:::
::figcaption[A description of the diagram rendered above.]
::::

You can also check out this Github issue that I created on the remark-directive repository about enabling this option where I go into some more technical details of how I’d like to see it work.

Now, I realize that I’m talking very specifically about my personal desires for what generic directives could be, but I feel like it’s a really good example of the types of opportunities that generic directives can enable.

That doesn’t work well because of reference links and embedded HTML. It can also behave awkwardly in light of code blocks.

Example 1:

[a]: https://example

See [b].

::: foo
[b]: https://example

See [a].
:::

Example 2:

::: foo
<script>
/*
:::
*/
</script>
:::

Example 3:

::: foo

~~~ colons
:::
~~~

:::

Edit: Removed “code spans” from the list of potential problems, because it doesn’t make much sense.

Using indented code blocks instead of fenced makes this syntax a bit less noisy. It might still be unintuitive to require a code block, but custom directives already presume some author knowledge.

::: ditaa
    +---+
    | A |
    +-+-+
      |  +---+
      +->+ B +
         +---+
:::

The most likely issue with this is uninitiated authors editing an existing document and not understanding why the new diagram they added isn’t displaying properly.