Generic directives/plugins syntax

I couldn’t agree more with @jgm’s suggestion. As an example, I have been able to implement a convention for floating environments, such as figures, table and code listings, purely using existing md syntax (aside from attributed images, but that’s another story).

LIke John suggested, if you are making an extension that maps to block level content, making the starting delimiter a special section that stars with a keyword, such as

### Figure: proposed trajectory of Apollo 14 {#apollo14-plan}

gives some useful advantages:

  1. Headers serve as a relatively unambiguously marker, both in the raw Markdown (it’s easy for regex to parse) and the parsed AST (there’s no need to recurse into AST trees since it’s only allowed in the “top level”).

  2. Text editors, even with just the most rudimentary Markdown support, have no problem parsing headers. Most of them will also give you some sort of automatic TOC based on the headers. For my case, it was very nice because I jump between several text editors throughout the day and I was able to have my figures show up (along with their anchor label) in the TOC without having to write a single custom line of editor plugin code.

Thanks for the thoughtful replies.

My current use case is trying to map Markdown to reStructuredText directives. I’m one of the maintainers of Read the Docs and we currently support Sphinx with rST, and are planning to support the current spec. We will be waiting until an extension syntax is defined before adapting one, so we don’t end up supporting two.

We will be adding support to Sphinx for current CommonMark spec though the CommonMark-Py implementation. For now, we will just implement it with no extensions, but it would be amazing to be able to get a more complete mapping to rst concepts.

My current thought is that rST directives (ref) are a bit too undefined. This leads to some situations, like the admonition where the arguments and content is ambiguous. Another instance is the note directive, that has no arguments but has content.

An example of this: http://rst.ninjs.org/?n=fac7927439c29f7eb80db0f255a21427&theme=basic

I’d argue this is an annoying implementation detail of rST, but my main concern is being able to map properly. Having explicit argument, option, and content syntax would solve a lot of it for us from the Markdown side. Then we will just need to figure out how to map it to rST.

I’d also love to get extensions into CommonMark so that we can begin to support them. We might go with your suggestions of extending current syntax for now – but the main power for us comes from a relatively complete mapping of the markup.

1 Like

Some sphinx/docutils features do not map neatly with the current extension we are discussing here.

Namely the best course of action to make it non-awkward is to import as-is the directive and then use the generic way to put attributes for stuff like roles and such.

(I’m the guy adapting the remarkdown to use commonmark-py and I extended it already to support block attributes).

1 Like

Yea, the main thing I’m worried about in the deeper mapping is the dependence of the directives on the rst parser – I guess at the end of the day even pushing the content into the directive as a string it would be parsed as rst but the author would be using markdown, so we need to use the generic way of mapping parsed commonmark objects.

Not really, the directive interface itself just takes the text and the option and passes it down to a specialized parser. Usually the parser feeds back docutils nodes to be processed by the generic docutils code.

The problem I have is that the model of those directive is that options can have the same name and multiple instances, while here we have a dictionary, thus a single key.

Interesting, we first thought about having both arguments (i.e. filenames, urls and other identifiers) and options (i.e. key-value pairs which we call attributes) as well. But then we dropped the former since having both is somewhat redundant—you can simply write :youtube[funny cat]{id=1234 fullscreen=true} instead of :youtube[funny cat](1234){fullscreen=true}—and it has the potential to confuse users unnecessarily.

(Note that you can still have attributes as well as content, the difference being that attribute-key and -values are plain strings, while content is markdown as well.)

Some sphinx/docutils features do not map neatly with the current extension we are discussing here.

@lu_zero, are you talking about the markdown directives proposal having no arguments, or?

Do you guys think this poses a major problem for mapping to docutils?

@ericholscher, would love to hear your feedback on the proposel in the first post in this topic, which I’ve updated a few times as the discussion progressed.

I think the proposal generally looks good. My main concern is with the above example, which conflates content with arguments. IE, I would consider “Foobar” an argument to the wikipedia directive, but “content” is actual content for the smallcaps directive. I believe rst did this same thing with mixing arguments and content, but I don’t know if there is really a way around it, since users will likely do it because it’s a more natural syntax.

For example, this feels kinda off to me:

:wikipedia[**Foobar**]
:smallcaps[**we went for a run**]

Though having it be: :wikipedia{Foobar} might be workable, I think that the separation might will not be kept by third party authors. I think it might be an unavoidable side effect of offering this kind of feature, and probably not able to be stopped in a spec.

I’m not totally set on RST’s differentiation between arguments and options. I think we could do a hacky thing and map {arguments='string of arguments'} or something to produce a more compatible mapping. I think at the end of the day, mapping perfectly is going to be impossible, or at least really awkward.

Example:

.. py:function:: send_message(sender, recipient, message_body, [priority=1])

   Send a message to a recipient

   :param str sender: The person sending the message
   :param str recipient: The recipient of the message
   :param str message_body: The body of the message
   :param priority: The priority of the message, can be a number 1-5
   :type priority: integer or None
   :return: the message id
   :rtype: int
   :raises ValueError: if the message_body exceeds 160 characters
   :raises TypeError: if the message_body is not a basestring

The simplest way is make something like

:::rst
.. py:function:: send_message(sender, recipient, message_body, [priority=1])

   Send a message to a recipient

   :param str sender: The person sending the message
   :param str recipient: The recipient of the message
   :param str message_body: The body of the message
   :param priority: The priority of the message, can be a number 1-5
   :type priority: integer or None
   :return: the message id
   :rtype: int
   :raises ValueError: if the message_body exceeds 160 characters
   :raises TypeError: if the message_body is not a basestring
::: 

or

:::py_function[send_message(sender, recipient, message_body, [priority=1])]


   Send a message to a recipient

   :param[str](sender){c="The person sending the message"}
   :param[str](recipient){c="The recipient of the message"}
   :param[str](message_body){comment="The body of the message"}
   :param[](priority: The priority of the message, can be a number 1-5
   :type[priority](integer, None)
   :return[](){c="the message id")
   :rtype[int]
   :raises[ValueError](if the message_body exceeds 160 characters)
   :raises[TypeError](if the message_body is not a basestring)
::: 

to use something close to what we discussed so far.

1 Like

Interesting. That is certainly one way to make it work – and likely less hacky than all the other ways I was looking at doing it. It will makes the user write rst, but it makes a nice bridge between the two.

Luckily the number of directives with a body is small so would be possible to use the markdown parser for it but keep the option parsing.

One of the items that should be checked twice is the cross-reference support.

1 Like

Wild idea. How about unifying code for plugins and code formatting. The average user cannot tell the difference anyway, so why bother.

~~~ youtube { vid=09jf3ow9jfw }
Here is a video of my cat. 
~~~
1 Like

The markup looks leaner and intuitive, I wonder how it would behave with other formatting needs (e.g. sphinx/docutils bridging).

+++ nielsle [Jun 02 15 13:23 ]:

[1]nielsle
June 2

Wild idea. How about unifying code for plugins and code formatting. The
average user cannot tell the difference anyway, so why bother.

Here is a video of my cat.

One advantage of this sort of approach is that requires no
changes in the parser. You just need to go through the AST,
looking for specially marked code blocks and transforming
them to links or whatnot.

Compare my comment here.

One nice thing about most lightweight markup languages is that they do not rely on English words for tag and attribute names or keyword values since they are repurposing ASCII punctuation/non-alphanumeric characters instead. All those extensions with key-value pairs and explicit names harm that principle. They are a step back. They are not based upon prior art, i.e. actual practice in plain-text media. They are not in the spirit of Markdown. They impose coder habits onto a general purpose language. They must be considered harmful. They are a bad idea.

That being said, there is a place for names that are not predefined, i.e. IDs and classes, or that are proper names, e.g. language names for automatic highlighting in code blocks. IDs can be auto-generated, however, in most if not all cases.

In other words, if you want verbose tags you have come to the wrong place and should use XML/HTML or (La)TeX instead.

3 Likes

+++ Crissov [Jun 15 15 23:47 ]:

One nice thing about most lightweight markup languages is that they do
not rely on English words for tag and attribute names or keyword values
since they are repurposing ASCII punctuation/non-alphanumeric
characters instead. All those extensions with key-value pairs and
explicit names harm that principle. They are a step back. They are not
based upon prior art, i.e. actual practice in plain-text media. They
are not in the spirit of Markdown. They impose coder habits onto a
general purpose language. They must be considered harmful. They are a
bad idea.

This is a real advantage of Markdown (and its ilk) over LaTeX, DocBook, HTML, etc. If you write in Swedish, your source document looks like Swedish – not Swedish with a bunch of English words mixed in. So I’m against hard-coding English words into the syntax.

Of course, going too far in the other direction makes your source document look like Perl.

9 Likes

Just discovered and played around with @vitaly’s markdown-it container plugin which allows the most basic form of 3. Container Block Directives proposed above. Pretty cool! Looking forward to hear your experiences when you’ve got more user feedback etc.

We use it for quotes & spoilers:

Don’t be confused wit ``` syntax. That’s for convenience. It overrides defaults when keywords quote and spoiler used, and run our plugin to allow markup in internal content.

I meant:

::: warning
here be dragons
:::

Yup mb21, I would be interested to see how the custom container function in the wild. Seems to be fine so far.

What would you think of the detail block as a solution for spoiler blocks w3schools page on the <details> tag

^^^ classname summary text
full content here
^^^
<details>
  <summary>summary text</summary>
  <p> full content here </p>
</details>

e.g.

^^^ spoiler for harry potter ^^^
snape dies
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Unfortunately, the details element is not widely supported by browsers.