No-markdown islands


@mb21, I think there’s some confusion, I’m not talking about support for parsing and taking action to render LaTeX, or any other language. When using MathJax in a web page for example, this is the job of the MathJax script to take LaTeX and render it appropriatelly, MathJax will understand its delimiters as $...$ or \(...)\ IIRC, and will take the stuff inside to produce rendered LaTeX, but what’s inside is just verbatim LaTeX that didn’t get pre-processed by some markdown parser, it should not.
Such job that MathJax provides using javascript, could also be provided in other context, I don’t have another example, but one can infer that it would also be necessary to not have parsing of such input that’s intended to be consumed by a script.


If you’re interested, I think I proposed something similar to you for the mathjax example at least: Mathematics extension


Something like this would be useful for embedding MathJax, we also ran into this problem at Stack Exchange and ended up hard coding an ignore sequence whenver we saw the MathJax start and end blocks (the dollar signs, etc, as I recall). For example


Yes, I think it would be good to have in the spec, that anything between $…$ and $$…$$ or between \(… \) and \[ …\] shouldn’t be processed by CommonMark. This is the standard for indicating LaTeX math blocks. And it would be great if it always possible to add mathjax support to any CommonMark editor. Without having to write difficult plug-ins, like the plugin I wrote for discourse:


IMHO this is a part of more generic problem - format for inline extentions. Block extentions are now discussed in several threads, but those markup is not ok for inline elements (in paragraphs and lists).

On practice, inline elements have different philosophy - syntax should be as simple as possible. So, it would be more convenient for users to have different markers for each case, than one universal marker with name and params.


Does a <![CDATA[]]> section fit your definition of a no-markdown island?

You can put almost anything you want in one, except maybe ]]> There seems to be a few fine points to work out around them, but they are already in the spec.

I was testing in the block context and found a spec issue, see: Drawing a distinction between HTML block elements and non-element tags. We may need to pay similar attention to the difference between Raw HTML and non-element tags like CDATA.


No, it doesn’t fit since the CDATA section ends up in the rendered output wrapping its contents. MathJax LaTeX, for example, is not expected to be wrapped by this in the output.


I see. So it sounds like this topic belongs in the Extensions category.


Indeed, but I’d like to point out that the no-markdown literal I’m suggesting can encompass, but is not limited to MathJax, it’s a simpler feature for asking the parser to not parse contents, just output them verbatim. So, let’s say, if the delimiters were ''', so _foo_ '''$E = mc^2$''' *bar* would be the thing that would make $E = mc^2$ goes to the output without being parsed, leaving the $...$ thing for the job of MathJax.


Maybe, but, at last for me, this is such a must to be right in the spec, I miss it didn’t happen to have been included in Markdown from day one.

Mathematics extension

One approach could be to have a general directive like this in the beginning of the page.

    latex: $ ... $
    C: /* ... */ 

$ 1+1 = 2 $
/* hello world */

Good thing about this approach, is that it is easy to refer to, and can be adapted as needed.

Also this above is local to the page only, but I’m sure it could be possible to create a sitewide header for declaring these (e.g. a website handling maths tutorials). And perhaps, baked into core, if used often enough by everyone like latex `$ … $``


This is the key point: who knows what kind of other parsing scripts will conflict with Markdown in the future? Having a simple “no Markdown” notation would make forward compatibility much simpler and possibly reduce the need for extensions to the Markdown parser.

I still think that the best bet would be to somehow combine this with the raw HTML notation, since the whole point of that is for Markdown to pass it along unaltered.


mightymax, this could be your solution. Use markdown=1 flags in markdown (where all html tags are by default markdown=0)

<div markdown="1"> 
This is *true* markdown text.


I’ll just point out that the markdown=“1” trick should be credited to John Gruber. It was his plan to incorporate this into Markdown at some point (and if I recall well, one of the 1.0.2 betas had the feature enabled, but it was removed in later betas because it had issues).



It’d be a lot cleaner if the parser parsed the guts of capitalised tag names, and converted them to lowercase. This example looks messy, but in practice, you wouldn’t need to use the feature very often.

# this is a header one
<p> # this is just text <SPAN> but `this is code` </SPAN></p>

…becomes, more or less…

<h1> This is a header one </h1>
<p> # this is just text <span> but <code>this is code</code> </span></p>

This is not exactly backwards compatible, but you could easily convert existing docs by just capitalising every tag name.


interesting approach…


Indeed, that’s a key point. I also think it may be ok if included in raw HTML notation section, the only thing that I find not optimal with that is the requirement to have html involved. This no-markdown thing doesn’t implies involvement of HTML. If it were just a kind of markdown notation, instead of HTML based, we could get verbatim content in the output without it having to be wrapped by <span> or <div> tags. When using MathJax for example, inside an html file, you can have foo \( math \) bar, instead of foo <span> \( math \) </span> bar which is what you would get with the no-markdown thing based on raw HTML, even though it works for effects of display.


I was playing around to day if I could get commonmark to work with mathjax.

And it was quite easy, commonmark already has something to parse raw input:

// Parse a run of ordinary characters, or a single character with
// a special meaning in markdown, as a plain string, adding to inlines.
var parseString = function(inlines) {
  var m;
  if ((m = this.match(reMain))) {
    inlines.push({ t: 'Str', c: m });
    return m.length;
  } else {
    return 0;

It also worked for \begin align etc. for me. I use it now at this project:

I’ve kind of stolen the layout of the editor on the common mark try page, I hope that this is no problem, I kind of like it :smile:


But isn’t it hard to let Markdown support exactly the delimeter that MathJax expects? And if we do that, other languages would need other delimieters, which also would need supporting?

Also, you can use anything inside CDATA, including the delimeters you need, right? E.g., doesn’t including ` work? Of course it isn’t very pretty, that’s true, and perhaps MathJax or whatever gets confused by the CDATA tags that stick around in the resulting HTML?


What’s being suggested is not related to MathJax specifically, as has been stressed all along.


Yes, that was exactly my point. Use a CDATA tag to suppress markdown parsing and inside use whatever magic sequence to trigger whatever further parsing you want?

I think I misinterpreted this part. I thought you meant that MathJax doesn’t use CDATA as a delimiter, so Markdown should use the same delimiter as MathJax. Reading it again, I think you mean that MathJax doesn’t recognize its normal delimeter ($...$ I think?) when it is wrapped inside CDATA, meaning that CDATA doesn’t actually help here (even when the “right” delimiters are included inside the CDATA tags). Did I get that right (this time)?