No-markdown islands

IMHO this is a part of more generic problem - format for inline extentions. Block extentions are now discussed in several threads, but those markup is not ok for inline elements (in paragraphs and lists).

On practice, inline elements have different philosophy - syntax should be as simple as possible. So, it would be more convenient for users to have different markers for each case, than one universal marker with name and params.

Does a <![CDATA[]]> section fit your definition of a no-markdown island?

You can put almost anything you want in one, except maybe ]]> There seems to be a few fine points to work out around them, but they are already in the spec.

I was testing in the block context and found a spec issue, see: Drawing a distinction between HTML block elements and non-element tags. We may need to pay similar attention to the difference between Raw HTML and non-element tags like CDATA.

No, it doesn’t fit since the CDATA section ends up in the rendered output wrapping its contents. MathJax LaTeX, for example, is not expected to be wrapped by this in the output.

I see. So it sounds like this topic belongs in the Extensions category.

1 Like

Indeed, but I’d like to point out that the no-markdown literal I’m suggesting can encompass, but is not limited to MathJax, it’s a simpler feature for asking the parser to not parse contents, just output them verbatim. So, let’s say, if the delimiters were ''', so _foo_ '''$E = mc^2$''' *bar* would be the thing that would make $E = mc^2$ goes to the output without being parsed, leaving the $...$ thing for the job of MathJax.

Maybe, but, at last for me, this is such a must to be right in the spec, I miss it didn’t happen to have been included in Markdown from day one.

One approach could be to have a general directive like this in the beginning of the page.

!!!ignore
    latex: $ ... $
    C: /* ... */ 
!!!

$ 1+1 = 2 $
/* hello world */

Good thing about this approach, is that it is easy to refer to, and can be adapted as needed.

Also this above is local to the page only, but I’m sure it could be possible to create a sitewide header for declaring these (e.g. a website handling maths tutorials). And perhaps, baked into core, if used often enough by everyone like latex `$ … $``

This is the key point: who knows what kind of other parsing scripts will conflict with Markdown in the future? Having a simple “no Markdown” notation would make forward compatibility much simpler and possibly reduce the need for extensions to the Markdown parser.

I still think that the best bet would be to somehow combine this with the raw HTML notation, since the whole point of that is for Markdown to pass it along unaltered.

1 Like

mightymax, this could be your solution. Use markdown=1 flags in markdown (where all html tags are by default markdown=0)

<div markdown="1"> 
This is *true* markdown text.
</div>

src: PHP Markdown Extra

I’ll just point out that the markdown=“1” trick should be credited to John Gruber. It was his plan to incorporate this into Markdown at some point (and if I recall well, one of the 1.0.2 betas had the feature enabled, but it was removed in later betas because it had issues).

src: Markdown within block-level elements

2 Likes

It’d be a lot cleaner if the parser parsed the guts of capitalised tag names, and converted them to lowercase. This example looks messy, but in practice, you wouldn’t need to use the feature very often.

<SECTION>
# this is a header one
<p> # this is just text <SPAN> but `this is code` </SPAN></p>
<SECTION>

…becomes, more or less…

<section>
<h1> This is a header one </h1>
<p> # this is just text <span> but <code>this is code</code> </span></p>
</section>

This is not exactly backwards compatible, but you could easily convert existing docs by just capitalising every tag name.

1 Like

interesting approach…

Indeed, that’s a key point. I also think it may be ok if included in raw HTML notation section, the only thing that I find not optimal with that is the requirement to have html involved. This no-markdown thing doesn’t implies involvement of HTML. If it were just a kind of markdown notation, instead of HTML based, we could get verbatim content in the output without it having to be wrapped by <span> or <div> tags. When using MathJax for example, inside an html file, you can have foo \( math \) bar, instead of foo <span> \( math \) </span> bar which is what you would get with the no-markdown thing based on raw HTML, even though it works for effects of display.

I was playing around to day if I could get commonmark to work with mathjax.

And it was quite easy, commonmark already has something to parse raw input:

// Parse a run of ordinary characters, or a single character with
// a special meaning in markdown, as a plain string, adding to inlines.
var parseString = function(inlines) {
  var m;
  if ((m = this.match(reMain))) {
    inlines.push({ t: 'Str', c: m });
    return m.length;
  } else {
    return 0;
  }
};

It also worked for \begin align etc. for me. I use it now at this project:
http://kasperpeulen.github.io/CoffeeTeX/

I’ve kind of stolen the layout of the editor on the common mark try page, I hope that this is no problem, I kind of like it :smile:

But isn’t it hard to let Markdown support exactly the delimeter that MathJax expects? And if we do that, other languages would need other delimieters, which also would need supporting?

Also, you can use anything inside CDATA, including the delimeters you need, right? E.g., doesn’t including ` work? Of course it isn’t very pretty, that’s true, and perhaps MathJax or whatever gets confused by the CDATA tags that stick around in the resulting HTML?

What’s being suggested is not related to MathJax specifically, as has been stressed all along.

Yes, that was exactly my point. Use a CDATA tag to suppress markdown parsing and inside use whatever magic sequence to trigger whatever further parsing you want?

I think I misinterpreted this part. I thought you meant that MathJax doesn’t use CDATA as a delimiter, so Markdown should use the same delimiter as MathJax. Reading it again, I think you mean that MathJax doesn’t recognize its normal delimeter ($...$ I think?) when it is wrapped inside CDATA, meaning that CDATA doesn’t actually help here (even when the “right” delimiters are included inside the CDATA tags). Did I get that right (this time)?

You got that CDATA can’t work, yes.

If you get this, you understand the proposal.

Yup, I see it now. Basically what you’re saying is needed is a “no markdown” syntax that does not leave any trace of itself in the output (unlike the current HTML no markdown syntax). Makes sense to me.

Yes, I’m not even sure whether there’s any HTML syntax that’s able to cover no-markdown, even when outputing some surrounding HTML (except for CDATA). I feel that not involving HTML for this as a good thing because then, it could ease the production of other kinds of output beyond HTML. For example, one could have mainly CommonMark content, with some short pieces of LaTeX (through no-markdown), for a source file meant to produce LaTeX output, not HTML.