No-markdown islands

mofosyne · September 17, 2014, 8:50am

One approach could be to have a general directive like this in the beginning of the page.

!!!ignore
    latex: $ ... $
    C: /* ... */ 
!!!

$ 1+1 = 2 $
/* hello world */

Good thing about this approach, is that it is easy to refer to, and can be adapted as needed.

Also this above is local to the page only, but I’m sure it could be possible to create a sitewide header for declaring these (e.g. a website handling maths tutorials). And perhaps, baked into core, if used often enough by everyone like latex `$ … $``

mightymax · September 19, 2014, 1:31pm

This is the key point: who knows what kind of other parsing scripts will conflict with Markdown in the future? Having a simple “no Markdown” notation would make forward compatibility much simpler and possibly reduce the need for extensions to the Markdown parser.

I still think that the best bet would be to somehow combine this with the raw HTML notation, since the whole point of that is for Markdown to pass it along unaltered.

mofosyne · September 19, 2014, 2:43pm

mightymax, this could be your solution. Use markdown=1 flags in markdown (where all html tags are by default markdown=0)

<div markdown="1"> 
This is *true* markdown text.
</div>

src: PHP Markdown Extra

I’ll just point out that the markdown=“1” trick should be credited to John Gruber. It was his plan to incorporate this into Markdown at some point (and if I recall well, one of the 1.0.2 betas had the feature enabled, but it was removed in later betas because it had issues).

src: Markdown within block-level elements

carlsmith · September 19, 2014, 4:58pm

It’d be a lot cleaner if the parser parsed the guts of capitalised tag names, and converted them to lowercase. This example looks messy, but in practice, you wouldn’t need to use the feature very often.

<SECTION>
# this is a header one
<p> # this is just text <SPAN> but `this is code` </SPAN></p>
<SECTION>

…becomes, more or less…

<section>
<h1> This is a header one </h1>
<p> # this is just text <span> but <code>this is code</code> </span></p>
</section>

This is not exactly backwards compatible, but you could easily convert existing docs by just capitalising every tag name.

pepper_chico · September 21, 2014, 6:46pm

interesting approach…

pepper_chico · September 21, 2014, 6:56pm

Indeed, that’s a key point. I also think it may be ok if included in raw HTML notation section, the only thing that I find not optimal with that is the requirement to have html involved. This no-markdown thing doesn’t implies involvement of HTML. If it were just a kind of markdown notation, instead of HTML based, we could get verbatim content in the output without it having to be wrapped by <span> or <div> tags. When using MathJax for example, inside an html file, you can have foo $ math $ bar, instead of foo <span> $ math $ </span> bar which is what you would get with the no-markdown thing based on raw HTML, even though it works for effects of display.

Kasper · September 23, 2014, 3:42pm

I was playing around to day if I could get commonmark to work with mathjax.

And it was quite easy, commonmark already has something to parse raw input:

// Parse a run of ordinary characters, or a single character with
// a special meaning in markdown, as a plain string, adding to inlines.
var parseString = function(inlines) {
  var m;
  if ((m = this.match(reMain))) {
    inlines.push({ t: 'Str', c: m });
    return m.length;
  } else {
    return 0;
  }
};

It also worked for \begin align etc. for me. I use it now at this project:
http://kasperpeulen.github.io/CoffeeTeX/

I’ve kind of stolen the layout of the editor on the common mark try page, I hope that this is no problem, I kind of like it

Matthijs_Kooijman · October 6, 2014, 9:05pm

But isn’t it hard to let Markdown support exactly the delimeter that MathJax expects? And if we do that, other languages would need other delimieters, which also would need supporting?

Also, you can use anything inside CDATA, including the delimeters you need, right? E.g., doesn’t including ` work? Of course it isn’t very pretty, that’s true, and perhaps MathJax or whatever gets confused by the CDATA tags that stick around in the resulting HTML?

pepper_chico · October 6, 2014, 9:19pm

What’s being suggested is not related to MathJax specifically, as has been stressed all along.

Matthijs_Kooijman · October 7, 2014, 5:56am

Yes, that was exactly my point. Use a CDATA tag to suppress markdown parsing and inside use whatever magic sequence to trigger whatever further parsing you want?

I think I misinterpreted this part. I thought you meant that MathJax doesn’t use CDATA as a delimiter, so Markdown should use the same delimiter as MathJax. Reading it again, I think you mean that MathJax doesn’t recognize its normal delimeter ( $...$ I think?) when it is wrapped inside CDATA, meaning that CDATA doesn’t actually help here (even when the “right” delimiters are included inside the CDATA tags). Did I get that right (this time)?

pepper_chico · October 7, 2014, 6:14am

You got that CDATA can’t work, yes.

pepper_chico · October 7, 2014, 6:21am

If you get this, you understand the proposal.

Matthijs_Kooijman · October 7, 2014, 6:35am

Yup, I see it now. Basically what you’re saying is needed is a “no markdown” syntax that does not leave any trace of itself in the output (unlike the current HTML no markdown syntax). Makes sense to me.

pepper_chico · October 7, 2014, 6:52am

Yes, I’m not even sure whether there’s any HTML syntax that’s able to cover no-markdown, even when outputing some surrounding HTML (except for CDATA). I feel that not involving HTML for this as a good thing because then, it could ease the production of other kinds of output beyond HTML. For example, one could have mainly CommonMark content, with some short pieces of LaTeX (through no-markdown), for a source file meant to produce LaTeX output, not HTML.

vitaly · October 7, 2014, 10:08am

Problem is what to do with presentation of those islands. For example, you can prefer mathml, i can prefer asciimath and someone else - latex. We will get situation, when spec can’t guarantee the same output.

So, i think, that such islands without content type marker will be useless on practice. Instead of working with generic form, i’d prefer to solve particular tasks. Right now - how to embed math/latex inline and as block. If we reserve syntax for particular case, we could do this syntak more user-friendly.

mofosyne · October 7, 2014, 10:26am

I would push this to be done via generic directive like below. Though it’s probbly not the prettiest syntax compared to something native like HTML CDATA.

!nomarkup[ ...etc... ]

!!!nomarkup
  ...etc ...
!!!

Btw, vitaly, will you support $$ latex math block at least?

vitaly · October 7, 2014, 10:41am

Personally, i’d prefer to use $$ for asciimath. But i’l do any syntax, that fall into spec. I really need math. The only reason why it’s not implemented yet in remarkable - i wait until math markup become official. Many people need it, and i hope that this question will not be postponed for a long time.

pepper_chico · October 7, 2014, 4:08pm

I don’t get what you mean by don’t having guarantee of output presentation, are you falling in the same argument as mb21?

This is just about getting what’s in the islands as verbatim in the output. It’s up to the person editing the file to know for what use his CommonMark source is for, producing HTML, LaTeX or whatever, and then know the best use of the no-markdown literals, let it be for outputing some pieces of MathJax, or just raw LaTeX (which doesn’t need surrounding $…$), etc.

mofosyne · October 7, 2014, 4:19pm

Would it be viable to specify a expression of what to ignore in document declaration? E.g. like No-markdown islands

@vitaly would it be a plausible approach to allow users to specify perhaps a regular expression of what to ignore (definable perhaps in a YAML document header)? I think it would provide the flexibility for @pepper_chico and others to deal with any variety of syntax that needs to be passed through verbatim (thus could be for futureproofing as well).

pepper_chico · October 7, 2014, 4:43pm

IMO, I don’t have a problem with that, it’ll work too, although it adds an extra bit of complexity. My suggestion regarding not having problems with clashing delimiters with contents is in the sense of raw string literals existing in programming languages like C++ and Rust (for which I have provided some samples already).

While with the front matter approach you can have a more readable source file where, for effects of reading the CommonMark file, you can readily recognise for what use the islands are there, because they’re self-descriptive even though they produce the same function for output; With a single kind of literal approach, you make it readily recognisable it’s a CommonMark no-markdown literal, and don’t need front matter configuration.