Ignore LaTeX-like Math mode (or parse it)

I am writing a lot of blog articles about computer science / mathematics. This means I want formulas to be rendered in a similar way like TeX does it. I don’t expect CommonMark to specify / implement this (although it would be nice). I just would appreciate it, if CommonMark would not mess with blocks of TeX-like math.

Example

The source

This is a set: \(A = \{1,2\}\)

gets transferred into

<p>This is a set: (A = {1,2})</p>

whereas I would like it to be

<p>This is a set: \(A = \{1,2\}\)</p>

so that MathJax can grab the \(...\) and \[...\] blocks and render them.

What about the Dollar-Notation?

People who write about mathematics propably would like to use a notation like $A = \{1,2\}$ or $$A = \{1,2\}$$. However, as pointed out in a similar discussion this is a problem for parsing.

For example, consider:

Let $y = m * x + b$ where $b$ is US$50,000

As Dollar signs are probably too common for non-mathematicians, I suggest to simply not deal with that. The notation \(..\) for inline mathematics and \[...\] should proably be used in LaTeX documents anyway (see Are ( and ) preferable to dollar signs for math mode?)

People do care about it

Does quoting the backslashes suffice?

This is a set: `\(A = \{1,2\}\)`

cmparser --inline-code-default-language=latex --inline-code-treatment=eval foo.md

``` !latex
\(A = \{1,2\}\)
```

It seems as if this would result in inline math getting to a code block.

What is the difference between !latex and latex? What is cmparser?

Does quoting the backslashes suffice?

Do you want to know if writing

This is a set: \(A = \{1,2\}\)

writing

This is a set: \\(A = \\{1,2\\}\\)

is enough, then yes, for this example it is enough. However, LaTeX has a lot of backslashes. It is not convenient to do so. And convenience is the main reason why I use Markdown and not directly HTML.

The problem with the \(..\) and \[..\] forms is that \( and \[ already have clear and important meanings in CommonMark (and other Markdown versions). They are escaped parentheses and backslashes. It’s very important to keep this behavior, or you’re left without an easy way to write literal special characters when this is needed.

Hence I prefer the $ syntax. Pandoc has supported this for a decade now, with some simple heuristics that prevent unwanted capture of regular $ characters. This has worked just fine. It’s extremely uncommon to write things like US$50,000, and one can always escape the $ in cases like this. Anyway, my experience with pandoc is that this is not a pain point. People simply don’t complain about unwanted capturing of $ characters.

6 Likes

I’m sorry, I actually meant to elaborate a bit more.

Some implementations do already support math blocks like so:

``` math

The exclamation mark was proposed to differentiate between code to be parsed or simply syntax highlighted.
cmparser is just your hypothetic CM parser that supports this.

Agreed we used the $ method at Stack Exchange for the sites that needed MathJax and it worked great, nobody complained.

https://blog.stackoverflow.com/2011/04/stack-exchange-partners-with-mathjax/

1 Like

There are several dialects to express math equations. I know LaTex, MathML, AsciiMath.

$$ looks attractive, but i’d like the possibility to select math language mode. For non-tech people LaTeX is too complex. May be i’m wrong and problem can be solved by WISYWIG editor, but i don’t know any good & opensource one for web and for desktop text editors.

Is there an extension to commonmark that implements the Pandoc (well, TeX) style for math mode?

@jgm you mentioned in a previous thread about the heuristics Pandoc uses to parse the $ syntax. In particular you require that there is no whitespace after the first delimiter and right before the last delimiter.

I was wondering if there is a particular reason for this, a particular case. In StackEdit and StackExchange, they allow the whitespace, and I’ve had a math guy who uses these sites extensively tell me that it helps in the readability to have an extra space with the $.

Is this something that could be relaxed without breaking anything? I can’t think of a test case where it would break.

Update: actually I withdraw the question. The example in the Pandoc docs, which I overlooked, $20,000 and $30,000 demonstrates it perfectly. And StackEdit/Exchange does indeed not handle it correctly, though I’m guessing it’s not something they run into often.

Note that pandoc does allow spaces after $$ for display math, since $$ isn’t likely to occur in other contexts. So you can do

$$
e=mc^2
$$

for better readability of displayed formulas.