Current rule on backslash escapes (6.1) conflicts with stated "Overriding Design Goal"

The current spec reads:

Any ASCII punctuation character may be backslash-escaped

With ASCII punctuation characters being:

!"#$%&'()*+,-./:;<=>?@^_`{|}~

(CommonMark Spec)

This allows for backslash escaping in more situations than strictly necessary, along two dimensions:

  • Some of these characters do not have any syntactical meaning within CM and therefore never require escaping. Examples (as far as I can tell): ^ : ; and others

  • Other characters are meaningful, but the spec allows escaping even in situations where it is not necessary. The most common example are quotes: These frequently appear as literal quote characters, but escaping is only necessary in link titles. Even there, choosing a different delimiter (allowed are single and double quotes as well as parentheses) almost always allows avoiding the need to escape.

The current spec presumably intends to reduce the complexity of parsers and formatters. Unfortunately, they eagerly chose to use this freedom at the cost of Markdown’s core idea:

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible.

(Spec: What is Markdown?)

Case in point: Calibre’s ebook-convert escapes all occurrences of punctuation characters. Example:

…opposition was fierce \(\“You\'re stupid\!\” one protester shouted\)

While one could argue that Calibre is just plain wrong, the author defends this behaviour with reference to the original spec, which is identical to CommonMark’s current rule 6.1 in all but the somewhat smaller set of characters it is applied to: Bug #1840567 “exclamation mark not converted in markdown” : Bugs : calibre

Matthias Winkelmann via CommonMark Discussion
noreply@talk.commonmark.org writes:

The current spec presumably intends to reduce the complexity of parsers and formatters. Unfortunately, they eagerly chose to use this freedom at the cost of Markdown’s core idea:

No, it’s very easy for computers to remember arbitrary lists of
characters that need escaping. But it’s hard for humans.

The intent is to reduce the complexity of the mental model you
need to keep in your head when you’re writing commonmark. With
the original Markdown, you always had to think, “Is this one of
the special characters that can be backslash-escaped?” And this
becomes harder if you’re using extensions that assign special
meanings to more symbols. Better to have a simple rule that’s
easy to remember.

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible.

Case in point: Calibre’s ebook-convert escapes all occurrences of punctuation characters. Example:

…opposition was fierce ("You're stupid!" one protester shouted)

While one could argue that Calibre is just plain wrong, the author defends this behaviour with reference to the original spec, which is identical to CommonMark’s current rule 6.1 in all but the somewhat smaller set of characters it is applied to: Bug #1840567 “exclamation mark not converted in markdown” : Bugs : calibre

The output is legal but not optimal for reading. It’s not the
intent of the spec to permit only optimal documents. For example,
I think that it’s more readable not to use “lazy” line wrapping
in lists and block quotes, but the spec allows it (as did
original Markdown).

4 Likes