A proposal to support the <mark> tag with Markdown

As I was thinking about the <mark> tag, I realized that it could be supported without
a Markdown syntax change, by taking advantage of the current redundancy between
underscores _ and asterisks *. I wonder what others think about this. I have explained
my reasoning in more detail in this blog post:

http://www.jandecaluwe.com/blog/mark-tag-support.html

There would be no change whatsoever required, not in the reference implementation, and not in the specification, if only:

  1. the specification would specify that distinctions like this one (was “__” used or “**”?, was “_” used or “*”?, and so on) have to be represented in the parser’s output, and
  2. the CommonMark DTD would provide appropriate attributes to host this information,

then the “application”, or “renderer”—that piece of software that turns the parser’s result into HTML, HTML5, XHTML, DocBook, LaTeX, or whatever—could in effect map “__” into <mark> or into whatever is desired.

There’s an item in the thread Issues we MUST resolve before 1.0 release [6 remaining] - #25 by jgm about this (how many of these aspects of the concrete input should be recorded in the AST, so renderers have access to them).

I think it’s been suggested before to use the source code distinction to differentiate between em/strong and i/b. Double equals signs == are used in some extensions to markup what becomes mark in HTML5 output.

Some related topics:

http://talk.commonmark.org/t/highlighting-text-with-the-mark-element/840

http://talk.commonmark.org/t/alternate-voice-or-mood-i-tag-in-html5/1206

I’m not convinced that either * or _ should render something other than emphasis as this would change the meaning of thousands of legacy documents.

2 Likes

Sure, that should always be a non-default option or extension. I need the italics/emphasis distinction for linguistic texts, for instance.

Being able to write these other kinds of markup using a non-HTML syntax would be beneficial. Even as a non-default option, overwriting the meaning of the asterisk or underscore could get very messy when combining CommonMark documents from multiple sources (where not all authors intended the asterisk/underscore to mean something different from the default).