Emphasis and East Asian text

I fear that distinguishing Western and East Asian punctuation marks may not be an ideal solution.

Although this PR works for Japanese and Chinese text (please note that Korean text uses “Western” punctuation marks), it does not solve a related but slightly different issue in Korean text reported here (github/javascript-tutorial, #2040).

Koreans expect *스크립트(script)*라고 to be rendered to <em>스크립트(script)</em>라고. Since Korean text uses “Western” punctuation marks, the current CommonMark spec or this PR does not render the above Korean text “correctly.”

This Korean-text issue may be resolved by adding one more condition to @jgm’s simple rule in this comment:

Right flanking:

  • before char is non-space, AND
  • one of the following:
    • before char is EA punctuation or non-punctuation
    • after char is space or punctuation or any EA character,

although it will break nested emphases more severely.


By the way, I think a better way to solve CJK-related emphasis issues is to introduce a new syntax ~_ , _~ , ~* , and *~ originally suggested by Prof. John MacFarlane for intra-word emphasis. However, his suggestion is equally applicable to any CJK-related emphasis issues arising from the lack of whitespace:

猫は~**「のどか」**~という。
~**有经验的人总会猜测对手会怎么做。**~这样的话
~*스크립트(script)*~라고