What does it mean by "must not be a multiple of 3" phrase in the emphasis section of CommonMark Spec?

queued · April 3, 2017, 2:47pm

I had asked this question on Stack Overflow, but I couldn’t get a proper answer. Can you help me? Here is the whole text:

In the emphasis and strong emphasis section of CommonMark Spec, it says:

Emphasis begins with a delimiter that can open emphasis and ends with a delimiter that can close emphasis, and that uses the same character (_ or *) as the opening delimiter. The opening and closing delimiters must belong to separate delimiter runs. If one of the delimiters can both open and close emphasis, then the sum of the lengths of the delimiter runs containing the opening and closing delimiters must not be a multiple of 3.

But I have no idea what it means by “the sum of the lengths of the delimiter runs containing the opening and closing delimiters.”

What is it, and why it must not be a multiple of 3?

What happens if it is a multiple of 3?

Edit: I found the demonstration on the same website and there was a code

odd_match = (closer.can_open || opener.can_close) &&
(opener.numdelims + closer.numdelims) % 3 === 0;
if (opener.cc === closer.cc && opener.can_open && !odd_match) {
opener_found = true;
break;
}

> ([source](http://spec.commonmark.org/dingus/commonmark.js)). My rough understanding is that this rule was intended for restricting three-character delimiter runs in the middle of  alphanumeric characters. (e.g. "\*\*\*hello\*\*\*" will be rendered to be ***hello***, but "\*\*\*hello\*\*\*world" will not be rendered.) However, I'm not sure if my understanding is correct, and why such restriction is needed if I understood it correctly.

jgm · April 3, 2017, 3:49pm

See the explanations at
http://spec.commonmark.org/0.27/#example-388
and https://github.com/jgm/cmark/commit/c50197bab81d7105c9c790548821b61bcb97a62a

jgm · April 3, 2017, 3:47pm

There’s an explanation of the original change here:

github.com/commonmark/cmark

Changed `process_emphasis` to get better results in corner cases.

This will need corresponding spec changes.

The change is this:  when considering matches between an interior
delimiter run (one that can open and can close) and another delimiter
run, we require that the sum of the lengths of the two delimiter
runs mod 3 is not 0.

Thus, for example, in

    *a**b*
    1 23 4

delimiter 1 cannot match 2, since the sum of the lengths of
the first delimiter run (1) and the second (1,2) == 3.
Thus we get `<em>a**b</em>` instead of `<em>a</em><em>b</em>`.

This gives better behavior on things like

    *a**b**c*

which previously got parsed as

    <em>a</em><em>b</em><em>c</em>

and now would be parsed as

    <em>a<strong>b</strong>c</em>

With this change we get four spec test failures, but in each
case the output seems more "intuitive":

```
Example 386 (lines 6490-6494) Emphasis and strong emphasis
*foo**bar**baz*

--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em>foo</em><em>bar</em><em>baz</em></p>
+<p><em>foo<strong>bar</strong>baz</em></p>

Example 389 (lines 6518-6522) Emphasis and strong emphasis
*foo**bar***

--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em>foo</em><em>bar</em>**</p>
+<p><em>foo<strong>bar</strong></em></p>

Example 401 (lines 6620-6624) Emphasis and strong emphasis
**foo*bar*baz**

--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em><em>foo</em>bar</em>baz**</p>
+<p><strong>foo<em>bar</em>baz</strong></p>

Example 442 (lines 6944-6948) Emphasis and strong emphasis
**foo*bar**

--- expected HTML
+++ actual HTML
@@ -1 +1 @@
-<p><em><em>foo</em>bar</em>*</p>
+<p><strong>foo*bar</strong></p>
```

by jgm on 11:50PM - 13 Apr 16

changed 1 files with 17 additions and 3 deletions.

queued · April 3, 2017, 4:44pm

Thank you for your answer. Then the purpose of the rule is to handle the ‘em-inside-strong’ (and also the ‘strong-inside-em’) cases, right? Isn’t there any simpler and more intuitive way to make those cases work properly? I mean, I think this rule is unintentionally restricting some other cases: for example, ***hello***world syntax that I have mentioned.

jgm · April 4, 2017, 8:13am

+++ queued [Apr 03 17 16:55 ]:

Thank you for your answer. Then the purpose of the rule is to handle
with the ‘em-inside-strong’ (and also the ‘strong-inside-em’) cases,
right? Isn’t there any simpler and more intuitive way to make those
cases work properly? I mean, I think this rule is unintentionally
restricting some other cases: for example, helloworld syntax that
I have mentioned.

Well, I’m certainly openminded about this, if you have a better approach to suggest. We have a fairly large corpus of cases in the test suite. If you can come up with a different algorithm, it can easily be tested against those to see if it has unintended bad results.

Jonarod · November 5, 2017, 3:53am

Hello !

I have the same understanding problem. Now I “know” thanks.
However, taking this case a bit further, I find this example troublesome:

**bar*** = bar*

foo**bar*** = foobar*

*foo***bar*** = foo**bar***

Why the last example could not just parse as  like the rest ?! Is this because of the same rule ?