When exactly should numeric character references be replaced?

mgeier · June 3, 2016, 12:15pm

I’m not sure if I’ve missed that in the spec …

Obviously, numeric character references cannot be replaced before code spans are parsed, but according to the spec any later time would be OK, right?

The problem is: when deciding if a delimiter run is left- or right-flanking, should numeric character references be treated as punctuation (because they start and end with & and ;, respectively), or should they be treated as the Unicode code points they represent?

The reference implementation on http://try.commonmark.org seems to decide about emphasis before replacing the numeric character references.
But I think if an author encodes a non-punctuation character, she still wants it to behave like a non-punctuation character w.r.t. emphasis.

For example:

Is the following emphasized or not?

i*&#105;i*i

Here is a live example: http://spec.commonmark.org/dingus/?text=%23%20iiii%0A%0A%23%20i*%3Bii%0A%0A%23%20ii%3Bi%0A%0A%23%20i%26%23105%3Bii%0A%0A%23%20ii%26%23105%3B*i%0A%0A