Bug in commonmark.js about initial and final Unicode whitespaces in a paragraph




There are two 'IDEOGRAPHIC SPACE’s (U+3000) before and after the word “aaa”.
(Update: I found that there weren’t 'IDEOGRAPHIC SPACE’s, now corrected)

4.8 says:

The paragraph’s raw content is formed by concatenating the lines and removing initial and final whitespace.

However, commonmark.js dingus returns:
(Update: modified the link, now it jumps to Babelmark)


These U+3000s shouldn’t be removed because U+3000 is defined as an Unicode whitespace character but not as a whitespace character.


Interesting. I wonder what the rationale is for distinguishing between a (CommonMark) whitespace character and a Unicode whitespace character.


As far as I know, if CJKV writers use fullwidth whitespace, it should be for text layout because they won’t use fullwidth whitespace as a word divider. And Unicode whitespace characters seem to be for text layout, too.