Native support for unicode No Break space

#1

Hello,
I am using markdown for narrative texts and I would like to optimize my workflow. One of my drawback is that I can’t get my files to take care of No Break Space written in Unicode. Do you plan to have native support for unicode U+00A0 ?

I have made a try on http://spec.commonmark.org/dingus/ and I can’t get my no break space taken in account if I insert it with Ctrl+Shift+0020
But if I type   it is ok.
Do I make something wrong ? Is there any way to have NBSP taken in account with Unicode character ?
All the best from France

0 Likes

#2

Both reference parsers handle U+00A0 with no difficulty. Your question is really about the online dingus, I suppose, but I’d suggest you download one of the reference implementations and use it directly.

Note that you can also insert the character with  .

1 Like

#3

Thanks for your answer. I was speaking about offline tools. I have made a test, creating a simple markdown file with gedit, save it and with pandoc export it in epub (a simple command line, all default). The nbsp I have included with unicode Ctrl+Shift+0020 has not been taken in consideration in the resulting epub.
Is there some indication to include some where in the file ? Or some text editors to prefer ?

0 Likes

#4

Hello. I have made new tests, and have checked that the space I used were actually 0x00a0 and not 0x20
I get a normal space in the resulting epub when I export through pandoc. Is there some tweaking/paramater I should make ?

0 Likes

#5

There are many ways to encode NBSP in HTML. Two of them are <p>non-breaking space</p> and <p>non-breaking&nbsp;space</p>.

There are plenty of ways to encode NBSP in Markdown too. The most common two are again non-breaking space and non-breaking&nbsp;space. Regardless of which Markdown representation you choose, the CommonMark reference implementation always chooses to convert it to <p>non-breaking space</p> in HTML.

That’s why you see <p>non-breaking space</p>, not <p>non-breaking&nbsp;space</p> in the “HTML” tab of the online dingus. See it yourself.

Now try to copy non-breaking space from any rendered output. What you’ll get in your clipboard is non-breaking and space with the plain old 0x20 space between them. Even though what you copy actually contains NBSP, your browser replaces it with the plain old space on copying, for some unknown reason.

It’s a bug (or maybe a feature) which is present in both Firefox and Chromium. I couldn’t find any exact info on why this happens.

There’s this 10-year-old open bug in Mozilla’s Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=359303.

0 Likes

#6

EPUB is an XML-based format. If you check for NBSP by copying and pasting, you may stumble upon the same bug I describe above. How do you check it?

0 Likes

#7

Maybe somebody should ask the Firefox and Chromium devs on their respective IRC channels. Definitely not me or not today. I’m done with investigating obscure software bugs for today.

0 Likes

#8

Thanks for your input. I will definitively look for more tests. Perhaps it is only a viewer problem and not export as you say.

0 Likes

#9

To prove it, I created this even easier way to reproduce this behavior: https://codepen.io/anon/pen/PbXeNx.

Let us now when you find more. I’m especially interested why this weird behavior happens.

0 Likes

#10

OK. So I have taken the opportunity during New Year Eve to update my system (Debian Sid) and I have taken the latest pandoc package from the pandoc website. Now I have exactly the behaviour you describe.
The problem is that many writing softwares don’t show the nbsp (Retext for instance) on preview (using some web tool like webkit) and, moreover, save the nbsp as normal space which is a problem on their own.
Anyway that’s no longer a markdown issue for me, thanks a lot for your help :slight_smile:

0 Likes