Getting back to nbsp. I still don’t understand what’s wrong with using escaped space.
Rest assured that there is absolutely nothing wrong with using “escaped space” “
\” (REVERSE SOLIDUS , SPACE) as a keyboarding string for NBSP, if you so wish to.
What would be “wrong” (or rather: unjustified, an unneeded complication, not worth the hassle—you know what I mean) in my opinion is: changing the CommonMark specification to accomodate this single itch of yours, and in doing so to also weaken and complicate the “basic rule for using backslash” in Markdown, namely to:
- use backslash ("
\") to “hide” a mark-up significant character from the parser, like in “
This is the primary purpose of backslash in any Markdown variant, and in CommonMark as well. Note that this rule
corresponds exactly to the purpose and use of
<—in XML/XHTML/SGML/HTML; while
also corresponds to the use of “
\” in W3C [CSS level 2][css] syntax (but “
\” is used there for a range of other things too, like specifying characters by hex-digits giving the code point); and
has a sole exception in the current CommonMark specification (AFAIK), namely that:
- a backslash at end-of-line effects a hard line break,
which is kind of the opposite to “hiding the end-of-line”, and thus this execption is in my view a mistake—even more so as there is a (IMO better, as in: more consistent and unobtrusive and easier to type) alternative already, namely:
- use two or more " " (SPACE) at end-of-line to effect a hard line break.
Of course you can have a completely different opinion concerning this “use of backslash” issue in CommonMark, but do you at least understand where I’m coming from whith these remarks?
Yes, you can press Alt+Space on Mac (or some other combination on other system) to insert Unicode nbsp char, but it will be invisible. Same thing as with two spaces at the end of the line.
Yes, so what? I can’t see a NBSP, or EM SPACE or EN SPACE either. If I can’t use a decent editor (I’m typing this text into a shabby HTML
<textarea> in my browser right now!), I simply type
Or I write the goddam
<br />) at the end-of-line by myself, but most of the time I simply remember that I have typed two or more SPACEs there and be done with it.
I really can’t understand why you find that so hard on the one hand, and still insist that you don’t have nor want to use better tools like a decent editor …
That said …
… I agree that one could define a “better” (ie more versatile) use of backslash “
\” in CommonMark, and IMO the use of backslash in CSS would be a good starting point for such an extension. This could introduce new rules for using “
\” which can be more or less literally cloned from the CSS rules about backslash:
Hexadecimal character specification: “
a non-breaking\0Aspace” inserts a NBSP, but in “
a space\0A after a hex-sequence is gobbled up” the SPACE after “
A” will get “gobbled up”, resulting in only the NBSP being between the “
e” and the “
The “usual” backslash-escape sequences like (most of them are probably not that useful and needed):
\t” for HT,
\n” for a new-line or line break (not equivalent to entering a U+000A LF control character!),
\f” for a FF,
\b” for BS,
\a” for BEL,
- And I would welcome “
\s” for NBSP as a new backslash-escape sequence too!
As usual (like in the C preprocessor, and also inside strings in CSS): a backslash followed by and end-of-line is ignored, the result is as if neither the backslash nor the line break had been in the input [or the line break is replaced by a single SPACE]----so you can write eg long section heading texts into multiple lines connected by “
\” at EOL. (C programmers know how to do this ) But the parser (and the CommonMark rules too!) would still see only a single line:
This section title \
\nis a bit long so we write it into \
multiple input lines
But the parser sees only *one* title line here! While in the *output* we will get *two* lines in the section
title, because of the "hard line break" introduced by "`\n`".
Also as usual: “
\\” will enter a literal U+005C REVERSE SOLIDUS character.
If the backslash is not followed by a decimal digit or a lower-case letter, the usual “hide it from the parser” rule continues to apply.
NOTE 1: The explicit EOL sequence “
\n” would also obviate the ugly use of backslash to mark-up a hard line break.
NOTE 2: We’re writing text, not program code, so control characters like CR and LF (and BEL, and BS) are not really usefule here. Personally, I would like to have backslash-escape sequences for often-used typographical characters like EM SPACE, or EN DASH and EM DASH and so on.
NOTE 2: By the simple rule (3) above, the sequence “
\” (backslash, then space) would now be well-defined, and would “hide” the space from the parser (I think there are situations where SPACE is relevant in CommonMark parsing, so this could also be useful, beyond being consistent).
NOTE 3: According to the current CommonMark rules, the backslash only “hides” punctuation characters, and is taken literally else: I find this rule pretty byzantine, too, and worth replacing by simpler, more versatile, and more usefule rules.
This would be an extension of the CommonMark specification I’d be happy to support—and I hope that it would at least be acceptable for you to use “
\s” (and not backslash-followed-by-space) to “mark up” (or just: type) a NBSP …