Converting 'line ending' to 'space' in code span makes inconvenience (CommonMark v0.29)

Problem

From 0.29, code span normalizes content in following ways

  • First, line endings are converted to spaces.
  • If the resulting string both begins and ends with a space
    character, but does not consist entirely of space
    characters, a single space character is removed from the
    front and back. This allows you to include code that begins
    or ends with backtick characters, which must be separated by
    whitespace from the opening or closing backtick strings.

But, when you need to hard-wrap very long text (e.g. sha256 hash) in code span, this adds unnecessary space.

This CommonMark text

There is a chance to hard-wrap very long text (e.g. `sha256:e3b0c44298fc
1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855`).

will become

There is a chance to hard-wrap very long text (e.g. sha256:e3b0c44298fc 1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855).

When you copy&paste this sha256 hash, you see a space between 44298fc and 1c149af.


Suggestion

Is following sufficient?

In code span normalization

  • First, line endings are removed.
  • If the resulting string both begins and ends with a space
    character, but does not consist entirely of space
    characters, a single space character is removed from the
    front and back. This allows you to include code that begins
    or ends with backtick characters, which must be separated by
    whitespace from the opening or closing backtick strings.

Sorry. I overlooked description of ‘you should use discource instead of github issues for questions’ at GitHub and made issue on github.

Are you using code spans rather than indented code blocks or fenced code blocks? Those may work better for you.

Would escaping line breaks work? (It does not currently.)

foo ``sha256:e3b0c44298fc1c149afbf4c8996fb924\
27ae41e4649b934ca495991b7852b855`` bar

@Brian_Lalonde

Yes, I do.
Because I would like to embed long string (e.g. sha256 hash) into paragraph.

indented code blocks & fenced code blocks cannot do this, can they?

They’ll wrap if you show them where (depending on available width):

foo `sha256:e3b0c44298fc1c149afbf4c8996fb924`
`27ae41e4649b934ca495991b7852b855` bar

Otherwise, CommonMark doesn’t currently have the level of sophistication to preserve line breaks in a block within a block that doesn’t preserve line breaks (this simplicity is why it’s so popular). However, you can use a little HTML:

foo `sha256:e3b0c44298fc1c149afbf4c8996fb924`<br/>
`27ae41e4649b934ca495991b7852b855` bar

Or else, if you find that you frequently, absolutely require a level of formatting sophistication beyond what CommonMark can provide, there are formats that may work better for you: HTML or AsciiDoc or many others.

I’m not sure how much impact changing the line-ending behavior in code spans would have. If it’s not too big, and there are important use cases, it could be worth pursuing further here.

You mean this?

(This isn’t what current spec is, but as an idea)

In code span, if and only if backslash comes at the end of line, it escapes line ending and no space will be added in place of line ending. And other backslashes which aren’t at the end of line will be treated literally.

If so, it looks nice for me.

@Brian_Lalonde

No. What I expect isn’t to preserve line break.

What I expect is this

<p>foo <code>sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855</code> bar</p>

But when I hard wrap it in code span like this

foo ``sha256:e3b0c44298fc1c149afbf4c8996fb924
27ae41e4649b934ca495991b7852b855`` bar

it’ll become

<p>foo <code>sha256:e3b0c44298fc1c149afbf4c8996fb924 27ae41e4649b934ca495991b7852b855</code> bar</p>

Yeah. To use another tool can be one approach.

In HTML, hard-wrapping can have the same result, though.

<p>foo <code>sha256:e3b0c44298fc1c149afbf4c8996fb92427
ae41e4649b934ca495991b7852b855</code> bar</p>

I get a space between 92427 and ae41e when browsing this file

<!DOCTYPE html>
<html>
 <body>
  <p>foo <code>sha256:e3b0c44298fc1c149afbf4c8996fb92427
  ae41e4649b934ca495991b7852b855</code> bar</p>
 </body>
</html>

What is wrong? Could you tell me the point?

vanou, IIUC you want the hash to remain a single unbroken string in the output — sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 — right?

Neither HTML nor markdown support hard-wrapping inside a word. All talk of “hard-wrapping” markdown is about liberty (in many though not all places) to break a “logical line” into multiple “physical lines” by using a newline between words.
But the newline in middle of a word will always split it into 2 words (in a few cases by a line break in the output, in others simple by a space).

Markdown has no hard goal of allowing the input to always fit a fixed width, say 80 columns. If you want to say a single 200-character word, you must write at least that word unbroken.

1 Like