Beyond Markdown

Simpler is better for everyone. Common Mark should stand apart with no (or minimal) reliance on other other languages. IMO, backward compatibility is a goal not an absolute. Where backward compatibility is possible go for it but do not be bound by it. Very probable not all variants of Markdown can be built into Common Mark. Common Mark needs to “exceed” the other variants so they go away. Simplicity and unambiguous ways of writing will eventually prevail. Getting all to use Common Mark not likely.

  • I agree with @alehed that Common Mark should provide the ability to create the “normal” features of writing documents (tables; footnotes and so on).

  • Eliminate multiple ways of performing the same task. For example, no short reference links. There are probable others.

  • Emphasis: not sure how to solve bold and strong. Bold = " * “. Strong +” ** ". I agree with one character to identify letter format.

  • A truly radical proposal: use words, ie this becomes an attribute so there is no ambiguity (strong is strong; bold is bold). For clarity in human readability each attribute stands alone; cannot put multiple attributes in the same “holder”.

  • For attributes: {=…} @adiantwoods.

  • With a unambiguous statement of attributes HTML not needed. Not all know HTML or care to learn HTML.

  • A list should only be a list (no fancy complications).

  • All code that needs to “pass through” inside a code block.

Always open to comments and suggestions.

1 Like

Having to visually parse <strong>words that are not</strong> part of the sentence does <emphasis>not</emphasis> achieve “clarity in human readability”. See what <bold>I</bold> did there?

I think you are confusing *markup* readability with *content* readability, and Markdown is focused on the latter.

2 Likes

I also wonder whether one can expect to get those “normal features” (in the form of extensions) of (more serious) documents like tables/footnotes/… in the Common Mark or is it better to focus on the systems supporting other markups (Markdown Extra, rst, Asciidoc,…)?

In other words, what is the practical meaning of Proposed Extensions, can one expect to see e.g. support for footnotes or it is more probable that we won’t see much of them in the final spec?

1. Emphasis

No personal strong opinion. Will follow best consensus/practices.

2. Reference links

No personal strong opinion. Will follow best consensus/practices.

3. Indented code blocks and lists

+1.0 with your fix.

4. Raw HTML

INLINE FORM

+1.0 with your fix

BLOCK FORM

+0.5 with your fix

  • should be completed with a section ::: too, like with pandoc fenced_div.
  • which should now rather be understood as a section marker compatible with html5 section, article, main etc.

5. Lists and blank lines

No personal strong opinion. Will follow best consensus/practices.

6. Attributes

+1.0 with your fix. A very good work.

However

  • {.class} should be recognized too
  • Attributes at end rather than at begin have appeared here and there.
    Why not relax conditions about these.

Edited: This link on github go in this direction too : generic directive extension list

1 Like

An alternate fix for emphasis:

Require an exact match between the opening and closing delimiters. Kind of like inline code spans.

Emphasis would begin with a left-flanking delimiter run of exactly 1, 2, or 3 asterisks or 1, 2, or 3 underscores, and end with a right-flanking delimiter run of exactly the same length and character.

*emphasis*
_emphasis_

**strong emphasis**
__strong emphasis__

***strong plus regular emphasis***
___strong plus regular emphasis___

Four or more sequential asterisks or underscores would render literally.

****no emphasis****
____no emphasis____

Unmatched delimiter runs would not create emphasis at all, and could not divide into emphasis plus literal characters.

**no emphasis*

**no emphasis***

_no emphasis*

Any unmatched delimiters, including within emphasis, would render literally.

**asterisk* within strong emphasis**
__underscore_ within strong emphasis__

*asterisks** within emphasis*
_underscores__ within emphasis_

To create a literal asterisk or underscore next to emphasized text, a character can be escaped…

*asterisk\** inside emphasized text

*asterisk*\* outside emphasized text

…or a different delimiter character can be used.

_asterisk*_ inside emphasized text

_asterisk_* outside emphasized text

Nested emphasis would work.

**strong and *emphasis* within strong**

It would also be possible to nest the same type of emphasis by alternating asterisks and underscores.

*lots _of *emphasized* text_ here*

When using asterisks in a single word, emphasis would start with a left-flanking or both-flanking delimiter run, and end with a both-flanking or right-flanking delimiter run. This would allow intraword emphasis.

*emphasized*
*em*phasized
em*pha*sized
empha*sized*

These rules should be pretty intuitive and easy to learn, and backwards compatible to a large extent.

And they eliminate a huge amount of complexity and ambiguity.

3 Likes

I think @aoudad’s suggestions on emphasis are sound, keeping both ease of reading and backward compatibility. On the rest, I tend to agree with @jgm. However, the bit that I find toughest to get behind is getting rid of shortcut reference links. Those are not only convenient, but are very readable as well. [foo][] gives up a bit of that human-readability for parsing convenience.

At any rate, looking forward to CommonMark 1.0.

2 Likes

Some thoughts…

  1. Emphasis

    I like using _emphasis_ and *strong emphasis*. As somebody mentioned already, that’s how it works on WhatsApp, Facebook, and Slack, and it seems very logical. They also support ~strikethrough~, `monospaced`, and triple-backtick code blocks, which are all nice.

    I personally don’t care much for intra-word emphasis and I’d rather keep the single tilde free for strikethrough syntax.

  2. Reference links

    Shortcut reference links are great, and it would be a shame to remove them. They are very intuitive and readable. Just go to a random Hacker News comment thread and you’ll see people instinctively using them, even though they’re not supported there (just Ctrl/Cmd+F [1] to see examples).

    They are also essential for wiki-style and academic writing, which are both extremely rich in references. The extra noise caused by [this style][] would hurt readability.

  3. Indented code blocks

    Big yes to the more logical list indentation style, and to removing indented code blocks. What a pain they are.

  4. Raw HTML

    Not a fan of the specific syntax (why the extra =?), but this sounds good in general.

  5. Lists and blank lines

    I’d rather we just allow the creation of a list even without a blank line separating it from a paragraph. It seems to me this would rarely be a problem in practice. The example given can just be fixed by having 220. be on the previous line. Or maybe one could allow escaping the period like so 200\. to mean that you really do want to write 200. at the beginning of the line, and not start a list. Again, I really doubt this will happen very often.

  6. Attributes

    Why not use a consistent way of creating attributes for headers, like on GitHub? This would avoid having to introduce extra syntax, and keep documents cleaner.

2 Likes

There is a discussion regarding adding header IDs to CommonMark, but it seems that - at least in some cases - having the ability to manually specify these is useful.

To keep shortcut reference links readable, how about double brackets?

In other words, links would use external brackets for a reference label [foo][bar], and nested brackets if the link text is its own label [[foo]].

That makes it immediately clear there’s a link, not just a span or literal brackets; and it’s more natural-looking than appending empty brackets [foo][].

[[foo]]

[foo]: https://example.com
<a href="https://example.com">foo</a>

If there’s no link reference definition, the double-bracketed text would still be a link. It could fall back to an implicit page link:

[[foo]]
<a href="foo">foo</a>
2 Likes

That’s a good suggestion. This is similar to the [[text]] format of the Wiki markup used by MediaWiki, and it is human-readable and doesn’t look horrid.

The trouble would be that it isn’t backward-compatible. But it is something I feel one can support, since I feel standardization is more important than backward-compatibility.

I would say MediaWiki is exactly the reason why IMHO Markdown (or derivatives of it) should not use [[foo]] for the ordinary links. Wiki is one very natural application for Markdown-like syntax and it’s therefore better to reserve that syntax for wiki-links to other articles defined by the database of its articles; not to some arbitrary URIs.

3 Likes

UPDATE ABOUT ATTRIBUTES

6. Attributes

+1.0 with your fix. A very good work.

However

  • {.class} should be recognized too
  • Attributes at end rather than at begin have appeared here and there.
    Why not relax conditions about these.

See Also

Is there sufficient consensus to move forward on any of the 6 items that @jgm flagged? Given that commonmark is yet to have a 1.0 release, I feel some simplification and rationalization would still be a good idea. Given a tool like pandoc exists, people could convert from non-compliant markdown formats into commonmark-compliant format quite easily, and it is a one-time process. Hence, backward-compatibility should be a hard-block, in my opinion.

All things considered, I still stand by this assessment that I posted 1.5 years back: Beyond Markdown

1 Like

Can you share your current thoughts on the state of the 6 items @jgm? Is there anything specifically we can do to help?

Jeff Atwood via CommonMark Discussion
noreply@talk.commonmark.org writes:

Can you share your current thoughts on the state of the 6 items @jgm?
Is there anything specifically we can do to help?

As I say in the document, it’s not really a list of proposals for
modifying commonmark, which has the aim of retaining compatibility.

(Though I have implemented the general attribute syntax as
an extension in may Haskell commonmark library, commonmark-hs,
and it’s available as an extension to commonmark in pandoc.)

I tend to agree the next order of business is to pick the most popular “extension” (as observed in the wild, on the internet) and add that to the CommonMark spec. My vote would strongly be for tables.

That’s assuming we’re confident in the base CommonMark spec and there aren’t any major outstanding, unresolved issues in the base CommonMark spec? :thinking:

The discussion around creating a new language is interesting, but a bit of a distraction from the goal of strongly specifying Markdown. Unless there’s some consensus about moving applications to the new language, there’s probably more practical value in formalising what’s already in wide use. I agree with @codinghorror about moving forward with extensions; assuming that it is relatively stable, could the core spec be left in a pre-1.0 state for the time being and extension specs built on top of this?

1 Like

I wouldn’t be against adding a table extension.

But even for simple pipe tables there are a number of tricky
cases to consider, and if you want to open up more features
(row/colspans, side headings, headers and footers, etc.),
it starts to get very complex very fast.

For now I’d suggest that anyone who is adding a table extension
should make it compatible with the gfm table spec.

2 Likes

PS. This is not the right thread in which to discuss extensions
or the commonmark spec.

1 Like

It would be very useful if your implement one comment character (or a double comment character) that have the function of making all the line following it as a comment, and thus ignored in displaying the text.