Not sure if this is really within the scope of this standard to change, but this has irked me and several others: * and _ both act exactly the same when producing italics or bold text. This makes markdown a bit more confusing to learn, and to avoid making a mess, the author has to pick one scheme and stick to it.
One solution would be to scrap one of them altogether; using * or _ exclusively. For example, *i* could be the only way to create italics, and **b** the only way to create bold text. Another solution would be to let *b* be bold and _i_ be italics.
As I said, Iām not sure whether itās within the scope of this standardization effort, but in my opinion, it is a real issue with the markdown standard, and has to be changed at some point. This will break some backwards compatibility, so if it should be changed at all, it has to be changed as soon as possible.
I agree here, but I wonder how disruptive this would be for sites/apps with existing content to adopt SMD ā¦ I wonder to what degree backwards compatibility is a concern that should be acknowledged. Making it relatively painless to upgrade would spur adoption.
First, remember that according to the HTML spec itās not italics and bold, itās emphasis and strong importance. I think whatās more at issue here is a contradiction between the HTML spec and SMD spec:
HTML
The em element represents stress emphasis of its contentsā¦ The level of stress that a particular piece of content has is given by its number of ancestor em elementsā¦ The em element also isnāt intended to convey importance; for that purpose, the strong element is more appropriate.
SMD
Markdown treats asterisks (*) and underscores (_) as indicators of emphasis.
So while HTML stresses that em and strong are completely different elements, SMD clearly conflates them and considers strong to be āstrong emphasisā. I donāt see a clear path here.
Using only * for strong and only _ for em would be more compliant with the HTML standard by separating the two elements. SMD does already allow for nesting of em and strong, so āstrong emphasisā could still be conveyed by nesting.
The existing behavior is more consistent with Markdownās original (though technically incorrect) interpretation of em and strong. I think this is also pretty consistent with those elementsā usage on the net.
When it comes down to it, I think this is really an HTML5 problem. i, b, em, and strong are pretty high on the list of inconsistently used elements, so what should SMD do? Be a stickler and try to follow the HTML5 standard or go for ease-of-use and do what people usually intend?
Markdown is based on the way people used to mark up conversations on USENET. In that environment, people used *emphasis* and _emphasis_ and sometimes also -emphasis- interchangeably. If they wanted to indicate extra emphasis, theyād put **double stars** or __double underscores__, again interchangeably. People generally tended to pick one or the other and stick to it, except sometimes to avoid ambiguity (e.g. I hate *identifiers_with_underscores*)
Thatās why itās interchangeable in Markdown. Making _ and * each do something different would be an unpleasant and annoying break with tradition.
You say that this makes markdown a bit more confusing to learn, and to avoid making a mess, the author has to pick one scheme and stick to it. I donāt see why this is confusing. There are two different ways to write each level of emphasis, how is that more confusing than it would be if * and _ did different things? (And then you would have to remember to nest them properly!)
Backwards compatibility was a major goal. Of course, implementations
vary enough that itās impossible to avoid all breakage. But we want to
minimize it.
In my experience, most people intuitively see _ as denoting emphasis (itās visually lighter) and * as denoting strong importance (itās visually heavier).
Gruber had some logical arguments behind his decision, but it ultimately seems to have been his personal preference and stubbornness that won through. Aaron Swartz, Merlin Mann and several others were advocates for _emphasis_ and *strong*, but Gruber was not convincedā¦
In short, youāre sort of screwed, because thatās how I write, and itās how Iāve written since around 1992.
Unfortunately, any change to this would break backwards compatibility with legacy Markdown documents. Not to mention the habits of everyone who has used Markdown for years. Trying to get this fixed in SMD will not be easy.
That said, I think it could be done with relatively little breakage in legacy documents.
Have the spec declare that _ denotes emphasis and * denotes strong importance, while still allowing doubles (__, **) to denote the same as their respective single character. Double characters must be used to add emphasis and strong importance in the middle of words.
The breakage in legacy documents would then be:
__foo__, intended as strong importance instead come out as emphasis
*bar*, intended as emphasis instead come out as strong importance
Itās mainly a semantic issue, the degrees of emphasis change. Unless Iāve overlooked something? (Not unlikely. Despite its simple syntax, Markdown is full of quirks and surprises!)
Iād love to see this getting fixed in SMD. But Markdown is in a very different position today than when Gruber made his decision back in 2006. It will not be easy to get a change like this accepted today either.
I would be vehemently against āfixingā this, since it would break backwards compatibility in a big way. And not only in existing documents, but it would also change patterns that millions of people might have learned already.
Thatās my conclusion as well. Long story short: Itās too late.
Edit: Just as a tease, hereās how I would have liked Markdown to be:
I _really_ like Markdown, but *beware*, stick to the spec.
It is supercali**fragilistic**expialidocious.
Emphasis: _foo_
Strong: *foo*
Intra-word emphasis and strong: foo__bar__baz, foo**bar**baz
No intra-word emphasis and strong: foo_bar_baz, foo*bar*baz
What if asterisks were used for <em> and <strong>, and underscores for <i> and <b>. This would preserve backwards compatibility visually, while allowing authors to make the appropriate distinction when necessary.
E.g.:
I really like The Big Lebowski!
I *really* like _The Big Lebowski_!
(Note that this may introduce accessibility issues, where things that were properly emphasized in legacy documents suddenly are not properly emphasized. However, Iām guessing legacy documents already had accessibility issues with strings that were emphasized inappropriately.)
Generally speaking, if there are 2 ways to achieve the same thing, then itās confusing to learn. As a novice you donāt know if itās really the same, if one is different from the other in some way or which you should use for what. Once someone tells you itās actually the same, itās not confusing anymore, but you canāt intuitively decide that for yourself (necessarily).
In this case though Iām with you in that 1) itās already too late and 2) itās not confusing because these notations have been around forever and they both communicate well what they do. If rewinding time were possible, Iād definitely remove one, though.
While I would like to get rid of underscores (I find them confusing as used for emphasis), I suspect that it would break compatibility with far too many documents to be worth the change.
Even after using markdown for ages. I still think the markdown convention for * and _ to be wrong.
Iāve always seen italics to be _ and bold to be * .
Even if itās not backwards compatible, we really do need to rought this one out, or this mistake will set in stone even harder (Much like the switch from Python 2.x to 3.x) .
Maybe you could make:
<em> ā _em_
<strong> ā *strong*
<i> ā __italic__
<b> ā **bold** .
My justification for the above is based from this section for html5 <b> tag. Which indicated that <strong> should be used first over <b> to bold text. HTML b Tag :
Note: According to the HTML 5 specification, the <b> tag should be used as a LAST resort when no other tag is more appropriate. The HTML 5 specification states that headings should be denoted with the <h1> to <h6> tags, emphasized text should be denoted with the <em> tag, important text should be denoted with the <strong> tag, and marked/highlighted text should use the tag.
Your quote highlights why your proposition is a bad idea; itās semantically incorrect. __text__ is strongly emphasized due to being surrounded by double underscores. The <i> tag does not mean strong emphasis/importance.
Not about āboldā or āitalicsā, itās about emphasis and importance. In which case you could only argue for _ as emphasis and * as importance. This would be incompatible with having double asterisks or double underscores as means of indicating importance, resulting in massive backwards compatibility issues.
Not to mention that your proposal from the beginning already breaks many, many existing documents and implementations. (Everyoneās single asterisk wrapped text would be changed from being in <em> tags to being in <strong> tags, and vice versa for double underscore wrapped text.)
This is one of those topics where the original Markdown spec is quite clear in its definitionsā¦ I also think the logic is reasonably justifiable.
And since the goal of CommonMark is to be as true as possible to the original spec, and render existing Markdown docs as faithfully as possible, our path forward is clear here.
Excuse my possible mistakes. This is my first post in to this platform.
@codinghorror, with all due respect, this particular issue isnāt about logical consistency (sorry, Iām afraid I havenāt been able to find its justification here).
in my opinion. the main problem for users is that this is visually misleading.
And this isnāt limited to new users, but it affects to the source readability in CommonMark by experienced users.
Markdown already has an installed base of thousands of websites and apps, and millions of documents. Changing the language in backwards-incompatible ways for commonly used syntax is not really on the table. The only thing that would accomplish is a further fragmentation of the language, where the same text renders in different ways depending on what markdown implementation and version is being used.