The issue as I see it is not that some one would start a sentence with a hashtag as much as it might end up at the beginning of a line from hard wrapping.
Also, it seems far nicer to require the space just from a readability perspective.
The issue as I see it is not that some one would start a sentence with a hashtag as much as it might end up at the beginning of a line from hard wrapping.
Also, it seems far nicer to require the space just from a readability perspective.
So what are our test cases here:
#heading
# heading
#hashtag
#15 number
#1533 github issue
Did I miss any?
When I arrived at this discussion I was in favor of allowing no space for backwards compatibility, but after reading through the thread and considering the different cases I think Iām converted to requiring the space.
Iāve felt the pain of manually editing lots of markdown files to include the space in headers due to moving between blog engines. As far as I can tell, thereās no simple way around it.
@balpha, how many comments or posts in chat do begin with an hash?
Personally I think that space after hash will only really break existing M******n users who bothered to read the docs, and those are the most likely to be able to adapt to the new behaviour ā and then we should know better than to use h1 tags in post bodies
The easier way out would be to explicitly assign #this special semantics. Shorter post contexts like reddit and discourse probably want hashtags. Longer form posting like SE and blogs probably want anchoring for intra-article links. (Github has issue numbers, though those are numbers so #this remains ambiguous.)
If you donāt want to use extensions though and you really want to automagically do the right thing, at least for most languages, disambiguate on the character immediately following the # sign. If that character is uppercase, itās a title, otherwise itās an hashtag. (Define uppercase as x == x.upper()
, so ā1ā, " (" and āę¬ā would count as uppercase.) This renders #swag
and #ifdef
verbatim, it leaves #Loud noises
loud and goes with backwards compatibility with the rest of the cases.
As one of the #noobs who contributed to this always leaves out the space, sorry.
I see the point of changing this, I know Github does require the space (which incidentally catches me out all the time)
It is important to note this only affects people who want to put a #hastag at the start of the line (which i guess is a fairly common use case).
In a chat-like application that uses hashtags, the āline starts with a hashtagā is an extremely common use case. Often a chat message can be just a single hashtag. Adding barriers like having to escape them in this case (starting a line) vs. everywhere else would be really confusing and a major pain point. In a chat application, the pace of communication is easily a lot faster than when writing blog posts or comments, so breaking that workflow is in my opinion also more harmful.
It would be preferable not to include too many of these options. Ideally, none; every option essentially creates another Markdown flavour. I want to be able to copy and paste between different CommonMark-compliant applications and have everything just work.
Breaking backward compatibility seems reasonable to account for expected behaviour, as with the list marker being significant in CommonMark.
#hello
Twitter users have also been trained to put a character before a message starting with @ when they want their tweet to be publicly visible. people can learn to deal with strange workarounds if they have toā¦ (like what I did above).
@balpha - for your 1.5% number, is that the number of a random set of sample posts? (that was my initial interpretation)
Or did you only select posts that already contain ATX headers?
Iād be interested in what % of ATX headers in your sample are using a space vs. what % are not using a space.
But currently Iām currently in favor of requiring the space after the hash though because, a) 1.5% doesnāt sound that bad, b) there are plenty of ways to make headers, c) github already does it, and d) twitter / instagram, aka:
Think of the children
Your initial interpretation was correct. The posts I checked were simply the last 200k posts on meta.SE (at the time my local copy was made).
##Philosophy
This markupās primary utility comes from an ability to mostly just Do The Right Thing with ASCII-formatted text. If I write a bit of text that looks like a heading, it becomes a heading. If I write text that looks like a paragraph, it becomes a paragraph.
Narrowing the spec to require authors to remember strict rules in order for their documents to be displayed is counter to this philosophy. If I have a short bit of text prefixed by some number of hashes on its own line followed by a paragraph, I intended that to be a header. If the system Iām using has some other convention for hashes (tagging, numbered article references, whatever) then they should consider adopting a similarly-obvious convention for their use that does not conflict with headers.
##Practicality
Who uses H1 headers? Who uses multiple H1 headers in a document? How common are the use of multiple H1 headers inline or at the end of documents? Iām not using any H1s here, because this document Iām contributing to already has a titleā¦
If the goal is to avoid conflicts with hash tags (which are commonly NOT at the start of a document), then a client which expects hashtags would be well-served by just ignoring H1 headers after the start of a document, or even pre-processing hashtag-like things.
##Scope
In no case is the processing of hastags, issue refs, or other such entities a part of CommonMark; how, when, or IF they are processed is likely to continue being the responsibility of the client, with all the inconsistencies that implies. But headers are, and breaking compatibility with most existing implementations for the sake of something that is wholly out of scope is careless.
I wouldnāt say āitem with spaceā is a strict list ruleā¦
-this is not a list
-is it?
In general from what I have seen, requiring the space is the correct choice, reduces casual errors, and has the most support ā even from posting this topic on Twitter, the responses were 80% positive in favor of requiring the space.
Being slightly more strict, when it comes to Markdown, is kind of what weāre doing with CommonMark. Not so much that it breaks a ton of stuff, but where it will make everyoneās future lives easier by reducing confusion. And I believe that is true in this decision as well.
My concern is still that this is largely a āBut hashtags!ā argument, which has more to do with resolving an issue of conflating another markup convention with CommonMark than it does with making CommonMark itself straightforward.
At a minimum, I feel like the spec should state that a space must always work, and provide an option that for implementations which also interpret hashtags externally, a space may be required. Unfortunately this introduces a level of ambiguity which I would imagine weāre trying to avoid, so thatās not much of a solution at all.
I still feel that simply expecting that users will use a backslash to prevent-headerfication when thereās a conflict is a perfectly fine option, but I suspect you arenāt swayed by that argument. Admittedly Iām also not entirely sure what cases there are (except GitHub?) where hashtags are interpreted a special way and headers are considered allowed markup though, so.
We are using the CommonMark spec (markdown-it with some plugins) for diaspora*, a decentralized social network with more than 80000 active users on more than 300 servers. We like the current rules for headings because it allows our users to easily express themselves using #hashtags (we wrote a plugin for those) while still being able to define headers in an intuitive way.
Twitter has hashtags. Twitter does not have markdown. Iām sure there are lots of markdown users among your followers, but still you have some major potential for sampling bias there.
Unnecessary in any case. Thereās a massive amount of existing markdown publicly available; if the (ahem) common practice is to eschew spaces, then the standard is fine. If itās mixed, not fine. A unifying standard should reflect what we got.
What if we did this? Require a space for <h1>
only, and putting another # at the end of the line makes it a h1 again.
For a first-level ATX header, if there is only the
#
sign at the beginning (and not the end) of the line, and there is no space character immediately after the#
symbol, then an ATX header is not created; interpret the line as the start of a paragraph.
Iām not seeing many cons in having this rules beside having to vet some previous data once.
#hashtag?
#heading#
##heading
Itās only really ambiguous for top-level headings, it seems. Anyhow ā¦
When I was starting to write a more general specification for lightweight markup languages a little over a year ago, one of the first things I decided was a line syntax that required at least one space (or tabulator) after the line prefix to avoid ambiguity and enforce uniformity.
) Line
attribute) Line
(attribute) Line
My parser that is still in the making will have a Markdown-like mode, but it wonāt support
#Heading
>Blockquote
-Bullet point
1.Enumerated list item
Of these, the only one supported by CommonMark is >Blockquote
.
My vote goes to requiring the space for headers.
There are relatively few places online where youāre allowed to write headings anyway. GitHub being a major exception (but they already made the space mandatory), and of course when you write your own blog post (or other documents), but those people tend to be more markdown-pros anyway than the average-users that just want to write their hashtag in their favourite social network or whatever. If we want these online services to adapt CommonMark, we should optimize for the novice-use-case (and even pros should use a space since itās just ugly without).
I agree with @mb21 and I also think @rikingās proposal has merit.
#Hashtags
come up way too often in the real world and I see problems caused by not requiring the space all the time, it confuses people.
This one is an easy and obvious call.
And even that one is kind of debatable now that weāve gotten deeper into thisā¦