What are "blanks"?

Section 4.2 ATX Headings states:

Leading and trailing blanks are ignored in parsing inline content

and then gives # foo => <h1>foo</h1> as an example.

The term “blank” appears in the spec (v0.27) 85 times; 83 times in the phrase “blank line[s]”, once in the phrase “blank HTML comment”, and once in the above-quoted statement.

Earlier, in section 2.1, the spec takes care to define the terms “blank line”, “whitespace character”, “whitespace”, “Unicode whitespace character”, “Unicode whitespace”, “space”, and “non-whitespace character”. But the term “blank”, in reference to a character, is nowhere defined, which renders the above-quoted statement ambiguous.

I think that it’s clear from context and from precedent that “blank” should here be treated as synonymous with “space”, but I believe it would be prudent to make that explicit in the text.

The Javascript reference implementation at try.commonmark.org appears to take a different approach than I expected, and treats U+0009 (horizontal tab) as a “blank.” This is evidence in favor of the proposition that “blank” should be synonymous with “whitespace character”, although it could still be “Unicode whitespace character”.

It would be best to avoid the term “blanks” and rewrite
using one of the defined terms you mentioned in each case.

1 Like

From what I’ve seen in the reference implementations, “blank” is a space or tab, NOT all whitespace (like \f or \v).

Furthermore, even when the a definition links the term “whitespace” in many cases I’ve noticed that the reference implementations really mean space or tab.