What are "blanks"?

release-1.0
#1

Section 4.2 ATX Headings states:

Leading and trailing blanks are ignored in parsing inline content

and then gives # foo => <h1>foo</h1> as an example.

The term “blank” appears in the spec (v0.27) 85 times; 83 times in the phrase “blank line[s]”, once in the phrase “blank HTML comment”, and once in the above-quoted statement.

Earlier, in section 2.1, the spec takes care to define the terms “blank line”, “whitespace character”, “whitespace”, “Unicode whitespace character”, “Unicode whitespace”, “space”, and “non-whitespace character”. But the term “blank”, in reference to a character, is nowhere defined, which renders the above-quoted statement ambiguous.

I think that it’s clear from context and from precedent that “blank” should here be treated as synonymous with “space”, but I believe it would be prudent to make that explicit in the text.

0 Likes

#2

The Javascript reference implementation at try.commonmark.org appears to take a different approach than I expected, and treats U+0009 (horizontal tab) as a “blank.” This is evidence in favor of the proposition that “blank” should be synonymous with “whitespace character”, although it could still be “Unicode whitespace character”.

0 Likes

#3

It would be best to avoid the term “blanks” and rewrite
using one of the defined terms you mentioned in each case.

1 Like

#4

From what I’ve seen in the reference implementations, “blank” is a space or tab, NOT all whitespace (like \f or \v).

Furthermore, even when the a definition links the term “whitespace” in many cases I’ve noticed that the reference implementations really mean space or tab.

0 Likes