Setext headers and empty list items

Spec says: “If a line containing a single - can be interpreted as an empty list items [TYPO: should be item], it should be interpreted this way and not as a setext header underline.”

But if you think about it, this rules out all one-character hyphen underlines. For, the following is a list with one empty item:

-

A list can start directly after a paragraph, with no intervening blank line:

hi
- a

So, in this example, the second line is interpretable as an empty list item:

a
-

According to the spec as written, it shouldn’t be a setext header, but the reference parser treats it as one.

Should the spec be changed? (To ban empty list items? To require at least two -- for a setext header underline?) Or the implementations?

2 Likes

You might want an empty list item. Perhaps to represent a missing item in a list that will be filled in later.

Potentially there could be a header that is only one character long.

I expect both cases to be rare, so perhaps make a decision based on the least used case?

1 Like

an empty list can be

- \  

I am with @chrisalley, which of these is more common in the wild? A header with one dash underneath seems weird, but then so does an empty list…

+++ codinghorror [May 02 15 06:32 ]:

I am with @chrisalley, which of these is more common in the wild? A header with one dash underneath seems weird, but then so does an empty list…

They’re both uncommon, but both can come up in a fairly natural way.

A one-dash underline makes sense when you have a one-character header.
(And some people may be in the habit of using it for longer headers out
of laziness.)

An empty list item can occur when you’re gradually filling in a list.

Empty list items are less widely supported by existing Markdown
processors, so if we had to choose, it would make sense to exclude
empty list items. But maybe there’s a good way to have both.

As workaround we could require at least 2 chars for setex.

From the arguments it seems that the empty list item is likely to be a work in progress so it should be safer to parse this as header - even if the intention had been an empty list, often it will be just a draft that was “messed up”.

1 Like

Single letter names, while rare, might be used as a heading title. In such cases, it would look more elegant if a single - is used.

Examples of single letter names:

+++ vitaly [May 02 15 21:05 ]:

As workaround we could require at least 2 chars for setex.

As others have also pointed out, there is a legitimate use for these
in things like

A
-

Also, since existing implementations only require one character in the
setext header line, there may be many existing documents that lazily
use just one character, e.g.

My title
-

and these would break with the change you propose.

Please note that this problem is likely related to this:

- one
- - -
- two

If this is supposed to be a thematic break that separates two list items, which example 30 suggests and the majority of implementations agree on, then the problem might be solved!

Let me elaborate:

  • A setext heading must have higher precedence than a thematic break. Otherwise setext headings with --- simply would never work.

  • If the above example should create a thematic break, then a thematic break must have a higher precedence than list items.

From those two points follows logically that a setext heading must have higher
precedence than a list item. Done. The implementations are right, the spec is wrong.

Or am I missing something?

Note: This somewhat depends on the definition of “precedence” and if a block parser is supposed to “look ahead” to later input lines. But if that would be the case, a setext heading would also trump an empty list item.

Note: This doesn’t say anything about if empty list elements should in general be allowed or not.

It turns out that I did miss something, but it doesn’t change the outcome of my previous comment:

There is this invisible mystical meta-container called “List” that makes it impossible to parse list items and thematic breaks independently. So in a way, both have the same precedence (still pending a proper definition of “precedence”).

This changes my argument slightly, but the result is the same: A setext heading must still have higher precedence than a thematic break. And since list items basically have the same precedence as thematic breaks, this means that setext headings beat list items!

Another use case: single-ideogram heading titles (like ).

愛
-
1 Like

This is the most common markdown-related bug report we get from users: weird highlighting/rendering when they type a list item after text, because it just so happens that you usually pause a bit after you type the opening - and then wonder what’s going on.

Personally, I’m for requiring 2 dashes at minimum.

As for the 1-letter setext headings. IMO, I’d de-prioritize (in tutorials and toolbar buttons) setext headings in favor of ATX headings, since they’re more useful, and efficient, but that’s probably just me.

1 Like

Hello all! I hope bumping this thread is the right thing to do, despite its age, since it’s the exact topic I wanted to bring up :bowing_man:

I recently opened an issue on Hedgedoc about this ‘empty, hyphenated list-item Vs. h2 setext’ issue. The maintainer also initially thought it was a problem on their side, but then realized it was conformant with commonmark, so opened an issue on Commonmark.

The argument for setext only requiring one - or = for single-character titles does make sense. However, the argument for setext requiring multiple - or = is that it would solve the list/header issue, and would be arguably closer to Gruber’s description:

Markdown offers two styles of headers: Setext and atx. Setext-style headers for <h1> and <h2> are created by “underlining” with equal signs (=) and hyphens (-), respectively.

(emphasis added)

This description, which arguably cites multiple - or =, appears to be backed up with the examples some lines later:

A First Level Header
====================

A Second Level Header
---------------------

In contrast, Commonmark example 83 states:

The underlining can be any length

With respective examples of h2 and h1:

Foo
-------------------------

Foo
=

Furthermore, as mentioned in the first post, the Commonmark 0.30 Specification, 4.3 Setext heading still states:

If a line containing a single - can be interpreted as an empty list items, it should be interpreted this way and not as a setext heading underline.

While it might be argued that…

Foo
-

should still be interpreted as an h2, I believe interpreting…

- Foo
    -

as an h2 is interpretation too far, and goes against the ‘readability’ maxim.

In conclusion, I:

  • generally support requiring multiple - or = for setext headers, and
  • strongly support indented, hyphenated list items not being interpreted as h2 setext header underlines for parent list items.

A solution which would only solve the ‘indented point/h2 setext’ issue could be to require setext headers and/or their underlines to start at the beginning of their respective lines.

This would break headers which are indented for whatever reason. However, this is possibly a less breaking solution than requiring multiple - or =