Blank lines before lists, revisited

release-1.0

#21

jkdev noreply@talk.commonmark.org writes:

It seems sensible to require at least two numbered list items.

How many situations are there in which someone actually wants to 
interrupt a paragraph with a single-item numbered list? Probably
1. Or maybe a couple more. Other than that, a numbered list would
be created by happenstance, without intending or expecting it.

It’s actually pretty common in some contexts. For example, in a
paper one might discuss a number of numbered examples, with regular
paragraph text in between. I’ve frequently used one-item numbered
lists (with different start numbers) for this kind of thing.


#22

To weigh in on the proposals:

(1) (single-digit start numbers) seems okay to me. It might fix some
practical problems (and might cause some others). On the other hand,
even with this fix we’d have a mismatch between parser behavior and the
spec of the sort described in commonmark/cmark#204.

(3) is bad for the reason I just gave in my previous post.
One-item lists are, in general, useful, so we don’t want to
rule them out altogether. I suppose we could require at least
two items when the list interrupts a paragraph, but that creates
difficulties for parsing: you can’t know you’ve got a list until
you’ve parsed the whole list item and seen what comes after it.

(4) has a similar issue: we can’t tell if the list is going
to be loose unless we’ve parsed the whole thing. And I’m not
sure what is gained by allowing only tight lists to interrupt
a paragraph.

(5) is going to be problematic for people who wrap their text
to a reasonable width (say, 72 characters), and also for people
who don’t hard-wrap at all. And I echo @mity’s worry about surprising
behavior.

(2) seems the most promising to me, but there is the worry
about languages with different punctuation conventions.

I guess it might be a good idea at this point if someone
summarized clearly and concisely why we need to change things.
My own intervention above was motivated by
https://github.com/commonmark/cmark/issues/204, but I believe
that issue could be handled at the implementation level without
a change in the spec. At any rate, I’ve written a Haskell
implementation, roughly following the same strategy as cmark,
which gives the right results in this case.


#23

An enumerated example is not a list item. I don’t think this counts as a strong argument against (3). I did assume a minimum of two list items would only be expected in lists that are not preceded by a blank line. If this makes it to complicated, I’m fine with doing this idea.

The reasoning behind (4) is that a tight list could well be a child of a paragraph (in output formats which support this nesting), whereas a loose list, which can contain paragraphs (i.e. blank lines) itself, seems strange inside a paraphrasing and thus could only end it.

<p><list.tight/></p>

<p/><list.loose/><p/>

Anyhow, I prefer (2), too. The colon at the end of the line preceding a list works in two ways:

  1. Existing content in many languages will have a colon introduce a list without an intervening blank line. It works as a heuristic rule.
  2. New content in any language can be authored with the colon as a new active markup character.

The problem is that for much of variant 1 the colon should be retained in the output, whereas it should be dropped for many cases of variant 2.
This can be done with an additional rule, but I’m not sure whether that would still be acceptable.


#24

Christoph Päper noreply@talk.commonmark.org writes:

An enumerated example is not a list item.

Well, a list item is the closest thing in commonmark to represent it
with. If you make it a regular paragraph, the indentation will be
wrong and it won’t stand out.

The reasoning behind (4) is that a tight list could well be a child of
a paragraph (in output formats which support this nesting), whereas a
loose list, which can contain paragraphs (i.e. blank lines) itself,
seems strange inside a paraphrasing and thus could only end it.

I see. But the way the spec is designed, a paragraph can never resume
after a tight list either. So, without much larger changes, (4) doesn’t
seem motivated.

Anyhow, I prefer (2), too. The colon at the end of the line preceding a list works in two ways:

  1. Existing content in many languages will have a colon introduce a list without an intervening blank line. It works as a heuristic rule.
  2. New content in any language can be authored with the colon as a new active markup character.

The problem is that for much of variant 1 the colon should be retained in the output, whereas it should be dropped for many cases of variant 2.
This can be done with an additional rule, but I’m not sure whether that would still be acceptable.

For something a bit like this, see
http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#literal-blocks

‘As a convenience, the “::” is recognized at the end of any
paragraph. If immediately preceded by whitespace, both colons will be
removed from the output (this is the “partially minimized” form). When
text immediately precedes the “::”, one colon will be removed from the
output, leaving only one colon visible (i.e., “::” will be replaced by
“:”; this is the “fully minimized” form).’

So, we could say that if the final colon is preceded by whitespace,
it gets removed.


#25

<space><colon> would not work well with French practice, I guess. I thought about leaving a single colon as is :, but removing both if there are two ::.


#26

I honestly believed that list items needed to have some whitespace and not start at the left margin.

What if list items needed either a blank line separating them from the paragraph or some initial whitespace before the first bullet/number? Then you wouldn’t get accidental list items from linewrapped paragraphs.