Maintaining sanity in lists with different kinds of spacing

jcracknell · September 18, 2014, 1:11am

So I am puttering away at my own (PEG-based) markdown implementation, and being very much in tune with the Markdown ideal of producing output which predictably resembles the input, I was all over this.

  * a
  * b
  
  * c

  * d
  * e

The above is very clearly three lists: a tight list containing items a and b, a loose list containing item c, and a tight list containing items d and e.

This behavior is easy enough to acheive - you make the rule for tight lists willing to “settle” - it will accept whatever items it can get, so essentially the space between the first and second item in the list determines its character. And that works great. Then you start looking at edge cases:

  * a
    
    b
  * c

So I guess this is a tight list followed by a code block and another list? Wait I know, we’ll allow the omission of the trailing blank line in a loose list item if it contains a continuation! Yeah!

  * a
  * b

    c
  * d

Yeah! And this is… uhhh… a tight list? With a single item? Followed by a loose list with two items - two lists!
No wait! Uhhh…

It is at this point you start to ask yourself important questions like:

When exactly are you going to need support for adjacent lists?
Who writes lists like this anyways?
Why am I trying to help them?

The current behavior will happily munge together anything even remotely resembling a list - it is basically as flexible as it can get. This has very real benefits in terms of the fragility of the list syntax. In terms of user expectations and in the context of real-time preview (as seen on this site), it is best if small edits to the input do not cause the structure of the document to vary wildly.

Given the very real questions surrounding the utility of a more rigid list syntax, I am currently mulling over whether or not such a change would actually be beneficial to the average user. The current behavior encodes the notion that adjacent lists with the same marker style are fundamentally ambiguous, which has a certain appeal.