Trying to understand parsing of list item

I’m implementing a commonmark parser and I’ve got one case where I parse a construct differently to the reference implementations. The simplest case is

*
a

which my code parses as <ul><li>a</li></ul>, but the reference parses as <ul><li></li></ul><p>a</p> - i.e. the a is not getting treated as a lazy-continuation line.

I had thought that since

*
   a

parses as <ul><li>a</li></ul>, the lazy-continuation rule would result in my behaviour.
What am I missing? Is it that while a could be paragraph continuation text (if we changed the previous line), since it does not actually continue a paragraph we disallow it?

1 Like

Lazy continuation only applies to paragraphs. The a is the start of a paragraph, not the continuation of one. Also, you cannot start a paragraph lazily, only continue one started with the proper indentation (in the case of list items, or properly prefixed in the case of block quotes). Here a paragraph is started within a list item, and continues lazily:

* z
a

(you can quickly test all my examples with the online CommonMark demo)

In both your examples, the list item does not start with a paragraph, but a blank line. See point 3 under List Items in the spec (starts right above example 248). A list item is allowed to start with at most one blank line. Here is a list item starting with a blank line, then continuing with a properly indented paragraph start, the single z, which then continues lazily with the unindented a:

* 
  z
a

It is the paragraph that continues lazily, not the list item. Here the list item does not lazily continue with the paragraph b :

* 
  z
a

b

Here the list item does continue with the paragraph b c:

* 
  z
a

  b
c

@drmikeando, you should execute the spec as a test suite against your implementation. In fact, the spec is designed for such use, as the reference implementation demonstrates. I’m pretty sure doing so would have more or less let you know what I explained above.

@jgm, does it make sense to break out the topic of Lazy Paragraph Continuation in the spec into a dedicated section? Currently its explanation is scattered around the spec. Spec links to “Lazy Continuation Lines” point to this section under List items, which in turn references “paragraph continuation text” with a link to a section under Block quotes. The closest thing the spec has to a generalized notion of lazy paragraph continuation is within the Appendix on parsing under Phase 1: block structure.

1 Like

@vas Thanks for the clarification - it explains the behaviuor I’m seeing perfectly. My implementation is being run against the spec. But it’s still a work in progress. I’m also fuzz testing against another implementation, which allows me to smaller failure cases - which is what threw that example up.