Beginning new parser - some spec feedback

Paul_LeBeau · May 11, 2015, 6:11am

Hi

I’m just starting on a new parser, and wanted to give some feedback based on my first hour or so reading the spec and beginning a new implementation.

In the preamble to section 3 (“Blocks and inlines”) it states

Blocks can contain other blocks, …

but a couple of paragraphs later “Leaf blocks” are introduced. However these are blocks that cannot contain other blocks. This contradiction should be removed. If Leaf blocks aren’t really blocks, then shouldn’t they be called something else, like “Leaf elements”? At the very least the Section 3 preamble should be clarified.
The glossary goes to great lengths defining all sorts of variants of “whitespace”. But the most commonly used variant, “space” / “spaces”, is not defined explicitly. I assume the authors mean ASCII 32 here. But it did make me wonder for a while whether other whitespace characters are allowed - for example - as indent characters, or between the horizontal rule characters. Should this perhaps be made explicit?
The preamble to section “4.2 ATX headers” states:

The opening sequence of # characters cannot be followed directly by a non-space character.

A “non-space character” is defined as anything other than ASCII 32.

However in Example 40, the second line is “#”+newline. It is accepted by the reference parser as valid, but that appears to be illegal according to the spec. Either the parser is wrong, or the spec is.

jgm · May 12, 2015, 3:01am

Thanks for the feedback. It is really useful to get impressions from someone reading the spec for the first time, so please post more if you have more comments!

+++ Paul_LeBeau [May 11 15 06:21 ]: