Spec nits, typos and other minor issues

Minor issues reading the spec.
It would help if (e.g. the examples) are marked as informative, remainder being taken as normative? Without that, I’m taking examples as normative.

  1. EOLN is specified by characters, EOF is not.
    Line ending is used twice, differently, without definition.

  2. Leaf is used without definition.

4.1Please define characters rather than glyphs for underscore, hyphen, asterisk. Ideally using Unicode char reference names.

Lots of wasted typing for character sequences. Why not use ebnf or similar?

“Horizontal rules can interrupt a paragraph:” it appears that the para is terminated and restarted. Interrupt suggests the hr is placed mid para

… which seems wrong? Please clarify.

" this is a setext header" Is this a typo? setext?

Example 21 implies hr terminates a list. If true, please be explicit. Does it terminate any block?

4,2 "The closing # characters may be followed by spaces only. " is it still a header with zero content after \n? Unlcear what terminates the \ws sequence? Only a new line? ‘only’ makes it confusing.

“A space is required between the # characters and the header’s contents.” contradicts above, may be followed by ws only?

"many implementations currently do not require the space. " Implementations of what? This spec is for version?

“because the first # is escaped:” term ‘escaped’ is undefined.

"# foo bar *baz* escaping used before definition.

"Four spaces are too much: too much for what? suggest this is not a header or some such

“A closing sequence of # characters is optional:” optional to what? Again, a syntax schema would make this redundant.

“It need not be the same length as the opening sequence:” In which case is a termination character(s) required for an ATX header? What is it?

“Spaces are allowed after the closing sequence:” Flaw in the spec? Since ws is important, perhaps use underscore character or something to ‘show’ ws? Is ‘spaces’ defined? Not found? Are tabs included? What of other unicode characters such as thin space, non breaking space? Are they ‘spaces’?

‘spaces’ plural. One, 1000? Loose wording?

“A sequence of # characters with a nonspace character following it is not a closing sequence, but counts as part of the contents of the header:” Specified in the negative? Please specify in the positive for implementations? What does terminate an ATX header?

"Backslash-escaped # characters " Again, no definition prior to use (or link to definition)

“ATX headers need not be separated from surrounding content by blank lines, and they can interrupt paragraphs:” Vague negative specification? Should this be a part of a block specification, i.e. more general? At least specify in the positive.

“ATX headers can be empty:” Can or may? Previous comment could / should be linked to this. Use of shall/should/may/can has same problem as W3C has addressed.

4.3 Setext headers

" followed by a setext header underline. " Suggest ‘immediately’ followed? Can I insert 100 blank lines? In which case the setext header will align with this requirement (but likely be wrong by your intent)?

“a sequence of = characters or a sequence of - characters” again request actual character definitions? u03D etc.

“The underlining can be any length:” Is any 0…n or 1…n? Is there a minimum at all?

setext header underline definition does not indicate start of line, which I assume is required? Only mentions indentation, which is undefined as being wrt newline?

" is a level 1 header if = characters are used," I need to infer ‘where’ it is used? Should be explicit. (again a syntax schema?)

"In general, a setext header need not be preceded or followed by a blank line. Very woolly? ‘need not’? Is this a requirement?

“However, it cannot interrupt a paragraph, so when a setext header comes after a paragraph, a blank line is needed between them.” Please reword for clarity? Again, it would seem this requirement belongs in the block definition section? It seems to be inbetween a para definition and an setext definition? What of a list termination / setext start? Not at all clear. It clearly contradicts the earlier “ATX headers need not be separated from surrounding content by blank lines, and they can interrupt paragraphs:” statement? Are they different then? Should they be different? Is there consensus on this?

example 42 leads to testing where the header content is grossly indented? Is this outside the spec?

“be a lazy line:” Is this defined? What does it mean?

ex 51 duplicates earlier statements. (lots of examples which could be clarifed by a syntax schema?)

“But in general a blank line is not required before or after:” In general? Is it required in specific cases?

4.4 Indented code blocks

[This section would benefit from a symbol for a ws character?]

The ‘indented chunk’ links to 4.4? Incorrect or undefined?

“An indented chunk is a sequence of non-blank lines, each indented four or more spaces.” This excludes lines consisting of 4 ws characters? Or does it. I ‘see’ that as blank? No definition of ‘non blank’? example 56 may contradict this?

“An indented code block has no attributes.” attributes are as yet undefined.

Example 57 unclear, with ‘how many’ white spaces?

Example 57 confuses when read with “each indented four or more spaces.” Need to differentiate between content and markup?

“An indented code block cannot interrupt a paragraph. (This allows hanging indents and the like.)” Confusion. One more for block termination definition? What can terminate a para? What defines the start of a code block since we have this as an exception? Weakly specified?

“However, any non-blank line with fewer than four leading spaces ends the code block immediately. So a paragraph may occur immediately after indented code:” Another weak ‘however’. This is a partial definition for termination. What of a blank line (when defined) with four leading space characters only? I think that slips in between

“And indented code can occur immediately before and after other kinds of blocks:” Please be explicit? ‘any other’? Other than what? paragraph? Leaves room for loose interpretation

“The first line can be indented more than four spaces:” Already said. Leaves room for error in interpretation? Leaves the status of space characters 5…n undefined.

"Blank lines preceding or following " I believe blank lines are used suffiiciently to require definition.

4.5 Fenced code blocks

“all subsequent lines, until a” suggest ‘up until’ for clarity

“The closing code fence may be indented up to three spaces, and may be followed only by spaces,” Add … on the same line unless this is meant to terminate the file?

Indentation - should it be identical on both code-fences? Not made clear?

"If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document). " Unclear. No definition of 'end of containing block? Perhaps should be the start of a new block (if that is defined?). Again the exception is the setext header? The bracketed comment has no place in the spec IMHO.

“A fenced code block may interrupt a paragraph,” Again, interrupt seems inaccurate. The para is terminated. Why is para singled out? Would a code block ‘interrupt’ (terminate) a list? It would appear so from the initial implementation? Again this points to a higher level ‘block definition’ section as being necessary? Supported by the rider “and does not require a blank line either before or after.” which is semantically a block level requirement.

“The first word of the info string is typically used” Typically? Is this a specification? Rephrase if so. And define word please. Some countries don’t use space char as a word separator? What of non-single-word languages?

No action appears to be placed on an implementer for the remainder of the ‘info string’? Is there one? example 83 infers it might be lost by an implementer? Is this implementation dependent? If so, please say so.

Example 70 appears to contradict the ?normative? requirement (terminating a block)

“Four spaces indentation produces an indented code block:” Contradicts the earlier statement (up to and including 3 spaces), again the example does not show indentation. Confusing

Example 78. internal spaces. Unclear. Perhaps use ‘within the line’ to clarify? Raises the issue of ‘blocks’ being used as inlines, which the example shows, but which is not explicit?

"Other blocks can also occur " Fluffy specification? What does ‘other’ mean?

“Opening and closing spaces will be stripped, and the first word, prefixed with language-, is used as the value for the class attribute of the code element within the enclosing pre element.”

  1. Opening and closing? I can guess what it means, requires firming up IMHO? Does the user need to insert ‘language’ or the implementer? The example is HTML, is that the only output? Weak specification, hard for the implementer to retain consistency.

“Closing code fences cannot have info strings:” should be clearly stated in para 3 of this section.

4.6 html blocks

" It ends when a blank line or the end of the input is encountered. " blank line link does not link to a definition.

“or the end of the input is encountered.” End of input used here, end of document used elsewhere? Inconsistent.

Implies (not stated) that an html block may not contain a blank line? If true please state it. Example 89 implies this. Is that normative?

" and will not be escaped in HTML output." Definition of escaped (direct or referenced)? Implication for author is to escape < &, is that true? Should be stated for the implementer if so. Alpha implementation is unclear on this.

Example 90 “until a blank line or the end of the document is reached:” extends what is said previously about termination of the html block.

“An HTML block can interrupt a paragraph,” Same comment as previously. Inappropriate, incomplete and confusing use of one block element, ignoring others. Should be at block specification level.

"This rule differs from John Gruber’s original Markdown syntax " Suggest remove. Inappropriate in a specification. Extract the ‘blank line terminates’ to bolster the previous statement and make it clear perhaps? A long and inappropriate section defending design decisions. If they are necessary, put them in a separate document linked from this, perhaps ‘rationale’?

“An incomplete HTML block” no definition of ‘incomplete’ other than by example? Syntactically incorrect? what of html> does that start an html block? Confusing for implementers.

4.7 Link reference definitions

Confusing, not consistent? Why is the word ‘definitions’ appended? No other sections have it, IMHO it makes the section unclear as to whether it is defining anchors, internal links, external (http protocol) links or something else?
Reference and definition are words used in html for links, which makes this worse? Not at all clear.

“No further non-space characters may occur on the line.” Suggest needs rewording. non-space undefined. (possibly use characters other than the space character (u0x20). Also ‘which’ line? since this can occur over a number of lines? Perhaps reference the end of the current line?

Made worse by "A link reference-definition … defines a label which can be used in reference links " Is the word ‘anchor’ inappropriate? See http://www.w3.org/TR/html401/struct/links.html#h-12.1.3

The ‘reference implementation’ does not create the link unless the target exists? Is this part of the spec, implementation dependent or unspecified?

ex 101. which is the destination? foo or /url?

ex 103 uses escapes before defining them.

All the examples duplicate [foo] with no explanation as to why? Via the ref impl, no links work, internal or external, without the second one being present? Very confusing.

ex 110 is nice, especially since you ignore character encoding. Staring impl wars nice and early? Please allow specification of encoding. If nothing else, using emacs class first lines? Give i18N a chance?

ex110 what are non space characters? I can guess, but this is a spec?

“A link reference definition cannot interrupt a paragraph.” Same complaint as with other blocks. In this case, what can it interrupt since I’m told what it cannot interrupt.

It does beg the question of inline links, not linked from this section of the spec, nor mentioned.

" it can directly follow other block elements," Which others?

Ex 117, despite being on their own line, via the ref impl, these links appear as inlines? Confusing.

"Link reference definitions can occur inside block containers, like lists and block quotations. " Unclear. inside ‘any’ block element? Some block elements? Please be explicit. Saying ‘like’ is not helpful.

“They affect the entire document, not just the container in which they are defined:” affect the document in what manner?

ex 118 I cannot understand what this example shows. Does it have a purpose wrt the para above it?

4.8 Paragraphs

"A sequence of non-blank lines that cannot be interpreted as other kinds of blocks " By whom? A negative definition?
Please state in a positive manner for implementers. Again the use of ‘non-blank lines’ without a definition.

“removing initial and final spaces.” Is this sufficiently clear for an implementer (who didn’t write it)? Should the leading and trailing \n be removed as a ‘space’?

Example 120 implies embedded \n characters are retained? Is this a requirement / part of the spec? Unspecified.

Ex 121, again, should be part of a block definition section. It has little to do with the para block?

ex 122 “Leading spaces are skipped:” Is this true? I think not. How many leading spaces before it is no longer a para?

ex 124. or an indented code block will be triggered: - is ‘triggered’ defined? Rephrase please.

ex 129 “Final spaces are stripped before inline parsing, so a paragraph that ends with two or more spaces will not end with a hard line break:” final spaces? Stripped? Neither defined? What is meant please, precisely?

4.9 Blank lines

http://jgm.github.io/stmd/spec.html#blank-line link is to a target which does not define blank lines.

“Blank lines between block-level elements are ignored, except for the role they play in determining whether a list is tight or loose.” so they are not ignored? Suggest reword accurately.

“Blank lines at the beginning and end of the document are also ignored.” Perhaps rephrase exactly?

  1. Container blocks

Is ‘container’ superfluous? Normally referred to as nesting?

“There are two basic kinds of container blocks” are there non-basic container blocks?

“and list items. Lists are meta-containers for list items.” s/list items/Lists/ if List is the container block?

No explanation of the purpose of container block?

“We define the syntax for container blocks recursively. The general form of the definition is:
If X is a sequence of blocks, then the result of transforming X in such-and-such a way is a container of type Y with these blocks as its content.” Do these paragraphs serve any purpose in the specification?

“So we explain…” Makes me question who are the audience for this spec? Writers of md or implementers?

5.1 Block quotes.

" of initial indent," redundant.

"If a string of lines " String? Is this defined? series or sequence of lines

"appending a [block quote marker] to the beginning of each line " perhaps prepending a block quote marker?

The link to ‘paragraph continuation text’ is to the same para? Needs defining. Suggest seperately - the definition is after the use, suggest should be with other definitions or precede its use.

" If a string of lines Ls constitute a block quote … is a block quote with Bs " Bad English, Not clear.

Personally I can’t make head nor tail of that ‘laziness’ para.

“A document cannot contain two block quotes in a row unless there is a blank line between them.” Again specified in the negative, please invert.

“Nothing else counts” redundant? Having specified what a blockquote is, please leave it at that? Bit like saying blue paint isn’t a block quote?

ex 131 again needs to show clearly the 4 sp indent?

ex 138 shows \n inserted. Is this required? ditto 139?

“A blank line always separates block quotes:” A variant on the ‘interrupt’ usage seen previously?
A single instance rather than a general case, a better definition would define conditions required to terminate/ start a blockquote?

“(Most current Markdown implementations,” commentry, not required in a specification?

“To get a block quote with two paragraphs, use:” If this is part of the specification, please rephrase as a requirement rather than ‘it is better…’? for consistent implementation (‘standard’?)

“Block quotes can interrupt paragraphs:” Vague and partial statement of specification? as previously, wrong place.

“In general, blank lines are not needed before or after block quotes:” Only in general? Suggest delete.

“a blank line is needed between a block quote and a following paragraph:” One element specified, makes one ask about all the others? The clearer method would be to fully specify the start/end conditions. Ex 148 appears to contradict this.

ex 149 “a nested block quote:” first mention of nesting blockquotes? is there a purpose to this? Is it part of the spec? Please specify if it is, rather than it popping up in an example. E.g. improper ‘nesting’.

ex 150 contradicts " (b) a single character > not followed by a space." at the top of this section. Bug?
The ref impl seems not to implement this?

5.2 list items

I suggest any of these you feel are sufficiently clear and distinct enough to be individual bugs, open on the GitHub issue tracker as issues.