I think that idea currently works because all block structure is encoded at start of lines: indentation, list bullets, >
, blank lines. Will any this be a problem for any attempt to introduce block structure in the middle of a line?
Possibly silly questions: In what way do we want tables to be block structrure? Could we only consider table start/end to be block structure, and cell boundaries to be inline structure?
Well, putting code spans and \|
problem aside, it makes typographic sense to think of each cell as having a separate inline structure. For example:
| table | head |
|-------|------|
| A*B | C*D |
-
GFM and almost all table implementations treat these as unmatched asterisks, not markup.
-
a couple (maruku, s9e/TextFormatter) make B and/or D italic , but still treat it C as a “fresh start of independent cell”.
-
nobody makes B C italic across cells. Good! That would make little sense as AST and would not fit HTML at all…
-
But mutlimarkdown and cebe/gfm have an interesting alternative: a single cell “AB | CD” where the inner
|
is NOT a cell separator, just regular text.=> I guess this is what it means to treat cell boundaries as inline structrure.
I suspect it’s a bit more error-prone than parsing each cell separately, and more previews will flicker more during editing… But at least it’s a consistent position!
Also, what about escaped \| outside backticks resulting in a single cell with textual “|”?
Well, backslashes can inhibit block AND inline constructs in markdown, so it’s consistent with both positions. And it’s important to have a way to spell “|” inside a cell (other than ugly |
or |
) .
Not let’s talk code spans. I’d think that if we want:
| I`J | K`L |
to mean a single cell with a “IJ | K
L” content, we better treat “AB | CD” similarly.
Unfortunately, the reality is more fragmented: https://babelmark.github.io/?text=|+table+|+head+| |-------|------| |+A*B+++|+C*D++| |+E*F++\|+G*H++| |+I`J+++|+K`L++| |+M`N++\|+O`P++|
-
github/cmark and a few others are consistent in first parsing cell boundaries, then treating A*B and I`J as unterminated asterisk an backtick.
-
markdown-it and a few others do A*B but a single cell with
J | K
code span. -
maruku does the opposite! But its table support is weird in other ways, and apparently it doesn’t allow escaping | by any way — neither \ nor code span nor even \ inside code span
-
multimarkdown consistently treats all 4 combinations as a single cell. But nobody else does.
-
There is more variation about
\|
inside code span becoming|
vs\|
in the output
I’ll post more thoughts about this soon.