Tables in pure Markdown

Since Markdown uses soft breaks, you could just break the sentence up over multiple lines. The parser would then render the lines together as a single sentence/line.

It might be best to require explicit divider lines between cells in case the cell contains multiple paragraphs. In any case, I think it’s worth making the formatting rules within a cell the same as the formatting rules for content outside of a cell.

The downside to this is that it isn’t applicable to “pipe” tables, and changes to “simple” tables frequently cause “respacing” and “rebreaking”.

I’m not sure how pipe tables behave now in the various implementations, but is there any reason why they couldn’t be designed this way in CommonMark?

If all you need is some text or data presented in a table-like way, you need no extension, you could just use a pre block and tabulate to taste:

A header            Another header             Price
----------------------------------------------------
Some text here      Another bit of text        34,10
Lorem ipsum         In principo creavit       624,45
----------------------------------------------------
                                    Total:    658,55

The result is primitive but flexible as you can be very creative with your formatting.

If you need to output an HTML table, then you need a specific syntax.
I really like the Pandoc simple+multiline tables. They have the same benefits as Github flavored ones and are very legible, easier to write, visually light, intuitive (no special char to remember)…

If we don’t need alignment (some argued it should be left to CSS) we could imagine an even simpler syntax for basic tables:

# Header       Header 2
| First field  Second field

A header (# or another character) followed by a pipe on the next line would be a table header line. 2 or more spaces mean a new column.

2 Likes

I think your pre block and tabulate is quite nice. I would still request that tables require/support pipes, as it is not always a given that alignment is respected in all text viewers/editors (e.g. one that strips out all whitespaces

A header           | Another header      |   Price
----------------------------------------------------
Some text here     | Another bit of text |    34,10
Lorem ipsum        | In principo creavit |    624,45
----------------------------------------------------
||                                   Total:   658,55

The difference between the standard pipe and dash, and the similarities to yours is that it allows for lazy horizontal rules to separate cells. But it doesn’t omit the row cell separator to ensure that the division is more clearer for the reader and parser.

(e.g. Easy for the parser to tell that the “Total: 658,55” should be in the third cell.)


If there is a need to distinguish headers in tables. This could be an option, of specifying thicker lines.

A header           | Another header      |   Price
====================================================
Some text here     | Another bit of text |    34,10
Lorem ipsum        | In principo creavit |    624,45
----------------------------------------------------
||                                   Total:   658,55
1 Like

One request I would make when implementing a spec for tables - don’t make a header required. For example, in kramdown, you can simply use

| Item 1  | Item 2 |
| Item 3  | Item 4 |

and it will produce a table without a header. I’ve used this style in many instances where I just need some information columnized but don’t want a header row. If a line of dashes are needed to start the table, that’s fine. But let’s not require a table head (most other implementations do require it, unfortunately).

2 Likes

Instead of discussing additional or less used table formats we should first specify pipe tables:

| First Header  | Second Header |
| ------------- | ------------- |
| Content Cell  | Content Cell  |
| Content Cell  | Content Cell  |

Pipe before first and after last column are optional unless the table has only one column:

First Header  | Second Header
------------- | -------------
Content Cell  | Content Cell
Content Cell  | Content Cell

Spaces before and after pipes optional:

First Header|Second Header
--------------|   --------------
Content Cell |    Content Cell

But this is not allowed, neither can header be omitted:

First Header  | Second Header
-----------------------------
Content Cell  | Content Cell

Colons can be used to indicate column alignment:

| Left-Aligned  | Center Aligned  | Right Aligned |
| :------------ |:---------------:| -------------:|
| col 3 is      | some wordy text | $1600         |
| col 2 is      | centered        | $12           |
| zebra stripes | are neat        | $1            |

Columns need not be vertically aligned:

fruit| price
-----|-----:
apple|2.05
pear|1.37
orange|3.09

That’s all. The hard work is putting this rules into a clear and strict specification.

2 Likes

FYI this form is supported by “pandoc 1.15.0.6” and “Maruku 0.7.2”

 field1 | field2
 ----------------
 entry0 | entry1
 entry2 | entry3
 entry4 | entry5

I like this form (in addition to the standard github flavored pipe tables) , since it easier to edit tables without having to worry about exact alignment of the space

http://johnmacfarlane.net/babelmark2/?text=+field1+|+field2 +---------------- +entry0+|+entry1 +entry2+|+entry3 +entry4+|+entry5

http://johnmacfarlane.net/babelmark2/?text=+field1+|+field2 +-------|-------- +entry0+|+entry1 +entry2+|+entry3 +entry4+|+entry5

You mean.

 field1 | field2
 -------|--------
 entry0 | entry1
 entry2 | entry3
 entry4 | entry5

I just read through all of these posts. It seems the preferences surrounding the table syntax for CommonMark are as variegated as the Markdown “spec” itself. Some folks want it to be more like HTML, some value the parse over readability in the pure Markdown form, and still others don’t think it should be in the CommonMark spec at all!

@vitaly’s has basically taken the GitHub Flavored Markdown version and incorporated it into markdown-it. (This is what I use for my node.js-powered blog, based on Markdown source.) Most people even remotely interested in Markdown are familiar with GitHub, so this seems like a good choice on @vitaly’s part.

Going one step further, Byword’s implementation of Fletcher Penney’s MultiMarkdown 3 table spec seems the most straight-forward amalgamation of the core ideas for tables in Markdown. Penney’s ideas are a close cousin to the GitHub Flavored Markdown table spec, except for one crucial addition: table captions.

Captions for tables do not seem to appear in any of the other Markdown specs including some table parsing. (And, likewise, no Markdown spec except for Penney’s have any support for <figcaptions>. I’m not sure why.)

Example [Multi]Markdown source for a table:

| First Header  | Second Header | Third Header         |
| :------------ | :-----------: | -------------------: |
| First row     | Data          | Very long data entry |
| Second row    | **Cell**      | *Cell*               |
| Third row     | Cell that spans across two columns  ||
[Table caption, works as a reference][section-mmd-tables-table1]

The final line, [Table caption, works as a reference][section-mmd-tables-table1] is what forms the <caption> for the <table>.

Is this not the most simple execution for tables in CommonMark?

1 Like

My choice was done to minimize breaking changes when spec is complete. I’m not sure it’s the best of possible for the spec.

2 Likes

@vitaly Understood. Still, it was a good choice in and of itself :slight_smile:

More than a year has passed, is the core settled enough? Can you please elaborate on how we can create these extensions? Should it be a Pull request on the spec itself in a new section ‘extensions’? Or should it be its own git repo?

If we would create a fork, the entire idea of having one spec gets lost. That doesn’t seem useful.

1 Like

Having a spec for the core elements is still useful even if there are implementations that do extensions to the core in different ways. That is a much better situation than having implementations that render core elements differently. Anyway, I’m sorry this is trying your patience. This thread is a good place to record and discuss ideas for table syntax, which should make things faster when we get to that.

1 Like

If/when standardization of pipe tables occur, it would be good to address escaping of pipes. I just had the following scenario in some documentation:

|`E[foo|="en"]`|...some text...|

Note the internal | character within the first cell in the row. The Markdown implementation I am using considers it a column separator.

The only solution with common Markdown processors appears to be to escape to HTML, for example:

|<code>E[foo&#124;="en"]</code>	|...some text...|

But I don’t find that very satisfying.

In my example, given that I am within a backticks code block, we could make a case that the internal pipe should not be considered a column separator at all. But in the case where there are no backticks, there should be a non-HTML way to escape pipes.

According to CommonMark 0.23 Section 6.1. Backslash escapes,

Any ASCII punctuation character may be backslash-escaped […]

I believe pipe (|) is an ASCII punctuation character, and \ is a decidedly non-HTML way to escape it.

As for code spans, Section 6.3. Code spans states:

Code span backticks have higher precedence than any other inline constructs except HTML tags and autolinks.

I don’t see how pipe tables, if/when standardized, would change these simple rules.

The implementation you are using seems to either use different precedence rules or (more likely) disregard precedence altogether.

@ebruchez Any implementation which uses “|” to delimit table cells (or whatever syntax construct), but

  1. does not provide the escape sequence “\|” to “hide” that character, and
  2. does not treat is as data inside a code span

seems pretty broken, in particular since (2.) already holds for any “markup-relevant” character in the very first Markdown description by Gruber.


That said, based on examples I tried in BabelMark, it seems that botching escape sequence recognition is not an uncommon problem in Markdown implementations, and that using a character reference like you did with &#124; is in fact the most robust work-around (and sometimes the only one). For a simple example:

*foo \* bar*

will not render as “foo * bar” (wrapped in <EM>) in all implementations, and even less so

*foo * bar*

(though I’m pretty sure that both forms should, by very basic Markdown rules), but

*foo &#42; bar*

will in every implementation employed there (even in some really dumb ones!).


@Dmitry Hmm, now that you quote it from the specification, the use of the term “precedence” in this context doesn’t feel quite right—am I the only one having this hunch?

Indeed, select implementations of CommonMark use the term priority, but I don’t see much harm in using precedence in this context.

2 posts were split to a new topic: Is the spec too big?

@tin-pot I am not sure which implementation this is (it’s the one used by gitbook). I agree it’s quite broken. Hopefully CommonMark can make sure this kind of scenarios are fully covered, and if the core of CommonMark already does cover escapes and code spans properly, then it’s even better!