Tables in pure Markdown

Standardizing “pipe” tables (also known as PHP Markdown Extra tables) might be a step into right direction, since many implementations already support them in some way.

However, I wanted to share with you my comments I made for myself when adding a table support into Minima converter to show you how many corner-cases and ambiguities are out there:

Example 1

  • Pandoc: 4 cols in header, 4 cols in 1st and 2nd row, 2 cols in 3rd row.
  • PHP Markdown Extra: 4 cols everywhere.
h1|h2|h3|h4
-:|-|-|-
1|2|3|.|
a|b|c|d
I|.

Example 2

4th column ignored in Pandoc, present in PHP Markdown Extra.

A|B|C|D
-|-|-
1|2|3|4

Example 3

A complete table in PHP Markdown Extra, no table detected in Pandoc.

A|B|C|D
-|
1|2|3|4

Example 4

OK table in Pandoc. It doesn’t work in PHP Markdown Extra because of missing header line.

 --:|--|---|--
  1 |2 |3  |4
  a |b |c  |d
  I |II|III|IV

Example 5

Not detected as a table in Pandoc, due to single :, but works in PHP Markdown Extra.

|a|b|c|
|-|-|:|
||2|3|

Example 6

An empty table in PHP Markdown Extra, not detected as a table in Pandoc.

||||
|-|-|:|
||||

Example 7

2 cols in header and 4 cols tbody in Pandoc, erratic behaviour in PHP Markdown Extra.

x|y
-|-|-|-
||||||||||

Example 8

Pipe chars entered as \|or `|` should not trigger a cell separation. Works that way in Pandoc and kramdown.

|  1 |   2 |  3
| -- | --- | --
| \| | `|` | \|

Example 9

While Pandoc does recognize only 5 cols in <thead> in below case, PHP Markdown Extra sees 7 cols and that feels like the right thing.

| `|` | <!--|--> | \| |    *|*      |  __|__  |
| --- | -------- | -- | --- | ----- | -- | -- |
| a   | b        | c  | d   | e     | f  | g  |

Example 10

Pandoc does not recognize one-column tables. PHP Markdown extra also does not recognize them and it needs the data line written as |a|, which is in collision with its own documentation.

| 1 | 2 | 3
|---|:--|--:
a|

Example 11

A table inside a second list item. Pandoc handles it somehow but cudos to kramdown!

- item1
- | -
a | b | c | d 

Example-12

There should be some empty headers and 13 <code>|<code> cells. OK in PHP Markdown Extra and kramdown. Only 7 <code>|<code> cells in Pandoc.

||||||||||||||||||||||
-|-|-|-|-|-|-
`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`

Example 13

A table without header. Works in kramdown only.

:--- | ---- | ---:
A    | B    | C
1    | 2
I    |

Ideally, all above inconsistencies should go away by a proper table syntax specification. And it feels like it’s going to be a bit of work.

10 Likes

The third example is already a ‘pipe’ table (the beginning and ending pipes being optional). The fourth is potentially viable, however, I believe pandoc’s markdown specifically forbids this, maybe due to difficulty in implementation detail? Might be relevant since john also created pandoc obviously.

Good point rwzy. I would like to caution that the 3rd example you mention does not work. This is because it is mistaken as a ‘h1’ header as shown in babelmark test 1 .

But it’s easily fixed, by appending a single |, so that is not too painful. babelmark test 2 . Ah… but pandoc only recognizes the first column (Which matches what jks is talking about in terms of corner cases). PHP Markdown Extra, parsedown, and kramdown works thought (so good on them!)

Really do wish pandoc can notice the context of the ------- line in making tables without needing |, but yes it’s most likely due to difficulty in implementation (due to the need of context).

They just need to sort out the context issue of ------ . Aside from that, using, for inline csv data input should be relatively straight forward. (which means… just give us the option of choosing , or | as the cell delimiter )

2 Likes

At the end of the day, CommonMark comes from the m******n implementations of Discourse, Stack Exchange, Reddit and Github.

Discourse: no tables
Stack Exchange: no tables
Github: pipe delimited cells, colons on second line mark alignment
Reddit: pipe delimited cells, colons on second line mark alignment

Thus, if any syntax should be chosen, pipe delimited cells with colons on second line marking alignment should be it. This is in line with the goal of backwards compatibility.

7 Likes

I said I’d argue for more table syntaxes but didn’t exactly mention why. I’ll try to list them now, in order from most important to least.

  1. Pipe
  • Seems to be the most obvious one that I think we can all agree to have right?
  • That being said, bp_, I still think having some more alternate syntaxes (below) in the spec is beneficial, and doesn’t affect backwards compatibility, since we could have those in addition to pipe tables.
  • Again, I would like to a see a reason to not include multimarkdown’s colspanning here since it easily allows for horizontal grouping, which is used quite commonly in tables.
  1. Simple (from pandoc’s markdown)
  • Allows for a very simple readable syntax that doesn’t need to resort to pipes. Less powerful, but if that’s all one needs, it’s much easier/faster for them to do simple tables, right?
  1. Multiline (from pandoc’s markdown again)
  • Sort of an extension of the simple syntax which is still very useful due to allowing multiple lines in a cell, unlike in any other syntax (except for grid tables, which I refer to below).
  • For example, using lists in a table cell is common.
  • The syntax also looks very similar to what’s commonly used to style tables in academia. (No vertical lines that pipe tables have).
  1. CSV
  • Pipes look ugly when unaligned being the main reason I think?
  • So the same as pipe tables, but with commas instead of pipes?
    • But I’m not sure if mofosyne/anyone still wants ‘compact headers’ for comma delimited tables?

  1. Grid Tables
  • Initially I thought that if this makes the spec, emacs users lives would become very easy and therefore it’s worth it.
  • However, putting it in the spec means people who don’t use emacs might be presented with such a table, which they might need to edit themselves…
  • Since it’s quite difficult to do so without an advanced editor, maybe it shouldn’t be in the spec? Thoughts? (I’m not an emacs user though.)
4 Likes

Except of course for where major contributors to the spec and @jgm himself have stated several times they were looking to standardise on what other implementations were doing and keep much backwards-compatibility in.

Even then it makes sense to include tables in the spec in some form.

4 Likes

has @jgm mention anything about why they didn’t include tables?

I’m not saying there should be only one syntax, but it would probably easier to start with one, and if we have to pick one, we should pick the one that users of those services that will be picking up CommonMark first are already used to.

I believe the goal is for an almost silent transition from m******n to CommonMark, with only side cases being changed for the better.

3 Likes

Do you mean like the third reply in this thread? Perhaps they didn’t consider tables a ‘core’ feature because gruber’s original didn’t include any table syntax?

Oh, it seemed as though you meant only the one. But yes I agree with that. Judging by the number of likes on op, I think pipe tables are the most popular across the flavours and therefore the most likely to get accepted, at the least.

Yea. The biggest point for pipe tables, is that it is in widespread usage in github and reddit already. Backwards compatibility to the most common dialect of markdown is a must if CommonMark is to be common.

4 Likes

That is probably the case. And while I can certainly see the merit in being just a formalised standard of Gruber’s Markdown, over time I have been getting the feeling CommonMark is trying to be more.

(And, at the risk of repeating myself, there is a precedent in fenced code blocks.)

2 Likes

Yea… well historically it was initially Standard Markdown. It’s just that it’s now CommonMark.

Which is a good opportunity for us to be more than ‘just markdown’ and forced to adhere to tradition at our detriment. Much like how Python 3.x break compatibility with Python 2.x to correct fundamental issues with the language. We could perhaps strategically break away from traditional specs in order to bring a more cohesive markdown language. (e.g. introducing code fencing and tables, both in common use, and both not in original specs)

1 Like

Just brainstorming:

| title: The Title | name: The Name | ph: The Phone | ← this introduces a table and row names
---------- ← this introduces a new row
title: value1
name: value2
ph: value3
---------- ← another row

| ← indicates the end of the table

  • Still easy to read because you have column key in front of each value
  • No matter in which order cells are desrcribed (looks more like a JSON object)
  • Easy to maintain by using keys title, name, ph, so you can still easily add / remove columns
2 Likes

I thought I’d chime in; we use Markdown for public documentation, internal documentation as well as a quick way to keep notes. Being able to format table has come up fairly early in each case.

  • Tables are used as a way to show information, not a way to structure it (i.e. it’s not a database). As an example, if we create a feature list, me may have the feature ID, the component and the description. This can be shown in a bullet list but becomes very hard to read; it’s much better to create columns. It becomes much more human-readable.
  • The pain of maintaining Markdown tables is smaller than the advantages they bring. Anyway, most IDEs have multiple cursors, so adding/removing spaces on all rows is usually as easy as clicking on the last character of the first row you want to expand, Ctrl+Alt+Click on the last one, then add spaces.
  • Even though colspan, rowspan and cell alignment are very nice to have, allowing basic table formatting (creating columns, maybe headers) would still be much better than nothing.

And my humble opinion: if I ask someone to make a table using a basic text editor, I expect them to create it using pipes and dashes. Maybe I’m wrong. I think it also makes it more readable in pure text, which is a huge plus when comparing in Git. I agree CSV makes the writer’s life easier, but I’ll argue that if somebody decides to use Markdown, it’s because it’s as easy to read as it is to write.

10 Likes

Actually this would cause issues as soon as you need to resize a column. That sounds like a recipe for annoying, bogus merge conflicts.

1 Like

I’m currently implementing python’s html2text feature, which converts html tables into markdown.
I added two options - insert table as html block and as GH Flavored M******n.
Here’s example output.
So I vote for Github table syntax be included into spec. As for me, tables are needed when you write content in text editor.

2 Likes

I find the kramdown full syntax quite handy since it supports header/footer.

Hi,

I have a job need for tables and I am concerned that without them, this specification will have less support than it might (and make life harder for those that need tables where the spec is implemented).

As this is something relevant to my interests, I’ve started working on building up the ideas I have.

If enough people are interested, I’m willing to start work on the actual spec for the ideas below and submit them to the project.


Table Syntax Ideas

  1. Pipe delimited tables appear to be the most common implementation
  2. Support for colspan and rowspan are considered nice to have by many, but critical for those that need them
  3. Tables must be human readable
  4. Idea: Make the divider a double pipe to allow more command flexibility and enable formatting directives within the pipes
  5. The example should define the syntax
||            || Column Header One || Column Header Two || Column Header Three || if this line is followed by a line of ||--|| then it is a header, also, Row Headers are indicated by an optional *empty* A1 cell
||-----------:||:------------------||:-----------------:||--------------------:|| : to indicate default column alignment as per ME
||Row Hdr 1   ||Left Aligned||Centre Aligned||Right Aligned|| whitespace should not matter
||Row Hdr 2   |>|colspan the next cell                  ||Row 2, Col 3         || Merge the cells in Col1 and 2
||Row Hdr 3   |>>|colspan the next two cells                                   || Merge the cells in Col1, 2, and 3
||Row Hdr 4   |v|rowspan the next cell down||           || Column 3            || rowspan on col1 and an empty cell
||Row Hdr 5   ||                   || Column 2          || Column 3            ||
||Row Hdr 6   |vv|merge down 2     |>| Merge right                             ||
||Row Hdr 7   ||                   || Column 2         :||: Column 3           || Row7 Col2 RAlign, Col3 LAlign (in exception to column default)
||Row Hdr 8   ||                   || Column 2         :||: Column 3           ||
  • This should produce something like the following (Yes css folks would create this differently, I just wanted to give people an idea of what the rendered table should resemble):
    <table border="2" width="0">
    <tr>
        <th></th><th>Column Header One</th><th>Column Header Two</th><th>Column Header Three</th>
    </tr>
    <tr>
    	<td align="right">Row Hdr 1</td><td align="left">Left Aligned</td><td align="center">Centre Aligned</td><td align="right">Right Aligned</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 2</td><td colspan="2" align="left">colspan the next cell</td><td align="right">Row 2, Col 3</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 3</td><td colspan="3" align="left">colspan the next two cells</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 4</td><td rowspan="2" align="left">rowspan the next cell down</td><td align="center"></td><td align="right">Column 3</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 5</td><td align="center">Column 2</td><td align="right">Column 3</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 6</td><td rowspan="3" align="left">Merge Down 2</td><td colspan="2" align="center">Merge Right</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 7</td><td align="right">Column 2</td><td align="left">Column 3</td>
    </tr>
    <tr>
    	<td align="right">Row Hdr 8</td><td align="right">Column 2</td><td align="left">Column 3</td>
    </tr>
    </table>

2 Likes

Relevant xkcd

I don’t think it is a good idea to implement another table syntax in stmd. There is already a widely used table syntax which is the one that i.e. github is using. Everybody knows it and most people want exactly this. So I think the best idea is to implement this syntax in stmd and implement all your crazy table ideas in separate plugins.

I mean the whole idea of a standard is to describe what most of the people do anyways. So there should be no space for any new and fancy ideas. Thats not what a standard should be used for.

7 Likes