Tables in pure Markdown

For normal stuff, I don’t really see how it’s that much different. You can just as well add whitespace to align the columns etc. The only extra thing are the row separators and the leading {| and trailing |}.

Yes, those extra things are the things that don’t look like plain text. Additionally, the use of ! for headers is unusual. The syntax in the OP, used by MarkdownExtra, looks like a plain-text table you’d see people manually write in an email or the like, which is precisely the aesthetic that Markdown adheres to and what makes it attractive. Taken together, MediaWiki tables look nothing like a table; nobody unfamiliar with the syntax would understand what it was trying to do when looking at the source.

(MarkdownExtra has a special syntax for header alignment which doesn’t map to plain-text conventions, using a colon on the separator line below the header, which is weird. I’ve seen similar support for this feature by literally aligning the text; putting any whitespace on one or both sides between the header text and the pipes aligns the text accordingly. This is a better syntax, because it looks natural, does what it looks like, and acts correctly by default (centering when there’s no space or space on both sides).)

2 Likes

I agree, at first glance I had no idea what the MediaWiki table syntax meant. Markdown is supposed to be human and machine readable :thumbsup:

I like that MarkdownExtra gives multiple formats for tables. I like the fact that you can do either:

| Header | Header | 
|--------|--------|
| Row    | Row    |

or

Header | Header
-------|-------
Row    | Row

If it gets too much more complicated as in MarkdownExtra, I don’t think it’d be a standard feature.

However, I’m sticking to my guns in saying that yes, Markdown needs a standard table syntax.

4 Likes

To be honest, I see it as a failing of HTML that we can’t use a syntax such as

<table src="data.csv"></table>

to separate data from markup.

Of course, it’s possible to polyfill the behavior using

<table data-src="data.csv"></table>

and a bit of javascript.

Overall, tables would add a level of complexity that probably belongs in an extension. That way, a standards compliant markdown parser can choose not to support tables, or choose to handle tables in a different manner.

1 Like

Me too, tables are an essential feature for markdown and should be in the standard.
I don’t see the need to only implement things that were in the old markdown “standard”.

The syntax of a table line as “|” and “-” seems like a good idea.

So this:

| Header | Header |
|--------|--------|
| Row    | Row    |

would be the stmd table and if you want a more fancy one you can use a table addon.

My problem with addons is that I think that many people won’t use them and if we don’t have tables in standard markdown we probably have no tables at all.

4 Likes

Agreed. Making certain features optional means it won’t be fully widespread. And stuff like tables and anchors are common enough to be a core standard.

A common complaint I understand about tables in markdown variants that attempts to implement this, is that it is hard to maintain. So here is some ways I think it can be simplified from “Markdown Extra” syntax for this effort.

This is [Markdown Extra Syntax for tables] (https://michelf.ca/projects/php-markdown/extra/):

 | Item      | Value |
 | --------- | -----:|
 | Computer  | $1600 |
 | Phone     |   $12 |

First example (Compress the pipe headers):

To indicate a field is a header you use |-, -| .
For header alignment: |:- left aligned -|, |- right aligned -:| |:- Centre aligned -:| .

|:- Header -:|:- Header -:|
|   Row      |   Row      |
|   Row      |   Row      |

Second Example ( CSV Input):

The second issue, is that people find it hard to have to deal with formatting the pipes. If alignment of cell data is of no concern to the user, then we should use CSV data as the inspiration.

I’m a big fan of CSV data, due to how easy it is to type. The ease of use comes from sticking to csv which most people use already, and combining it with a simplified table header.

If you still need alignment control for each cell, then you can just use the previous (but simplified) pipe tables shown above using |:, :|

|:- Year -|:- Make  -|:- Model                         -:| 
  1997,      Ford,      E350
  1999,      Chevy,    "Venture ""Extended Edition"""
  1999,      Chevy,    "Venture ""Extended Edition
  1996,      Jeep,      Grand Cherokee

 This is some other text, since the end of a table is implied by a new paragraph. 

example data from: http://en.wikipedia.org/wiki/Comma-separated_values

Essentially, just treat pipes as ‘optional’ for the actual cell data (which is the field that gets modified most often anyway (compared to the header). This way, we can avoid too much formatting, and heck if you are lazy, you could just remove whitespaces and it shall still be very maintainable like so:

|:- Year -|:- Make -|:- Model  -:| 
1997, Ford, E350
1999, Chevy, "Venture ""Extended Edition"" "
1999, Chevy, "Venture ""Extended Edition"" "
1996, Jeep, Grand Cherokee

The second approach is my preference. Since I believe markdown is about getting formatting out of the way of your writing.

edit: crossposted to https://github.com/jgm/stmd/issues/73

3 Likes

You could also remove the "s around cells. A new cell is started with a comma, so whitespaces should be no problem. If you want a comma you have to escape it.
As long as I get my pipe tables this sounds like a good idea. :slight_smile:
You could just copy and paste csv into markdown. That’s really handy.

|:- Year -|:- Make  -|:- Model                         -:| 
  1999,      Chevy,    Venture \, "Extended Edition"

hmmm… well, the issue with that is that it looks rather ugly. At the very least, commas like these

|:-  Year -|:- Make  -|:- Model                         -:| 
     1999,     Chevy,    "Venture, ""Extended Edition"" "

are not too ugly. I’m literally, using the ‘psudo-standard’ in csv explained by wikipedia and ratified by RFC 4180. So again, the benefit of this approach, is you can tend to copy paste most csv data (and most csv comforms to RFC4180).


RFC 4180 : http://tools.ietf.org/html/rfc4180#page-2

  1. If double-quotes are used to enclose fields, then a double-quote
    appearing inside a field must be escaped by preceding it with
    another double quote. For example:
  "aaa","b""bb","ccc"

Seems we can sum this up:

Standard pipe (markdown extra)

Pros:

  • Is traditional, and most people already remember how to do it.
  • Visually appealing

Cons:

  • Takes time to create
  • Takes time to modify
  • Parsing is not simple

Compact Pipe Table

Pros:

  • Is more compact. And marginally faster to write.
  • Parsing is a little easier

Con:

  • loses flexibility of |:------:| in setting all cells below to same alignment

CSV Tables (With compressed pipe headers)

Pros:

  • Fastest to write
  • Aside from parsing the compressed pipe header, the rest of the data is in CSV format which is much faster to parse.
  • Easiest to modify

Cons:

  • Doesn’t have the same beauty as piped headers.

Wrap up

My opinion is that we should aim to support csv tables first, simply due to the ease of implementation. But the pipe tables from markdown extra should be included as well, since it occurs so often in emails and text documents.

In summary

  • Pipe Tables is best for ‘one off’ presentation like emails, where flexibility comes first before ease of maintenance.

  • CSV tables is best for when the table is likely to be constantly updated, and the user is willing to sacrifice flexibility in alignment for maximum ease of modification.

2 Likes

Sorry guys. I continued off what mofosyne suggested over at github https://github.com/jgm/stmd/issues/73#issuecomment-54611781. As long as this site stays up, obviously the discussion should be kept here. I’m going to quote myself so that the reader doesn’t have to click the link if she does not wish to (feel free to object to what I put forward!):


I’d also argue for the inclusion of the three other types of tables supported by pandoc markdown, as well as support of using a + in the hyphen line of pipe tables as produced by emacs orgtbl-mode, as pandoc markdown also supports it. Also, the grouping (colspan) feature from multimarkdown’s table syntax.

So now in summary:

  1. Simple
  2. Multiline
  3. Grid
  4. CSV
  5. Pipe, without the compact/compressed headers as would be used with the csv table, but with multimarkdown’s grouping (colspan) feature.
  • I haven’t actually seen a reason for not having multimarkdown’s grouping feature in pipe tables already in pandoc’s markdown though (whether in the github issues, github wiki, or in the discuss mailing list)? So it would be nice if john or someone else could specify or link to the reason for it’s occlusion. And then if that exists, it could be used as a reason to keep it out of the spec here too.

This way we could probably satisfy quite a lot of needs, while still retaining fairly simple and readable syntaxes (with varying degrees of flexibility and maintainability).

Of course, I may have easily overlooked a conflict or something in supporting five different table syntaxes, or overlooked a difficult implementation detail so feel free to bring up such a case. This is only a suggestion for the future; for whenever john wants to deal with tables.

2 Likes

Super simple semi-csv table ( discussed in markdown-discuss@six.pairlist.net ) . Converts only the comma in the header field into " | ".

    Year  |   Make  |   Model
    1997,    Ford,      E350
    1999,    Chevy,    Venture "Extended Edition"
    1999,    Chevy,    Venture "Extended Edition"
    1996,    Jeep,      Grand Cherokee

Pros: Very easy to convert from csv.
Con: Very inflexible.

Also if needed you can include ------- for better legibility

    Year  |   Make  |   Model
    ----------------------------------------------
    1997,    Ford,      E350
    1999,    Chevy,    Venture "Extended Edition"
    1999,    Chevy,    Venture "Extended Edition"
    1996,    Jeep,      Grand Cherokee

I’m sure you guys have seen this style of tables before

Well if you like pipes instead of commas (make sure you align it right!):

    Year  |   Make  |   Model
    ----------------------------------------------
    1997  |   Ford  |   E350
    1999  |   Chevy |   Venture "Extended Edition"
    1999  |   Chevy |   Venture "Extended Edition"
    1996  |   Jeep  |   Grand Cherokee

Another very interesting concept from Bill Costa.

In other words push the text up against the pipe for left/right
justification, and spaces on either side for centered.

|    Year|Make     |    Model                        |
| -------------------------------------------------- |
|    1997|Ford     |    E350                         |
|    1999|Chevy    |    Venture "Extended Edition"   |
|    1999|Chevy    |    Venture "Extended Edition"   |
|    1996|Jeep     |    Grand Cherokee               |

Pros: Can align tables, with minimal effort. Alignment is intuitive
Cons: Changing alignment of whole rows is painful. (Might as well just use pandoc’s -:| style pipe alignment aye? But then again, that might use more mental energy than this, for single shot tables)

2 Likes

Standardizing “pipe” tables (also known as PHP Markdown Extra tables) might be a step into right direction, since many implementations already support them in some way.

However, I wanted to share with you my comments I made for myself when adding a table support into Minima converter to show you how many corner-cases and ambiguities are out there:

Example 1

  • Pandoc: 4 cols in header, 4 cols in 1st and 2nd row, 2 cols in 3rd row.
  • PHP Markdown Extra: 4 cols everywhere.
h1|h2|h3|h4
-:|-|-|-
1|2|3|.|
a|b|c|d
I|.

Example 2

4th column ignored in Pandoc, present in PHP Markdown Extra.

A|B|C|D
-|-|-
1|2|3|4

Example 3

A complete table in PHP Markdown Extra, no table detected in Pandoc.

A|B|C|D
-|
1|2|3|4

Example 4

OK table in Pandoc. It doesn’t work in PHP Markdown Extra because of missing header line.

 --:|--|---|--
  1 |2 |3  |4
  a |b |c  |d
  I |II|III|IV

Example 5

Not detected as a table in Pandoc, due to single :, but works in PHP Markdown Extra.

|a|b|c|
|-|-|:|
||2|3|

Example 6

An empty table in PHP Markdown Extra, not detected as a table in Pandoc.

||||
|-|-|:|
||||

Example 7

2 cols in header and 4 cols tbody in Pandoc, erratic behaviour in PHP Markdown Extra.

x|y
-|-|-|-
||||||||||

Example 8

Pipe chars entered as \|or `|` should not trigger a cell separation. Works that way in Pandoc and kramdown.

|  1 |   2 |  3
| -- | --- | --
| \| | `|` | \|

Example 9

While Pandoc does recognize only 5 cols in <thead> in below case, PHP Markdown Extra sees 7 cols and that feels like the right thing.

| `|` | <!--|--> | \| |    *|*      |  __|__  |
| --- | -------- | -- | --- | ----- | -- | -- |
| a   | b        | c  | d   | e     | f  | g  |

Example 10

Pandoc does not recognize one-column tables. PHP Markdown extra also does not recognize them and it needs the data line written as |a|, which is in collision with its own documentation.

| 1 | 2 | 3
|---|:--|--:
a|

Example 11

A table inside a second list item. Pandoc handles it somehow but cudos to kramdown!

- item1
- | -
a | b | c | d 

Example-12

There should be some empty headers and 13 <code>|<code> cells. OK in PHP Markdown Extra and kramdown. Only 7 <code>|<code> cells in Pandoc.

||||||||||||||||||||||
-|-|-|-|-|-|-
`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`

Example 13

A table without header. Works in kramdown only.

:--- | ---- | ---:
A    | B    | C
1    | 2
I    |

Ideally, all above inconsistencies should go away by a proper table syntax specification. And it feels like it’s going to be a bit of work.

8 Likes

The third example is already a ‘pipe’ table (the beginning and ending pipes being optional). The fourth is potentially viable, however, I believe pandoc’s markdown specifically forbids this, maybe due to difficulty in implementation detail? Might be relevant since john also created pandoc obviously.

Good point rwzy. I would like to caution that the 3rd example you mention does not work. This is because it is mistaken as a ‘h1’ header as shown in babelmark test 1 .

But it’s easily fixed, by appending a single |, so that is not too painful. babelmark test 2 . Ah… but pandoc only recognizes the first column (Which matches what jks is talking about in terms of corner cases). PHP Markdown Extra, parsedown, and kramdown works thought (so good on them!)

Really do wish pandoc can notice the context of the ------- line in making tables without needing |, but yes it’s most likely due to difficulty in implementation (due to the need of context).

They just need to sort out the context issue of ------ . Aside from that, using, for inline csv data input should be relatively straight forward. (which means… just give us the option of choosing , or | as the cell delimiter )

2 Likes

At the end of the day, CommonMark comes from the m******n implementations of Discourse, Stack Exchange, Reddit and Github.

Discourse: no tables
Stack Exchange: no tables
Github: pipe delimited cells, colons on second line mark alignment
Reddit: pipe delimited cells, colons on second line mark alignment

Thus, if any syntax should be chosen, pipe delimited cells with colons on second line marking alignment should be it. This is in line with the goal of backwards compatibility.

6 Likes

I said I’d argue for more table syntaxes but didn’t exactly mention why. I’ll try to list them now, in order from most important to least.

  1. Pipe
  • Seems to be the most obvious one that I think we can all agree to have right?
  • That being said, bp_, I still think having some more alternate syntaxes (below) in the spec is beneficial, and doesn’t affect backwards compatibility, since we could have those in addition to pipe tables.
  • Again, I would like to a see a reason to not include multimarkdown’s colspanning here since it easily allows for horizontal grouping, which is used quite commonly in tables.
  1. Simple (from pandoc’s markdown)
  • Allows for a very simple readable syntax that doesn’t need to resort to pipes. Less powerful, but if that’s all one needs, it’s much easier/faster for them to do simple tables, right?
  1. Multiline (from pandoc’s markdown again)
  • Sort of an extension of the simple syntax which is still very useful due to allowing multiple lines in a cell, unlike in any other syntax (except for grid tables, which I refer to below).
  • For example, using lists in a table cell is common.
  • The syntax also looks very similar to what’s commonly used to style tables in academia. (No vertical lines that pipe tables have).
  1. CSV
  • Pipes look ugly when unaligned being the main reason I think?
  • So the same as pipe tables, but with commas instead of pipes?
    • But I’m not sure if mofosyne/anyone still wants ‘compact headers’ for comma delimited tables?

  1. Grid Tables
  • Initially I thought that if this makes the spec, emacs users lives would become very easy and therefore it’s worth it.
  • However, putting it in the spec means people who don’t use emacs might be presented with such a table, which they might need to edit themselves…
  • Since it’s quite difficult to do so without an advanced editor, maybe it shouldn’t be in the spec? Thoughts? (I’m not an emacs user though.)
4 Likes

Except of course for where major contributors to the spec and @jgm himself have stated several times they were looking to standardise on what other implementations were doing and keep much backwards-compatibility in.

Even then it makes sense to include tables in the spec in some form.

4 Likes

has @jgm mention anything about why they didn’t include tables?

I’m not saying there should be only one syntax, but it would probably easier to start with one, and if we have to pick one, we should pick the one that users of those services that will be picking up CommonMark first are already used to.

I believe the goal is for an almost silent transition from m******n to CommonMark, with only side cases being changed for the better.

3 Likes

Do you mean like the third reply in this thread? Perhaps they didn’t consider tables a ‘core’ feature because gruber’s original didn’t include any table syntax?