Tables in pure Markdown

xengi · September 4, 2014, 7:01pm

Me too, tables are an essential feature for markdown and should be in the standard.
I don’t see the need to only implement things that were in the old markdown “standard”.

The syntax of a table line as “|” and “-” seems like a good idea.

So this:

| Header | Header |
|--------|--------|
| Row    | Row    |

would be the stmd table and if you want a more fancy one you can use a table addon.

My problem with addons is that I think that many people won’t use them and if we don’t have tables in standard markdown we probably have no tables at all.

mofosyne · September 5, 2014, 5:41am

Agreed. Making certain features optional means it won’t be fully widespread. And stuff like tables and anchors are common enough to be a core standard.

A common complaint I understand about tables in markdown variants that attempts to implement this, is that it is hard to maintain. So here is some ways I think it can be simplified from “Markdown Extra” syntax for this effort.

This is [Markdown Extra Syntax for tables] (https://michelf.ca/projects/php-markdown/extra/):

 | Item      | Value |
 | --------- | -----:|
 | Computer  | $1600 |
 | Phone     |   $12 |

First example (Compress the pipe headers):

To indicate a field is a header you use |-, -| .
For header alignment: |:- left aligned -|, |- right aligned -:| |:- Centre aligned -:| .

|:- Header -:|:- Header -:|
|   Row      |   Row      |
|   Row      |   Row      |

Second Example ( CSV Input):

The second issue, is that people find it hard to have to deal with formatting the pipes. If alignment of cell data is of no concern to the user, then we should use CSV data as the inspiration.

I’m a big fan of CSV data, due to how easy it is to type. The ease of use comes from sticking to csv which most people use already, and combining it with a simplified table header.

If you still need alignment control for each cell, then you can just use the previous (but simplified) pipe tables shown above using |:, :|

|:- Year -|:- Make  -|:- Model                         -:| 
  1997,      Ford,      E350
  1999,      Chevy,    "Venture ""Extended Edition"""
  1999,      Chevy,    "Venture ""Extended Edition
  1996,      Jeep,      Grand Cherokee

 This is some other text, since the end of a table is implied by a new paragraph.

example data from: http://en.wikipedia.org/wiki/Comma-separated_values

Essentially, just treat pipes as ‘optional’ for the actual cell data (which is the field that gets modified most often anyway (compared to the header). This way, we can avoid too much formatting, and heck if you are lazy, you could just remove whitespaces and it shall still be very maintainable like so:

|:- Year -|:- Make -|:- Model  -:| 
1997, Ford, E350
1999, Chevy, "Venture ""Extended Edition"" "
1999, Chevy, "Venture ""Extended Edition"" "
1996, Jeep, Grand Cherokee

The second approach is my preference. Since I believe markdown is about getting formatting out of the way of your writing.

edit: crossposted to https://github.com/jgm/stmd/issues/73

xengi · September 5, 2014, 7:28am

You could also remove the "s around cells. A new cell is started with a comma, so whitespaces should be no problem. If you want a comma you have to escape it.
As long as I get my pipe tables this sounds like a good idea.
You could just copy and paste csv into markdown. That’s really handy.

mofosyne · September 5, 2014, 8:23am

|:- Year -|:- Make  -|:- Model                         -:| 
  1999,      Chevy,    Venture \, "Extended Edition"

hmmm… well, the issue with that is that it looks rather ugly. At the very least, commas like these

|:-  Year -|:- Make  -|:- Model                         -:| 
     1999,     Chevy,    "Venture, ""Extended Edition"" "

are not too ugly. I’m literally, using the ‘psudo-standard’ in csv explained by wikipedia and ratified by RFC 4180. So again, the benefit of this approach, is you can tend to copy paste most csv data (and most csv comforms to RFC4180).

RFC 4180 : RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files

If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:

  "aaa","b""bb","ccc"

mofosyne · September 5, 2014, 9:01am

Seems we can sum this up:

Standard pipe (markdown extra)

Pros:

Is traditional, and most people already remember how to do it.
Visually appealing

Cons:

Takes time to create
Takes time to modify
Parsing is not simple

Compact Pipe Table

Pros:

Is more compact. And marginally faster to write.
Parsing is a little easier

Con:

loses flexibility of |:------:| in setting all cells below to same alignment

CSV Tables (With compressed pipe headers)

Pros:

Fastest to write
Aside from parsing the compressed pipe header, the rest of the data is in CSV format which is much faster to parse.
Easiest to modify

Cons:

Doesn’t have the same beauty as piped headers.

Wrap up

My opinion is that we should aim to support csv tables first, simply due to the ease of implementation. But the pipe tables from markdown extra should be included as well, since it occurs so often in emails and text documents.

In summary

Pipe Tables is best for ‘one off’ presentation like emails, where flexibility comes first before ease of maintenance.
CSV tables is best for when the table is likely to be constantly updated, and the user is willing to sacrifice flexibility in alignment for maximum ease of modification.

rwzy · September 5, 2014, 12:23pm

Sorry guys. I continued off what mofosyne suggested over at github https://github.com/jgm/stmd/issues/73#issuecomment-54611781. As long as this site stays up, obviously the discussion should be kept here. I’m going to quote myself so that the reader doesn’t have to click the link if she does not wish to (feel free to object to what I put forward!):

I’d also argue for the inclusion of the three other types of tables supported by pandoc markdown, as well as support of using a + in the hyphen line of pipe tables as produced by emacs orgtbl-mode, as pandoc markdown also supports it. Also, the grouping (colspan) feature from multimarkdown’s table syntax.

So now in summary:

Simple
Multiline
Grid
CSV
Pipe, without the compact/compressed headers as would be used with the csv table, but with multimarkdown’s grouping (colspan) feature.

I haven’t actually seen a reason for not having multimarkdown’s grouping feature in pipe tables already in pandoc’s markdown though (whether in the github issues, github wiki, or in the discuss mailing list)? So it would be nice if john or someone else could specify or link to the reason for it’s occlusion. And then if that exists, it could be used as a reason to keep it out of the spec here too.

This way we could probably satisfy quite a lot of needs, while still retaining fairly simple and readable syntaxes (with varying degrees of flexibility and maintainability).

Of course, I may have easily overlooked a conflict or something in supporting five different table syntaxes, or overlooked a difficult implementation detail so feel free to bring up such a case. This is only a suggestion for the future; for whenever john wants to deal with tables.

mofosyne · September 5, 2014, 12:57pm

Super simple semi-csv table ( discussed in markdown-discuss@six.pairlist.net ) . Converts only the comma in the header field into " | ".

    Year  |   Make  |   Model
    1997,    Ford,      E350
    1999,    Chevy,    Venture "Extended Edition"
    1999,    Chevy,    Venture "Extended Edition"
    1996,    Jeep,      Grand Cherokee

Pros: Very easy to convert from csv.
Con: Very inflexible.

Also if needed you can include ------- for better legibility

    Year  |   Make  |   Model
    ----------------------------------------------
    1997,    Ford,      E350
    1999,    Chevy,    Venture "Extended Edition"
    1999,    Chevy,    Venture "Extended Edition"
    1996,    Jeep,      Grand Cherokee

I’m sure you guys have seen this style of tables before

Well if you like pipes instead of commas (make sure you align it right!):

    Year  |   Make  |   Model
    ----------------------------------------------
    1997  |   Ford  |   E350
    1999  |   Chevy |   Venture "Extended Edition"
    1999  |   Chevy |   Venture "Extended Edition"
    1996  |   Jeep  |   Grand Cherokee

Another very interesting concept from Bill Costa.

In other words push the text up against the pipe for left/right
justification, and spaces on either side for centered.

|    Year|Make     |    Model                        |
| -------------------------------------------------- |
|    1997|Ford     |    E350                         |
|    1999|Chevy    |    Venture "Extended Edition"   |
|    1999|Chevy    |    Venture "Extended Edition"   |
|    1996|Jeep     |    Grand Cherokee               |

Pros: Can align tables, with minimal effort. Alignment is intuitive
Cons: Changing alignment of whole rows is painful. (Might as well just use pandoc’s -:| style pipe alignment aye? But then again, that might use more mental energy than this, for single shot tables)

jks · September 5, 2014, 2:43pm

Standardizing “pipe” tables (also known as PHP Markdown Extra tables) might be a step into right direction, since many implementations already support them in some way.

However, I wanted to share with you my comments I made for myself when adding a table support into Minima converter to show you how many corner-cases and ambiguities are out there:

Example 1

Pandoc: 4 cols in header, 4 cols in 1st and 2nd row, 2 cols in 3rd row.
PHP Markdown Extra: 4 cols everywhere.

h1|h2|h3|h4
-:|-|-|-
1|2|3|.|
a|b|c|d
I|.

Example 2

4th column ignored in Pandoc, present in PHP Markdown Extra.

A|B|C|D
-|-|-
1|2|3|4

Example 3

A complete table in PHP Markdown Extra, no table detected in Pandoc.

A|B|C|D
-|
1|2|3|4

Example 4

OK table in Pandoc. It doesn’t work in PHP Markdown Extra because of missing header line.

 --:|--|---|--
  1 |2 |3  |4
  a |b |c  |d
  I |II|III|IV

Example 5

Not detected as a table in Pandoc, due to single :, but works in PHP Markdown Extra.

|a|b|c|
|-|-|:|
||2|3|

Example 6

An empty table in PHP Markdown Extra, not detected as a table in Pandoc.

||||
|-|-|:|
||||

Example 7

2 cols in header and 4 cols tbody in Pandoc, erratic behaviour in PHP Markdown Extra.

x|y
-|-|-|-
||||||||||

Example 8

Pipe chars entered as \|or `|` should not trigger a cell separation. Works that way in Pandoc and kramdown.

|  1 |   2 |  3
| -- | --- | --
| \| | `|` | \|

Example 9

While Pandoc does recognize only 5 cols in <thead> in below case, PHP Markdown Extra sees 7 cols and that feels like the right thing.

| `|` | <!--|--> | \| |    *|*      |  __|__  |
| --- | -------- | -- | --- | ----- | -- | -- |
| a   | b        | c  | d   | e     | f  | g  |

Example 10

Pandoc does not recognize one-column tables. PHP Markdown extra also does not recognize them and it needs the data line written as |a|, which is in collision with its own documentation.

| 1 | 2 | 3
|---|:--|--:
a|

Example 11

A table inside a second list item. Pandoc handles it somehow but cudos to kramdown!

- item1
- | -
a | b | c | d

Example-12

There should be some empty headers and 13 <code>|<code> cells. OK in PHP Markdown Extra and kramdown. Only 7 <code>|<code> cells in Pandoc.

||||||||||||||||||||||
-|-|-|-|-|-|-
`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`|`

Example 13

A table without header. Works in kramdown only.

:--- | ---- | ---:
A    | B    | C
1    | 2
I    |

Ideally, all above inconsistencies should go away by a proper table syntax specification. And it feels like it’s going to be a bit of work.

rwzy · September 5, 2014, 2:46pm

The third example is already a ‘pipe’ table (the beginning and ending pipes being optional). The fourth is potentially viable, however, I believe pandoc’s markdown specifically forbids this, maybe due to difficulty in implementation detail? Might be relevant since john also created pandoc obviously.

mofosyne · September 5, 2014, 3:08pm

Good point rwzy. I would like to caution that the 3rd example you mention does not work. This is because it is mistaken as a ‘h1’ header as shown in babelmark test 1 .

But it’s easily fixed, by appending a single |, so that is not too painful. babelmark test 2 . Ah… but pandoc only recognizes the first column (Which matches what jks is talking about in terms of corner cases). PHP Markdown Extra, parsedown, and kramdown works thought (so good on them!)

Really do wish pandoc can notice the context of the ------- line in making tables without needing |, but yes it’s most likely due to difficulty in implementation (due to the need of context).

They just need to sort out the context issue of ------ . Aside from that, using, for inline csv data input should be relatively straight forward. (which means… just give us the option of choosing , or | as the cell delimiter )

bp_ · September 5, 2014, 11:28pm

At the end of the day, CommonMark comes from the m******n implementations of Discourse, Stack Exchange, Reddit and Github.

Discourse: no tables
Stack Exchange: no tables
Github: pipe delimited cells, colons on second line mark alignment
Reddit: pipe delimited cells, colons on second line mark alignment

Thus, if any syntax should be chosen, pipe delimited cells with colons on second line marking alignment should be it. This is in line with the goal of backwards compatibility.

rwzy · September 6, 2014, 2:25am

I said I’d argue for more table syntaxes but didn’t exactly mention why. I’ll try to list them now, in order from most important to least.

Pipe

Seems to be the most obvious one that I think we can all agree to have right?
That being said, bp_, I still think having some more alternate syntaxes (below) in the spec is beneficial, and doesn’t affect backwards compatibility, since we could have those in addition to pipe tables.
Again, I would like to a see a reason to not include multimarkdown’s colspanning here since it easily allows for horizontal grouping, which is used quite commonly in tables.

Simple (from pandoc’s markdown)

Allows for a very simple readable syntax that doesn’t need to resort to pipes. Less powerful, but if that’s all one needs, it’s much easier/faster for them to do simple tables, right?

Multiline (from pandoc’s markdown again)

Sort of an extension of the simple syntax which is still very useful due to allowing multiple lines in a cell, unlike in any other syntax (except for grid tables, which I refer to below).
For example, using lists in a table cell is common.
The syntax also looks very similar to what’s commonly used to style tables in academia. (No vertical lines that pipe tables have).

CSV

Pipes look ugly when unaligned being the main reason I think?
So the same as pipe tables, but with commas instead of pipes?
- But I’m not sure if mofosyne/anyone still wants ‘compact headers’ for comma delimited tables?

Grid Tables

Initially I thought that if this makes the spec, emacs users lives would become very easy and therefore it’s worth it.
However, putting it in the spec means people who don’t use emacs might be presented with such a table, which they might need to edit themselves…
Since it’s quite difficult to do so without an advanced editor, maybe it shouldn’t be in the spec? Thoughts? (I’m not an emacs user though.)

Zegnat · September 6, 2014, 8:49am

Except of course for where major contributors to the spec and @jgm himself have stated several times they were looking to standardise on what other implementations were doing and keep much backwards-compatibility in.

Even then it makes sense to include tables in the spec in some form.

mofosyne · September 6, 2014, 9:03am

has @jgm mention anything about why they didn’t include tables?

bp_ · September 6, 2014, 9:17am

I’m not saying there should be only one syntax, but it would probably easier to start with one, and if we have to pick one, we should pick the one that users of those services that will be picking up CommonMark first are already used to.

I believe the goal is for an almost silent transition from m******n to CommonMark, with only side cases being changed for the better.

rwzy · September 6, 2014, 9:17am

Do you mean like the third reply in this thread? Perhaps they didn’t consider tables a ‘core’ feature because gruber’s original didn’t include any table syntax?

rwzy · September 6, 2014, 9:21am

Oh, it seemed as though you meant only the one. But yes I agree with that. Judging by the number of likes on op, I think pipe tables are the most popular across the flavours and therefore the most likely to get accepted, at the least.

mofosyne · September 6, 2014, 9:50am

Yea. The biggest point for pipe tables, is that it is in widespread usage in github and reddit already. Backwards compatibility to the most common dialect of markdown is a must if CommonMark is to be common.

Zegnat · September 7, 2014, 3:25pm

That is probably the case. And while I can certainly see the merit in being just a formalised standard of Gruber’s Markdown, over time I have been getting the feeling CommonMark is trying to be more.

(And, at the risk of repeating myself, there is a precedent in fenced code blocks.)

mofosyne · September 7, 2014, 4:47pm

Yea… well historically it was initially Standard Markdown. It’s just that it’s now CommonMark.

Which is a good opportunity for us to be more than ‘just markdown’ and forced to adhere to tradition at our detriment. Much like how Python 3.x break compatibility with Python 2.x to correct fundamental issues with the language. We could perhaps strategically break away from traditional specs in order to bring a more cohesive markdown language. (e.g. introducing code fencing and tables, both in common use, and both not in original specs)