Tables in pure Markdown


#123

Or like this:

| Server | IP | Description |
|--------|----|-------------|
| cl1 
| 192.168.100.1
| This is my first server in the list
|
| cl2 
| 10.10.1.22
| This is another one server
|
| windows-5BSD567DSLOS
| 127.0.0.12
| This is customer windows vm. dont touch this! 
|
| DFHSDDFFUCKENLONGNAME
| 192.168.1.50
| Some printer
|
...

And this:

| Item | Amount | Cost |
|:-----|-------:|-----:|
| Orange
| 10
| 7.00
|
| Bread 
| 4
| 3.00
|
| Butter
| 1
| 5.00
|
| Total |      | 15.00 |

This solution gives you more space for creativity. Besides:

  • It’s no need global changes. (only one micro fix is needed)
  • It does not break anything.
  • Everyone will happy

This is the best solution, isn’t it?


#124

I agree with you that pipe tables aren’t very good for big tables. I still think they should be standardized; pipe tables are such a widely-supported Markdown extension that people expect them to work whether they’re in the spec or not.

On the other hand, I think you’re unfairly characterizing HTML tables. You don’t have to put in the end tags for rows or cells. This is fine (at least, it’s specified by HTML5 to work the way we want it to):

<table>
<tr>
<th> Server
<th> IP
<th> Description
<tr>
<td> cl1
<td> 192.168.100.1
<td> This is my first server in the list
<tr>
<td> cl2
<td> 10.10.1.22
<td> This is another one server
<tr>
<td> windows-5BSD567DSLOS
<td> 127.0.0.12
<td> This is customer windows vm. dont touch this!
<tr>
<td> DFHSDDFFUCKENLONGNAME
<td> 192.168.1.50
<td> Some printer
..
</table>

I guess that MediaWiki tables are better. I’m not convinced that they’re so much better that CommonMark should add them to the spec when they’re so rare among existing implementations.


#125

Yes, I see, html5 tables it’s quite simple, thanks for this. But I love markdown and I want to make it better to using with tables too.
I described the main problem to using markdown tables to you.

I don’t want fully mediawiki tables implementation in the markdown, but I want to have opportunity to make tables in vertically mode.

This solution, is fully solves this problem.
It is needed only simple check for newline symbol before pipe in the code.
So it will allow to use common markdown tables in vertically mode too.
Why not? It is not breaking anything, but may be very useful for all users.
And it is same yet markdown tables…


#126

I agree with your implementation.
But I would remove the alingment beneath the headers and put them into the headers themselves.

I think this is the best solution. and the most flexable. and people can have a grid table or a list style table Which is something I like


#127

I find it odd that no-one mentioned the necessity of row headers (i.e. ths to describe rows not columns). For some reason all the current Markdown flavours which support tables only support column headers. But it is really important for accessibility to support both!

Just consider the following table of pizzas and their cost:

				Small    Large
Salami			8.99     10.99
Hawaii			9.49     11.49
Margherita		7.99      9.99

For screen reader users to understand, e.g. the cell with “11.49”, they would need to have read out the pizza (row header = Hawaii) and the size (column header = Large). Just the one or the other is not enough to understand what the content of a cell means.

Most simple text markup languages allow a header cell anywhere in a table with a quite simple syntax, e.g.

  • Creole: |=
  • txt2tags: ||
  • DokuWiki: ^
  • Textile: |_.
  • I personally like: |# (because users already associate # with a heading in Markdown)

Colspans and rowspans are also important for accessibility. Wrongly empty cells can confuse screen readers. But unless they happen in header cells, they are not a big barrier.

Depending on length and complexity, not having header cells marked up correctly can make a table very difficult or impossible to understand for screen reader users.


#128

I agree that row headers are important, but marking them up may become clumsy – perhaps add a colon before the pipe:

|          | Small | Large |  
| -------- | ----- | ----- |  
|  Salami :|  8.99 | 10.99 |  
|  Hawaii :|  9.49 | 11.49 |  
|  Marge. :|  7.99 |  9.99 |  

Another observation I want to throw in here: Fallback rendering for pipe tables is improved in most implementations if each line is ended by a double space. The only exception is Blackfriday which does not render a table then.

## With double space

| Column Header | Second Column |  
| ------------- | ------------- |  
| data cell     | second cell   |  
| third cell    | fourth cell   |  

## Without double space

| Column Header | Second Column |
| ------------- | ------------- |
| data cell     | second cell   |
| third cell    | fourth cell   |

#129

Could the parser automatically figure out that it is a row header based on the empty cell in the top row?


#130

It could, but that’s not always the case. Too unreliable.


#131

John, given that it’s been 3½ years since you wrote that, and since in that time GitHub and many others have adopted CommonMark, should the core be settled without further delay?


#132

I would like to say at this point, I don’t care much about what table format is supported, but rather, that any are.


#133

There is already a tables extension as part of the GitHub Flavored Markdown spec (which is a superset of CommonMark). You can use this today. If a tables extension is ever formalised as part of the CommonMark project, it would need to aim for compatibility with GFM since the GFM table syntax is already widely used and the goal of CommonMark is to be highly compatible with existing implementations.


#134

I agree that compatibility with the widely used pipe table syntax is a good idea. Here are some thoughts I wrote up a couple years ago, towards a spec for tables that is largely compatible with existing pipe tables but more flexible:

I tentatively agree with the current syntax’s decision that pipes | create cell structure, and that literal pipes need to be escaped even inside code backticks. This is a departure from the general principle that nothing needs to be escaped inside code backticks, but it conforms to the general commonmark idea that block structure is discerned prior to inline structure.

Headers should be optional. So, this should be a table:

| a | b |

This too:

|:--:|--:|
| a  | b |

I think | should be required at the beginning and end of the row. I’d like to reserve this syntax for line blocks:

| 15 Main St.
| Chicago, IL

Note that existing pipe tables allow the leading and trailing pipe to be skipped. However, this makes it harder for parsers to tell right away whether we have a table line, and there’s also the issue flagged above about line blocks. Finally, especially if headers are not required, this increases the probability that a line with a literal | will be wrongly treated as a table.

Alignments should be supported, in the now-standard way:

| right | left | center |
| ---:  | :--- | :----: |

What to do if the rows have different numbers of columns? Presumably just add empty columns to all the rows. But there’s a question whether we should take the headers to determine the number of columns (and truncate body rows if needed) or just take the maximum number of cells in any row.

That would be the minimum.

I think colspans could be supported thus:

| this spans two columns  || third column |
| this spans three columns              |||
| aaa      | bbb          |  cccc         |

This means that if you want a blank cell you need space between the ||s. This would be a departure from the way pipe tables currently work.

Maybe for rowspans:

| this spans two rows |  bbb |
|^                    |  ccc |

These can be combined:

| this spans two rows and two colums|| bbb |
|^                                  || ccc |
| aaa             | bbb              | ccc |

One more thing that would be attractive in table syntax is a way to include long paragraphs in cells (or even multiple block elements).

For long paragraphs, one could do something like this:

| aaa | this is really long so I just |
!     | continue down here            |
| new | row here                      |

Here the exclamation marks say: add the contents of these cells to the cells above them. One could even have a list in a table cell using this syntax:

| aaa | - item one         |
!     | - item two         |
| new | row here           |

The ! is very similar to the |, which has its good points and its bad points. Alternatively one could choose a different character with more contrast.

| aaa | - item one         |
+     | - item two         |
| new | row here           |

Note that allowing long cell contents, and especially block level content inside cells, raises some issues about table layout. In HTML output this isn’t a problem, since browsers compute column widths that (usually) make sense for the content. But in other output formats, like latex, one must explicitly specify column widths. So one question is whether to have something in the syntax that represents relative column widths. Pandoc does this by using the lines of - under the headers, but only in cases where the cells are too long to be represented without wrapping.

EDIT: fixed initial |s in code blocks, which this forum converted to > for some reason… @codinghorror - this seems to be a bug in discourse’s processing of posts received by HTML.


#135

I’ve tried to find an implementation of it to no avail.

Joel Gerber


#136

We should be very careful to not create Catch-22. Consider a paragraph with many code spans, each having a pipe in it. It might be easily misinterpreted as a table. Yes, you may escape the pipes to prevent that. Alas, when not table, all the escapes are not escapes anymore but literal \| in a normal code span.

IMHO, it would be absolutely awesome if such ideas are included somewhere in CommonMark repo, even in such a short and informal way as your previous post. E.g. something in contrib/ or staging/ subdir which would contain proposals of future CommonMark features.

It could get CommonMark some momentum for discussing them and it would allow PRs which incrementally change them into more formal spec-like wording, and incrementally bringing them to the level required for inclusion into the core spec as a new chapter. Also implementations of them would go in (more likely) in the right direction instead of reinventing something way too incompatible.

The activity in the form of PRs or some discussion referring them would also make some natural metrics about demand for those features, allowing some prioritization.


#137

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.

That’s from Gruber’s original Markdown spec, and is also quoted in the second paragraph of the CommonMark spec.

The approach I outline below has the overriding goal of readability/publishability as-is. It supports rich table functionality and is backwards compatible with GFM tables. It riffs off of the ideas of @jgm and others in this thread.

Start with the GFM tables extension. Then add the following:

Allow zero or multiple header rows. If there is no header, the header delimiter is optional (you may want to keep it for column alignment). Support column spans.

| heading 1 |          heading 2          ||
|           |  sub head a  |  sub head b   |
|-----------|--------------|---------------|
| aaa       | bbb          |  cccc         |
| spans two cols          ||  cccc         |
| aaa       | bbb          |  cccc         |
| spans three cols                       |||
| aaa       | bbb          |  cccc         |

Support row spans as well as long paragraphs and multiple block elements as cell content. Normally each text row is a table row, but the inclusion of a row delimiter signals that all rows in the table will be explicitly delimited. Notice that despite the complexity, it is easy to visually parse this as a 3 x 5 table:

| aaa | this is really long so I just   | ccc |
|     | continue down here              |     |
|.....|.................................|.....|
| aaa | bbb                             | ccc |
|.....|.................................|.....|
| this spans 2 (not 3) rows and 2 cols || ccc |
|                                      ||.....|
|                                      || ccc |
|.......................................|.....|
| aaa | bbb                             | ccc |
|.......................................|.....|
| aaa | - item one                      | ccc |
|     | - item two                      |     |

In addition to column-level default alignment specified in the header delimiter (per GFM), granular cell-level alignment can be specified in row delimiters.

|..........|:........:|..........|
|    right |  center  |  center  |
|---------:|:---------|:--------:|
|    right | left     |  center  |
|..........|..........|..........|
|    right | left     |  center  |
|:........:|.........:|:.........|
|  center  |    right | left     |
|..........|..........|..........|
|    right | left     |  center  |

If row headers are important (for accessibility as @selfthinker says above), support identifying them in the header delimiter:

|            |   Small |  Large | 
|============|---------|--------|
| Salami	 |    8.99 |  10.99 |
| Hawaii	 |    9.49 |  11.49 |
| Margherita |    7.99 |   9.99 |

Optionally (if this is not too hard for parsers), column spans can be determined by pipe alignment. This is about as readable and as intuitive for writers as it gets:

| heading |              heading 2              |
|         |      sub head a      |  sub head b  |
|---------|----------------------|--------------|
| aaa     | this is still just   | ccc          |
|         | a single row but I   |              |
|         | talk too much        |              |
|.........|......................|..............|
| aaa     | bbb                  | ccc          |
|.........|......................|..............|
| this spans two rows and two    | ccc          |
| columns                        |..............|
|                                | ccc          |
|.........|......................|..............|
| aaa     | bbb                  | ccc          |
|.........|......................|..............|
| aaa     | - item one           | ccc          |
|         | - item two           |              |

Notice that the header clearly has two rows despite not having an explicit row delimiter because of the differing column spans of the two text rows. If this is too hard for parsers to see, we could require that an explicit row delimiter be used.


#138

@vas - some nice ideas there. Two comments:

  1. In some of pandoc’s simple, multiline, and grid table formats (all very readable), we make use use column alignment. This doesn’t create a big problem for parsing. But it does create a problem for people entering tables in text boxes, which very often (as on this forum) don’t use monospaced fonts. When you’re using a proportionally spaced font, it’s next to impossible to line things up. So it’s probably better if the proposed syntax does not rely on column alignment.

  2. Instead of using periods, it would look better to use hyphens, and use = signs for the headers. Hyphens are in the middle of the line, while periods are on the bottom, and I think this makes them a better separator. I see that this has two drawbacks, though: it loses backwards compatibility, and it requires finding some other way to mark row headers. Still, it might be worth considering.

@mity - your point about the Catch-22 is a great one. Remember, though, that on this proposal an initial and final | is always required, and a table can’t interrupt a paragraph (or so I think it would be reasonable to stipulate). So to get the scenario you describe, you’d need a paragraph that starts with an unescaped |, like:

| some `code with pipes |
| and` more `code with pipes |
|` and a final unescaped |

Here we’d get a table, but this could be fixed easily by backslash-escaping the first | in the paragraph, or by re-wrapping. (The second line can’t start a table because it would be interrupting a paragraph.) It’s also extremely unlikely that anything like this would occur in normal writing: how often do you begin a paragraph with a pipe character?

I agree that it might be good to put some proposals for tables on GitHub, maybe in the CommonMark repository.


#139

So it’s probably better if the proposed syntax does not rely on column alignment.

Agreed. Would it be a bad idea if the spec supported both means of determining column span? e.g,

| aaa | bbb |
| ccc ||

would be equivalent to

| aaa | bbb |
| ccc       |

Instead of using periods, it would look better to use hyphens

But only marginally better I think:

| heading |              heading 2              |
|         |      sub head a      |  sub head b  |
|---------|----------------------|--------------|
| aaa     | this is still just   | ccc          |
|         | a single row but I   |              |
|         | talk too much        |              |
|.........|......................|..............|
| aaa     | bbb                  | ccc          |
|.........|......................|..............|
| this spans two rows and two    | ccc          |
| columns                        |..............|
|                                | ccc          |
|.........|:....................:|..............|
| aaa     |      centered        | ccc          |
|.........|......................|..............|
| aaa     | - item one           | ccc          |
|         | - item two           |              |
| heading |              heading 2              |
|         |      sub head a      |  sub head b  |
|=========|======================|==============|
| aaa     | this is still just   | ccc          |
|         | a single row but I   |              |
|         | talk too much        |              |
|---------|----------------------|--------------|
| aaa     | bbb                  | ccc          |
|---------|----------------------|--------------|
| this spans two rows and two    | ccc          |
| columns                        |--------------|
|                                | ccc          |
|---------|:--------------------:|--------------|
| aaa     |      centered        | ccc          |
|---------|----------------------|--------------|
| aaa     | - item one           | ccc          |
|         | - item two           |              |

Options:

  1. - for headers, . for rows, because GFM backwards compatibility is worth more
  2. = for headers, - for rows, because it’s more readable. # or + can be used to mark row headers.
  3. support both, with the occurrence of a = delimiter row changing the meaning of a - delimiter row.

Option 3 might be too complex for users. Not so much worried about parsers.

Also I think @chrisalley might be right above when he says cell alignment is a presentation thing, in which case I’d remove the column alignment stuff from my proposal. It would certainly simplify it.


#140

Not often, of course. However your counter-example silently assumes that table cannot interrupt a paragraph like e.g. lists can.


#141

Not silently – I say that explicitly: “a table can’t interrupt a paragraph (or so I think it would be reasonable to stipulate)”. I don’t think this is in the crude syntax description I provided, but it should be, for exactly this reason.


#142

@jgm Sorry, overlooked that.