Some of the common gripes I have about CSV is the limited ability to make a distinction between a row being a “header row” and a “data row”.
If we ever come to an agreement on a simple enough commonmark table parser (or maybe even before). Would it also be a good idea to promote a new “PSV” format? In which a file may look like, say… github style markdown table?
It doesn’t have to be a pipe format, but it should ideally be a format that is likely to be used by commonmark. The idea is that it would be included as a possible input format for python, matlab, excel, and any other spreadsheet software as a competing alternative to csv.
# Spreadsheet Title
Spreadsheet comments
## Worksheet Name
Sheet comments
| Tables | Are | Cool |
|----------|---------------|-------|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
| col 3 is | normal | $1 |
(Source of terminology “spreadsheet” and “worksheet” is based off Excel, as sourced from here )
Yea I know there is tab separation, but tabs are difficult to distinguish from spaces. Also csv files does not make headers visually obvious.
Plus with smart design of the parser (proposed in halfbakery below), you could also have multiple sheets (along with sheet comments). So it becomes almost akin to an ASCII readable spreadsheet. Of course you do want to keep the base parser as basic as possible however, to encourage adoption.
At it’s most simplest, the most basic parser should be able to parse the table, locate the header, and ignore all other comments and tables gracefully.
If this is a good idea and there is enough support, we could spread the concept via a website (like how json did it), which would provide a simple visual explanation of constructing such parser. And maybe if popular enough that people write parsers for it, we can also include libs for popular languages like python etc…
Header Handling:
Standard Single Header
| Tables | Are | Cool |
|----------|---------------|-------|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
Multirow Header
|==========|===============|=======|
| Tables | Are | Cool |
| Tables | Are | Cool |
|==========|===============|=======|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
Multicell Header
|==========|===============|=======|
| Tables | Are | Cool |
: Tables : Are : Cool :
|==========|===============|=======|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
Multicell rows
| Tables | Are | Cool |
|----------|---------------|-------|
| col 1 is | * apples | $1600 |
: : * oranges : :
| col 2 is | cool | $12 |
: : or hot : :
Non data field?
Not sure how to deal with this, but maybe if =
is for header divider, and -
is for non data field below?
|==========|===============|=======|
| Tables | Are | Cool |
|==========|===============|=======|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
|----------|---------------|-------|
|| TOTAL: | $1612 |
But then how do you declare the logic? e.g. SUM(A3:E3)
in excel? Or should that be considered beyond the scope of PSV?
What if we also used !
?
|==========|===============|=======|
| Tables | Are | Cool |
|==========|===============|=======|
| col 1 is | ok | $1600 |
| col 2 is | cool | $12 |
|----------|---------------|-------|
|| TOTAL: | $1612 |
!! ! =SUM(col:"Cool") !
Hmmm… maybe out of scope of this. Seems rather ugly to try and shove in.
Value Type detection
Supports all datatype that json can handle. But also record a type guess e.g. ( $50 is seen as {"value":50,"type":"currency"}
). Can either be autodetected from first data field, or declared in header field. e.g.
|=======|===============|========|
| ID | Comment | Cool |
: (key) : : ($USD) :
|=======|===============|========|
| 1 | ok | $1600 |
| 2 | cool | $12 |
other values declaration could be like (SI:cm^2)
which says SI units of centimetre squared etc…
Difference from typical markdown tables
-
:
alignment characters is ignored. This is a data format like csv. -
:
in start of a line while in table mode means multicell continuation. -
"
can be used to include strings with|
or:
safely.
Originally made the proposal in halfbakery:
edit: Updated with these consideration. Side Thoughts: Promoting pipe tables as a potential alternative to .CSV format (e.g. .PSV ?)