Made a first stab at a C markdown table format to json program - psv.c

A while ago, I was bikeshedding over the idea of a pipe separated value ‘psv’ file format which is essentially just a markdown table. I think i figured out at least a reasonable format now. Hopefully this will inform what the ‘psv’ standard should look like in GitHub - psv-format/psv-format.github.io: PSV spec, with reference implementations in C and JavaScript . Participants are welcome to join (will add you to the org)!

Basically, have a consistent attribute syntax on top as an optional method to ID a table, but if not present then just say table1, table2, table3 etc… this makes it easier to locate a particular table of interest in the middle of a normal commonmark document as well.

So how does it look at the moment? I’ve written it as a very barebone MVP, so it won’t have any ‘quote’ or ‘escaping’ capability. But it’s able to recognize numbers and boolean at least.

I also decided to make it kind of similar to ‘jq’ in that it is primarily using standard stream, so you can easily pipe in a markdown document and then output a json content that can be easily piped into jq. This is a good approach as more people are likely to already know how to use jq and that keeps the complexity of this MVP down.

Overall, what’s the possible application of this? Maybe you got a ‘default settings page’ you want to read. This could help keep the source of truth in one location instead.

Anyway this is what I got! At least it’s something now! I won’t be working on this too much until I get extra motivation or feedback, but hopefully this will get the ball rolling on if this is a good idea or not.

make && ./psv << 'HEREDOC'
| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32.4  | San Francisco|
| Bob     | 32  | 
| Charlie | 19  | London       |

{#test2}
| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32  | San Francisco|
| Bob     | 32  | Melbourne    |
| Charlie | 19  | London       |
HEREDOC

The above command would give a response that may look like below

[
    {
        "id": "table1",
        "headers": ["Name", "Age", "City"],
        "rows": [
            {"Name": "Alice", "Age": 25, "City": "New York"},
            {"Name": "Bob", "Age": 32.4, "City": "San Francisco"},
            {"Name": "Bob", "Age": 32},
            {"Name": "Charlie", "Age": 19, "City": "London"}
        ]
    }
    ,
    {
        "id": "test2",
        "headers": ["Name", "Age", "City"],
        "rows": [
            {"Name": "Alice", "Age": 25, "City": "New York"},
            {"Name": "Bob", "Age": 32, "City": "San Francisco"},
            {"Name": "Bob", "Age": 32, "City": "Melbourne"},
            {"Name": "Charlie", "Age": 19, "City": "London"}
        ]
    }
]

There is also a compact mode

make && ./psv -c << 'HEREDOC'
| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32.4  | San Francisco|
| Bob     | 32  | 
| Charlie | 19  | London       |

| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32  | San Francisco|
| Bob     | 32  | Melbourne    |
| Charlie | 19  | London       |
HEREDOC

which has a more compact representation

[
   [
       {"Name": "Alice", "Age": 25, "City": "New York"},
       {"Name": "Bob", "Age": 32.4, "City": "San Francisco"},
       {"Name": "Bob", "Age": 32},
       {"Name": "Charlie", "Age": 19, "City": "London"}
   ]
   ,
   [
       {"Name": "Alice", "Age": 25, "City": "New York"},
       {"Name": "Bob", "Age": 32, "City": "San Francisco"},
       {"Name": "Bob", "Age": 32, "City": "Melbourne"},
       {"Name": "Charlie", "Age": 19, "City": "London"}
   ]
]

Finally you can select table by ID

make && ./psv --id dog -c << 'HEREDOC'
{#cat}
| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32.4  | San Francisco|
| Bob     | 32  | 
| Charlie | 19  | London       |

{#dog}
| Name    | Age | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32  | San Francisco|
| Bob     | 32  | Melbourne    |
| Charlie | 19  | London       |
HEREDOC

Which would output just the table marked as dog

   [
       {"Name": "Alice", "Age": 25, "City": "New York"},
       {"Name": "Bob", "Age": 32, "City": "San Francisco"},
       {"Name": "Bob", "Age": 32, "City": "Melbourne"},
       {"Name": "Charlie", "Age": 19, "City": "London"}
   ]

Has there been any settling on what the consistent attribute syntax would look like?

{#table1 .table .sortable data-types=[integer, string, datetime]}
| Customer ID       | Name   | Date of Purchase  |
|-------------------|--------|-------------------|
| 1                 | Alice  | 2024-04-23        |
| 2                 | Bob    | 2024-04-24        |
| 3                 | Charlie| 2024-04-25        |

Was wondering what a ‘schema’ for this might look like. This is what I got so far as potential idea, but be good to hear some opinions.

Also if anyone is interested in this concept, feel free to ping me with your github username to be added to the psv working group in PSV File Format Group · GitHub

Some extra thoughts on specifying data type

{#pet}
| Name    | Age  {#age int} | City         |
| ------- | --- | ------------ |
| Alice   | 25  | New York     |
| Bob     | 32  | San Francisco|
| Bob     | 32  | Melbourne    |
| Charlie | 19  | London       |

This looks cleaner, and basically says that the Age field has id of “age” and is represented as an integer.

But will need to figure out how to deal between data type and semantic tags. E.g. that a line is a string and is interpreted as a datetime… Maybe {#creationtime string .datetime} CSS class also double as semantic tag?