Intended fields of application for CommonMark/Markdown

Mike · October 1, 2014, 12:18pm

Hi guys,

first a short disclaimer. I’m following the discussions here for only for a few days and I have limited experience with Markdown myself, I just use it as a file format for richtext documents. So, please bear with me.

I would like to get a grasp of where the community sees CommonMark/Markdown in the future.

As far as I get it, it was developed to have a simple way for writing blog posts. But this extended over time and now we have complete publishing solutions for various media based on Markdown. Obviously this imposes new needs and new user groups.

What user groups should write Markdown in your opinion?

Just web or software developers who know other programming and markup languages and are happy to have a simpler way to type their forum or blog posts?
Would you aim for a general wiki markup, and have tech savvy users?
Shall it be simple and forgiving enough to serve even casual users?

From some topics I get the impression that it should also be strong enough to handle nearly every formatting and embedding task (without html fallback) which imposes of course constraints on the simplicity.

Which are the fields of application you’re aiming for?

Blog and forum posts
Web site creation
Publishing (ebook, print)
Common writing tasks, like word processors

It would be very nice, if you could show me where you are heading with CommonMark/Markdown.

Regards
Mike

mofosyne · October 1, 2014, 1:30pm

I would want to see this be adopted more in places like government, and office workers.

I could imagine productivity would increase, if people did not have to worry about how pretty the layout is (Except at the final stage of publishing).

In general I would like to see a “core syntax” with “profiles/flavors” declared by “document declarations” that can be done either as a sitewide declaration (like blog or forum post), or per document (reports, emails).

Different profiles/flavors will share the same ‘core syntax’, but may have extended syntax that conflict with each other. This is acceptable, since I think it’s impossible to have a single syntax to suit all use cases without too many edge cases.

Lite Mark : Compact stripped down syntax. Simplest to implement. (Use case: IRC chat) (Singular)
Core Mark : Has the most common syntax (Is subset of Lite CommonMark) (Singular)
Flavours: Has extended syntax that may conflict with other flavors, but works best for each cases. (Is subset of Core CommonMark) (Multiples)
- Each flavor can declare what layouts it supports.

In document declaration, you could either declare specifically what flavor to use. But if you do not declare a flavor, but do declare a layout e.g.

| !CommonMark: 0.1.23-github.username.projectname
| title: Title for the top bar of any browser
| layout: report

then a flavor which can handle the layout is chosen, else reverts to pure ‘core commonmark’

If no document declaration is done, then ‘core commonmark’ is used by default (unless specified otherwise, e.g. math flavor enabled for all maths post in a math website)

From some topics I get the impression that it should also be strong enough to handle nearly every formatting and embedding task (without html fallback) which imposes of course constraints on the simplicity.

Any general ‘embedding task’ will be handled by the generic directive syntax !extentionName[](){}. E.g. !youtube[ rick roll ]( https://www.youtube.com/watch?v=dQw4w9WgXcQ )

Extended syntax should be restricted in flavors, aimed at different fields like “filmscript”, “report”, “academic”, “blog”, etc… This is because of conflicting edge cases between the needs of different fields.

tl;dr: I would like to see it in used everywhere, where speed over flexibility is needed. E.g. For type setting a report to publish as a book, you want to do it in latex for maximum flexibility at cost to time. Draft writing, requires higher speed at cost of lower flexibility.

HansBKK · October 1, 2014, 4:51pm

See the various syntaxes (sp?) listed here:

http://johnmacfarlane.net/pandoc/README.html

and for each one, ask “what is this used for?” accepting multiple answers.

Compile into a single list, and there you go!

Mike · October 1, 2014, 6:22pm

Thank you.

Considering your conclusion:
Before I would publish a report, I would spend quite some time drafting and at some stage the drafts are ready and should be published. So, I guess you cannot devide these two tasks, but one has to have a simple way to swtich between the two, otherwise one would have to copy or reedit the text which is cumbersome.

Mike · October 1, 2014, 6:24pm

So you say it should do everything, all the others can do and with the same ease like specialized formats.

That might be the ultimate wish, but is that reasonable or possible?

mofosyne · October 2, 2014, 2:32am

Well, considering you can convert CommonMark --> HTML, and HTML is a higher markup language.

I really don’t see the problem. The main issue is really maintainability of the code base. Which is why we really should define an Abstract Syntax Tree. So that we can compile all CommonMark to AST, and then create multiple renderers use that same AST to up convert it to a higher markup language.

CommonMark text —(Parsed)—> LightMark AST

LightMark AST —(Rendered)----> HTML, PDF, Word, etc…

And we should really encorage other lightweight markup language like the upcoming Z.M.L to compile to a common AST, so we don’t reduplicate the AST to HTML process (And allow more focus on the actual parsing process instead).

jgm · October 2, 2014, 5:15pm

+++ mofosyne [Oct 02 14 02:42 ]:

mofosyne [1]mofosyne
October 1

Well, considering you can convert CommonMark → HTML, and HTML is a
higher markup language.

I really don’t see the problem. The main issue is really
maintainability of the code base. Which is why we really should define
an Abstract Syntax Tree. So that we can compile all CommonMark to AST,
and then create multiple renderers use that same AST to up convert it
to a higher markup language.

CommonMark text —(Parsed)—> LightMark AST

LightMark AST —(Rendered)----> HTML, PDF, Word, etc…
__________________________________________________________________

And we should really encorage other lightweight markup language like
the upcoming Z.M.L to compile to a common AST, so we don’t reduplicate
the AST to HTML process (And allow more focus on the actual parsing
process instead).

The implemenations already parse to an AST. I think what you have in mind is a common AST serialization format, that can be used as a medium of exchange for different tools. I agree that this makes lots of sense, and will work on it.

mofosyne · October 3, 2014, 7:25am

That’s exactly what I was thinking. A common serialized format. Which incidentally if designed well, will help prevent bitrot due to parsers becoming outdated in say 20 to 40 years time, (by translating it into new formatting). This is especially important, as lightweight markup languages become more complex.

Mike · October 3, 2014, 9:07pm

While an AST is a very good thing from a technical point of view, it won’t help the users. They write the source documents.

In a way, when you asked me (I think it was you, if not then I’m sorry) to add a Markdown view to my WYSIWYG editor it was very similar, seeing the Markdown as the technical specification of the rich text. Developers like to see the technical aspects. Ordinary people (if those will be target audience) don’t care about the format, if there is Markdown or an unreadable AST, they care about what they have to do, to get their job done. And those people you can convince to write Markdown don’t care either, as long as they get the tools that help them get the job done.

Having different renderers will be part of the solution, but how would the source look? Even with some kind of style sheets only for the production pass one would need to specify the applicable style in the text source and that might clutter the source if not done carefully.

My only point is, it is very hard to have a source format that can adapt to several needs. And that is part of my question, what should Commonmark/Markdown be in the end. History seems to have told, if it should be everything, it will be nothing.

bowerbird · October 3, 2014, 9:19pm

call me crazy if you like, i don’t mind, but…

there’s no need to plan for “20 to 40 years”.

by then, there will be no need for “markup”,
“markdown”, or “mark-all-around-the-town”.

it’s not that difficult even now to "figure out"
fairly unstructured text, so once we humans
make slightly smarter programs and accept
the responsibility to write in a structured way,
leaving no room for ambiguous interpretation
– which really isn’t as hard as you imagine –
we’ll be able to leave explicit markup behind.

it will be “zen”.

i say this because i was able to ascertain
the structure of project gutenberg e-texts
without too much difficulty in most cases.

likewise, you can scan a print-book and
do o.c.r… on the scans, and get output
which also reveals the structure of the
underlying text in a straightforward way.

if you want to see that on a large scale,
examine google books, which offers stuff
which it ascertained, like header markup
and a respective linked table of contents.

that stuff certainly wasn’t “marked up” in
the print-book, but it just isn’t that hard to
"figure out" either, from the data available,
if you ponder it a while, and apply yourself.

in fact, i wouldn’t be the least bit surprised
if google can already grok unstructured text,
because they have the firepower to solve it.

heck, i’m merely a garage hacker, and i have
achieved much of the solution all by myself.

this “light-markup” stage is just a little path,
meant to show us the way to a bright future.

-bowerbird

Mike · October 6, 2014, 11:05am

Actually I’m a bit irritated right now.

Besides the answer it should be everything no real response towards what the active people intend with their work on this project.

Of course, if that is really the direction for CommonMark that is fine, but I honestly would have expected something different, looking at other specification and planning efforts and from the distinguished projects supporting CommonMark.

Concluding:

Current spec: Remove the ambiguities from Gruber’s Markdown.
Future roadmap: Do, what all competing projects are doing, just standardized and better ;-). (Side question: Is it even clear what projects are considered competing?)

Finally I’m still glad I asked, the answers told a lot.

bowerbird · October 6, 2014, 6:50pm

mike, mike, mike, don’t be irritated. it’s not worth it.

but i think your conclusion is right on.

the current version of commonmark has the goal of
removing the ambiguities from gruber’s “specification”,
and suggesting some resolution to the inconsistencies
which have arisen from all the various implementations.

the future roadmap is up for grabs, so in a real sense
it is indeed “do everything”. but that will soon end up
in a messy pile, and someone will have to sort it out,
and then the priorities will surface as to what it will be.

still, there are a number of features that’ll be needed
by “enough” users that the mess is still down the line.
nobody’s gonna argue if you put in tables, or footnotes.

so i don’t think that’s the problem facing commonmark.

instead, i would suggest that “fixing” gruber-markdown
with “yet another flavor” won’t do people any real good,
since gruber-markdown is so primitive that nobody has
a good reason to use it – or, therefore, commonmark –
so this project won’t get adoption that challenges any of
the powerful flavors out there, such as multimarkdown.

meanwhile, i don’t see much reason for powerful flavors
to see commonmark as anything more than a potential
competitor, and thus to decide to not support it now,
in its youth, as a way to stall it so that it won’t succeed.

to put all of this another way, everybody agrees that it
will be good to “standardize” gruber-markdown, but it
won’t make many people happy if that’s all you do.

they left gruber-markdown long ago, for more power.

but once you try to “extend” markdown, you’re gonna
run into established flavors, with an existing user-base
and backwards-compatibility to protect, and since you
have no user-base already using your extended features,
you have no way to compete against those other flavors
who, for the most part, always had “gruber-compatibility”.

maybe i’m wrong.

if anyone thinks i’m wrong, please feel free to say why.

but that’s how i see it.

-bowerbird

HansBKK · October 13, 2014, 12:41am

I disagree with that everything/nothing generalization.

And I don’t see the md ecosystem projects as competing with each other, nor in most cases the various md syntax flavors that have evolved over the years.

A given program may have a specific purpose, but dozens even hundreds of such projects can share a core syntax. When a program requires additional syntax for some ‘outlier’ function, then a well-defined extension to the core’s definition can accommodate that need.

And a toolset like Pandoc has proven able to reliably convert between the various flavors, so as the various tools migrate to the canonical syntax, it should usually be straightforward for past doc collections to be migrated over.