Project Announcement and
Request for Comments—
Using CommonMark in new ways
I intend to extend the use of CommonMark and the usefulness of cmark
in a project of mine with two main goals:
1. Make “foreign” mark-up syntaxes available in CommonMark texts: Blocks and inline spans written in a “foreign” mark-up syntax can be used in CommonMark texts.
Blocks of this kind use either the existing fenced code block syntax and announce the type of mark-up they contain in an info string, or they use configurable start and end lines as delimiters.
Inline spans mark-up announce the type of mark-up they contain using a (preliminary) syntax which also includes an info string, which is interpreted in the same way as the info string on fenced code blocks
2. Make CommonMark and (a modified) cmark
usable in an XML/SGML environment: While the “conventional” transformation of plain text files into structured documents (ie HTML/XML/XHTML) is of course retained (and can use the “foreign mark-up” extensions mentioned), it should also be possible to input XML/SGML documents which containplain text fragments as the character data content of designated element into the mark-up processing, which substitutes these “container elements” with elements generated from the contained plain text in the final output (which is again an XML/SGML document).
The motivation for the first goal needs probably no explanation, and achieving the second goal would allow to use CommonMark (and a processor for it) as well as “foreign” syntaxes (and processors for them) in an XML/SGML authoring process, eg to produce DocBook documents.
Some expected properties of my solution are:
-
The CommonMark specification needs no change at all if the “foreign” syntax is only used in fenced code blocks;
-
The CommonMark syntax is processed by (a modified)
cmark
or similar Markdown processor, while -
each of the “foreign” syntaxes is processed by it’s own, specific processor
-
in a robust but flexible manner.
-
Adding a new “foreign” syntax for use in
-
fenced code blocks with info strings, in
-
“foreign mark-up blocks”, or in
-
code spans with info strings
-
would solely require adding one line into a configuration file, and no changes to the cmark
processor.
- Input to mark-up processing can be in a variety of formats:
-
Plain text files as ever,
-
well-formed XML documents (without using a DTD or XML Schema),
-
validated XML documents (using an XML parser to check the document against a DTD or XML Schema),
-
validated SGML/HTML documents (using an SGML parser to check the document against a DTD, and to parse it as a first stage of processing).
- The mark-up processing tools used in this concept (
cmark
, processors for “foreign” syntaxes) would not need to parse and generate XML or SGML, but a simple text format instead.
The key idea is to compose cmark
and other mark-up processors together with specialized tools into a chain of processors (typically in a U*IX-style pipeline, or controlled by a Makefile, ie each processor is also a process), so that this chain of processes transforms the plain text mark-up: it is a “plain text mark- up processing chain”. Most of the rest follows naturally from this idea, driven by some design decisions I made, under constraints and requirements I assumed.
More (or even all) the details about the motivation, concept, design of the planned implementation can be found in a very detailled article I wrote: A Plain Text Mark-Up Processing Chain.
Since the goals of the project are relevant for the greater community (as I would hope), and the implementation would be related to several topics discussed here recently, like
-
ways of using “foreign” mark-up syntax blocks in CommonMark, either in the form of fenced code blocks or alternatively or additionally in “foreign mark-up blocks”;
-
ways of using the same “foreign” mark-up syntax in code spans (with an extended syntax to attach an info string to such code spans);
-
modifying the
cmark
implementation by adding another “mode of operation” (ie a new value for the-t
option), or alternatively implement a new processor based oncmark
and using it’s API; -
generating “native” elements from mark-up, and thus using the CommonMark DTD for “production purposes”, not primarily for testing;
I would like to invite everyone who finds some (or all) of the project’s goals or topics attractive and worthwile to comment, discuss and help to make this project a success.
— tin-pot