Okay:
For metadata:
Use external metadata parser via ignoring anything between jekyll style fencing:
http://talk.commonmark.org/t/jekyll-style-do-not-show-sections/
Okay:
Use external metadata parser via ignoring anything between jekyll style fencing:
http://talk.commonmark.org/t/jekyll-style-do-not-show-sections/
FWIW Iāve added a YAML metadata parser for the remarkable markdown parser here: https://github.com/eugeneware/remarkable-meta.
Iāve taken the approach of just use ---
separators at the top of the file for now, though this could be configurable.
Static website generators markers for YAML
Pelican uses ā and ā
Hugo uses ā and ā¦
RD
based on my previous comment, in case itās useful or helps to distinguish between what is necessary in markdown or can be handled by an āexternalā tool, I created a lib called gray-matter for parsing front-matter from markdown files. YAML is the most popular front-matter language, but gray-matter can also parse coffee-front-matter and JSON front matter. Itās very stable, itās the fastest implementation Iāve tested, and itās used on hundreds of projects (including Assemble)
I really liked how you allowed for different languages in grey-matter via options switches.
Just one suggestion. Can you allow for optional descriptive field after the ālanguage specifierā? This is most useful, for repeated metadata within the documents. E.g. Slideshow apps might need to set different stylesheet for each slides, so need to be able to distinguish between different metadata boxes.
---yaml: slide01 ---
CSS: style.css
---
Thanks!
Can you allow for optional descriptive field after the ālanguage specifierā?
We could, but it depends on the specifics. Iāve thought about a need for something similar, would you want to continue the discussion on a gray-matter feature request? might be better to move the discussion about it there
I like the terseness and ācalmnessā of pandoc-style headers, or āMetadata Blocksā: Just (up to) three lines right in front of the Markdown typescript, each line beginning with a ā%
ā. This looks like this:
% Document Title
% A. Uthor
% 2015-01-01
... Document content begins here ...
I have implemented this kind of āMeta-Informationā in my clone of cmark
(in cm2html
, to be exact), which puts the meta-information into Dublin Core <META>
elements in the HTML <HEAD>
, and uses the title for the <TITLE>
element too (if not overridden by a command-line option specifying the <TITLE>
to use this time).
The resulting HTML header is shown below. I particularly longed for this feature (copied from the discount parser I use too) because ISO HTML requiresāamong other thingsāeach HTML document to have a <TITLE>
to be valid, and thereās no good way to specify the title otherwise (well, maybe a parser could use the first section header text, but I wouldnāt find that a great work-around).
<!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
<HTML>
<HEAD>
<META name="GENERATOR"
content="cmark 0.22.0 (https://github.com/tin-pot/cmark.git d57b73fedd68)">
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<LINK rel="schema.DC" href="http://purl.org/dc/elements/1.1/">
<META name="DC.format" scheme="DCTERMS.IMT" content="text/html">
<META name="DC.type" scheme="DCTERMS.DCMIType" content="Text">
<META name="DC.title" content="Document Title">
<META name="DC.creator" content="A. Uthor">
<META name="DC.date" content="2015-01-01">
<LINK rel="stylesheet" type="text/css"
href="default.css">
<TITLE>Document Title</TITLE>
</HEAD>
Thanks for your comments! They cover exactly what Iām struggling right now.
Background.
I use markdown to store notes, clippings from articles, ideasā¦ So I need something fast and uncomplicated. MD is perfect for this. And CommonMark is even better.
To distinguish reference data (URLs, the language of the note, who said that, etc.) I simply put a paragraph starting with ā~ā somewhere in the text. This note refers to the enclosing header or to the whole file if appears before any header.
Anything more complicated (or with too much structure) will not be used for my peculiar use case, Iām sure. Also a visually intrusive markup, I think ,distracts too much from the real content.
So I suggest the language could define a begin of line marker to say āIgnore thisā or āThis is specialā or āThis is a commentā. The standard parsers (pandocā¦) could ignore it, but libcmark could create a node for it marked as āCMARK_NODE_CUSTOM_BLOCKā. Then is my database loading code (for example) that use this information with no need to look at each paragraph first character to see if it is a ā~ā.
Thanks for the discussion!
Pandoc % block is too limited.
What if I want to mark a section with the language used? The pandoc block is OK for your use case, but then I need a different mechanism for this one.
% Title: multi-language
% language: fr
<text in French>
# Article
% language: en
<text in English>
# Articolo
% language: it
<text in Italian>
I propose that we overload the fenced code block info string to indicate to post-processors that a code block may be interpreted:
```yaml #!
---
foo: bar
---
```
Whilst it is a little more verbose than some of the alternatives it has several benefits:
#!
in an info string is a post-processor directiveyaml.safeLoadAll(node.literal)
)#!js-yaml
or #!yaml-lite
etcI think this addresses several of the concerns in this thread, I wonder what people thinkā¦
@tmpfs I prefer the Jekyll-style metadata blocks because āa Markdown-formatted document should be publishable as-is, as plain text, without looking like itās been marked up with tags or formatting instructions.ā The list item points that you mention are good for parsers, but adding additional syntax around the metadata block makes the document less readable for humans.
I see your point of view but I have a few issues with the standard YAML frontmatter approach:
---
to mean YAML frontmatter, thematic break and level 2 setext headingAs noted elsewhere with the YAML frontmatter approach there is no real need to specify anything as it can be trivially parsed by a pre-processor.
However I think there could be some value in defining a commonmark extension that allows embedding arbitrary data in arbitrary formats anywhere in a document.
Just to throw it out there, to improve legibility rather than take inspiration from the shebang (I figured most people using this functionality would see the shebang as something that would be interpreted) how about a single period .
as in source
:
```json .
{"meta": "foo"}
```
I think is more readable and also implies the code block would be interpreted.
I also believe it would be useful if an author could define structured data in multiple places in the document.
I think that there are many different uses for markdown and we shouldnāt be restricted by it always being accessible to a layperson.
If a technical person is creating a markdown blog post then some additional meta data is useful. If I want to share a recipe with my family then I would use plain markdown.
I think we should be able to support both use cases.
Iām not sure that we can. You can get a technical person to understand a non-technical document, but a non-technical person is going to be confused or put off by seeing technical complexity mixed in with a regular Markdown document. So long as we allow complex syntax, some documents are inevitably going to include that syntax with the (wrong) assumption that a non-technical person will just ignore the complex syntax.
My point is that in the first use case if no layperson will see the document, ie: it is published to HTML for public consumption then we should be able to specify this extension to fulfil that use case.
We know this use case exists as there are various markdown -> blog publishing tools.
The idea we should strive for is that ātechnical featuresā targeting technical users can and should be more complex. But at the same time clearly distinguished from the normal markdown syntax for normal markdown user, via some āstartā and āstopā markers.
An example of this is # h1 heading { #idname .classname var=1 }
and how it clearly distinguished the much less intuitive anchor and stying classnames from the normal intuitive markdown syntax via {}
.
What this allows for is for normal and expert users to share the same markdown vocab for the basics, but for more advance features the more technical users can have a more expressive syntax (at the cost of intuitiveness). So we get the best of both without impacting on low or high technically abled writers.
This also allows for defining how to ignore the more technical features (or stripping it), to provide a more compact markdown display that doesnāt implement the more advance features.
In which case we could still overload the fenced code block just be more specific:
```yaml {meta}
---
data:
- foo
---
```
What do you think about something like that?
@jgm I would like to āreviveā this, now that we support CommonMark. At a minimum I want to allow you a āuser optionā for ātraditional multilineā support, but need some metadata to accompany it.
Simplest solution in my mind would be to add this to the top of any documents you create.
<!-- softbreak: false -->
We could also go the full hog here, but this chews up premium real estate in the editor.
---
softbreak: false
---
I guess the editor could be made smart enough just to hide the metadata and expose it via the options dialog.
Not sure what to do here, what is your pref.
An implementation can always decide to treat some specially marked initial lines of the document as metadata, rather than as part of the CommonMark content. I donāt think thereās any need to standardize on a format for this, since different applications will have different needs. So I still think that metadata should not be part of the spec.