Info strings elsewhere

## Heading # still heading
## Heading ## info string
## Heading ### still heading
## Heading## still heading
## Heading ##still heading
## Heading # # still heading


In the Github issue I’m requesting a small syntax change to enable info strings, as known from fenced code blocks, for suffixed ATX headings. I believe they are the perfect entry point for many kinds of extensions. Even if not parsed further, info strings can be considered comments that will not appear in the output. I am not proposing a certain syntax for the info string itself here. I just want to collect and discuss ideas how to enable them elsewhere.

Setext headings

Suggestion: the number of dashes - or equals signs = must match the number of characters in (the first/last/shortest/longest line of) the heading (without leading and trailing whitespace) for successive characters after a whitespace to be considered an info string.

------ no, a paragraph

------- info string

-------- no, a paragraph



Quotations are complicated, due to lazy wrapping. An approach similar to the one I’m proposing for ATX headings might work, though.

>> Quotation < more quotation

>> Quotation << info string

>> Quotation <<< more quotation

>> Quotation <<more quotation

>> Quotation<< more quotation

>>Quotation << info string?

>> Quotation 
<< info string?


List items

Unlike HTML, Commonmark has no explicit markup for lists. They are made implicitly from consecutive list items. We could try the same approach as proposed for headings, but unfortunately, the bullet characters are rather likely to appear verbatim with spaces on both sides. The situation is better for numbered lists. In any case, it would probably make sense to restrict info strings to the first line of a list item or to single-line items, because otherwise it gets really confusing and probably ambiguous with nested lists.

* List item * info string?
* List item + more list item
* List item - more list item

+ List item * more list item
+ List item + info string?
+ List item - more list item

- List item * more list item
- List item + more list item
- List item - info string?

1. List item . info string
1) List item ) info string


Thematic breaks

We could require a certain pattern of whitespaces and only dashes, only asterisks or only underlines to allow an info string after it, e.g. three times three characters separated by three spaces, but that’s a really arbitrary choice to make.

---   ---   ---   info string
***   ***   ***   info string
___   ___   ___   info string


This has many problems in existing implementations because two or three dashes may be combined into an en or em dash (Smartypants), respectively, while consecutive asterisks and underscores may be interpreted as emphasis markup.

Link definitions

One could consider the link title in parentheses or plain quotation marks as a rudimentary info string. That means, any additional information would just follow after it. In other words, everything after the first whitespace after the link address is considered an info string and the title syntax is the only part of it that is described in the core spec.

  [link-id]: "title" info string
  [link-id]: (title) info string
  [link-id]: <>
  [link-id]: <> "title" info string
  [link-id]: <> (title) info string


Note that #, ? and perhaps even ; can be considered minimal relative URLs that can safely be ignored. This may be useful for certain extensions.

Inline links

Basically the same considerations as for link definitions also apply to inline links, although existing support there is less universal.

[link text]( "title" info string)


One could argue that URLs inside angle brackets should allow spaces and other characters that need to be percent-encoded. This would probably negate the possibility of info strings therein.

< part of the address>


Related topics

Possible internal info string syntax

The exact use and syntax of info strings should not be specified by Commonmark, but most parsers would probably agree on some common extensions:

  • #id unique identifier, usable as a anchor, i.e. a link target, for instance (not a hash-tag)
  • .class named category, type or class, e.g. for styling or specialized behavior
  • "title", 'title', maybe (title) alternative or additional textual content, e.g. for a table of contents or a tool tip
  • key=value arbitrary parameters or attributes that may only be useful for a certain output format or processor

Other syntax extensions for info strings are less common:

  • @attribute a boolean property that should be activated (true), can also be a semantic category (not a mention)
  • $variable, $value
  • {template}, {{template}}
  • :lang, ((lang)) a BCP47 language code


Without more changes, autolinks could either support info strings or spaces in absolute URLs, but not both.

Assuming we preferred info strings, should they be dropped from, retained for or even used as the link text?

< info string>
<p><a href=""></a></p>
<p><a href=""> info string</a></p>
<p><a href="">info string</a></p>

List items 2

For enumerated lists with parentheses, reversing the marker would be possible as well:

1) List item ) more list item
1) List item ( info string
1) List item (1 info string? 
1) List item 1( more list item

Autolinks 2

The example may be clearer this way:

< #info .string>
<p><a href="" id="info" class="string"></a></p>
<p><a href="" id="info" class="string"> #info .string</a></p>
<p><a href="" id="info" class="string">info string</a></p>

I strongly suggest we stay with automatically converted spaces, though:

<p><a href=""> #info .string</a></p>

Given that this proposal adds quite a few syntax rules (and thus additional test cases), and the community’s desire to stabilise the core spec, should we move this to “Extensions”?

1 Like

The info strings enable extensions but the syntax itself belongs into the core, preferably in 1.0, because we donʼt want breaking changes later.

How about…

{.class #id key="value" info="Arbitrary info string"}

This provides maximum flexibility for future extensions, since any info string could be applied in any location where attribute blocks (or attribute references) can be applied.

It’s additional markup, but it keeps things future-compatible – consistent attribute syntax will almost inevitably be added to CommonMark core or a standard extension.

1 Like

I’m in favour of using consistent attribute syntax for info strings too, rather than adding a bunch of new syntax rules relating to the placement of the info string.

Also, by keeping text that isn’t rendered in the visible output of the document inside of curly braces there’s a clear seperation of concerns. I realise that fenced code blocks already can have info strings outside of curly braces; that’s unfortunate, but too late to change now. So maybe info strings outside of curly braces could be reserved for fenced block elements only.

1 Like