I’ve often tried to convert legal documents (org. constitutions, contracts, etc.) to Markdown. Legalese is often full of nested ordered lists of several varieties. A main pet peeve is that letter-ordered lists are frequent and not really supported.
1. A vegetarian shall not eat:
(a) Chicken
(b) Beef
(c) Bacon
(d) Fish, on these days:
i. Tuesday
ii. The first wednesday of October
I’ve often tried to convert legal documents (org. constitutions,
contracts, etc.) to Markdown. Legalese is often full of nested ordered
lists of several varieties. A main pet peeve is that letter-ordered
lists are frequent and not really supported.
A vegetarian shall not eat:
(a) Chicken
(b) Beef
(c) Bacon
(d) Fish, on these days:
i. Tuesday
ii. The first wednesday of October
Would that be worthy of support?
I would be in favor. Pandoc supports all the list styles you used.
However, in our initial discussions the majority wanted to stay with
decimal lists, with . and ) delimiters.
I would support the addition of letter-ordered lists. One use case that might not have been considered yet is for todo.txt, which has a format that requires lettered lists.
My only concern is whether it would break existing Markdown implementations. For example, you might have a sentence beginning with a. or (a) but not intend for that to start an ordered list. If it’s an extension, at least you’d need to turn on the extension before breaking anything.
My only concern is whether it would break existing Markdown implementations. For example, you might have a sentence beginning with a. or (a) but not intend for that to start an ordered list. If it’s an extension, at least you’d need to turn on the extension before breaking anything.
I am, myself, a fan of letter-ordered lists, and they are supported
in pandoc.
It may occasionally happen that a. or (a) or a) will occur by
accident at the beginning of a hard-wrapped line, but this is not going
to be any more likely than 1. or 1) appearing.
However, if capital-letter lists are allowed, there is a significant
risk that names with initials will be wrongly interpreted as lists: B. Russell says.... The solution in pandoc is to require two spaces
after the period in these cases.
Good point. Are capital-letter lists common? Lowercase letters appear to be common in Terms & Conditions. If capital-letter lists are uncommon, perhaps CommonMark could only support lowercase lists. The web developer could always use CSS to make the list capitalised if neccessary.
I have a concern with the double space rule after the B.. Writers who are “just muddling through” and haven’t read the documentation might be confused when the B. turns into a list item. Without searching the documentation on how to apply the override, they might give up.
Lower-case only sounds like a reasonable solution for the main spec.
I do think this is a common need, personally, as this would come up for example in nearly any copyright license, which is a common case where people will want something edited in plain text but nonetheless very readable (after all, it’s for the sake of text README and related files that Github supports Markdown).
I was going to submit this as an issue before finding this page, but below are my thoughts on implementing this spec:
CommonMark ordered lists are pretty limited in their scope, however CSS allows us to render them in a variety of ways (upper/lowercase alpha-numeric, Roman|Georgian|Hebrew|Armenian|etc numerals) I would like to see an implementation spec regarding the following use cases.
I tried to demo existing spec features, such as distinguished list groups based on delimiter change [Lists: example 186] and nesting list items [Lists: example 199]:
I based the delimiters on the type attribute values outlined in the HTML5 <ol> tag spec since this format is consistent with current practices and is fairly declarative, while still being easy to read and author.
Caveat:
This deliberately limited subset potentially alleviates the issue of names beginning with an initial from initializing an ordered list.
// Lettered: simple example
a. item >>> a. item
a. item >>> b. item
a. item sub 1 >>> a. item sub 1
a. item >>> c. item
A. item >>> A. item
A. item >>> B. item
A. item >>> C. item
// Roman numeral: simple example
i. item >>> i. item
i. item >>> ii. item
i. item >>> iii. item
I. item >>> I. item
I. item >>> II. item
// More contrived example
a. item >>> a. item
a. item >>> b. item
1. item sub 1 >>> 1. item sub 1
i. item sub 2 >>> i. item sub 2
i. item sub 2 >>> ii. item sub 2
1. item sub 1 >>> 2. item sub 1
A. item sub 2 >>> A. item sub 2
- item sub 3 >>> - item sub 3
- item sub 3 >>> - item sub 3
A. item sub 2 >>> B. item sub 2
A. item sub 2 >>> C. item sub 2
a. item >>> c. item
Markdown renders:
<!-- Lettered: simple render -->
<ol type="a">
<li>item</li> >>> a. item
<li>item</li> >>> b. item
<li> >>> a. item sub 1
<ol type="a"> >>> c. item
<li>item sub 1</li>
</ol>
</li>
<li>item</li>
</ol>
<ol type="A">
<li>item</li> >>> A. item
<li>item</li> >>> B. item
<li>item</li> >>> C. item
</ol>
<!-- Roman numeral: simple render -->
<ol type="i">
<li>item</li> >>> i. item
<li>item</li> >>> ii. item
<li>item</li> >>> iii. item
</ol> >>> I. item
<ol type="I"> >>> II. item
<li>item</li> >>> III. item
<li>item</li>
<li>item</li>
</ol>
<!-- AND our really contrived example -->
<ol type="a">
<li>item</li> >>> a. item
<li>item</li> >>> b. item
<li> >>> 1. item sub 1
<ol type="1"> >>> i. item sub 2
<li>item sub 1</li> >>> ii. item sub 2
<li> >>> 2. item sub 1
<ol type="i"> >>> A. item sub 2
<li>item sub 2</li> >>> - item sub 3
<li>item sub 2</li> >>> - item sub 3
</ol> >>> B. item sub 2
</li> >>> C. item sub 2
<li>item sub 1</li> >>> c. item
<li>
<ol type="A">
<li>item sub 2</li>
<li>
<ul>
<li>item sub 3</li>
<li>item sub 3</li>
</ul>
</li>
<li>item sub 2</li>
<li>item sub 2</li>
</ol>
</li>
</ol>
</li>
<li>item</li>
</ol>
Presentation Styling
The developer or designer could utilize CSS to adjust presentational properties of the lists in a more targeted way:
With the roman numeral example, how is i. set apart from the letter i.? Is it when the list marker changes from a. to i.? What happens when the previous marker is h.?
For capital letters, we still have the problem of A. being used to represent an initial, e.g. A. Hitchcock.
What actually should be part of the core, is a more generic description of line prefixes, so that extensions could give them meaning. I’m thinking of something like this (with nesting and indentation mostly ignored):
I prefer a separation of semantics from presentation, similar to HTML where the symbols used to represent the ordering are a rendering concern defined not in the HTML but in CSS (CSS list-style-type Property So many options! traditional Katakana iroha numbering anyone?).
Sorry, I must have missed your post earlier @vas. What are your thoughts on using the type attribute (examples earlier on the topic) which is part of HTML? The writer might want to preserve the type of ordered list so that this information is transferable even when the stylesheets are different, e.g. to ensure that a sub list remains letter ordered so that the items can be referred to elsewhere in the text as item a, b, etc. Otherwise these items might vary depending on the style sheet used, changing the meaning on the document.
But since the numbers given in Markdown for ordered lists are not literal, the numbering in the rendered output may be different. But your use case, @chrisalley, is an important one. How to refer to items in an ordered list from elsewhere in the same content? The proper solution is to provide a reference syntax, offhand something like (but better than):
1. Item A
3. Item C (Item B got deleted)
4. Item D
Here is a paragraph that refers to Option {{#3}}.
Which would render as:
1. Item A
2. Item C (Item B got deleted)
3. Item D
Here is a paragraph that refers to Option 2.
EDIT: I just realized my idea might not work, since CSS styling would be applied after the Markdown was rendered to HTML, meaning the reference could not reflect the style (number, letter, roman numeral)… Or can it? Is there an HTML/CSS trick to support this?
My original point is just a preference. If it turns out that the pros of supporting specified list styles in Markdown outweigh the pros of separation of concerns, I can get behind it. But it does seem to go against the design principles of Markdown and its most common output format, HTML. Yes, HTML supports inline declaration of styling, but that’s because HTML supports both SOP and monolithic approaches. Markdown currently doesn’t support inline styling in any way other than embedding HTML.
I think I’ve rambled on too long about something that may not be that important!