Abbreviations (and acronyms)

Abbreviations can be defined in HTML5 as follows:

<abbr title="Hyper Text Markup Language">HTML</abbr>

There is no abbreviation syntax in the original Markdown spec, however Markdown Extra includes the following syntax:

*[HTML]: Hyper Text Markup Language

and (anywhere) else in the document, the abbreviation can be written in plain text like this:

The HTML specification is maintained by the W3C.

resulting in any instance of ‘HTML’ becoming wrapped in <abbr> like this:

The <abbr title="Hyper Text Markup Language">HTML</abbr> specification is maintained by the W3C.

Quite an elegant solution if you ask me. The same syntax is also supported in the following flavours/implementations:

Flavours that do not use the Markdown Extra syntax:

Sidenote: HTML previously contained a separate <acronym> element. This element has been made obsolete in HTML5. An acronym is also an abbreviation; the spec recommends the use of <abbr> for acronyms instead.

4 Likes

I like it, and the impact seems minimal.

Still, all additions / extensions beyond core should be postponed until we get the basic Markdown part of the spec solidified.

1 Like

Thank you so much for your well crafted statement.

I was able to take this and add an extension to to html-pipeline very easily html-pipeline-abbr

I also added AutoAbbr, which finds verbose text and converts them into the abbreviation. Same syntax, just a different filter.

1 Like

Quick thoughts:

I remember that programming language allow for scoping variables.

I wonder if scoping rules would be handy for this system. I would imagine that the scoping environment would be between header levels.

This would help avoid abbreviation pollution in larger documents, especially once abbreviations is supported in maths equations (where mathematical symbols may mean different things between sections).

I can see a more general issue of an abbreviation standing for two different things, even within the same section. For example, “Cross Site Scripting” and “Cascading Style Sheets” could both be used in a section about web development. You could just use raw HTML in those (rare) cases, but it is an issue if you want to avoid the use of HTML.

I tend to do the reverse. So ambiguity isn’t an issue for me.

I type “Cascading Style Sheets” and have it convert that into CSS. But that probably does not work for others. e.g.:

#input:
Cascading Style Sheets is fun.

*[CSS]: Cascading stylesheet

#output:
<abbr title="Cascading Style Sheets">CSS</abbr> is fun.

I often put all links ([]: url form) and abbreviations at the bottom of my document . Have been luck finding alternate abbreviations but yes, that may not be possible. E.g.: XSS for cross site scripting

So while scoping sounds like a good idea, it may introduce a few wrinkles for me (though would probably work)

Would it be an issue still thought? Though I would encourage dumb forward substitution (like #defines), otherwise you need to scan a document twice to fully render the page (grab tokens declaration, replace tokens).

But anyway presuming we go with what you proposed kbrock. This still seem alright:

#h1_a
some content with abbr1 text to abbreviate

*[abbr1]: scoped to h1_a & h2

##h2 
*[abbr2]: scoped to h1_a & h2

You can see abbr1 and abbr2

#h1b

you cannot see abbr1 or abbr2 here.

But I can see how this can be inflexible to declaring global scoping of certain abbr tags. Would this be an issue in normal writing? Can’t say, but that comes down to if abbr tags should even be allowed to have a global scope.

In ebooks, I often wondered if you could define phrases in just a chapter.
So the definition for a character would change throughout the book

So it seems that it would be nice if abbreviations, a kind of definition, were scoped.

I do feel that it would be nice if the *[abbr]: def behaved like [ref]: http://link/.
And links are global.

like this kbrock? Won’t clash with syntax declaration, since empty links + string quotes are rare. But then again… what if abbr tag supports clickable abbr links… is there such thing?

#h1_a
What is CSS? Its a way for you to make your stuff prettier!

[CSS]: "Cascade Style Sheet"

[CSSLink]: css.org "this link syntax won't clash !!"

Thought I do worry if its bad to overload such syntax.


##alternative syntax proposal (why not use ‘:~’ instead? )

Btw a common thing I personally do with my own textfiles is to write like this:

#h1_a
What is CSS? Its a way for you to make your stuff prettier!

CSS :~ Cascade Style Sheet
HTML :~ Hyper Text Markup Language

Maybe that’s nicer approach instead? I use it personally because I find :~ to be more directional.
(My use case is for declaring what each variable in a mathematical equation means. So I certainly hope this would extend to math equations)

Plus no overloading remote link syntax. Two birds with one stone. And much more cleaner.

I’d rather see a syntax that expands on the link and image syntax:

reusing * from above it would make more sense to me to use a syntax along the lines of:

*[HTML](Hyper Text Markup Language) is awesome!

or

*[HTML][1] is awesome

[1]: Hyper Text Markup Language

As far as the flag used for marking abbreviations, I think an asterisk is a poor choice with it already being used for em and strong. I’d prefer the question mark because it’s common for abbr elements to be styled with cursor: help.

?[HTML][1] is awesome

[1]: Hyper Text Markup Language

As far as authoring is concerned. I rarely ever find a need for the abbr tag. Typically content editors do what newspapers and magazines do and define the abbreviation the first time it’s used:

A good example of this is the wikipedia article on HTML

HyperText Markup Language, commonly referred to as HTML, is the standard markup language used to create web pages. It is written in the form of HTML elements consisting of tags enclosed in angle brackets (like <html>).

2 Likes

This problem could be overcome by allowing multiple abbreviation declarations in a document using the existing Markdown Extra syntax. Just use the first abbreviation declaration below the use of that abbreviation. For example:

CSS

Another use of CSS

*[CSS]: Cross Site Scripting

CSS

*[CSS]: Cascading Style Sheets

Yet another use of CSS.

would produce:

<p><abbr title="Cross Site Scripting">CSS</abbr></p>

<p>Another use of <abbr title="Cross Site Scripting">CSS</abbr></p>

<p><abbr title="Cascading Style Sheets">CSS</abbr></p>

<p>Yet another use of <abbr title="Cascading Style Sheets">CSS</abbr></p>

The actual placement of the abbreviation definition wouldn’t matter so long as it was after the use of definition. If no further definition is listed, the last defined definition will be used, as in the “Yet another use of CSS.” example. I can imagine definitions being listed at the end of a chapter or section to keep the document tidy though.

I see. So it’s behaviour is pretty much nearly identical to normal link definitions.

So we shall have declarations below usage, and scoping based on headers?

Also what do you think of using :~ instead for a cleaner look? e.g.

CSS
Another use of CSS

CSS :~ Cross Site Scripting

CSS

CSS :~ Cascading Style Sheets

Yet another use of CSS.

In the proposal above there would be no need for the document to contain any headers. It’s more flexible that way.

I prefer the Markdown Extra syntax. An asterisk looks like a reference to something, and the square brackets + colon make it look similar to other types of Markdown definitions. Also, Markdown Extra is already quite well established syntax.

1 Like

Any updates on this?

+1 for reusing the link style.

The only problem that i have with it is that it’s not clear anymore whether you want every abbreviation in the document (or paragraph – as discussed above) or whether you just intended to mark the current abbreviation.

In my opinion, the * in the common markup denotes the repetition, i.e. “every occurence found should be replaced”. Therefore, in normal writing i wouldn’t necessarily expect an asterisk to be there.

That’s what I like about the point of @mofosyne: why not add it into the link syntax? The asterisk part in the link/abbr definition would then only be used if you want to mark every occurence of the abbreviation.

I don’t know if that would ever be usefull, but one could even extend this thought to links:

You can use google to search [stuff][1]. I mean: google is your friend. Damn you, just google it!

*[google]: google.com <!-- Link on every occurence -->

*[friend]: "enemy of the enemy" <!-- Abbr -->

[1]: example.com/stuff <!-- single link -->

PS: probably there should be an option to highlight (or link) only the first occurence in a paragraph.

1 Like

I would like to be able to use an abbreviation (or some equivalent functionality) in a URL, eg:

*[_wp]: https://en.wikipedia.org/wiki
[_wp_md2]: _wp/Markdown
...
[Markdown][_wp_md2]

FWIW, some wiki engines (e.g., Foswiki, TWiki) support the use of “variables”:

... [[%WP_MD%][Markdown]] ...
...
  * Set WP        = https://en.wikipedia.org/wiki
  * Set WP_MD     = %WP%/Markdown

Also, many wiki engines support Interwiki links.

-r

1 Like

I would like to point out that the title attribute in <abbr> is optional.

A common practice is to explain an abbreviation on first use. Subsequent uses are marked up with <abbr> so screenreaders know how to pronounce them.

The syntax options discussed so far do not really support that pattern. The Markdown Extra syntax will add a title on every occasion (which really is redundant). The inline syntax proposed by @zzzzBov could be extended by making the title optional:

*[HTML] is awesome!
<abbr>HTML</abbr> is awesome!
2 Likes

Stumbled across this thread while looking for an abbreviations extension for the parser I use.

I appreciate the points and direction and wanted to throw some copper as I’m going to go ahead and try building an extension to the aforementioned: https://github.com/joshbruce/commonmark-abbreviations

  1. Inline over replace all: This covers the only define once point as well as one acronym meaning two different things (though I usually recommend just spelling both out at that point).
  2. ~[]() - Tilde for approximately, or “kind of,” and so on.

Details in the read me. Again, just throwing into an otherwise dead thread as the points seemed relevant and unmentioned.

Cheers.

1 Like

There is certainly value in simply adopting the Markdown Extra convention:

foo abbr bar

*[abbr]: abbreviation expansion

I understand this feels to much like magic with no markup at all in prose. Also, most current implementations do not support automatically generating a definition with full expansion on first use (and maybe they should not):

<p>foo <dfn>abbreviation expansion</dfn> (<abbr>abbr</abbr>) bar</p>

<p>foo <abbr title="abbreviation expansion">abbr</abbr> bar</p>

Therefore, authors would have to do this manually and thus violate the DRY principle:

foo _abbreviation expansion_ (abbr) bar

*[abbr]: abbreviation expansion

I also understand the desire to reuse inline link syntax, but I do not like adding an arbitrary mark in front of the squares brackets.

foo ~[abbr](abbreviation expansion) bar

foo ~[abbr] bar

  [abbr]: abbreviation expansion

Instead, I favor reinterpreting square brackets as markup for genetic text spans that can get more specific by context, i.e. become a hyperlink or image embedding currently.

foo [abbr]("abbreviation expansion") bar

foo [abbr] bar

  [abbr]: "abbreviation expansion"

For improved backwards compatibility, because the link destination is currently mandatory, one might need to add some kind of empty or void URL:

  [abbr]: <> "abbreviation expansion"
  [abbr]: # "abbreviation expansion"
  [abbr]: ? "abbreviation expansion"
  [abbr]: . "abbreviation expansion"

This could also be used for differentiation.

Draft conventions

  • A text span with its first attribute being a destination attribute <foo> is considered a hyperlink <a>.
  • A text span with an empty destination attribute and its second attribute being an ID attribute #id is considered an anchor <a>.
  • A text span with its first attribute being a quoted title attribute "foo" / 'foo' (but not (foo)) is considered an abbreviation <abbr>.
  • A text span without destination and title attributes or with its first attribute being a parenthetical title attribute (foo) is considered a defined term <dfn>.
  • A text span with its first attribute being a language attribute :en is considered a citation <cite> and if this attribute is empty : it is considered a name.
1 Like

I can see that and want to focus on the mention of DRY (as that’s where my head is right now) and a distinction in approach to Markdown: I tend to view Markdown as a way to create a plain-text document first and HTML second. The tables extension mirrors this approach, same with footnotes.

With that said, when it comes to abbreviations and acronyms, the US Plain Language Guidance says, in short:

  1. Avoid them when possible.
  2. Define the first instance of use per section of a document. (For this one I usually recommend defining a “section” as anything starting with a header.)

So…

Plain text header

Something about CSS (Cascading Stylesheets) ...blah blah... 
CSS is awesome ...blah blah... CSS has variables

# Markdown header (using inline with tilde)

Something about ~[CSS](Cascading Stylesheets) ...blah blah...
CSS is awesome ...blah blah... CSS has variables

// html translation
<h1>Markdown header (using inline with tilde)</h1>

<p>Something about <abbr title="Cascading Stylesheets">CSS</abbr> ...blah blah...
CSS is awesome ...blah blah... CSS has variables</p>

# Markdown header (using extension format - replace all)

Something about CSS ...blah blah...
CSS is awesome ...blah blah... CSS has variables.

*[CSS]: Cascading Stylesheets

// html translation
<h1>Markdown header (using extension format - replace all)</h1>

<p>Somehing about <abbr title="Cascading Stylesheets">CSS</abbr> ...blah blah...
<abbr title="Cascading Stylesheets">CSS</abbr> is awesome ...blah blah... <abbr title="Cascading Stylesheets">CSS</abbr> has variables.

In the final case, the HTML is set semantically with every use. I’m not sure of impact to Accessibility Technologies such as screen readers. With that said, as plain text, it breaks the recommendation, turning acronyms and abbreviations to a glossary mechanism when uncompiled.

As I push forward using this in my own pipeline I should be able to get more information related to other points made.

Cheers.