The different kinds of italics in HTML

Semantic HTML distinguishes the following kinds of italics:

  1. Emphasizing text (which affects how you’d read it out loud):
    Now <em>that</em> is a good idea.
    
  2. Definition (introducing new terms):
    A <dfn>bicycle</dfn> is a vehicle with two wheels.
    
  3. Citing names, book titles, etc.:
    I enjoyed <cite>His dark materials</cite>.
    
  4. Mentioning foreign-language text:
    She said <i lang="fr">au revoir</i>.
    

How would you express 2–3 in pure Markdown (without HTML tags)? Two options:

  • New markup. Not sure what that would look like.
  • Pandoc spans:
    [bicycle]{.dfn}
    [His dark materials]{.cite}
    [au revoir]{.i lang="fr"}
    

The problem with using Pandoc spans: Goes against the point of a human friendly markup language.

The problem with adding new markup for each of your cases: Limited and dwindling markup design space (i.e. the number of unused Ascii characters suitable for use as markup is very limited). There isn’t enough for all of the things people will want to do. For example, the following are already taken up by common extensions, and does not include all the uncommon ones nor syntax that is proprietary to the many tools out there that add their own:

  • ==highlight==
  • ^superscript^
  • ~subscript~
  • ~~strikethrough~~
  • $inline math$
  • :emoji_shortcode:
  • : definition in a definition list
  • @at_mention
  • #hashtag, but on GitHub, #issue-number
  • [[wikilink]]

One could argue that 2 and 3 are common enough that they deserve to consume precious design space. e.g. =bicycle= and ""His dark materials"" (since quotes are the old-school way of doing citations). But 3 is pretty esoteric, as is evident in how even in HTML you’re using the tag for regular italics, but with an attribute.

I’ve been working on a meta-markup language, Plain Text Style Sheets, so that people can declaratively define syntax and semantics in a way that would be portable across tools that supported PTSS. This would avoid consuming the remaining design space with fixed meanings, allowing that space to be used differently when the needs are different (e.g. scientific writing vs journalism). PTSS isn’t quite ready, but I’ve had a working version for well over a year that fully supports CommonMark/GFM[1] as well a completely novel markup languages. Getting closer!


  1. Their full syntax is expressed in PTSS, and I’ve added many extensions using PTSS. ↩︎

3 Likes

Great feedback, thanks!

  • "" feels like a good choice – I had considered it too.
  • I’m not as sure about =bicycle=. How about ::bicycle::?