What is CommonMark's semantic definition for relative links?

How should relative links in CommonMark be interpreted so the link destination can be determined consistently?

If I have a link destination of “images/image1.png” what is this link relative to?

  • Is it relative to the root of your CommonMark file system structure?
  • Is it relative to the directory where the markdown file containing the relative link is located?
  • Is it relative to the directory where the person reading the markdown text?
  • Is it intentionally left undefined?

I know Github Markdown is using relative to the markdown file containing the relative link, but I could not find in the CommonMark spec that CommonMark is doing the same thing.

The spec only states that a link destination is a URI with the following constaints:

A link destination consists of either

  • a sequence of zero or more characters between an opening < and a closing > that contains no line endings or unescaped < or > characters, or
  • a nonempty sequence of characters that does not start with <, does not include ASCII control charactersor space character, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses. (Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.)

As you can see from the constraints above and from the spec’s many examples for Links, the URI is almost any string and is passed through as-is. This is consistent with Gruber Markdown and with nearly all other variants: most pass the URI through as-is. A few “URL escape” spaces as %20 and one disallows spaces, but otherwise no other changes.

When rendered to HTML and that HTML is rendered by a browser, the URI works according to the rules of the web, including the rules for relative URLs, and according to the logic of the web server asked to serve the response for that URI (i.e. via HTTP GET).

The reason GitHub relative links work as they do is because GitHub’s web servers that host the web interface to GitHub repos translate relative URIs into relative file paths within the repo. They do this because it makes intuitive sense, not because it is specified by any Markdown specification.

Thank you for the reply.

I was hoping to find some semantic definitions for building and understanding for displaying CommonMark’s content. Right now it seems we need to look at various implementations to figure this out. There should be some expectation of what relative means? ie relative to what? The spec could say it is the implementer’s choice, but it should say something.

The CommonMark spec doesn’t “say something” about relative destination link semantics because Markdown never said anything about it, and CommonMark’s purpose is to provide a spec to lock down the syntax unambiguously, not to provide any new semantics. Doing the latter would be a “breaking change”.

Probably the reason Markdown (and thus CommonMark) doesn’t clarify this is because it was concerned with (1) the syntax for individual documents which (2) were rendered to HTML. It assumed the user of Markdown would use relative paths to images and other pages appropriate to their HTML publishing setup.

But you are right, many people are concerned with publishing entire assemblies of Markdown pages and resources. It’s not great that each publishing platform has its own rules. For example:

  • GitHub supports relative paths to other pages or resources in the same repo and automatically transforms the paths as necessary for the HTML rendered versions.
  • Pandoc by default expects paths relative to the project root, but can be configured to use page-relative paths with its rebase_relative_paths option. It does not natively support automatic transformation of links to .md files into links to corresponding published .html files, but you can use a custom Lua filter to make that happen.
  • Hugo until recently required the use of proprietary “shortcodes” to achieve what GitHub does automatically, but apparently has since mimicked GitHub’s behavior.
  • some setups have no support for this at all, and one is forced to code into their Markdown sources links to the expected publish URLs of references pages and images. This coupling of source and specific publish path is just plain bad.

I’ve long wanted to address this issue. I’ve put considerable time in developing a proposal which I call TextAssembly . I haven’t pushed my draft spec to the repo yet… right now there’s just the README and the manifesto. I’ve temporarily back-burnered that proposal until I finish a different related project. Feel free to watch or star TextAssembly so you are alerted when I get back to it :slight_smile:

1 Like