What should the Rust community do for linkage?

steveklabnik · June 29, 2016, 12:45pm

Hey everyone!

I work on the documentation for the Rust programming language. We use Markdown for all our docs. It’s been largely great, and we had a debate long ago about which format to use, and markdown won.

However, there’s one thing about Markdown which is an actual pain point for us, and that’s when it comes to writing API documentation. When referring to another type, having links to that type is great, but it can be a real pain to generate by hand. We currently recommend something like this:

A major drawback of Markdown is that it cannot automatically link types in API documentation.
Do this yourself with the reference-style syntax, for ease of reading:

    /// The [`String`] passed in lorum ipsum...
    ///
    /// [`String`]: ../string/struct.String.html

This is manageable, but less than ideal. First of all, it’s a lot to write by hand. But secondly, it can lead to errors: in Rust, a type can appear under two namespaces, which means that these relative links get broken. As an example, BTreeSet is in two places:

All of the links work in the first case, but not the second case.

The basic position of the Rust team has been “when CommonMark has an extensibility mechanism, we can use it to generate these kinds of links.” I personally care a lot about being compliant; I’d hate to be contributing to even further fragmentation of the Markdown ecosystem. But, as it was helpfully pointed out to me in another thread, it seems like that might take some time. In the meantime, people are feeling a lot of pain around this, and have made various proposals for what we should do; even suggesting we shouldn’t do the linking until it can be made automatic.

All of that is a lot of context and preamble for my real question: What do you all think Rust should do here? Should we invent some sort of Markdown syntax for this, and then try to propose it as an extension? Should we wait until the extensibility proposal is accepted someday, and just deal in the meantime? Should we use regular links, but create some kind of URL scheme that we post-process into “real” URLs? Do some post-processing on our “markdown” files to add the links in ourselves? Something else?

Any thoughts or guidance would be appreciated.

Side note: Due to being a new user, I had to remove a bunch of links in this post, thanks to an anti-spam measure. That’s totally fine, but amusing in a thread about how to properly add links

jgm · June 30, 2016, 6:29am

Can’t this problem be solved by just generating the list of reference definitions programatically?

xoofx · June 30, 2016, 12:24pm

This problem is usually solved by using domain specific URL as you said, where the links use a custom semantic without referencing a fixed path on the disk (e.g api:my_module.my_function.tparam1.tparam2)

Then you don’t really need particular extensions to handle this, just a post processing as you suggested.

The generated API pages contain the link metadata (e.g in a frontmatter), and the doc generator is able to extract this information, and glue it together while patching the markdown links. There is also a possibility to work on HTML output instead, process all a links, and patch them from there (it enables mixed content html/markdown… sometimes it can be quite convenient)

The docfx project for the .NET documentation is typically using a similar system with uids and xref links

gjtorikian · July 2, 2016, 4:03am

Hello @steveklabnik! I’ve been doing API documentation for over ten years. I’d like to share with you a syntax we’ve been using at GitHub.

For the Atom documentation, I helped design the format for atomdoc. For your specific question, we’ve been using curly braces ({...}) to denote type. We have an example right there in the README:

    * `count` {Number} representing count
    * `callback` {Function} that will be called when finished
      * `options` Options {Object} passed to your callback with the options:
        * `someOption` A {Bool}
        * `anotherOption` Another {Bool}

I can’t remember where I first saw this syntax, but it might have been from jsduck.

Post-processing is a must for this to work. In fact, in my opinion, for proper documentation of an API, you must have a processing step and a rendering step. I don’t know the details of how Rust’s documentation is generation, but here’s how it might go down (as we do for Atom):

Processing

Iterate over all the files and extract the comments out
From the comments, generate an AST that confirms to some syntax.
Ensure that the AST also contains information about the item that’s being documented.

That last bullet point is key. For example, here’s a truncated snippet from some Atom code:

# Essential: This class represents all essential editing state for a single
# {TextBuffer}, including cursor and selection positions, folds, and soft wraps.
module.exports =
class TextEditor

As it’s reading a file, Atom’s documentation tool says:

Okay, I have a comment, I better start reading it all in.
Okay, the thing right after a comment looks important. (In this case, it sees it’s a class TextEditor).
It adds to a list of classes that TextEditor is a type.

It’s kept both the comment and the context for that comment in memory.

Rendering

With the documentation AST processed, walk the AST.
Generate that AST into HTML.
As a new type is found (like a curly reference to a class), go and look up that class to make sure it exists.
- If it does, generate a link.
- If it does not, pass.

I only just started (literally, this week) looking into learning Rust. I’d love to help out in any aspect of this, as I’ve found the Rust doc content to be some of the most accessible to a total newbie.

The tools used by Atom are:

Hope this helps!

steveklabnik · July 7, 2016, 12:12am

Yeah this is great. Thanks so much!

(And to everyone else in this thread)