Image with link inside - Example 459 is not very consistent

Example 459

![foo [bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>

Base on spec, [bar] is interpreted to link then convert to literal text because it is inside image text. Is it good behaviour?

I think image text can not embed link is good option. So it render as:

<p>![foo <a href="/url">bar</a>](/url2)</p>

+++ textnut [Jan 20 15 11:00 ]:

Example 459

![foo [bar](/url)](/url2)
<p><img src="/url2" alt="foo bar" /></p>

Base on spec, [bar] is interpreted to link then convert to literal text because it is inside image text. Is it good behaviour?

I think image text can not embed link is good option. So it render as:

<p>![foo <a href="/url">bar</a>](/url2)</p>

As I say in this note,

Links inside image descriptions (and even inline images) might make sense if
the description is used as an image caption (as it is for links in
paragraphs by themselves, in some Markdown extensions). When the
description is used as the image’s alt text, a plain-text version of
the embedded link or image can be substituted.

I couldn’t think of any use for links within links, hence the asymmetry.
However, see

@jgm, i’m not expert in markdown spirit, but described case looks like design error or missuse.

If [] reserved for alt, it can’t have anything but text alternative to image. Description should be in title. Prior to discuss possible content markup, we need to clarify what is each markup component for.

Original Markdown was designed as an easy way to write HTML. Our vision is a bit broader: CommonMark is a way of writing structured documents that can be rendered into a variety of different formats.

The spec calls the part in brackets in an image the “image description,” not the alt text, which is an HTML-specific notion. Nothing about the notion of an image description limits to to plain text. And in fact, the syntax of Markdown generally allows formatting within the square brackets in links and images. We need to parse that formatting, at least to a degree, in parsing links and images. So, why not retain it in the AST, even if it is not going to be used in every renderer?

To give one example of where this extra information might be useful: pandoc and some other implementations have an extension that renders an image in a paragraph by itself as a figure. The image description is then used as the caption. Obviously, we want a caption to be able to contain formatting and links.

Anyway, this was my thinking. I am open to discussion.

You are right that, at least for HTML, title seems a better place to put an “image description.” But the square brackets form a more natural container for formatted inline content than the quotes; we already have to deal with precedence rules for the square brackets, because of links. Less confusing for the user if [...] always functions as a container for formatted inline elements, and "..." as a container for raw string content.

Of course, one could switch things around:

![title here](/url "alt text")

I think that would have been a better design, probably. But it would be backwards incompatible.

I agree that sometime using [] for description is more convenient for humans (swapping legacy alt/title definitions). Looks like it’s a target-dependent thing, and that’s not good.

IMHO it’s caused by improper attempt to generalise IMG use. IMG can’t have text markup elements by design. (Image + compex description) is something like FIGURE + FIGCAPTION.

I don’t touch AST details, collision is before AST on markup definition level. Until spec fix unified mapping principle for ![..](... "...") components, markup collision will produce infinite asspain with implementations (AST structure + code).

One of solution could be to invent separate markup for figures with caption, when caption really needed. Alternative is defining of “special cases” for img markup use (but, honestly, i’m not sure it’s possible).

Could you elaborate why this is proving complicated in your parser? It was very straight forward to implement for me and it actually is simpler to just parse everything in the AST and then let the renderer decide what to do with the nested content it finds to be not supported.

I should add that we haven’t just left this as a “puzzle for the reader.” The reference implementations contain an algorithm for efficient parsing of nested emphasis, links, and images (due mostly to @Knagis) that conforms to the spec and does not backtrack. You could use the same strategy – and it would eliminate the need for your artificial nesting limits.

That’s not technical problem, that’s logical problem. When data structure has “special cases”, that result in more complex data and more complex logic. Current AST just reflect spec cases. If spec has “strange cases”, those are reflected to ast “strange cases”.

Direct example - image alt tag should be rendered in special way for HTML.

@jgm, i have nothing against reference implementation. And i think info exchange goes to both ways, providing benefits for both parties. Our priority is to have pluggable syntax, and some nesting limit is acceprable price for it. I’m sure, any parser can be done better, if someone wish, but i’m ok with current one.

Anyway, problem is not in implemnentations, but in spec for image markup use.

Why, considering that

[foo *bar* foo](/url)

is normal approach, you consider

![foo *bar* foo](/url)

to be a “special case”?

In any case - the point @jgm is making is that ![..]() is the syntax for image, not html <img> tag - and there is no point of pulling html limitation (no markup in description attribute) into CommonMark. Although, I am the one who thinks that the AST should also contain parsed nested links and then let the renderer decide what to do with them.

@Knagis, problem is, that spec allows to interpret image markup as <img> OR as <figure> + <figcaption> WITHOUT clear definition when each form should be used. And those have different “convenient” location for “image description”.

  1. For <img> tag description should be after url.
  2. For <figure> description is convenient before url.

Each separate case is perfect, but those are not consistent together. That produce questions like “should we clear markup or not”, which can’t be answered.

I don’t see a logical problem here. The spec defines the function from source text to AST, and makes it clear that the contents of [...] in an image are to be parsed as formatted inlines and represented as such in the AST. That’s perfectly logical, and it maintains a parallel between how links and images are treated. To me, it would be much more of a “special case” if [...] in one context were a container for formatted inline text, and in another a container for plain text. If there’s going to be a special case, better for it to be in the description of how the AST is to be rendered to HTML rather than how the document is to be parsed.

My preference would be not to say anything about HTML rendering in the spec, making the spec purely a definition of the function from source text to AST. If we move to an XML representation of the AST in the spec examples, we could in principle remove any language about the alt attribute. However, leaving this undetermined would make it difficult to test implementations that only produce HTML, and make it less clear what counted as a conforming parser, so it’s probably good to keep it.

Note that if we have any ambition of targeting multiple formats, there are going to be many cases where an AST element must be treated differently in different output formats. For example, take the info string in a fenced code block. In HTML output, we add a class language-* using the first word of the info string, if present. But we’d do something different in LaTeX output, and probably just ignore the info string in man page output. (In a direct XML representation of the AST, we include the entire info string, not just the first word.) So, if the spec says anything about rendering, there are always going to be special cases like the one you describe. Another example is the distinction between tight and loose lists, which may not make sense in some formats.

There is a definite answer in the spec: “yes.”

The “image by itself becomes a figure” idea I was referring to would be an extension or transformation; it’s not part of the spec, and I agree that it makes more sense to handle figures more explicitly.

I don’t agree with your approach to define markup -> AST path only for data flow. That’s not enougth to guarantee data integrity.

To verify data flow, system should be considered as black box with input and output. You don’t have logical collision only because it’s like “let’s define behaviour for 1/2 of black box”. AST can not be refered as output, because it’s intermediate data. I don’t have experience with all possible output formats, but i think i can point if something is disputable with html.

IMHO, this problem did not existed in original markdown. It appeared on attempt to generalize markup as something more wide. When spec allowed to render image in different forms (also having different mapping for “description” location).

Note, i skipped case “can description contain markup or not”. But how are you going to resolve case when description location in markup is different? Without it you will get images “for html only”, “for pdf only” and so on. Is it acceptable?

IMHO, current resolution with captions is “intuitive”, as all markdown was 10 last yeast, until you started write spec. And IMHO, if you wish more wide spec, it worth to find more clear solution for image captions too.

Yes, it’s the root of problem. If we manage to define how to handle figures, that will remove all questions like “where to place description” and “should description contain markup or not”.

Because figures are refered as proof, “why we should have those features”, but transformation process itself is not clear (for the figures).

Well, forget about figures for now, that’s not the main issue. I think the main argument for including the structured inlines in the image description is for consistency with how links are handled. (And, if we ever introduce a generic span-like container, it would make sense for it to use brackets too.) We need to parse inline structure inside ![...] anyway, and it’s easy to do that using the same machinery we use for links. So, why throw away the information once we’ve got it? As I said above, and I think this is @Knagis’s point too, it would be inconsistent for [...] sometimes to be an inline container, sometimes a plain text container (but one that is sensitive to precedence rules governing inline content!). How the contents are to be rendered is a separate issue.

No problem. Let’s drop figure cases and leave only links with images.

Then for me answer is very simple - your examples are implementation pecularity, and implementation should not affect spec. If ![…] can be only html image, then content can be pure text only. Interpreting it in different way is overcomplication (for the spec) and implementation dependency.

My english is not perfect, please don’t consider it offensive. I don’t mean anything against spec and ref code. I only apply pure logic to initial conditions (what if we have links and images only).

The we should really consider the image reference as a picture with optional caption instead of an image.

I vote on this format:

![title here](/url "alt text")

My point is, [title here] should be consistency between link and image because in the spec, they are defined as link text. It can include emphasis, code etc. but it CANNOT include another link. For HTML, image title can ignore the emphasis at rendering. But alt text can be any text even include a link format.

That will break compatibility. IMHO figures with captions are very separate issue, for big discussion.

But i don’t see techtical problem to store both source text & parsed data in reference object and take proper form depending on usage. That’s only implementation internals, not markup inconsistency.