I think that’s a mistake. HTML defines these attributes to contain plain text, so a screen reader will announce ![*this*]() as some variation of:
less than em greater than this less than slash em greater than
And !(img "*this*") will render as <em>this</em> in the tooltip (where tooltips are supported).
This gives readers text that is completely different than what the author wrote. It’s not something anybody would like to do, it exposes an implementation quirk.
I think formatting should stay allowed in Markdown’s image alt/title syntax (so conversion straight to something more expressive than HTML can take advantage of that), but Markdown-to-HTML converters must be required to handle limitation of HTML attributes gracefully, without throwing unparsed angle brackets at users.
The simplest rule may be: parse alt and title as Markdown, strip all HTML tags, HTML-escape the result where necessary:
![*foo*]() <img alt="foo">
![\*foo\\*]() <img alt="*foo\*">
![& & "]() <img alt="& & "">
![<div>1 < 2</div>]() <img alt="1 < 2">
I think, right behaviour will be not parse markup in image alt at all. Current spec has 5 tests on this case. IMHO, such behavior sould not be pushed to other parsers. It would be nice to remove those tests from spec. Also i don’t like idea to do parse + cleanup, because it overcomplicate things.
It looks like reference implementation has pecularities, that was promoted to spec without strong reasons. Here is our js implementation to compare. It now pass all tests except these.
Since this issue has been mentioned in other topics as well (including https://github.com/jgm/CommonMark/issues/145) I thought I would bump it up and add my vote to the scenario where the parser parses the inlines in alt/title but HTML renderer outputs plain text.
The reasoning is that it seems very likely that people would want to create renderers that would output these attributes like this:
Another opinion is that the writer should not know that, for example, emphasis are not supported in the image title and in this context *foo* will be output literally. If the parser can output warnings, it should do so; but otherwise it should output the HTML based on the least-surprising principle.
That returns us to question of “what is markdown soul” . Is it minimalistic or universal markup?
If anyone wish markup in caption so match for custom renderer, he can make nested call on title content. Let’s keep things simple. As far as i understand, primary target for markdown is html. IMHO it should be correct by default, without sanitizing kludges.
It would be fairly straightforward to add to the renderer a function renderInlinesPlain, which renders the inlines without any tags or formatting. This would mean that ![*foo*](/url) would be rendered <img src="/url" alt="foo" />, not <img src="/url" alt="*foo*" />. This would give people the flexibility to write a custom renderer that renders the alt text as a formatted caption, while still outputing HTML that is guaranteed to be correct.
IMHO, that’s not very clear, until details defined in spec. I see many problems. At first, image is inline tag. Alternate examples are blocks. That’s a potential conflict. At second such markup does not fits well into AST, provided by html parsers - because image is not container.
My suggestion is to keep things simple (leave all src intact), until spec defines better all levels of alternate image use (markup attributes, ast, presentation). Freezing now img html output format does not guarantee anything at the end. It just adds unnesessary resyncs for all other implementations.
This can be solved very easy - mark tests as internal / unstable / experimental:
.. <- two points
not mandatory src
This can be moved to mandatory later, when use cases finalized.