Does this link to the song or embed an audio player in the page? It is not clear what the author’s intention was. If we use an explicit syntax, !(http://example.com/song.mp3) in the case of embeds and <http://example.com/song.mp3> in the case of links, both are possible to describe unambigiously.
@vitaly, how do I distinguish between this cases within the markdown-it plugin? This is the code I wrote some time before. It overrides “md.renderer.rules.link_open” renderer. How can I get if the link I parse is inlined or on a separate paragraph?
Should we change the spec so that instead of “images” it talks of media more generally, and allow the syntax to be used for audio and video? (Renderers could render appropriately for the media.)
My thoughts: for 1.0, rename this to a generic “embedded media” type, with image being the default media type that is rendered and used in the spec examples. The spec could mention that the embedded media syntax may be used to render other types of media based on file extension, the same way that the parser may render soft breaks as hard breaks (with soft breaks being the default). E.g. the spec could state:
A renderer may also provide an option to render the embedded media as a media type other than image, based on the file extension of the media file.
The whitelist of valid URL schemes for autolinks was removed in CommonMark 0.24. Omitting mention of particular file extensions/media types would be consistent with this. A formal list of file extensions is a moving target; keeping this list up to date falls quite far outside the scope of formalising Gruber’s Markdown, but perhaps a later extension could formalise how the media (based on file extension) should be rendered.
@vitaly, that seems equally true for images – can we really “know” that a given URL represents an image without inspecting it? Right now when I use a CommonMark compliant parser and embed an MP4 URL, I get an <img> tag, even though the URL clearly provides useful information to the parser suggesting that this is not in fact what the user intended.
I also think it’s important to distinguish between YouTube/Vimeo style embeds (which rely on third party code) and browser-supported HTML5 video/audio playback (which is an open standard, just like <img> tags).
As a markdown user, I would find a system that inserts standards-compliant <video> or <audio> tags for appropriate embed URLs on the basis of extensions more than sufficient for most practical purposes. Having special syntax to force the embed type might be useful for special cases. But if I want to embed a locally uploaded MP4 video in my markdown blog post, that should “just work” on the basis of the URL.
I do agree that any advanced embedding of third party libraries doesn’t belong in the spec, but embedding a video or audio file in a standards compliant manner is, as far as I can tell, not meaningfully different from the already supported image use case. Am I missing something?
@vitaly, that seems equally true for images – can we really “know” that a given URL represents an image without inspecting it?
You missed the end goal - output markup. If it is <img> always, no need to analyze src, because it does not afftects output. But if output can be img/audio/video - some additional criteria required to choose right. Assumptions about file extensions in URL are too weak for spec IMHO.
I think a case could be made not to maintain a list of file extensions mapped against <img>, <video> and <audio> in the spec, just because what file types frequently updated browsers like Chrome do support natively can change pretty often.
The spec should IMO, however, make the recommendation to implementers to vary the output of what’s currently image-only markup based on the file extension of commonly natively supported file formats (with <img> being the default if the extension is not recognized, or if there is none), and provide implementers with sample output for <img>, <video>and<audio>. All three are first-class citizens in HTML5 and it makes no sense to me that markdown would not support them as such.
The behavior that markup like ![some text](https://someurl.com/some.mp3) produces an <img> embed of a URL that’s obviously not an image is counter-intuitive, and the whole point of using a language like markdown is IMO to provide the most intuitive result for the common cases, using markup that people can keep in their head. To the extent that the spec produces obviously counterintuitive results when implemented, I would argue that it needs to be revised not to do so.
I agree with previous commenters who have suggested extending the syntax to override that behavior on an as-needed basis, but if there’s no consensus what that should look like, I would still recommend changing the default behavior of ![x](y) type invocations in the way described above, i.e. make intelligent guesses based on commonly used extensions, and fall back to image markup if no intelligent guess can be made. I don’t think that would qualify as AI.
Although I disagree with you here, let me thank you for all your hard work on markdown-it; it’s a pleasure to use! And again - I may be missing something obvious and am mostly speaking from the perspective of someone using other people’s markdown parsers in a few different codebases.
Well, I would argue that leaving handling of even basic HTML5 features like <video> and <audio> tags entirely up to implementers is far worse from the perspective of implementation proliferation (and also anachronistic). After thinking about it a bit more, I do think it would be best to include a non-exhaustive list of extensions in the spec, with a note that implementers are free to add formats natively supported by widely used browsers to that list.
The list, AFAICT, would be:
<audio>: default for URLs that end with .wav, .mp3, .ogg (see below)
<video>: default for URLs that end with .mp4, .ogv, .webm
<img>: default for all other URLs
All matching would be case-insensitive.
.ogg is the most challenging since it is a container format associated with both video and audio in the wild, which has led to proliferation of the .ogv extension. This and similar edge cases would be a reason to offer an override in the markup, but in the most common cases, the above matching should work just fine, IMO.
I think all formats natively supported by widely used browsers should be included in the list to avoid ambiguity. As mentioned earlier, this is moving target, but we can attempt to keep the list up to date. As browsers add support for new formats, file extensions could be added to the list using a never remove, only add strategy.