Embedded audio and video

Well, I already have working code, and it would be simpler to modify what I have and not rewrite everything from scratch, if it’s possible. Especially if it is faster :slight_smile:

So is it possible within the renderer?

No. Renderer operates with tokens and is not expected to analyze/modify combinations.

You can write plugin for core chain to scan for paragraphs, anylize if content is link_open + text + link_close and do replace. Or manually call parse + scan/replace + render.

1 Like

Anything new in this ? Any standard way to do now?

Yes I agree.

To spec makers, they don’t need to create a new expression of insert audio/video url, that could go to a complex debate because it’s hard to find a simple and elegant way to do that.

To end users, they don’t need to think more and learn more, just copy/paste the code given by audio/video websites, like how we do now.

To developer, just make extensions to finish that work if they want. Once there is a better way to be created, we can decide whether or not put it into the spec.

Regarding this comment in the Issues we SHOULD resolve before 1.0 release topic:

Should we change the spec so that instead of “images” it talks of media more generally, and allow the image syntax to be used for audio and video? (Renderers could render appropriately for the media.)

My thoughts: for 1.0, rename this to a generic “embedded media” type, with image being the default media type that is rendered and used in the spec examples. The spec could mention that the embedded media syntax may be used to render other types of media based on file extension, the same way that the parser may render soft breaks as hard breaks (with soft breaks being the default). E.g. the spec could state:

A renderer may also provide an option to render the embedded media as a media type other than image, based on the file extension of the media file.

The whitelist of valid URL schemes for autolinks was removed in CommonMark 0.24. Omitting mention of particular file extensions/media types would be consistent with this. A formal list of file extensions is a moving target; keeping this list up to date falls quite far outside the scope of formalising Gruber’s Markdown, but perhaps a later extension could formalise how the media (based on file extension) should be rendered.

2 Likes

You will not be able to select proper wrapper without info about media type. And you can not trust extension in URL. Data from URL should be loaded and analyzed. There are 2 problems:

  • You can not define clear method to resolve media type
  • Download from remote is async by nature. That will be ass pain for parser architecture.

My suggestion is to avoid adding AI to spec.

2 Likes

@vitaly, that seems equally true for images – can we really “know” that a given URL represents an image without inspecting it? Right now when I use a CommonMark compliant parser and embed an MP4 URL, I get an <img> tag, even though the URL clearly provides useful information to the parser suggesting that this is not in fact what the user intended.

I also think it’s important to distinguish between YouTube/Vimeo style embeds (which rely on third party code) and browser-supported HTML5 video/audio playback (which is an open standard, just like <img> tags).

As a markdown user, I would find a system that inserts standards-compliant <video> or <audio> tags for appropriate embed URLs on the basis of extensions more than sufficient for most practical purposes. Having special syntax to force the embed type might be useful for special cases. But if I want to embed a locally uploaded MP4 video in my markdown blog post, that should “just work” on the basis of the URL.

I do agree that any advanced embedding of third party libraries doesn’t belong in the spec, but embedding a video or audio file in a standards compliant manner is, as far as I can tell, not meaningfully different from the already supported image use case. Am I missing something?

@vitaly, that seems equally true for images – can we really “know” that a given URL represents an image without inspecting it?

You missed the end goal - output markup. If it is <img> always, no need to analyze src, because it does not afftects output. But if output can be img/audio/video - some additional criteria required to choose right. Assumptions about file extensions in URL are too weak for spec IMHO.

2 Likes

I think a case could be made not to maintain a list of file extensions mapped against <img>, <video> and <audio> in the spec, just because what file types frequently updated browsers like Chrome do support natively can change pretty often.

The spec should IMO, however, make the recommendation to implementers to vary the output of what’s currently image-only markup based on the file extension of commonly natively supported file formats (with <img> being the default if the extension is not recognized, or if there is none), and provide implementers with sample output for <img>, <video> and <audio>. All three are first-class citizens in HTML5 and it makes no sense to me that markdown would not support them as such.

The behavior that markup like ![some text](https://someurl.com/some.mp3) produces an <img> embed of a URL that’s obviously not an image is counter-intuitive, and the whole point of using a language like markdown is IMO to provide the most intuitive result for the common cases, using markup that people can keep in their head. To the extent that the spec produces obviously counterintuitive results when implemented, I would argue that it needs to be revised not to do so.

I agree with previous commenters who have suggested extending the syntax to override that behavior on an as-needed basis, but if there’s no consensus what that should look like, I would still recommend changing the default behavior of ![x](y) type invocations in the way described above, i.e. make intelligent guesses based on commonly used extensions, and fall back to image markup if no intelligent guess can be made. I don’t think that would qualify as AI.

Although I disagree with you here, let me thank you for all your hard work on markdown-it; it’s a pleasure to use! And again - I may be missing something obvious and am mostly speaking from the perspective of someone using other people’s markdown parsers in a few different codebases.

Such recommendations as definitions with partial/incomplete coverage do not work well in programming. That quickly become “one more standard”, as it already happened with markdown.

I understand people who need audio/video with the same markup, but don’t see ways for correct implementation.

BTW, we use embedza for post-processing to beautify links (not images) an that works very well on practice. And more convenient on forums than use image tags.

Well, I would argue that leaving handling of even basic HTML5 features like <video> and <audio> tags entirely up to implementers is far worse from the perspective of implementation proliferation (and also anachronistic). After thinking about it a bit more, I do think it would be best to include a non-exhaustive list of extensions in the spec, with a note that implementers are free to add formats natively supported by widely used browsers to that list.

The list, AFAICT, would be:

  • <audio>: default for URLs that end with .wav, .mp3, .ogg (see below)
  • <video>: default for URLs that end with .mp4, .ogv, .webm
  • <img>: default for all other URLs

All matching would be case-insensitive.

.ogg is the most challenging since it is a container format associated with both video and audio in the wild, which has led to proliferation of the .ogv extension. This and similar edge cases would be a reason to offer an override in the markup, but in the most common cases, the above matching should work just fine, IMO.

4 Likes

I think all formats natively supported by widely used browsers should be included in the list to avoid ambiguity. As mentioned earlier, this is moving target, but we can attempt to keep the list up to date. As browsers add support for new formats, file extensions could be added to the list using a never remove, only add strategy.

1 Like

We can discuss about it infinitely. What’s the final decision? We really want to merge @cmrd_senya PR in diaspora* core, waiting for the standard to be decided…

I think relying on the file name extension is much too brittle. Also, any list of file types/extensions will be quickly out-of-date.

I think it would be best to keep ![]() for images and introduce new syntax for “video” and “audio” (and possibly more in the future).

I think that images are used far more frequently, and it wouldn’t hurt if the others were a bit more verbose, e.g. !video[...](...) and !audio[...](...) (as was suggested before in this thread).

3 Likes

Literal keywords like video and audio are absolutely not the CM/MD way.

2 Likes

OK, I admit that using plain words like video and audio as part of the syntax isn’t ideal. But relying only on lists of file extensions is IMHO much worse. Especially since it is very common for (auto-generated) URLs to have no extension at all.

I’ve read through the whole thread again and I think the most reasonable solution would be a combination of the two ideas as suggested in this post. This would allow the nice and simple syntax with file-extension-detection like ![](my_file.mp3), but it would also allow the (IMHO very important) disambiguation in cases where the file extension is ambiguous, not recognized or just missing, like in

!audio[](some_url?id=superhit)
1 Like

This is a fine solution from my perspective. The complaint that

but .mp4 could mean audio only!

does not hold much weight. It could… but almost never will. And if so, use some other filename.

2 Likes

There are a lot of parallels between the comments in this topic and the discussions about adding support for image dimensions and alignments. While it may be useful to override the rules and have more control over specific elements in some scenarios, these scenarios often seem like special cases.

In scenarios where the filename cannot be changed, we have two solutions:

  1. Use raw HTML. The HTML5 spec is already very well defined and provides the user with a lot of power for customisation.
  2. Use the consistent attribute syntax extension to explictly define the source type, e.g. ![](audio.mp4){type: audio/mp4}.

Using either of these approaches requires no additional syntax to the specification for embedded audio and video in CommonMark; both (1) and (2) are seperate specifications. Defining audio and video can remain lightweight, a majority of documents can remain uncluttered with additional syntax, and (perhaps most importantly) we get syntatic continuity with images.

I also understand the appeal of having some means to explictly state the media file’s source type so that media can be embedded that does not follow the file extension convention. But because this is desirable in some (edge) cases, does this mean that writers should be forced to explictly state the source type every time they want to embed a video? That isn’t appealing.

6 Likes

Happy New Year!
Any update on this?
Any considertions finalized ?

1 Like