Embedded audio and video

I see well that could work. What if the url don't say .mp3 but should be treated as audio? I guess we could do both

1 Like

The solution I posted works because the system deals with very particular file formats and extensions. I can see it creating problems with more flexible systems.

Something like the solution you posted seems good. I would use sensible defaults for the options, rather than getting the user to explicitly define them.

Need to necro this topic , coz worth it .
As @chrisalley did. How about we agree on letting the implementation decide what to display, and just use common syntax for everything?

![](whatever it is.png/jpg/gif/mp4/webm/mp3/audio)

that way we do not need a different syntax.

1 Like

I like the idea of

![](filename.mp3)

produces

<audio controls="controls">
  <source type="audio/mp3" src="filename.mp3"></source>
  <source type="audio/ogg" src="filename.ogg"></source>
  <p>Your browser does not support the audio element.</p>
</audio>

I would like to suggest that we use

![](filename.mp3 filename.ogg)

using white space to separate the files and having the first file to dictate the HTML element.
For example

![](filename.mp4 filename.mp3)

will create

<video controls="controls">
  <source type="video/mp4" src="filename.mp4"></source>
  <source type="video/mp3" src="filename.mp3"></source>
  <p>Your browser does not support the video element.</p>
</video>
2 Likes

I agree, hence ![foo](bar) should not be seen and described as “image syntax”, but as “embedded resource syntax”. Embedding is special kind of hyperlinking.

Back in the day, when <img> was new to HTML and terminals were more common than GUIs and bandwidth was scarce, images – most likely figures then – would often be shown on demand only and not embedded within the text, although the name of the src attribute already suggests that it should be treated differently than those with href. Anyhow, an embedded resource is a special kind of link, whether it be a static image, video, audio, text (<iframe>) or an interactive component (<embed>).

A third type of links are transclusions. This kind is not necessary to support in a front-end format such as HTML, but should be a feature of a back-end format like Commonmark. Otherwise one needs another tool layer, i.e. a preprocessor. A well-known example are {{templates}} in MediaWiki. Transclusions in CM/MD should use a syntax similar to embeds, probably just switch the exclamation mark for a different punctuation character, maybe require them to be in a line or block by themselves, e.g.:

<[Introduction](chapter0.md)

This does not solve external code listings well, though, because you would have to put markup inside them since fences and indents won’t work:

~~~
<[Bubble Sort Class](bubble.c)
~~~

    <[Bubble Sort Class](bubble.c)
3 Likes

Agreed on a common syntax. There should be a sensible default rendering as well so that CommonMark documents are compatible across different websites and apps.

Explicitly listing the file types like this is probably the safest route. Otherwise the parser would have to know in advance (or check at the time of rendering) which files to include. If another file format is added in the future (say, a flac version), the document could either be updated manually (or perhaps programmatically if a flac version is added for every audio track in the larger file set).

I’m in agreement (as a CommonMark extension, not as part of the core spec).

If the extension is enabled, the ![]() syntax should render the content based on the specified file extension, e.g. ![](file.mp4) would render the HTML <video> tag. If the extension is not enabled, the syntax will attempt to render the HTML <img> tag regardless of the specified file extension.

For the extension to be viable we would need a white list of file extensions that would be used to render as the particular content types - image, audio, video, and perhaps other content types.

1 Like

Good ideas ! for whitelist i suggest whitelsiting html5 compatible formats

1 Like

I am going to make a plugin for Markdown-it , commonmark ref implementation is not ready for plugin/extensions , yet right?

1 Like

The reference implementation is only for CommonMark syntax I believe. But @jgm has not yet confirmed if an audio and video extension will be supported, or if the syntax we’ve discussed here will be used.

would be nice if @jgm can comment on this.

I am implementing one using markdown-it @chrisalley @jgm

I think it makes a lot of sense to use the

![alt](path/to/resource.ext)

syntax not just for images but for embedded audio and video, and let the renderer create tags appropriate for the resource, based on the extension.

This might call for some changes in the spec, renaming the element from “image” to “media” or something more generic, and adding some words about the flexibility in rendering. I’m not sure. In any case, as an extension this idea is completely natural.

6 Likes

Thanks a lot @jgm.

Here is what i’ve done by monkey-patching markdown-it’s rule for image. Not proper plugin yet but working very well for me.


  var markdownit = window.markdownit() 
  var defaultRender = markdownit.renderer.rules.image
  markdownit.renderer.rules.image = function(tokens, idx, options, env, self) {
  var  vimeoRE = /^https?:\/\/(www\.)?vimeo.com\/(\d+)($|\/)/;
  var audioRE = /^.*\.(ogg|mp3)$/gi
  var videoRE = /^.*\.(mp4|webm)$/gi
  var token = tokens[idx]
  var  aIndex = token.attrIndex('src');
  console.log('aindex of idx' + idx)
  console.log(aIndex)
  var matches_audio = audioRE.exec(token.attrs[aIndex][1])
  var matches_video = videoRE.exec(token.attrs[aIndex][1])
  console.log(token.attrs[aIndex][1])
  if (vimeoRE.test(token.attrs[aIndex][1])) {

    var id = token.attrs[aIndex][1].match(vimeoRE)[2];

    return '<div class="embed-responsive embed-responsive-16by9">\n' +
      '  <iframe class="embed-responsive-item" src="//player.vimeo.com/video/' + id + '"></iframe>\n' +
      '</div>\n';
      
  } else if (matches_audio !== null) {
    console.log('matches audio')        
    return ['<p><audio width="320" controls class="audioplayer"',
      '<source type="audio/' + matches_audio[1] + '" src=' + matches_audio[0] + '></source>',
      '</audio></p>'
    ].join('\n')

  } else if (matches_video !== null) {
    console.log('matches video')

    return ['<p><video width="320" height="240" class="audioplayer" controls>',
      '<source type="video/' + matches_video[1] + '" src=' + matches_video[0] + '></source>',
      '</video></p>'
    ].join('\n')
  }else {
    console.log('matches img')
    return defaultRender(tokens, idx, options, env, self);
  }
}

Yes that is very resonable. This will make markdown a lot richer.

3 Likes

@v3ss0n, are you going to shape your code as a plugin? We at diaspora need it to implement audio/video embedding. If not, I could probably make the plugin by myself basing on your code.

2 Likes

I will make a repo , i still learning how to do a proper markdown-it plugin.
We are also building something similar to diaspora but aimed for Group Conversation , instead of social network.

1 Like

I think it would be best if the core spec continues to describe the element as “image”. The HTML5 spec refers to the category of elements as embedded content. If the element is called “embedded content” or “media” in the core spec, people might be lead to believe that other embedded content is available as part of CommonMark-core. But this is intended as an extension, correct?

I was actually not thinking of it as an extension. More of a terminology change.

Currently the spec attempts to define parsing (transformation from source to AST) without being too detailed about exactly how each element should be rendered into HTML (or other formats). Loosening up the terminology would invite renderers to do something useful with movie and sound URLs in ![]() contexts, without requiring anything specific.

A further step would be to say explicitly that renderers must be sensitive to the URL or file extension (if there is one) and render the media in a way that makes sense. But I’m thinking this might be a bit much to require. And maybe the terminology change doesn’t make sense without this, I’m not sure.

1 Like

If file extensions are used to determine the media type, I think the mapping of file extensions to particular media types should be strongly specified.

Consider the following Markdown:

![](chihiro-onitsuka-everyhome.mp4)

Generally, this would refer to the music video of “everyhome” because of the file extension .mp4. However, the .mp4 extension can also be used for audio - the file extension can be used for AAC encoded music in iTunes, for example (even though .m4a is the norm for audio). So the file could just contain the audio of “everyhome” or both audio and video. If the implementation developer chooses the render it using the HTML <audio> tag (and the file is indeed video), the video will fail to display. This is a problem for cross-compatibility between implementations. Enforcing the .mp4 = video rule via a whitelist of file extensions for each media type would resolve this.

@chrisalley, you raise a good point. I’m reluctant to enforce .mp4 = video, though, if it might just be audio. I suppose we could abuse the title field to resolve ambiguities?

![](chihiro-onitsuko-everyhome.mp4 "audio: real title here")

This would degrade fairly well. Renderers that support video and audio could strip off the “audio:” from the title.

Or maybe we should leave ![]() for pictures, and let people use raw HTML for audio and video. After all, these are not going to be supported in many output formats besides HTML.

1 Like

I can see why you’re not keen on enforcing .mp4 = video.

As a modification of my original proposal, the whitelist of file extensions could consist of only unambiguous file extensions. That would rule out .mp4 being used to represent either audio or video, but .m4a and .m4v would be valid extensions (representing <audio> and <video> respectively). Similarly, .ogg is ambiguous, but .oga and .ogv are not.

In cases where there is a .mp4 or .ogg file, the file could either be renamed or if this is not possible, the writer could fall back to using HTML.

.mp3, .wav and .flac are only used for audio as far as I am aware, and .webm is only used for video.

The use of English doesn’t seem very Markdownish.

Some kind of lightweight markup for audio and video is preferable. Video, especially, is becoming increasingly common on the web, but the HTML to markup a video is not easy to write (or remember).