Embedded audio and video

If file extensions are used to determine the media type, I think the mapping of file extensions to particular media types should be strongly specified.

Consider the following Markdown:


Generally, this would refer to the music video of “everyhome” because of the file extension .mp4. However, the .mp4 extension can also be used for audio - the file extension can be used for AAC encoded music in iTunes, for example (even though .m4a is the norm for audio). So the file could just contain the audio of “everyhome” or both audio and video. If the implementation developer chooses the render it using the HTML <audio> tag (and the file is indeed video), the video will fail to display. This is a problem for cross-compatibility between implementations. Enforcing the .mp4 = video rule via a whitelist of file extensions for each media type would resolve this.

@chrisalley, you raise a good point. I’m reluctant to enforce .mp4 = video, though, if it might just be audio. I suppose we could abuse the title field to resolve ambiguities?

![](chihiro-onitsuko-everyhome.mp4 "audio: real title here")

This would degrade fairly well. Renderers that support video and audio could strip off the “audio:” from the title.

Or maybe we should leave ![]() for pictures, and let people use raw HTML for audio and video. After all, these are not going to be supported in many output formats besides HTML.

1 Like

I can see why you’re not keen on enforcing .mp4 = video.

As a modification of my original proposal, the whitelist of file extensions could consist of only unambiguous file extensions. That would rule out .mp4 being used to represent either audio or video, but .m4a and .m4v would be valid extensions (representing <audio> and <video> respectively). Similarly, .ogg is ambiguous, but .oga and .ogv are not.

In cases where there is a .mp4 or .ogg file, the file could either be renamed or if this is not possible, the writer could fall back to using HTML.

.mp3, .wav and .flac are only used for audio as far as I am aware, and .webm is only used for video.

The use of English doesn’t seem very Markdownish.

Some kind of lightweight markup for audio and video is preferable. Video, especially, is becoming increasingly common on the web, but the HTML to markup a video is not easy to write (or remember).

True, but to do it correctly you may need the HTML tags, which allow you to specify width, height, controls, autoplay, poster, subtitles, looping, muting, and alternative sources.

From a writer’s perspective, I just want a simple and quick way to express the intent to “embed my video at this point in the article”.

Many of the extra attributes for videos are application features. I’m not sure that writers should (or would want to) control these settings. The autoplay setting, for example, could be set at an application level and apply to all videos used by the application, removing the need for writer-controlled configuration.

Images also have height and width attributes, but these aren’t part of Markdown. There are often cleaner ways to set these than explicitly declaring them, such as the application checking the file’s dimensions and updating the generated HTML automatically. I’m inclined to think the same approach should apply to most of the <video> tag’s attributes.


Another thing to consider here is the upcoming <picture> tag that will allow multiple sources for images to be specified. This is planned for HTML 5.1. What <video>, <audio>, and <picture> all have in common is the ability to specify multiple sources via the <source> tag.

<picture> builds upon <img>, providing further reason to use ![]() for this family of elements; a lightweight syntax for specifying the media type and multiple sources, building upon the ![]() syntax already established for images, would be intuitive.

Do you have any data on how common this is in the real world, though? I’ve never seen an mp4 file that was audio only…

I have no real world examples or data. It’s possible to use .mp4 for audio, but this appears to be an edge case. .mp4 for video and .ogg for audio are the norm in my experience. Perhaps this informal convention should be adopted for CommonMark if the alternative is not used in practice? Users can always fall back to HTML if for some reason they cannot (or will not) follow the convention.

@v3ss0n, any progress so far?

@v3ss0n, ok, I’ve taken your code and made the plugin. Thanks for the code.

1 Like

Thanks , sorry , i was being busy with some deadline .
I will fork and add some changes.

How about naming it markdown-it-multimedia-html5 ?
Also can you credit me in author’s list? I will be contributing from time to time.

checking audio or video can be done via reading the header right?

That’s ok. I’m looking forward for it :smile:
Check for the recent changes (v0.1.0)

I don’t have authors list yet, but I’ll add one if you want. Though authors are usually contributors listed in the commit history. I also added link to this discussion as a source origin in the source code.

I don’t really want to change the name, because it is already registered in npm and bower, though it is not a big deal.

Yes, basing on MIME type, not on the extension. It would be better, but I thought to leave it for a while in a simple way.

@v3ss0n, I can give you write permissions for the repository, if you want, so we don’t get bothered with forking and merging.

Write permission is fine :slight_smile: . But i will work on branches .

What are you going to implement?

It’s worth noting that Maruka is using the .mp4 file extension together with the Markdown image syntax for video.


Markua Processors rely on file extension to determine the type of media and must not attempt to parse files to determine their type. Because of this, the choice of acceptable file extensions for the various media types is a subset of the total available, so that audio and video files can be distinguished solely by their file extension instead of by examining the file or by requiring authors to type some special metadata syntax.


The file is treated as an MP4 video.

and when describing the audio file extensions:

.m4a, .aac
The file is treated as an MP4 AAC (Advanced Audio Coding) audio file. Note that .mp4 is not supported as a file extension for MP4 AAC audio, since that is the file extension used for MP4 video.

This is essentially the same as what I suggested for CommonMark.

While Maruka isn’t exactly Markdown (it’s not aiming for backward compatibility - it removes some syntax from Markdown and adds new syntax), I think it would be wise to aim for some level of syntax compatibility. Markdown and Maruka are close enough to allow copy/pasting text between the two (with some modification).

Maruka? More markdown specs? Will this ever end lol.

I am itching to do libmagic-like header detecting in javascript. Should i attempt that route? I’ve done header parsing for detecting file type in Python , i think it should work fine for javascript fileobject , except we do not have mmap in javascript.

with that we can easily detect mime via file contents.

The CMS that I primarily work with serves all media (images, video, audio, pdfs, you name it) from requests that use the .asax extension. This is similar to a .php file serving any sort of data. If this extension is going to be useful, there needs to be a way to mark what type of media you’re serving.

My initial reaction would be to mark the type of media via the document fragment (I blogged about leveraging this for styling images with markdown), but it could conflict with existing document fragments used in SVG.

As this is being proposed as an extension, maybe it makes sense to use different symbols for the different embeds.

Audio could use @[](), and video could use ^[](). That way it’s up to the author to specify what they’re embedding.

1 Like