The CMS that I primarily work with serves all media (images, video, audio, pdfs, you name it) from requests that use the .asax extension. This is similar to a .php file serving any sort of data. If this extension is going to be useful, there needs to be a way to mark what type of media you’re serving.
My initial reaction would be to mark the type of media via the document fragment (I blogged about leveraging this for styling images with markdown), but it could conflict with existing document fragments used in SVG.
As this is being proposed as an extension, maybe it makes sense to use different symbols for the different embeds.
Audio could use @[](), and video could use ^[](). That way it’s up to the author to specify what they’re embedding.