Regarding this comment in the Issues we SHOULD resolve before 1.0 release topic:
Should we change the spec so that instead of “images” it talks of media more generally, and allow the
syntax to be used for audio and video? (Renderers could render appropriately for the media.)
My thoughts: for 1.0, rename this to a generic “embedded media” type, with image being the default media type that is rendered and used in the spec examples. The spec could mention that the embedded media syntax may be used to render other types of media based on file extension, the same way that the parser may render soft breaks as hard breaks (with soft breaks being the default). E.g. the spec could state:
A renderer may also provide an option to render the embedded media as a media type other than image, based on the file extension of the media file.
The whitelist of valid URL schemes for autolinks was removed in CommonMark 0.24. Omitting mention of particular file extensions/media types would be consistent with this. A formal list of file extensions is a moving target; keeping this list up to date falls quite far outside the scope of formalising Gruber’s Markdown, but perhaps a later extension could formalise how the media (based on file extension) should be rendered.