CDATA and processing instructions should not be recognized

release-1.0

#1

According to the HTML specification, CDATA sections and processing instructions are parse errors, which means that they should never occur in a valid HTML document. CommonMark should not recognize them, or (in the case of CDATA sections) treat the content as verbatim text to be output as HTML. I prefer the former.


#2

That’s not totally correct, from the docs on golang.org/x/net/html:

Strictly speaking, an HTML5 compliant tokenizer should allow CDATA if and only if tokenizing foreign content, such as MathML and SVG. However, tracking foreign-contentness is difficult to do purely in the tokenizer, as opposed to the parser, due to HTML integration points: an element can contain a that is foreign-to-SVG but not foreign-to- HTML.

I am using processing instructions embedded in my markdown documents extensively for various functionality (see https://github.com/mkdoc/mkpi) and object strongly to support for parsing processing instructions being removed.