The AST output from Dingus is invalid Xml according to xmllint


#1

When running the Xml AST output through xmllint, a validation error is thrown:

element document: validity error : root and DTD name do not match 'document' and 'CommonMark'

The offending line seems to be the DTD declaration in conjunction with the document’s root element:

<!DOCTYPE CommonMark SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">

Evidently, the DTD name CommonMark does not match the name of the root element <document>.

The respective rule is given in the W3C Xml specification as

The Name in the document type declaration MUST match the element type of the root element.

Is that known, and are any changes (in particular, changes to CommonMark.dtd, such as a renaming of the root element according to the DTD) planned for this?

The reason for my question is that in a project I am working on, ASTs are stored as Xml files at some point, and I’d like to make sure these Xml files are valid while also sticking to the conventions laid down by official CommonMark materials.


#2

No, that rule wasn’t known (by me anyway).

Which would be better, calling the root element CommonMark or changing the file to document.dtd?


#3

Calling the root element CommonMark gets my vote. It’s more consistent with HTML using <html> and SVG using <svg>. It’s also unambiguous.


#4

Sorry, I think I was confused. This doesn’t concern the filename of the DTD, it concerns the name in the DOCTYPE declaration, i.e. the first word after <!DOCTYPE.

In your example, this is CommonMark, but in cmark’s xml output, it is document (which indeed matches the name of the root element).

Ah, I see. In commonmark.js, CommonMark is used instead of document. Well, that’s easily fixed (done in commit d171379431e958c45481f7df8420104d14650691)>