Surely to have a standard these kind of definitions need to be in place?
I think the standard should actually specify the text of the HTML everything converts to, with the aim of having the output be valid by the WHATWG HTML Living Standard.
That sounds perfectly reasonable. That then at least should be specified in the standard?
The standard is meant to specify the abstract syntax of a document.
HTML is used in the examples as a way of representing the abstract
syntax tree. (It was chosen partly because this way, people can test
their implementations directly against the examples.)
It’s consistent with the spec to have variations in how this abstract
syntax tree is rendered. You might render it to HTML 4, XHTML, HTML 5,
or even LaTeX.