Current spec says, that any correct unicodes allowed in digitals entities (like this &# 3777;). Do you think it’s good idea to allow control codes like 0x00-0x1F?
In JS i have no access to libraries like punicode
, and did this naive-paranoidal check:
function isValidEntityCode(c) {
// broken sequence
if (c >= 0xD800 && c <= 0xDFFF) { return false; }
if (c >= 0xF5 && c <= 0xFF) { return false; }
if (c === 0xC0 || c === 0xC1) { return false; }
// never used
if (c >= 0xFDD0 && c <= 0xFDEF) { return false; }
if ((c & 0xFFFF) === 0xFFFF || (c & 0xFFFF) === 0xFFFE) { return false; }
// control codes
if (c <= 0x1F) { return false; }
if (c >= 0x7F && c <= 0x9F) { return false; }
// out of range
if (c > 0x10FFFF) { return false; }
return true;
}
Do you think it’s ok or should be relaxed?