Funky, yet standard-conforming email addresses

Consider the following conversation:

> What is 2 + 2 equal to?

2 + 2 = 4.

Want to know more? Feel free to contact me at <"4 > 2"@basic-arithmetic-facts.com>.

<"4 > 2"@basic-arithmetic-facts.com> is a valid email address, but it doesn’t become a link.

<mailto:"4>2"@basic-arithmetic-facts.com> (note missing spaces around >) is even worse. It becomes a link to mailto:"4.

Aside of “don’t use such email addresses”, what’s your position on this?

I stand corrected. mailto:"4>2"@basic-arithmetic-facts.com is not a valid URI. To become a valid URI, such an email address should be encoded as mailto:%224%3E2%22@basic-arithmetic-facts.com.

Regarding my point about <"4 > 2"@basic-arithmetic-facts.com>. Your spec defines an email address as anything that matches the non-normative regex from the HTML5 spec. Well, it’s one way to solve it. And one can still use mailto: and escaping to link to any normative email address.

Can you add my example (<"4 > 2"@basic-arithmetic-facts.com>) or something similar to the “These are not autolinks:” section of the spec? Would be nice to mention that it’s an RFC-conformant, yet not autolinked email address.

1 Like

See related thread under “Email addresses regex”:

That’s nice. I thought you were more on the side “nobody cares, nobody uses them”. And you pointed to another spec (HTML5) which handles email addresses in a similar manner. That’s why I decided not to argue.

There are 2 ways to solve the internationalized email problem though: 1) follow RFCs completely or 2) create another regex (or something similar). Which solution would you prefer?

IMHO it worth to make some recommendations. But it doesn’t worth to reinvent standard in spec and it doesn’t worth to force all follow RFC strictly. Implementing all RFC features for emails is not trivial, not fast and a bit useless.

I would fix IDN support and… may be… quotes.

Also, i’d recommend to dig https://github.com/markdown-it/linkify-it. That will give some ideas about proper IDN support in JS.

1 Like

Yes, I agree – it would be better just to get a more satisfactory regex, rather than requiring implementations to implement the rather complex RFC.

https://github.com/markdown-it/linkify-it/blob/2.0.2/lib/re.js#L93-L104 - that’s JS regex for domain, if you need.

don’t allow -- in domain names

That may mess with punycoded IDNAs, which start with xn--.

Not a problem, there are special case for xn-- https://github.com/markdown-it/linkify-it/blob/2.0.2/lib/re.js#L96.

Implementing full RFC compliance for email addresses is not that hard. It can be implemented with a comparatively simple regex. No need for any higher order grammars. If you are willing to consider this option, I can write such a regex together with meaningful comments for easy reviewing. As it takes some time to do it, I’m asking first before doing something maybe unnecessary.

I am no expert about e-mail addresses, but I have strong doubts about that. That’s IMHO not what any sane person would like to maintain or translate to different regexp flavor. Furthermore not all implementations are regexp-based.

Also note there are newer RFCs for e-mail addresses: RFC 2822 but I could not find any regexp targeting that RFC.