[tei-council] Internationalised domains

Stuart A. Yeates syeates at gmail.com
Tue Sep 20 04:49:29 EDT 2011

Currently domain names in TEI can occur in typed fields (such as
data.pointer) or in many other fields where type checking is more
relaxed (or non-existent). I would like to propose the following note
to appear somewhere in the standard (I'm thinking the data.pointer
page, but I'm open to suggestions). The URL in the example is perhaps
the best-known punycode URL (see
http://en.wikipedia.org/wiki/Masr_%28domain_name%29 ), but if Arabic
script causes problems in the publishing process I can probably find a
more Latin-esque one.



Internationalised domains containing non-ASCII characters should
always be escaped using RFC 3492 syntax ("punycode") Thus
http://موقع.وزارة-الاتصالات.مصر/ is written
http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/ Such escaping
permits internationalised domains to be used with a full range of
software tools.


More information about the tei-council mailing list