[tei-council] Internationalised domains

Kevin Hawkins kevin.s.hawkins at ultraslavonic.info
Tue Sep 20 18:12:52 EDT 2011


I guess what I'm saying is that Punycode is prescribed for use with the 
Domain Name System, but our TEI documents might outlive DNS or be used 
in a system that uses doesn't use DNS.  After all, even URIs (as 
prescribed in RFC 3986) give DNS as an example of a name registry 
mechanism, not the only one.

We tie ourselves to a few external standards (maintained by the W3C) 
which may become obsolete at some point, but I'm not sure whether we 
should add systems maintained by ICANN to the list.

--Kevin

On 9/20/2011 2:31 PM, Stuart A. Yeates wrote:
> Punycode is already required (and happens automatically with modern
> tools and formats) for URIs. View the source of the (UTF-8) web page
> of my example website to see what I mean.
>
> The issue is when people put URIs and in free text fields where the
> tools are unaware that these are URIs and expect them to 'just work'.
>
> cheers
> stuart
>
>
>
> On Wed, Sep 21, 2011 at 1:26 AM, Kevin Hawkins
> <kevin.s.hawkins at ultraslavonic.info>  wrote:
>> I'm not sure about prescribing use of RFC 3492.  This seems to me like
>> prescribing use of US-ASCII with character entity references instead of
>> UTF-8 within XML documents to ensure that we can use our documents with
>> a full range of software toolS -- something that fewer and fewer people
>> support doing.
>>
>> On 9/20/2011 4:49 AM, Stuart A. Yeates wrote:
>>> Currently domain names in TEI can occur in typed fields (such as
>>> data.pointer) or in many other fields where type checking is more
>>> relaxed (or non-existent). I would like to propose the following note
>>> to appear somewhere in the standard (I'm thinking the data.pointer
>>> page, but I'm open to suggestions). The URL in the example is perhaps
>>> the best-known punycode URL (see
>>> http://en.wikipedia.org/wiki/Masr_%28domain_name%29 ), but if Arabic
>>> script causes problems in the publishing process I can probably find a
>>> more Latin-esque one.
>>>
>>> cheers
>>> stuart
>>>
>>> ----
>>>
>>> Internationalised domains containing non-ASCII characters should
>>> always be escaped using RFC 3492 syntax ("punycode") Thus
>>> http://موقع.وزارة-الاتصالات.مصر/ is written
>>> http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/ Such escaping
>>> permits internationalised domains to be used with a full range of
>>> software tools.
>>>
>>> ----
>>> _______________________________________________
>>> tei-council mailing list
>>> tei-council at lists.village.Virginia.EDU
>>> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>>>
>>> PLEASE NOTE: postings to this list are publicly archived
>> _______________________________________________
>> tei-council mailing list
>> tei-council at lists.village.Virginia.EDU
>> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>>
>> PLEASE NOTE: postings to this list are publicly archived


More information about the tei-council mailing list