[tei-council] Internationalised domains

Kevin Hawkins kevin.s.hawkins at ultraslavonic.info
Fri Oct 7 16:52:25 EDT 2011

On 10/7/2011 4:31 PM, Stuart A. Yeates wrote:
> I am proposing that we prescribe using a specific encoding for at
> least the file part of URLs.
> My rationale for that is that without encoding there is (a) ambiguity
> about where one URL stops and another starts in lists of 1–∞ URLs and
> (b) ambiguity about whether the URL is encoded leading to issues with
> generic conversion to HTML, ODF, RDF, etc needing to guess the
> encoding of URLs and sometimes getting it wrong.

> To return to the original question, the answer is: No, I suggest we
> revise http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-data.pointer.html
> to follow RFC 3987 in all details. I further suggest that we include
> some motivating / worked examples.

Okay, so to make sure others understand that this isn't a typo, you want 
to change the definition of data.pointer from RFC 3986 to RFC 3987.  I 
suppose the declaration would also change from

data.pointer = xsd:anyURI


data.pointer = xsd:anyIRI

But as for Stuart's rationales, I don't see how (a) and (b) are problems 
if people properly follow RFC 3986 for any attributes using 
data.pointer.  There's the problem of Stuart's validation situation not 
catching these problems in his data, but that's partly due to putting 
URLs in @key (or other attributes that don't use data.pointer) and 
partly due to some still-undetermined cause of misvalidation of 
data.pointer attributes.

But if the switch from URIs to IRIs solves other problems, then I 
support it.  IRIs are a generalization of URIs, so we wouldn't violate 
the Birnbaum doctrine by doing this.

More information about the tei-council mailing list