[tei-council] Constraints on anyURI

Martin Holmes mholmes at uvic.ca
Mon Nov 19 17:51:10 EST 2012


On 12-11-19 01:30 PM, Sebastian Rahtz wrote:
> jing seems to fail to check anyURI. but rnv checks the xsd:anyURI better:
>
> Sebastians-MacBook-Pro:Exemplars rahtz$ rnv tei_all.rnc /tmp/foo.xmk
> /tmp/foo.xmk
> /tmp/foo.xmk:7:0: error: attribute ^sameAs with invalid value "n-CTL t-TR Ø-OBJ, n-SUBJ"
> required:
> 	data http://www.w3.org/2001/XMLSchema-datatypes^anyURI
> error: some documents are invalid
>
> But then again, jing may silently URL-encodiing the spaces and commas for us.
>
> I think you could maybe use http://www.w3.org/TR/xpath-functions/#func-resolve-uri in
> a Schematron check to check the URI. But surprisingly Saxon seems to accept
> almost anything as a valid URI.

I think Saxon is probably playing it safe. As the spec says, there's no 
way to reliably resolve a URI, especially given the range of registered 
and unregistered URI schemes that exist, so the best we can probably do 
is check for some things we know must be wrong (such as spaces, commas 
and perhaps percent signs which don't precede hex numbers). If we do 
this through Schematron it will presumably work for everyone that's 
using RNG with embedded Schematron.

Cheers,
Martin

-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)


More information about the tei-council mailing list