[tei-council] what is text (was "Re: <content> vs <mixedContent>")

Lou Burnard lou.burnard at retired.ox.ac.uk
Mon Oct 6 14:39:48 EDT 2014


On 06/10/14 15:05, Sebastian Rahtz wrote:
>
>> (Also, this makes it very clear to an ODD-writer who uses
>> mixed content that text nodes will be validated by matching against
>> the RELAX NG <text/> pattern, not the RELAX NG "string" datatype, nor
>> the RELAX NG "token" datatype, nor the W3C Schema "string" datatype,
>> nor the W3C Schema "token" datatype.)
> this dark and evil world of “text” is exactly why I am scared.

Just to clarify, once more, the additional thing that you get if you ask 
for <mixedContent> is
rng:text, and NOT what the TEI confusingly calls data.text (which maps 
to rng:data type="string").
Although the two are lexically identical (i.e. what is valid for one is 
also valid for the other), an XSLT processor will treat them differently 
when normalizing spaces. Or so I believe. According to 
http://eric.van-der-vlist.com/blog/2005/01/07/562_relax_ng_text_versus_data/ 
we want rng:text since mixedcontent is clearly document-centric rather 
than datacentric...  http://www.xmlplease.com/normalized is also helpful 
on normalization.

To avoid having this discussion again every few years, I wonder if we 
shouldn't deprecate "data.text" in favour of something called "data.string"?




> --
> Sebastian Rahtz
> Director (Research) of Academic IT
> University of Oxford IT Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Não sou nada.
> Nunca serei nada.
> Não posso querer ser nada.
> À parte isso, tenho em mim todos os sonhos do mundo.
>



More information about the tei-council mailing list