schema association (was Re: [tei-council] date attributes: summary, problems, and some suggestions)

Christian Wittern wittern at kanji.zinbun.kyoto-u.ac.jp
Fri Feb 16 07:30:10 EST 2007


Sebastian Rahtz wrote:
> Christian Wittern wrote:
>> How about we just give the users a spot in the header to declare which 
>> one they are using ("xsd", "oxygen", "nxml", "you-name-it"), without 
>> actually implementing our own?  The only thing we will have to ponder 
>> is if we want to maintain a registry of values for this list -- I 
>> would say that might be the price we have to pay for this.
>>
>> To me, this is one of the infrastructural issues we should not lightly 
>> postpone.
>>
> I think you need to go over the whole process and explain how it will 
> work in real life.
> 
> Two simple examples
> 
>  a) I declare my schema to be foo.rnc, in oxygen notation. I am 
> processing using XSL.
>      in XSL I cannot read .rnc files (at all easily). how do I access 
> the datatype info?
> 

To do that, you only have a choice between XSD and RNG files, really.

>  b) I am moving hundreds of documents into eXist, and I pull out small 
> fragments all
>      the time. for each one, I laboriously find the header, find the way 
> to referring to the
>      schema, locate the schema, and look! I have 4 different schemas, so 
> how now
>      do I evaluate @value?
> 

This is a usecase where most of our assumptions about files and headers 
fall on their face, which is reaonable IMHO, since we define a format 
for *interchange*.  In your document repository, you are bound to want 
to normalize these kinds of things into one standard form, so the 
processing required here is done on the import and then your done with it.

>>
>> You will need to check for the type before blindly processing them.
> how? I'm an experienced XML processing person, and I just dread the
> thought of considering it. Plus, I want my validation!

looking at the value of substring-before(., '-') should give you what 
you need to decide.  And I am sure Syd will come up with a regex that 
gives you a reasonable validation on the @value

>>
>> Which will put a burden on you if you suddenly discover a new type of 
>> dates in your texts to go back to the schema-drawing-room called Roma 
>> for another round.
> seems a reasonable price to me!

Well, maybe.  It's a week argument;-)

>>
>> Except that we still have the all-in-one proposal.  I might be biased 
>> since I
>> work more or less daily with xml:lang attributes, but that solution 
>> has the advantage
>> of being simpler technically.
> and I claim its not at all simple for validators and processors...
> 

See above.

I will need to give the whole thing a bit more thought, again.

Christian


-- 

  Christian Wittern
  Institute for Research in Humanities, Kyoto University
  47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN



More information about the tei-council mailing list