[tei-council] datatype issues (part 1)

Mon Sep 12 15:21:21 EDT 2005

Syd Bauman wrote:

>LB> Says who? If you're arabic or chinese, why is "m" more intuitive
>LB> than "1"? (or "u" than "0")?
>
>That's not really fair, in that if you're Arabic or Chinese you'll
>either be dealing with an equally non-intuitive element name (e.g.
>"person") and attribute name (i.e. "sex"), or you will have
>internationalized your schema, including these values.
>  
>
you can internationalize the attribute name and description
to help the Arabic encoder, and easily get back to canonical TEI,
with no lossiness. If you start monkeying with allowed values,
your task immediately becomes noticeably harder (in this case,
not _that_ hard, but no longer commodity).

>SR> If you want your archival XML to have ISO values for sex, but
>SR> your editors seem "mfu", then you have to use an alternate
>SR> authoring DTD, and impose a transformation in your workflow.
>
>In many many cases this is going to be a really good idea, for lots
>of less sexy reasons than this one.
>  
>
hopefully, the talk by self and G Bina in Sofia will discuss
this sort of thing

>That seems like it might be a bit of a slippery slope. I mean, one
>of the main selling points of XML is that it is human readable.
>
hmm. the thing about XML is that its text, and so
easily read by any application, including "cat".
That's as far as I would go. Can anyone claim <respStmt>
is "human-readable", in that sense that "sex='m'" is "readable"?

-- 
Sebastian Rahtz      
Information Manager, Oxford University Computing Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431

OSS Watch: JISC Open Source Advisory Service
http://www.oss-watch.ac.uk