[tei-council] Datatypes.... continued
Lou Burnard
lou.burnard at computing-services.oxford.ac.uk
Tue Sep 20 14:13:41 EDT 2005
Syd Bauman wrote:
>>1. add at place
>> addSpan at place
>> --> tei.data.enumerated
>>
>>
>
>What do you propose the value list be? The list{} method proposed
>permits things like "opposite top left" (which is also permitted in
>P4).
>
>
>
I have no particular proposal. Any list I came up with only be advisory
anyway.
>> metDecl at type
>> --> tei.data.enumerated
>>
>>
>
>Again, we need to account for the fact that more than 1 "value" is
>permitted. Thus the value list would have to account for the various
>combinations. Luckily, since there are only 3 of 'em, in this case
>it's not hard:
> met
> real
> rhyme
> met;real
> met;rhyme
> real;rhyme
> met;real;rhyme
>(In EDW90 I just copied what P4 says to do: "One or more of the three
>attribute names met, real, or rhyme, separated by whitespace", thus
> list { ("met" | "real" | "rhyme")+ }
>Both P4 and EDW90 permit silly combinations like "met real met
>real".)
>
>
>
so here;s a case where an enumeration is both possible and desirable. good!
>
>
>>2. tei.pointerGroup at domains
>> --> tei.data.pointers
>>
>>
>
>I don't really see it as a plus to permit values that the prose says
>are invalid just to say we used a datatype directly. The EDW90
>recommendation matches the prose perfectly. (It is
> list { tei.data.pointer, tei.data.pointers }
>.)
>
>
>
>On the other hand, if everyone really thinks it is extremely
>important to use an abstract TEI datatype instead of a perfectly
>reasonable combination of 'em like the above, I suppose we could move
>the "must be 2 or more" check into a Schematron rule. Seems like the
>lesser of two evils, at least.
>
>
>
Actually, I wonder if it might not be better to rethink this attribute
into two?
>
>
>
>>3. schemaSpec at start
>> specDesc at atts
>> --> tei.data.names
>>
>>
>
>Again, why use a lax constraint when the proper one is readily
>available just to say you used a datatype? There are three possible
>declarations for tei.data.names on the table, and only one of them
>actually constrain this attribute properly:
>* list { xsd:NCName+ } -> does not permit "musicML:note", but since
> there is an ns= attribute, I'm betting
> that's considered a good thing, right
> Sebastian?
>* list { xsd:NMTOKEN+ } -> permits "--notAllowed"
>* list { xsd:token { pattern="\S+" } } -> permits "${notAllowed}"
>
>If we decide tei.data.names should boil down to something other than
>the first, then I think we should just use the proper constraint
>without a TEI datatype and not fret it.
>
>
>
Is this another instance of the particular problem that we;re using
"name" sometimes to mean a NMTOKEN and sometimes not?
>>4. date at precision
>> --> tei.data.certainty
>>
>>
>
>I think this is a really bad idea. First off, <date> should probably
>not have a precision= attribute. The precision should be expressed in
>the value=. Furthermore, while I suppose vague terms like "high",
>"medium", and "low" are occasionally applied to the precision, it is
>much more common, and far more useful, to express the unit to which a
>measurement is precise. So if we really wanted to separate precision
>from the value=, we would want precision= of date to have values like
>"century", "decade", "year", "month", "week", and "day".
>
>
>
I can't remember who suggested having "precision" as an explicit
attribute on date. It does seem to overlap with the value attribute.
Unless anyone wants to fight to the death for it, I propose we remove it.
>
>
>>5. tei.datable at dateAttrib
>> --> tei.data.enumerated
>>
>>
>
>Yes, that's what is recommended: tei.enumerated with a value list of
>"datable" | "dated" | "unknown".
>
>
>
OK
>
>
>>6. locus at scheme
>> --> tei.data.name
>>
>>
>
>I presume that's because you don't want to argue with Matthew and
>David over the possible values? :-) Seriously, I don't currently
>really understood the purpose of this attribute. <locus> describes a
>location in the current manuscript, which (supposedly) has only one
>foliation scheme which should be described in //supportDesc/foliation
>(of which only 0 or 1 are permitted, right?). So what does scheme=
>buy us?
>
>
>
Matthew? what is this attribute for? I'm too tired to remember...
>
>
>>7. fragmentPattern at pat
>> --> tei.data.notation
>>
>>
>
>As above, if "tei.data.notation" is for "notations TEI made up", this
>doesn't belong.
>
>
>
See other note.
>>8. schemaSpec at namespace
>> elementSpec at ns (why isnt it "namespace" btw?)
>> --> these could be xsd:uri as proposed, or tei.data.pointer, but
>>maybe since they have to be
>> real namespace names (i.e. "#foo" won't do) maybe shd be
>>their own datatype?
>>
>>
>
>Yes, all our ns= and namespace= attributes need to be brought into
>alignment. I don't care which.
>
>However, as I read the spec, "#foo" is a perfectly valid namespace.
>Stupid perhaps, but valid.
>
>
>
OK, that solves that one.
>
>
>>9. tei.declarable at default
>> tei.identifiable at predeclare
>> metSym at terminal
>> numeric at trunc
>> binary at value
>> --> are all xsd:boolean (so "unknown" not allowed) ; could just
>>be tei.data.truthValue with extra rule
>>
>>
>
>Could be. And at one point in the history of that EDW90 table, they
>were. But a week or two ago we agreed to go directly with
>xsd:boolean.
>
>
>
So we either have to have a pair of datatypes, one permitting "unknown"
and one not, or we have to have an additional rule to exclude "unknown"
in cases where it makes no sense, like these.
>
>
>>10. timeline at interval
>> when at interval
>> --> tei.data.numeric | -1 (or think of a better way of
>> doing the -1)
>>
>>
>
>a) We'd go back to needing unit= (I'm not saying this is so horrible,
> just want to make sure everyone understands the implications) and
> violate the policy of using W3C Schema datatypes where applicable.
>
>b) All of the proposed declarations of tei.data.numeric already
> permit -1.
>
>c) This permits all other negative numbers, which P4 explicitly
> disallows.
>
>d) If you meant 'tei.data.count', it won't do as fractions may be
> needed.
>
>e) Now that I think about it, the pattern EDW90 recommends has a
> problem, too: it permits "-0.5".
>
>Thus, I am now leaning towards
> xsd:long { minInclusive = "0" } | xsd:token "unknown"
>
>
>
where did the "unknown" come from? I think it shd remain
tei.data.numeric, but with the constraint that it can't be smaller than -1
>
>
>>11. several attributes with proposed datypes of "xsd:NCName" ->
>> tei.data.name
>>
>>
>
>Again, only if we agree tei.data.name -> xsd:NCName, of which I've
>yet to be convinced.
>
>
>
see discussion elsewhere
>
>
>>12. several attributes with proposed datatypes of
>> "xsd:nonNegativeInteger" --> there are enough of these that I
>> propose a new datatype "tei.data.count"
>>
>>
>
>Fine with me, although I'm not sure what it gains for us.
>
>
>
It gives us a meaningful TEI datatype.
>
>
>>13. sense at level -> tei.data.count
>>
>>
>
>Fine.
>(Just so everyone understands, the only real difference between
> xsd:unsignedShort
>and
> tei.data.count -> xsd:nonNegativeIngeger
>is that the software engineer designing an application that reads a
>TEI-encoded dictionary knows that in the former case whenever she
>comes across a level= of <sense> she need only set aside 16 bits to
>hold the value; with the latter she gets no such assurances. The
>other difference, that xsd:unsignedShort has a maximum value of
>65,535, isn't a practical problem.)
>
>_______________________________________________
>tei-council mailing list
>tei-council at lists.village.Virginia.EDU
>http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>
>
>
>
More information about the tei-council
mailing list