[tei-council] date attributes: summary, problems, and some suggestions

Christian Wittern wittern at kanji.zinbun.kyoto-u.ac.jp
Fri Feb 2 18:05:29 EST 2007


Thanks for the remainder Syd.  Here are my comments.

Syd Bauman wrote:


> 
>       value=  of  <date>, <time>, <distance>, and <docDate>
>       date=   of  <birth>, <change>, and <death>
>   I am not bothered by this in the least, because I think the
>   semantics are clearer with these names, and the combined
>   alternative (dateValue=) is at least cumbersome if not misleading
>   (i.e., on <time>).
>   Suggestion: leave names as they are. 

fine

> 
> * We haven't implemented classes as well as we could.
>   Suggestions: 
> 
>   - Put <docDate> into att.datePart. This has the disadvantage of
>     giving <docDate> a dur= attribute, but I'm not sure it is worth
>     making another class just for this one case. Thoughts?

we should not carry the class economy to far, I think.  Having 
attributes that do not make sense for a certain element should 
automatically recommend it for a separate class.

> 
>   - Create a new attribute class for the date= of <birth>, <death>,
>     and <change>. (Any suggestions for the name?) 

att.datePart.date?

>   - If we keep <distance>[1] we may wish to reconsider its class
>     membership, as value= is a bit silly on <distance>. It needs only
>     dur= from att.datePart, making two cases that benefit from
>     splitting att.datePart. (See <docDate>, above.)

see <docDate>, above.


> * Users want a method of expressing things like "Oct 27 of 1909, 1910,
>   or 1911" or "an Oct 27, but I don't know which one". The W3C format
>   that express only a month and day explicitly (xsd:gMonthDay) means
>   "a set of one-day long, annually periodic instances". These users
>   don't want the entire set, they want only one. ISO 8601:2004 does
>   not seem to have even a method to represent the set, let alone a
>   singleton. (James, can you verify that? How would one represent
>   month & day, no year, in 8601?)
>   Suggestion: I haven't got one, thus defer to P5 1.1.

Whats the point of putting this in an attribute anyway?


> * At least one user has expressed a need to express dates in other
>   than the [proleptic] Gregorian calendar. He believes this would be a
>   requirement of many historians were they to use TEI.
>   Solutions: see below
> 
> Below
> -----
> Two different suggestions have been floated for trying to get a handle
> on the last three problems, to which I will add two more. 
> 
> The basic idea is to provide two capabilities: 
> * simple date format: conform to W3C spec, easily validatable, software
>   support in the world-at-large
> * complex date format: should conform to ISO 8601 if possible
> 
> Note that "simple" and "complex" are mostly just labels: it is
> possible to have a W3C date expression that is more complex than some
> other format. The complex date format could be split into two: those
> that conform to ISO 8601 and those that don't; this would give us
> three formats, W3C, ISO, and User-generated.
> 
> Note that P4 has only complex format dates. Further note that right
> now our P5 dates are very like the simple date format, except that a
> single complexity has been added: expressing times precise only to the
> minute or hour. This complexity is validatable, but enjoys no support
> in the world of XSLT 2.0. If we go with *any* of the following
> systems, I recommend that our "simple date" formats revert to being
> truly W3C-only, and thus those who need to express times less
> precisely than to the second would be forced to use the "complex date"
> format.

This seems to be quite desirable to me.


> The question is at what level to apportion these capabilities. Here
> are the four possibilities I have come up with. Note: the names are
> ones I have MADE UP on the spot, and are merely stand-ins for whatever
> Council eventually decides they should be named.
> 
> attribute level: each of the dating attrs is split into two
> datatype level: we provide one datatype for each date format, user
>                 chooses which for each attribute
> class level: for each attribute set, we provide two (or more) classes,
>              one for each format, user chooses which for each element
> all-in-one: syntax of attribute value differentiates
> 
> datatype level
> -------- -----
> We create two or three datatypes, one for each date format. 
> 
> data.w3cTemporal = xsd:date | xsd:gYear | xsd:gMonth | xsd:gDay |
>                    xsd:gYearMonth | xsd:gMonthDay | xsd:time |
>                    xsd:dateTime
> 
> data.isoTemporal = [if & when a datatype library is written, plug it
>                     in here; in the meantime, a bunch of gnarly
>                     regexes might do the trick.]
> 
> data.usrTemporal = xsd:token [3] or whatever user chooses to use
> 
> (Latter two could easily be rolled into one 'data.looseTemporal'.)
> 
> The user, at schema-creation time (perhaps with easy radio buttons in
> Roma) chooses which datatype to use for any given attribute. (A nice
> UI feature would allow user to select a datatype for *all* dating
> attributes at one shot.)

At the moment, I am inclined to go with this solution.  It seems to hide 
most of the complexity for standard use cases, but gives the building 
blocks for more flexibility, if needed.

What still bothers me is
 > The rule in P4 for all of the attributes that held a date/time value
 > is pretty simple. It boils down to "if you can use ISO 8601, do so; if
 > not, document whatever you do in <stdVals> in the header".

Now in P5, we do everything in the ODDs.  Do we still require this 
documentation, or do we just assume that the ODD will be available? 
This brings us once again to the eternal question of how to link from 
the instance to the ODD that governed the creation of the schema 
according to which the instance validates (the instance's grandmother, 
so to say)?

best, chw




More information about the tei-council mailing list