[tei-council] two <date> proposals: 1 lumping, 1 splitting

Lou Burnard lou.burnard at computing-services.oxford.ac.uk
Sun Oct 8 10:37:35 EDT 2006

A week or two ago, James and I had a discussion about these  and related 
issues which I never got round to posting here. To sumarize briefly, we 
thought it would be helpful

Syd Bauman wrote:
> lumping
> -------
> We currently have two attribute classes for attributes directly
> related to dates & dating:
>   att.datePart (should be 'att.dated') --
>   provides value= and dur= to:
>       <date>, <day>, <distance>, <hour>, <minute>, <month>,
>       <occasion>, <second>, <time>, <week>, <year>
The class was called att.datePart because the members of it were all 
also members of model.datePart, i.e. they could appear as subcomponents 
of a <date>. (yes a date can appear within a date, as in "the monday 
after my birthday" or
<date>the<day>monday</day>after<date>my birthday</date></date>)

I suggested and James did not violently dissent from the view that maybe 
this degree of tagging was a bit unhealthy, and that we might consider 
either silently dropping these elements from the already rather bloated 
ND chapter, or giving them only as an example of the sort of extension 
someone might like to make. We've been ridiculed in some quarters for 
this sort of "markup voodoo" and I don't know of any single case where 
it's been used in earnest. These are not conclusive arguments for 
removing these elements, I agree.

>   att.datable --
>   provides notBefore=, notAfter=, from=, and to= to:
>       <acquisition>, <affiliation>, <binding>, <birth>, <custEvent>,
>       <date>, <death>, <education>, <faith>, <floruit>,
>       <langKnowledge>, <langKnown>, <nationality>, <origDate>,
>       <origin>, <persEvent>, <persName>, <persState>, <persTrait>,
>       <provenance>, <relation>, <residence>, <seal>, <sex>,
>       <socecStatus>, and <time>
> It has been suggested that these two classes be merged. While at
> first look this seems to me like a very good idea. After all, any
> element that is in att.datable is there so one can describe rough
> dates about it. If one knew an exact date, one would be happy to use
> it, and
>     value="2006-09-10"
> is a lot nicer than 
>     notBefore="2006-09-10" notAfter="2006-09-10"


> But on closer examination, it certainly makes no sense to have a
> value= attribute on <persName> or <langKnown> that is a date!
I am not so sure about that. Doesn't it mean that we can locate the name 
or language knowledge to a specific date, but don't wish to claim 
anything about how long the state of affairs maintained?

There is a general ambiguity about whether a date range is meant to be 
interpreted as a span of time with exact start and end, or as some point 
in time within that span. Does notBefore="jun" notAfter="aug" mean "the 
whole period from 1 Jun to Aug 31" or "some point in time between those 
two dates"? Presumably this depends on what being dated -- if we are 
talking about say a birth (a persEvent), then it's more likely we mean 
the latter; if a name (a persTrait), it is probably the former. 

I think we are stuck with that ambiguity and to be honest it doesn't 
really worry me very much. I am more concerned to try to find some way 
of reducing the explosion of attributes for dating which will come by 
combining all the ideas so far mooted. See separate note on this topic, 
which will come in a moment.

> ---------
> Currently we support only W3C recommended formats for normalization
> of date & time format via data.temporal and data.duration, with one
> exception that I insisted on: that times can be expressed with
> reduced precision.
> Emerging from a conversation with the author of a date-related
> feature request is an idea to simultaneously support both W3C
> recommended and ISO standard formats. We would have two different
> formats, data.temporal.w3c and data.temporal.iso. The former would
> just use the W3C recs w/o that horrific regexp to permit reduced
> precision times. The latter would be full of horrific regexps[1] to
> constrain it to the ISO 8601:2004 standard (which permits quite a few
> things W3C does not, e.g. times w/ reduced precision, the year "0",
> time spans indicated by duration & end-point, etc.).
> Users could then choose, in their ODD, whether they wanted to go the
> easy but limited W3C route that has guaranteed software support, or
> the kitchen sink but no software ISO route. We would get to argue
> over which is the default.

I think it would be an excellent idea to have a datatype which is more 
restrictive than the current data.temporal and which makes no bones 
about being so. I think we should also provide a user-definable datatype 
-- or possibly a set of them for commonly used varieities. And I think 
we should provide a pair of attributes, one which is guaranteed to 
always have a data.temporal.w3c value and another which is guaranteed to 
always have a data.temporal.myNormalisationStyle value. More on this in 
my next post as well.

More information about the tei-council mailing list