[tei-council] DCR alignment inside ODD
Piotr Bański
bansp at o2.pl
Thu Apr 26 14:10:29 EDT 2012
[keeping Laurent in the loop, as he requested]
On 26/04/12 18:48, Lou Burnard wrote:
> On 26/04/12 17:12, Piotr Bański wrote:
>
> [... snip ... ]
>
>>> a<elementSpec> can contain a<valList>, whose<valItem> children
>>> can have<equiv> children
>>>
>>> Does that help?
>>
>> Some. Thanks. I looked at valItem but the description made me shy away
>> from it ("contains one or more valItem elements defining possible values
>> for an *attribute*") -- it made me think that using it for element
>> content is Bad.
>
>
> I'd say that description is erroneous and should be revised. Please put
> in a ticket.
Done.
https://sourceforge.net/tracker/?func=detail&aid=3521714&group_id=106328&atid=644062
>>> I suspect what you'd really like is to use a DTD which supplied default dcr:cat attributes to
>>> instances of<pos>.
>>
>> I'm not sure how to handle this in DTDs. default dcr:datcat pointing at
>> a definition of the POS, sure. But I can't see how to use this approach
>> for the values (noun, verb, etc.), maybe I'm missing something again.
>>
>
> I am coming to this discussion under-prepared, but for what it's worth,
> it seems to me that if what you want is to say "my <pos> elements all
> have content/values defined by the ISO DCR", you certainly don't need to
> say it on every <pos> occurrence. You could either say it in your ODD
> using <equiv> (as previously noted), or you could also say it in the
> <encodingDesc> somewhere. Similarly if you wanted to say that for your
> @type attributes or anything else. But this seems different from saying
> that your @type attribute or <pos> element itself is defined by the ISO
> DCR.
I want to say about <pos>noun</pos> that:
1) the concept expressed by <pos> is this-and-that Data Category kept at
PID X (that's the dcr:datcat pointing at the definition of
"part-of-speech"), and
2) the value of that POS is this-and-that Simple Data Category kept at
PID Y (that's the dcr:valueDatcat pointing at the definition of the
concept "noun").
(note that I am restricting this to linguistic examples, but you can
have just as well Data Categories for the concept of "author" or "sex",
or "trochee", etc., with the same reference machinery -- this is why
Laurent wants them global)
In particular, I would like to know that when dictionary A says that
something is "fem", dictionary B that it is "f", and C that it is
"feminine" (or "ż", "żeń.", or "weibl.", etc.), they all talk about the
same value of the category "Gender" (so I use dcr:datcat for the concept
"Gender", and valueDatcat for the concept "Feminine").
Conversely, when one dictionary tells me that something is "n", and
another that something else is "n", I want to make sure to indicate that
the first one talks about the concept "noun", but the other about the
concept "neuter", so I don't want to combine them in my search, or in my
combined mega-dictionary.
So it's not just about saying that "my pos elements have content defined
by the ISO DCR", but I need to be more granular, and actually identify
the concepts by their PIDs. I could indicate that to humans by e.g.
"neut" and to machines by the appropriate valueDatcat, at the same time
-- this is roughly the extension of the <f> example mentioned by
Laurent. And I guess this is the stage which can be encoded in the
Guidelines right now.
<gen dcr:datcat="{PID of 'Gender'}"
dcr:valueDatcat="{PID of 'Neuter'}">neut</gen>
-------------------------------------------
What I talked about in Zadar was a way to state, just *once* per
dictionary, that "wherever I use "neut" below as the value of <gen>, I
mean this-and-that DC under this PID". So in the body of the dictionary,
one would only use "neut" (incidentally human-readable and short), but
the header would tie this string appropriately to the relevant PID. I
guess that this is a matter for at least one Council session, and I hope
that LingSIG will come up with a coherent proposal, hopefully around
College Station or Oxford, whichever comes first.
best,
P.
More information about the tei-council
mailing list