[tei-council] DCR alignment inside ODD

Piotr Bański bansp at o2.pl
Thu Apr 26 11:51:55 EDT 2012


Thank you, Sebastian and Laurent,

There's several issues here. (I concentrate on Laurent's post and will
reply to Sebastian's separately)

Firstly, you (Laurent) want the most radical move, i.e., making the
datcat attributes available to *all* elements. Two comments:

* I had an impression that this was not accepted by the Council, and
that the recommendation was to go from the bottom up, i.e. to select the
items that need the datcat stuff, and possibly increase the scope as
needed. I agree that the radical solution would simplify the issue,
given that ISOcat is able to provide alignment for practically any kind
of data category, not just those purely linguistic. Perhaps we could
reopen the discussion on that, I'd be happy to see att.datcat where you
suggest.

The title of the ticket is general, but the description may indeed
suggest that this is a proposal restricted to linguistic stuff:

https://sourceforge.net/tracker/index.php?func=detail&aid=3432520&group_id=106328&atid=644065


* if you assume that att.datcat are global, why on earth NOT use them on
<equiv>, where's the consistency? Sure thing, equiv has the @uri
attribute which was, or could be, used for DCR alignment, since there
was no other tool to do it. But if you postulate global datcat
attributes, I see it as inconsistent and counterintuitive to demand that
on <equiv> alone, DCR alignment is to be handled by @uri rather than the
available datcat attributes.

Secondly, yes, I know the example, it's nice until you imagine lots of
<fs> at the POS layer (take any serious corpus out there), at which
point it stops being nice and becomes seriously overredundant, and makes
you think of shifting the DCR stuff at least to the level of FSD.
Granted, FSD is not quite there still (sigh), so keeping all the stuff
within <f> is an unhappy temporary solution, good for presenting as one
of the examples in the spec, but maybe not necessarily in the
Dictionaries chapter.

Sure thing, I can do the <equiv>alence for POS in the ODD, and then do:
<pos dcr:valueDatcat="http://www.isocat.org/datcat/DC-1256">CN</pos>
(for "common noun"), except that it's a variant of the problem with <f>,
namely redundancy, very clear in the context of a dictionary. It is also
a problem of a split mechanism (<equiv> for containers vs. local
dcr:valueDatcat for values) instead of a unified mechanism.

Let me make sure it's clear what I consider redundancy in this very
case: <pos> or @name="part of speech" have to be repeated many times,
that's OK. But if we add the dcr: stuff, then, together with the "local"
identifiers, we repeat the "global" identifiers, in every place
affected, instead of saying once, either in the ODD, schema, or header:
<pos> = "http://www.isocat.org/datcat/DC-1345", and then using <pos>,
with its meaning now clarified.

Still, I'm grateful for the replies and discussion because it took away
my doubts concerning the here-and-now: it's better to have the DCR stuff
officially in the TEI than create roundabout solutions of the type I
talked about in Zadar and implemented in FreeDict. For the reasons that
I gave above, it feels to me like a half-way solution, but still, as we
all know, it's better to have it than not to have it, and I will now put
some example into the DI chapter (maybe even without mentioning <equiv>
for the time being), and will be happy to make a step forward. I think I
wanted too much too soon (and feared about how overwhelming it might
become, and that it goes beyond just a brief Council discussion that
we've had).

Best,

  P.


On 26/04/12 09:44, Laurent Romary wrote:
> I guess you are currently working on 3432520
> 
> There are two distinct mechanisms here:
> - the normal use of <equiv> within an ODD spec
> - the on-the-fly declaration of equivalence on an element instance ("I
> used <pos> here, meaning exactly the POS in ISOCat")
> For the latter purpose, ISO 12620 introduces two attributes in the dcr:
> namespace, for instance (example provided by Menzo in CC), you can
> decorate an FS as follows
> <tei:TEI xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:dcr="http://www.isocat.org/ns/dcr">
>     ...
>     <tei:fs>
>         ...
>         <tei:f 
>             name="part of speech"
>            dcr:datcat="http://www.isocat.org/datcat/DC-1345"
>             fVal="common noun"
>             dcr:valueDatcat="http://www.isocat.org/datcat/DC-1256"
>         /> 
>         ...
>     </tei:fs>
>     ...
> </tei:TEI>
> 
> 
>  So, looking again at the ticket the situation is clear, you
> make att.global a member of att.datcat, but make clear in the guidelines
> that this does not replace <equiv>
> 
> 
> Le 25 avr. 2012 à 22:59, Sebastian Rahtz a écrit :
> 
>>
>> On 25 Apr 2012, at 21:40, Piotr Bański wrote:
>>
>>> I'm working on the ISO DCR / ISOcat issues.[1] Got stuck at the point of
>>> adding the relevant pieces of text to the Guidelines.
>>>
>>> The enlightened way to align grammatical categories with the values of
>>> the DCR is to put the appropriate references into the ODD, and I guess
>>> <equiv> is the ideal place for that.
>>>
>>> I imagine, and please correct me if I am wrong, that for elements such
>>> as <pos>, this action may be trivial:
>>>
>>> <elementSpec ident="pos" mode="change">
>>>  <equiv dcr:datcat="http://www.isocat.org/datcat/DC-1345"/>
>>> </elementSpec>
>>
>> <equiv url="http://www.isocat.org/datcat/DC-1345"/> is the syntax, I
>> think.
>>
>>> The above makes it possible for us to happily realize that whenever we
>>> do e.g.
>>>
>>> <gramGrp><pos>...</pos></gramGrp>
>>>
>>> all the machines in the world may know that by <pos>, we mean
>>> http://www.isocat.org/datcat/DC-1345 .
>> well, if they read the ODD yes. I think there is a certain amount
>> of "simple matter of programming" involved here.
>>
>>>
>>> However, there is also the content of <pos> to be handled, and it is not
>>> so obvious to me how to represent this in the ODD. Intuitively, I'm
>>> thinking of
>>>
>>> <elementSpec>
>>> ...
>>> <content>
>>> {list of values with their DCR references}
>>> </content>
>>
>> a <elementSpec> can contain a <valList>, whose <valItem> children
>> can have <equiv> children
>>
>> Does that help?
>>
>> I suspect what you'd really like is to use a DTD which supplied
>> default dcr:cat attributes to
>> instances of <pos>.
>>
>> Sebastian
>>
> 
> Laurent Romary
> INRIA & HUB-IDSL
> laurent.romary at inria.fr <mailto:laurent.romary at inria.fr>
> 
> 
> 



More information about the tei-council mailing list