[tei-council] Another question....

Laurent Romary laurent.romary at loria.fr
Tue Jun 9 09:25:58 EDT 2009


In a way <seg> has no prejudice on the kind of granularity that people  
would have in mind. If we think of things like dialogue acts (like  
discussed two weeks ago in an ISO/ TC 37/SC 4 meeting), we may need  
contents like <q> or <list> (remember that you can have a list within  
a sentence).
Saying this, I have difficulties to convince myself that this is a  
good argument for preventing a better coherence within model.segLoke  
objects...

OK, it goes in the direction of a "+1"...


Le 9 juin 09 à 14:03, Lou Burnard a écrit :

> OK, that's cool.
>
> Next question: what about <seg>?
>
> At present, its content model is macro.paraContent. All the other  
> members of
> model.segLike have a content model of macro.phraseSeq (the  
> difference is that
> macro.paraContent permits additionally members of model.inter such  
> as lists)
>
> I think this is anomalous. Can anyone come up with a specific use  
> case where a
> <seg> should need to include things that can appear between  
> paragraphs? note
> that it cannot currently contain paragraphs!
>
>
>
> aurent Romary <laurent.romary at loria.fr> writes:
>> I would definitely avoid breaking this and not impose too many
>> constraints at the level of the guidelines. The flexibility we have  
>> at
>> present should be counterbalanced by annotation projects defining
>> precise encoding manuals, depending on the kind of precision and  
>> depth
>> they want to acheive. We had a long discussion on what <pc> should be
>> equivalent to, namely <w> or <c> and I am still not sure that we
>> should restrain to <w>.
>>
>> Le 9 juin 09 à 13:36, Lou Burnard a écrit :
>>
>>> At present, segments smaller than <w> (e.g. <c>, <m>) and those
>>> larger (eg <phr>)
>>> are all members of model.segLike. This means that (inter alia) the
>>> following are
>>> all valid
>>> (i) <phr> <c/> </phr>
>>> (ii) <phr> <w> <c/> </w> <c/> </phr>
>>> (iii) <c> <w> <phr/></w></c>
>>> While we can all agree that (iii) is barking mad and should be
>>> stopped,
>>> it's less clear what to do about (ii). On the one hand, we do now  
>>> have
>>> a nice new <pc> element for punctuation, which could be defined as a
>>> sibling for <w>, so that (ii) could be replaced by
>>> (ii*) <phr> <w> <c/> </w> <pc/> </phr>
>>> On the other hand, in the absence till now of <pc>, there are
>>> literally millions of words of corpus out there which follow the
>>> pattern of (ii). Do we really want them all to become broken?
>>> I think not, but maybe (as one of the perpetrators) I'm biassed.
>>> _______________________________________________
>>> tei-council mailing list
>>> tei-council at lists.village.Virginia.EDU
>>> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>>



More information about the tei-council mailing list