[tei-council] Note on 2372570, "element for punctuation marks"

Lou Burnard lou.burnard at oucs.ox.ac.uk
Sun Mar 29 16:06:15 EDT 2009


I just want to point out that the "level of abstraction" purportedly 
introduced by the entity reference markup proposed in P4  was a snare 
and a delusion, like any other semantic markup based on the use 
character entity names, which is why we got rid of it. If you want to 
introduce semantic distinctions amongst your punctuation marks, you must 
do it with proper XML markup constructs like distinct elements, 
attribute values or whatever.

Without wishing to prejudge the issue, I remain to be convinced that we 
need something different from <g> or <c> to do this. It's hard enough 
explaining the difference between <g> and <c>. I'm not looking forward 
to explaining why the TEI has three different ways of tagging 
punctuation marks.



  David Sewell wrote:
> As an addendum to this note, if you compare the P3 section on "Treatment 
> of Punctuation"
> 
>    http://www.tei-c.org.uk/Vault/GL/P3/CO.htm#COPU
> 
> with the P5 version
> 
>    http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COPU
> 
> it seems that the TEI-specific entity names in P3 offered a level of 
> abstraction for punctuation marks that we have lost in P5 and that the 
> proposed <punct> element would restore.
> 
> (For what it's worth, the word "punctuation" has occurred in only 3 
> different threads since 1997 on TEI-L.)
> 
> David
> 
> On Sun, 29 Mar 2009, David Sewell wrote:
> 
>> ["Adopt-a-RED" note]
>>
>> Submitter: Alexei Lavrentev
>>
>> Reference:
>> http://sourceforge.net/tracker/?func=detail&aid=2372570&group_id=106328&atid=644065
>>
>> DISCUSSION:
>>
>> Alexei has presented an extended argument for the addition of a <punct>
>> element in his revised TEI MM 2008 paper, available here:
>>
>> http://sourceforge.net/tracker/download.php?group_id=106328&atid=644065&file_id=303629&aid=2372570
>>
>> I think we're going to need to talk about this face to face; I don't
>> feel competent to make a yes/no recommendation without further
>> discussion. I would urge everyone to read the paper ahead of our meeting
>> (it's only 5 pages), out of courtesy to Alexei if nothing else, as he is
>> one of the local organizers.
>>
>> My own opinion is that he makes a strong case for a dedicated
>> punctuation element. I think he's right that within the context of
>> Linguistic Segment Categories (17.1), <c> makes much more sense as
>> markup for characters that can be part of words or morphemes. I'm not in
>> a position to evaluate his arguments based on automated language
>> processing or medieval manuscript practice, but I can immediately see an
>> application in an area I'm more familiar with, encoding of manuscript
>> verse by Emily Dickinson. Her use of the dash is notorious for its
>> polysemy, and attempts have been made to characterize different types of
>> dash. From that point of view there is no single dash "character",
>> rather a set of idiosyncratic punctuation markers that may or may not
>> take identifiable distinct forms. Having a <punct> element available
>> would simplify interpretive markup of her verse.
>>
>> David
>>
>>
> 



More information about the tei-council mailing list