[tei-council] entities delenda sunt?

Lou Burnard lou.burnard at oucs.ox.ac.uk
Sun Oct 7 06:00:03 EDT 2007


Syd Bauman wrote:
>> Thanks for the suggestion. Yes, of course, they are gaijin. I have
>> hacked the text accordingly.
>>     
>
> I'm not 100% sold on this use of <g>. I don't have any way to access
> the Gavioli and Mansfield source, but were they really using
> non-standard characters to indicate things like "lengthened
> syllable", or rather were they using standard characters with
> non-standard semantics? If the former, great -- gaiji to the rescue!
> But if the latter, I'm not sure I'm comfortable with <g>. After all,
> when I use "a" as a variable instead of an indefinite article I don't
> use <g>, I use <code>.
>
>   
In Gavioli and Mansfield (the book) they used conventional punctuation 
characters or other features (like underlining) in unconventional ways. 
For example, they use the hyphen to mean "preceding syllable cut short" 
and the sequence "space hyphen" to mean "preceding tone group 
interrupted".  Some of their conventions map directly to elements we 
already have (e.g. they use + to mean <pause>, and underlining to mean 
<emph>) ; others don't. I think it would be a really nice project to go 
back and retrofit some of the things we don't have proper elements for 
into this module, but  not for 1.0

My understanding of Unicode is that the distinction you seem to be 
making between "nonstandard characters" and "standard characters with 
nonstandard semantics" is a bit dubious. A "character" as distinct from 
a "glyph" (notably) has semantics. Unicode distinguishes the *character* 
aleph-meaning-infinity from the *character* aleph; it defines the 
"number separator" character independently from its two (or more) glyph 
variants "," and "."  So the fact that G&M use comma to mean 
"fall-rise-intonation-detected-here" makes it plausibly a different 
character in my view. (I also considered whether it might be regarded as 
a glyph variant, but then I would have had to decide what it was a 
variant on which seemed difficult)

The alternative would be to introduce a whole lot of "user-defined" 
elements, which seemd pedagogically confusing at this point. And a lot 
more work to undo when we do get round to defining a better set of 
elements for transcription!







More information about the tei-council mailing list