[tei-council] entities delenda sunt?
Lou Burnard
lou.burnard at oucs.ox.ac.uk
Sun Oct 7 06:00:03 EDT 2007
Syd Bauman wrote:
>> Thanks for the suggestion. Yes, of course, they are gaijin. I have
>> hacked the text accordingly.
>>
>
> I'm not 100% sold on this use of <g>. I don't have any way to access
> the Gavioli and Mansfield source, but were they really using
> non-standard characters to indicate things like "lengthened
> syllable", or rather were they using standard characters with
> non-standard semantics? If the former, great -- gaiji to the rescue!
> But if the latter, I'm not sure I'm comfortable with <g>. After all,
> when I use "a" as a variable instead of an indefinite article I don't
> use <g>, I use <code>.
>
>
In Gavioli and Mansfield (the book) they used conventional punctuation
characters or other features (like underlining) in unconventional ways.
For example, they use the hyphen to mean "preceding syllable cut short"
and the sequence "space hyphen" to mean "preceding tone group
interrupted". Some of their conventions map directly to elements we
already have (e.g. they use + to mean <pause>, and underlining to mean
<emph>) ; others don't. I think it would be a really nice project to go
back and retrofit some of the things we don't have proper elements for
into this module, but not for 1.0
My understanding of Unicode is that the distinction you seem to be
making between "nonstandard characters" and "standard characters with
nonstandard semantics" is a bit dubious. A "character" as distinct from
a "glyph" (notably) has semantics. Unicode distinguishes the *character*
aleph-meaning-infinity from the *character* aleph; it defines the
"number separator" character independently from its two (or more) glyph
variants "," and "." So the fact that G&M use comma to mean
"fall-rise-intonation-detected-here" makes it plausibly a different
character in my view. (I also considered whether it might be regarded as
a glyph variant, but then I would have had to decide what it was a
variant on which seemed difficult)
The alternative would be to introduce a whole lot of "user-defined"
elements, which seemd pedagogically confusing at this point. And a lot
more work to undo when we do get round to defining a better set of
elements for transcription!
More information about the tei-council
mailing list