[tei-council] how to encode a hyphen at the end of a line, column, or page when you are encoding hyphens
kevin.s.hawkins at ultraslavonic.info
Mon Dec 27 20:51:01 EST 2010
Finally getting back to this! See below ...
Lou Burnard wrote:
>> I find "inWord" and "nobreak" entirely non-intuitive
> "inWord" seems fairly obvious to me. More significantly perhaps, it was
> the value which the Epidockers agreed on after a fairly heated debate.
My problem with "inWord" is that, without further explanation, I'm not
sure whether it only applies to cases like:
UTF-8 is a char-
acter encoding for Unicode.
or also to:
This is not a run-
That is, I'm not sure whether we're talking about orthographic or
If some explanation can be added to P5 on this point, I'll probably be
much happier with it.
> Maybe "inToken" or "internal" ?
Without further explanation, I find these opaque too. You see, "inWord"
sounds like something internal to a word, and if that's true, how is
>> I prefer these values for type=:
>> * lexicalBoundary
>> * noLexicalBoundary
>> * uncertainLexicalBoundary
> I am not comfortable with "lexical" here, because where I come from
> "lexical entries" may include multiple "tokens". If I treat "apple pie"
> as a lexical entry, and there happens to be a <lb/> between the "apple"
> and the "pie" I don't think I'd mark the <lb/> any different from any
> other. I think we should stick with the idea that line-end hyphenation
> (or not) is to do with simple minded orthographic tokens, not tricky
> things like lexical items.
>> However, these may not be expressive enough for everything you'd like to
>> encode. Paul Schaffner provided the following examples (which I've
>> a) street<lb/>walker -- line break between components of a usually
>> non-hyphenated compound
> Not sure what a "compound" is here. For me, the critical point is
> whether elsewhere in this text I find, or expect to find,
> "streetwalker" (in which case the <lb/> is "inWord") or "street walker"
> (in which case it isn't). And if I don't want to take a stand either
> way, then it is "undecided".
By "compound", I meant a compound word, such as "policeman",
"must-have", "ice cream", or "street walker".
Aside from this, my recollection of Dublin has faded significantly, and
I don't have any strong feelings on this except to give people clear
instructions they can follow that tells them what to do. I think that's
what Martin Mueller is looking for as well.
More information about the tei-council