[tei-council] word-dividing
Lou Burnard
lou.burnard at oucs.ox.ac.uk
Tue Jun 30 15:43:28 EDT 2009
Gabriel BODARD wrote:
> Lou Burnard wrote:
>>> (9) lb: should we add an example of the usage of
>>> lb/type=word-dividing, which currently sits a little uncomfortably in
>>> the note. I suggest "Cae<lb type="worddiv"/>sari".
>> Don't know what note you're referring to. Don't see the point of the
>> @type attribute. Haven't done anything.
>
> This was discussed some months ago, and is the reason @type was allowed
> on <lb> in the first place. There is currently a note at the bottom of
> LB that says: "The type attribute may be used to characterize the
> linebreak in any respect, for example as word-breaking or not." We have
> literally thousands of examples of this in EpiDoc files, where words are
> not always tagged explicitly and it's the only way we can be sure to
> tokenize correctly. I just thought an example would help to clarify the
> use-case.
>
> (If people feel strongly that [e.g.] "wordDividing" would be a better
> recommended value than "worddiv", I'm happy to make that part of our P5
> upgrade script.)
>
I don't mind adding examples, but this one confuses me. Isn't the point
that the <lb/> in your example does NOT divide the word ? so both
"wordDividing" and "worddiv" seem exactly the opposite of what you want
here. How about "nowordbreak" or "nwb"?
I know I lost this argument last time, but I still think in practice I'd
deal with this by putting in whitespace where the <lb> coincided with a
word boundary and leaving it out where it didn't!
> Best,
>
> G
>
More information about the tei-council
mailing list