[tei-council] soft hyphens (again)

Kevin Hawkins kevin.s.hawkins at ultraslavonic.info
Sun Jun 27 16:53:58 EDT 2010

On 6/16/2010 6:07 AM, Gabriel Bodard wrote:
> Has anyone ever had any use-case for characterizing linebreaks (and cb,
> pb, etc.) other than by whether they break works or not?

I asked Paul Schaffner about this, and he offered the following:

-- use of @type to distinguish word-breaking
     from in-word <lb>s seems to me a little strange
     to begin with. I should think that there are
     lots of other ways in which lines differ (and
     therefore their breaks differ) other than whether
     they occurr in a word-dividing position. And
     that some of those ways are a more natural fit
     for @type. E.g. ="forced" (by lack of space) vs.
     ="deliberate"; or "significant" vs. "insignificant";
     or "vertical" vs. (whatever--line breaks can
     appear between lines in all sorts of formatted
     text, e.g. chunks of a 'scroll'-style heading
     in engravings are most easily divided by <lb>s,
     even though one such 'line' does not sit neatly below
     the previous one).

-- the trio of "inWord" "betweenWords" and "uncertain"
     may not express all the options. One might want to
     distinguish (e.g.) (?)

     street<lb/>walker  between components of a non-hyphenated cpd
     bag<lb/>lady  between components of a usu. hyphenated cpd
     win<lb/>some between syllables (or morphemes) in a single word
     iP<lb/>hone word-internal breaks (misplaced according to usual rules*)
     gentle<lb/>man may or may not be regarded as a compound
     abusive<lb/>tagger between words

     (*this is the way that the WSJ breaks "iPhone" at line
     ends, for some reason)

     though I have to admit that when tagging inscriptions
     (or rather, when tagging transcriptions of inscriptions),
     my commonest need is to distinguish breaks that should
     be treated as word breaks, those that should not be,
     and those about which I have doubts.

More information about the tei-council mailing list