[tei-council] soft hyphens (again)
Martin Holmes
mholmes at uvic.ca
Mon Jun 28 12:17:00 EDT 2010
Taking Paul's examples:
<phr>street<lb/>walker</phr> between components of a non-hyphenated cpd
<phr type="hyphenated">bag-<lb/>lady</phr> between components of a usu.
hyphenated cpd
<w>win-<lb/>some</w> between syllables (or morphemes) in a single word
<w>iP-<lb/>hone</w> word-internal breaks (misplaced according to usual
rules*)
gentle<lb/>man may or may not be regarded as a compound
<w>abusive</w>-<lb/><w>tagger</w> between words
Lou responded to my previous message like this:
> But the issue currently on the table is what to do about LINEBREAKS. As
> I said in an earlier post, it isn't necessarily a hyphen character which
> is used to mark where a word (despite appearances) runs on to the next
> line. It may be something else entirely. It may be nothing at all.
At the risk of another roasting, I still think that the linebreak tag is
the wrong place to supply information about
whatever-it-is-that-is-being-broken (word, phrase or whatever) and
whatever-it-is-that-is-signalling-the-break (hyphen or whatever). The
linebreak tag says there is a linebreak in the text. The context, and
the glyph that precedes the linebreak, are not attributes of the linebreak.
I think it would be better to encourage the use of <w>, <phr> and other
inline-level tags to mark the context of the linebreak. Even if such
tags are not being used for any other purpose in a text -- or perhaps
_especially_ if they aren't -- they could be used for exactly this
purpose, and it's easy for a processor to detect when a
linebreak-signalling glyph or a linebreak tag occur within such contexts
and process accordingly.
Cheers,
Martin
On 10-06-27 01:53 PM, Kevin Hawkins wrote:
> street<lb/>walker between components of a non-hyphenated cpd
> bag<lb/>lady between components of a usu. hyphenated cpd
> win<lb/>some between syllables (or morphemes) in a single word
> iP<lb/>hone word-internal breaks (misplaced according to usual rules*)
> gentle<lb/>man may or may not be regarded as a compound
> abusive<lb/>tagger between words
--
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)
Half-Baked Software, Inc.
(mholmes at halfbakedsoftware.com)
martin at mholmes.com
More information about the tei-council
mailing list