[tei-council] how to encode a hyphen at the end of a line, column, or page when you are encoding hyphens

Martin Holmes mholmes at uvic.ca
Wed Jan 5 18:10:06 EST 2011


Hi Lou,

On 11-01-05 01:06 PM, Lou Burnard wrote:
> Each of us will have their own preferences, and these may well be
> different for different types of text or different types of application.
> I don't think that's a disaster, is it?

Absolutely not. But I think we do have to provide a detailed set of 
suggestions for (what we consider to be) the most all-round useful and 
effective way of encoding linebreaks and hyphenation, when those things 
matter to you. There will be people making decisions about how to encode 
this kind of thing in TEI who don't really know enough about the 
implications of their decisions down the road (how easy it will be to 
tokenize text effectively, for instance). We owe it to those people to 
find (if we can) a set of practices that we consider to be optimal.

Cheers,
Martin

On 11-01-05 01:06 PM, Lou Burnard wrote:
> On 05/01/11 17:03, Martin Holmes wrote:
>
>>
>> Do we believe that the existence of a hyphen, doubling, etc. should be
>> expressed through character data external to the break, or should it be
>> expressed through @rend? In other words:
>>
>> help-<lb/>ful
>>
>> or
>>
>> help<lb rend="hyphen"/>ful
>>
>
> I fear I don't think this is a question for voting on. It is clear, if
> you look back through the discussion, that there is simply no consensus
> in the community. For some people, it's obvious that you must try to
> preserve the way hyphenation occurs in the text; for others, it's
> equally obviously either of no importance or entirely counter productive
> to do so. Very good and persuasive reasons can be amassed on either
> side, and  have been.
>
> But this shouldn't depress us! We should simply recognise it  is another
> instance of the generally liberal attitude the Guidelines try to defend.
> Peoples' needs and priorities vary. I think we can still provide helpful
> guidance by saying:
>
> 1. You should probably not falsify the text, so do distinguish in some
> way "helpful" which has been split across a line break from "helpful"
> which has not been so divided
>
> 2. It's up to you whether you want to indicate the presence of the
> "metacharacter" hyphen (or whatever) and there are two ways you could do
> that:
>
> (a) symbolically (@rend="hyphen")
>
> (b) explicitly (in which case you need to find the right Unicode character)
>
> This is almost exactly what we already recommend for quotation marks:
> we provide explicit tags to distinguish "quoted" from quoted (several of
> them, in fact); you can retain the quote marks themselves if you like;
> you can replace them with a description supplied as the value for @rend.
>
> Each of us will have their own preferences, and these may well be
> different for different types of text or different types of application.
> I don't think that's a disaster, is it?
>
>
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>

-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)


More information about the tei-council mailing list