[tei-council] Fwd: TITE again

Sun Jul 15 12:29:06 EDT 2012

This makes me rather uncomfortable, because we're unambiguously 
importing presentational elements into the schema. On the other hand, I 
have to admit that this>:

<b>stuff</b>

is simpler, clearer and more user-friendly than either of these:

<hi rend="bold">stuff</hi>
<hi rend="font-weight: bold;">stuff</hi>
<hi style="font-weight: bold;">stuff</hi>

So I think on balance I like Sebastian's idea of a "kiss" module 
containing these elements, but I think we obviously have to keep @rend 
(and I still think we need @style for CSS). I disagree that the list 
should be as short as possible, though; if we include <sup> but not 
<sub>, for instance, we'll just get lots of feature requests which 
rightly point out that such a choice is essentially arbitrary -- unless 
we're prepared to do a huge amount of research and statistical analysis 
to support the exclusions.

There is one other option, though: the kiss module could explicitly 
import elements from the XHTML namespace.

Cheers,
Martin

On 12-07-15 04:38 AM, James Cummings wrote:
>
> For your consideration.
>
> -James
>
>
> -------- Original Message --------
> Subject: 	TITE again
> Date: 	Sun, 15 Jul 2012 10:32:41 +0000
> From: 	Martin Mueller <martinmueller at northwestern.edu>
> To: 	James Cummings <James.Cummings at OUCS.OX.AC.UK>
>
>
>
> Dear James,
>
> As you know, some time ago I raised the question whether the TITE
> convenience elements <b>, <i>, <u>, <sup>, and <sub> should
> become part of P5 on the grounds that <i> relates to <hi
> rend="italics"> in the same in which <lb/> relates to <milestone
> unit="line"/>. There followed a lively discussion on the Council
> list, which you accurately summarized as not very conclusive.
>
> I'd like to come back to this discussion and argue that on
> balance the case for 'yes' is a little stronger than the case for
> 'no'. Please take this letter to the Council. If you think it
> would be helpful to put it on TEI list please do so.
>
> I read through the thread again in the particular context of
> wondering how many of the varied and commons superscripts in the
> TCP texts could be expressed through Unicode characters, a
> possibility raised by Lou Burnard. There were other comments
> Piotr Banski, James Cummings, Martin Holmes, Kevin Hawkins,
> Sebastian Rahtz, and Paul Schaffner.
>
> The thread consisted of a mixture of theoretical and pragmatic
> arguments. On the more theoretical side, James, Piotr, and Gabby
> had reservations about mixing up semantic and presentational
> markup, coming too close to HTML, or encouraging encoders to be
> lazy. Piotr shared James's sense that <lb> and <pb> were somehow
> different from <i> or <sup>. I don't see the difference, but I
> respect such intuitions and recognize that they are hard to
> resolve by argument.
>
> On the pragmatic side, Kevin, Paul, and Sebastian argued in
> favour of various options for inclusion. Paul said that elements
> like <i> and <sup> have a "reassuring rootedness in actual page
> phenomena." In that regard, they may be like line or page breaks:
> you can't really argue about "the fact that." But most page or
> line breaks have a compelling reason: there is no more space.
> Italics and similar phenomena are never compulsory in that way
> (perhaps that is the reason why you and Piotr think they are
> different. They must have a reason even if it is hard to figure out).
>
> If you admit things like <i> or <sup>, where do you stop? Paul
> raised that question when he said that from the TCP perspective
> the list of TITE elements was inadequate and <b> wasn't needed.
> Implicit in Sebastian's comments, I think, is the argument that
> Frequency is King and a good enough guide to a tightly limited
> set of canned options. Sebastian suggested a kiss module (keep it
> simple stupid?) of i/b/u/bl/larger/smaller/sup/sub and removing
> the rend attribute.
>
> Martin Holmes replied that you'd always need a rend option to
> cover eventualities and wondered whether there would be a
> continued stream of feature request for more canned options.
> Kevin on the other hand argued that a limited set of canned
> options helps the cause of interoperability.
>
> To return to Lou's suggestion about superscripts, it turns out
> that you can represent a high percentage of superscripts in the
> TCP texts (perhaps as many as 98% of tokens) with Unicode
> characters. But there doesn't seem to be a superscripted 'c',
> which rules out the common w<sup>ch</sup>, there is no lower-case
> 'i' (so much for the common superscripted forms of 'Majestie'),
> and there is no superscripted period sign, which is also common.
> But 98% is 98%, and the various Unicode characters, although
> cobbled together from different lists, play well with each other
> in browser displays.
>
> Where does that leave us? In some ways, the question of
> convenience or syntactic sugar elements is like that of
> 'favourites' on Windows or OS X. Commonly used directories are
> freed from their status in the hierarchy and given a sort of VIP
> treatment. This works as long as two conditions are met:
>
>    1. The list must be short
>    2. The candidates must be really obvious in terms of their
>       frequency across a lot of different document types
>
> I would say that <i> and <sup> clearly meet the second criterion.
> If you're not bound by Systemzwang you would probably throw out
> <sub>, because it is much less common than <sup>. The TITE
> inclusion of <b> and <u> may have more to do with HTML than with
> conventions of print culture. So I'd urge the Council to go with
> the pragmatists and come up with a really short list that covers
> a high percentage of cases and is worth the trade-off of giving
> up a little consistency for a lot of convenience.
>
> Perhaps there is no such list. But I think there is.
>
>

-- 
Martin Holmes
mholmes at uvic.ca
UVic Humanities Computing and Media Centre