[tei-council] let's sort out the <w> problem first...

Tue Aug 5 16:18:47 EDT 2008

There is long standing discontent about the content model of the
model.segLike elements <m> , <w>, <phr>, and <cl>. These are intended to be
specialisations of <seg> (which is also a member of the model.segLike
class). However, since <m> and <w> are meant to be used for
segmentation at or below the single "word" level, they have different
content models. Specifically, these two permit a mixture of text,
model.gLike, model.global, and model.segLike elements only, whereas the
others contain macro.paraContent. The intention is to prevent nonsense
like the introduction of a <list> within a word, but the cost is that
useful tags like <hi> or <am> are also not available. It's not
unreasonable at all (as several have pointed out) to want to use such
tags at a sublexical level; yet the only way to do so at present is to
wrap the content of the <w> within a <seg> (which being a member of
model.segLike is permitted!). So you cannot say <w>M<am>.</am></w> --
but you can say <w><seg>M<am>.</am></seg></w>. Which just looks silly.

There seem to be two possible solutions (if you agree that the status
quo is broken)

a. change <w> and <m> to have macro.paraContent, like all the other
model.segLike elements
b. permit these two to contain an appropriate subset of  "sublexical"
elements rather than model.segLike

My preference would be for the latter. I suggest that the appropriate
subset would consist of
- current members of model.pPart.edit
- current members of model.hiLike
Whether this should consitute a new "sublexical" class is probably
best left to the next revision of the class system, however.

Opinions? counter-suggestions?  cries of "about time too"?