[tei-council] on <editorialDecl>

Wed Jan 30 11:15:43 EST 2008

[My apologies -- I've already been traveling for 2 weeks this year,
and as a result am a bit behind on reading Council threads. If this
has already been brought up, please just refer me to the recent
conversation.]

The P5 content model of <editorialDecl> is currently more restrictive
than the P4 equivalent in one important way. I think P4 had it right,
and that the current model demonstrates arrogance on our part.
Furthermore, because reverting to the P4 world view would not break
any existing TEI P5 documents, this may well be considered a
corrigible error.

P5 model:    ( model.pLike+ | model.editorialDeclPart+ )

P4 model:    ( p+ | ( model.editorialDeclPart+, p* ) )

That is, in P4 we said "you can just have chunks of prose, or, for
those things that we can forecast what you might need to talk about,
you can use these nifty special-purpose elements ... BTW, if you use
both free-prose <p> elements and nifty special-purpose elements, you
have to put the special-purpose ones first".

This was great. Where the TEI had anticipated the kinds of editorial
policies I want to express, I had some useful elements and attributes
with controlled vocabularies to express them with. If there was more
information to include, I tacked it on in a <p> element and went on
my merry way.

But now in P5 we say "you can just have chunks of prose, or, if we
have forecast everything you might need to talk about, you can use
these nifty special-purpose elements ... but, if you have even so
much as 1 thing to say that we haven't predicted and created a nifty
special-purpose element for, you have to use all prose".

So now if I have even so much as 1 bit of editorial policy I wish to
express that the TEI has not anticipated, I am forced to make a
choice: use TEI's useful special-purpose elements and forgo other
information or use all prose and forgo the controlled vocabulary,
predictability, etc. of the special-purpose elements.

Now, there is a reason for what (to me, and perhaps to you) seems
like madness. This change was made in our strive to ensure that users
could easily remove elements in their customizations and get valid
DTDs out. If a user were to delete all elements in
model.editorialDeclPart[1], the P4 content model would be reduced to 
  ( p+ | ( p* ) )
which is valid in RELAX NG, but illegal in DTDs (and in XSD version
1). At the time it was argued that it is an inordinate strain upon a
customizing user to have to go in and change the content model of
<editorialDecl>, and that it was too difficult to have the DTD
generation software do something else.

But by now the DTD generation software *does* do something else. When
asked to produce a DTD for
  ( p+ | ( model.editorialDeclPart+, p* ) )
Roma now produces
  ( p+ | ( _DUMMY_model.editorialDeclPart+, p* ) )
which is perfectly reasonable. (The DUMMY token is not declared, but
it is valid in DTDs to refer to an undeclared element in a
content model.)

So I am hoping the time is ripe to give P5 users the expressive
flexibility of P4, here.

Notes
-----
[1] <correction>, <hyphenation>, <interpretation>, <normalization>,
    <quotation>, <segmentation>, and <stdVals>.