[tei-council] MD chapter revised: namespace rules

Thu Apr 12 14:22:08 EDT 2007

> OK, it's time to get radical. What do we need "TEI Interchange
> format" for? Why do we need to define it at all?

So that we can move all these draconian restrictions out of the
definition of "conformance", but still put them somewhere useful.

Remember that this marvelous capability to differentiate those XML
structures that come from the TEI namespace from those that do not is
only helpful if you both have software that can process the generic
TEI structures and find it useful. That is, a lot of TEI users will
have to go through a lot of hoops to generate documents that fit the
current namespace-concerned definition of conformance, with no
benefit to themselves at all.

On the other hand, when they want to exchange that document with
someone else (say, an archive), following the more draconian rules is
likely to be quite helpful.

Some further thoughts. If we insist that TEI conformant documents
follow draconian namespace rules, there are some pretty predictable
potentially problematic probable consequences:

* Most projects will not use TEI for document capture -- dealing with
  those namespaces will just be too annoying. E.g., imagine the
  project that expected to capture TEI P5 texts using a customization
  that replaced <choice> with the old Janus attributes (because they
  know they are dealing with only 1 language and have no characters
  out of Unicode), and want to change xml:id= to id= and target= to
  an IDREF attribute for capture so that their XML editor will do the
  right thing.

* Many projects will not even bother to store their texts locally in
  TEI, and will either abandon TEI conformance completely, or only
  trot out the namesaced-conformant version for funding agencies and
  blind interchange.

* We will never have a conformant <soCalled>TEI Tite</soCalled> for
  vendor use. I don't see this as a major problem, personally, as the
  goal has always been to use Tite as a capture format, and transform
  into more canonical TEI later.

I really think that this is an issue about which users should be
canvassed before we commit to constraining conformance, as opposed to
interchange format.

BTW, if we are going to go this route of force-feeding namespaces
which impart no direct benefit on users, we had better make it as
easy as possible. That is, roma and webRoma should do the right thing
w/o significant extra work. This may not be easy. For example, roma
will need to know that changing an attribute from data.numeric to
data.count is a clean change, but from data.name to data.word is not.
(There are things we may not be able to expect roma to do, of course,
like read users' regular expressions and figure out if it represents
a clean subset of a datatype or not.)

I just asked a colleague what he thought of all this. He suggested a
different likely pattern of behavior for users. He suggested that
many users would tolerate the namespace business, but would not make
the effort to, or not be able to, figure out which customizations are
clean subsets and which are not, and thus would stick all
customizations into their own namespace. He left me with a very
insightful parting thought: "Boy, I hope the Council doesn't make TEI
into W3C Schema".

P.S. As it applies to interchange format, as opposed to conformance,
     I think I agree with the direction this thread is headed,
     including that translated or renamed elements should be in
     another namespace.