[tei-council] TEI Conformance

Syd Bauman Syd_Bauman at Brown.edu
Tue Nov 21 15:05:20 EST 2006

Thanks, James. As y'all might guess, I have a *lot* to say on this
topic. I am going to limit myself to the two points Sebastian raised,
for now, as I have my own action items I should be working on!

SR>  a) the suggestion of a standard namespace for TEI additions.
SR>  It's a very attractive idea, and I'd welcome it. It both
SR>  differentiates these from normal elements, and groups them for
SR>  processing.

Am I mis-remembering -- I thought you (Sebastian) were the one who
convinced Council that user-added elements "had" to be in the TEI
namespace! (When was that -- 2002?) I still regret not feeling I knew
enough about how namespaces worked at the time to object.

I'm not at all sure I like the idea of asking a TEI user to put "new"
elements in a different namespace. Face it,

      <div type="letter">
        <dateline>Written in an office that's so cold I can almost
        see my breath</dateline>
        <salute>Dear James</salute>
        <p>Namespaces: can't live with 'em, can't ignore 'em.</p>
        <ps>If I weren't so cold I would have thought of
        something funny</ps>

is a lot nicer than

      <div type="letter">
        <dateline>Written in an office that's so cold I can almost
        see my breath</dateline>
        <salute>Dear James</salute>
        <p>Namespaces: can't live with 'em, can't ignore 'em.</p>
        <syd:ps>If I weren't so cold I would have thought of
        something funny</syd:ps>

However, I think I like the idea of deliberately foiling the utility
of namespaces even less! Face it, that's exactly what P5 currently
does -- it flies in the face of the W3C recommendation, which exists
just so that software can very easily tell which elements are "mine"
and which are not. By having users create what James calls
"Modifications" and "Extensions" with the resulting elements in the
same namespace as the canonical TEI element set, we strip away any
such advantage that namespaces could provide.

HOWEVER, I still think one of the most important aspects of
conformance is the political (as opposed to the technical). I am very
worried that if we were to say something simple like "add any
elements you want, just use an ODD and put 'em in a different
namespace" we would leave the door open for funding agencies to say
"if you want funding your document has to be TEI without use of other
namespaces" or some such. If we do this, we have to define
conformance *very* carefully and *very* explicitly to ensure that
Modifications and Extensions are still encouraged.

SR>  b) raising the profile of <equiv>, to make it the mechanism by
SR>  which one legally adds new elements to the TEI namespace. this
SR>  is the perfect opportunity to sort out the use of <equiv>, ...

Yes, there is a lot of <equiv> sorting out that needs to be done
before we rely on it. Just to give a flavor of the type of problem we
need to think through: 

* Does <equiv> relate only to syntax, or to semantics also?

* I know of at least 2 schemas in the world that use an invented
  element <called> to deliberately conflate the TEI <mentioned> and
  <soCalled> elements. How does an <equiv> say that element X could
  map to either Y or Z?

* If, in my schema, <X type="a"> maps to a TEI <Y> element, and <X
  type="b"> maps to a TEI <Z> element ...

* How similar does the syntax (and semantics?) of my invented element
  have to be to the TEI vanilla element to call it equivalent? E.g.,
  for taking meeting minutes I have invented an <action> element.
  (This is true, BTW.) It is roughly equivalent to a TEI <note>
  element -- semantically similar[1], would belong to class
  model.noteLike if it were in P5, etc. But it has a very restrictive
  content model that allows no PCDATA and only 3 children: <name>,
  <resp>, and <date>. Should it be listed as <equiv> to a <note>?[2]

SR> I also don't think we should insist on a source description in
SR> the header.
JC> That the source for a file is 'born digital' is important, and
JC> better than having an absence of that information.

I agree w/ James, and believe the <sourceDesc> (or equivalent) should
remain a required feature of a conformant TEI text. Personally I'd
like to see a controlled-vocabulary mechanism (read: attribute with
closed list of values) for saying "born digital", just so we don't
have to put up with all the variations:

  Born digitial
  None; this electronic document is the source
  None. This TEI file is the source.


Lou had some principled objection to this idea, but I can't remember
what it was at the moment.  

[1] This is less true now that CW has asked that we include
    contextual information in the action item. When originally
    conceived the <action> element was just a flag, with the main
    discussion taking place in its parent <p>. However, since Lou &
    Sebastian cleverly put a list of the action items up at the top
    of the HTML output, it now makes more sense to include more
    information in the action item itself.
[2] I'm inclined to say "no", because I think <syd:A> being
    equivalent to <tei:B> means that if you (piece of software) know
    how to process a <tei:B>, then you will be able to process a
    <syd:A> (even if not optimally). This is not true in the case
    above, because <resp> is not a valid child of a vanilla TEI
    <note> element.

More information about the tei-council mailing list