[tei-council] on conformance document

Syd Bauman Syd_Bauman at Brown.edu
Tue Aug 22 06:19:37 EDT 2006

                    TEI conformance is not about making interchange a
                    trivial activity. TEI conformance is about making
                    what you've done explicit.

The architects of P3 did an outstanding job defining conformance and
creating a formal extension mechanism. The definition of conformance
was crafted, purposefully, to gut the very concept of conformance to
the bare bones.[1]

However we define it for P5, the idea of conformance is likely to be
an influential one: people will have an incentive to ensure that their
documents are "conformant" whether because of pressure from funding
bodies, or a desire for intellectual traction among colleagues, or
just a sense that it's the right thing to do, and that the TEI
encourages conformance. The main concern, of course, is that in an
effort to be "conformant" users will commit tag abuse, fail to encode
salient features, or not use TEI altogether. For this reason, I think
it's very important that we not define conformance such that it
actively discourages the customizations people need to make in order
to use the TEI to describe what they find interesting and esoteric
about their documents. This is *particularly* true in light of our
claims that the TEI can and should be customized, and that it can and
should be used for just about everything. We can't encourage
experimentation on the one hand, and then use terminology that
severely deprecates the results.

The fact that it is easier to interchange files that are less
drastically customized (i.e., a 1 on Sebastian's scale) is almost a
red herring. The main obstacles to most interchange lies elsewhere.
E.g., for what may be the vast majority of the world's TEI encoded
documents, those created by large-scale projects like digital
libraries, the packaging (i.e., how the thumbnails & JPEGs are
associated with the TEI file) is going to affect overall
interchangeability more than the encoding itself. Even when the
encoding is an issue, in many cases the problems will arise *within*
the confines of vanilla TEI. E.g., issues like slightly different
values for type= attributes, differing application semantics (you
apply <persName> to all personal names, I apply it only to those names
I've bothered keying), and differing methodologies (<note> at anchor
point vs. <note> points to anchor point).

On the other hand, there is no doubt that some practices lend
themselves to easier interchange than others. I do not think it is a
bad idea to discuss what kinds of customizations are likely to make
interchange harder, and which are not. I am not even saying that such
a discussion does not belong in the Guidelines. Just that such things
don't affect conformance. The list of "levels" Sebastian provides in
http://www.tei-c.org.uk/wiki/index.php/Conformance may well be a good
starting place for a discussion of these ease-of-interchange issues.
But it is not a good starting place for a discussion of _conformance_.

A TEI encoding that is closer to vanilla (e.g., a strict subset of
tei_all, a 1 on Sebastian's scale) is not ex facie "better" TEI than a
TEI encoding that is more drastically customized. Yes, it may be
easier to process with software designed for vanilla TEI, but it is
not better if the salient features are not properly encoded. Tag abuse
is *far* worse than customization, as is not being able to encode that
which is of interest.

While the details are up for discussion, I think it is really really
important that the definition of P5 TEI conformance be in the same
vein as the P3 & P4 definition: a broad, inclusive, yes/no distinction
that revolves around the need to make explicit the differences between
the current document and vanilla TEI. That is, conformance should not
be primarily about ease of interchange.

[1] I think anyone who wishes to contribute to this discussion should
    certainly read the entire 8-page chapter first. But here's a quick
    overview, skipping over all the SGML-related and some other less
    important stuff.

    TEI-conformance applies to documents, not software. A document
    instance is TEI-conformant if it:
     * is valid XML
     * uses the TEI DTD -- modifications, if any, are made according
       to the prescribed method and declared in the DTD subset
     * has all modifications to meaning or use of defined tags, and
       all new tags, documented in TEI Tag Set Declarations which
       accompany the document [these days that would be the ODD file]
     * includes a TEI header with the required elements
    [See forthcoming post "conforming binary objects".]

    Note that for non-SGML purposes "TEI local storage format" and
    "TEI interchange format" are the same, a fact we may want to take
    advantage of as we struggle to find names for the various classes
    of documents we're talking about. Unless, of course, we still plan
    to support SGML -- do we? [See forthcoming post "SGML or not to
    SGML, that is the question".]

More information about the tei-council mailing list