[tei-council] TEI Conformance

James Cummings James.Cummings at oucs.ox.ac.uk
Tue Nov 21 11:28:05 EST 2006


In the last conference call I was tasked (though it didn't seem to become an
action) to re-examine TEI conformance.  I have not drafted a new chapter or
anything because I don't think we are ready to do so.  However, I've distilled a
number of people's thinking, and broadly agree with Laurent's way of looking at
conformance and come up with my own thoughts and types of conformance.  I'm not
entirely sure about the details (such as names for different types of conformant
schemas), but wanted to re-open the issue and see how far from other people's
thinking I've wandered.

I would be willing to take this further and draft a new chapter, based on and
expanding the current one, but did not wish to do so until the council had
debated the issue further, since I am confident that there will be numerous
outcries against some of my suggestions.

=====start=====
Thoughts towards principles of TEI conformance.

- The existing chapter needs to be significantly rewritten because it does not
address modern TEI issues.  However, the ideas behind it are useful in any
attempt to define principles of TEI conformance, and my thoughts have been based
on the concerns expressed in that chapter.

- Conformance should not necessarily be an issue for local encoding formats, but
become important when files are made available for interchange.  For example,
with fragmentary files used locally which are dynamically assembled into a valid
TEI master document: conformance is only an issue for the master document as a
whole rather than the individual fragmentary files.

- The definition of namespaces in XML is central to addressing the problems of
recognition and collision where multiple XML standards are used in the same
document.  Since this encourages re-use and and modularisation, wherever
possible, namespaces should be used to differentiate elements or attributes from
different standards.  One additional possibility is to recommend that, where
reasonable, new elements or attributes which have no TEI equivalent
(documentable via  <equiv>) should be in a separate TEI Extensions namespace to
avoid confusion and pollution of the TEI namespace.  Having a clear demarcation
between elements the TEI sanctions, and those which others have created is
beneficial.  Moreover, dealing with multiple-namespaced documents is becoming
less of a problem as more tools develop which support this.

- There needs to be a mechanism, preferably as metadata in the teiHeader, where
a document instance can record details about the schema against which it is
intended to validate.  This should include at minimum options for a prose
description and multiple URI references.  Recommended best practice should be to
reference the both the ODD and schema URIs when the document does not intend to
validate against the tei_all schemas.

What is a TEI document?:
- A TEI-conformant TEI P5 document must be a well-formed and valid XML document.
- A TEI-conformant TEI P5 document must validate against a schema derived from
the TEI Guidelines.
- A TEI-conformant TEI P5 document must use the TEI namespace for all TEI elements.
- A TEI-conformant TEI P5 document must have <teiHeader> element which includes
some elements for a title statement, publication statement and source description.
- Any new customisations of the TEI schemas should be documented with a valid
TEI ODD file.
- Where possible any renamings or new elements, attributes and classes, should
be related to existing TEI structures with the use of <equiv>.

Type of TEI Schemas:
- Pure Subset: A Pure Subset schema is one which is identical to or further
constrains the tei_all schema.  This may include: the removal of unused
elements, attributes and classes; the provision of attribute value lists; or the
further constraint of existing datatypes or content models. However, a document
instance which validates against a Pure Subset must always also validate against
the tei_all schema.

- Renaming Subset: A Renaming Subset schema is one which is identical to or
further constrains a Pure Subset schema, but which also renames existing
elements, attributes, or classes.  A Renaming Subset schema can also add new
elements, attributes or classes, if and only if an <equiv> element in their ODD
documents an existing Pure Subset equivalent for them.  (e.g. <email> is
equivalent to <addrLine type="email">).  None of these changes should conflict
with the names of existing TEI elements, attributes or classes.  A document
instance which validates against a Renaming Subset schema must always also
validate against the tei_all schema if the renamings were reversed or replaced
with their documented equivalents.

- Modification: A Modification schema is one which modifies a Pure Subset or
Renaming Subset by: changing existing elements, attributes, or classes; adding
existing elements or attributes to members to existing classes; or changing
existing datatypes or content models.  These changes should not add new
elements, attributes or classes, and all changes should be documented in an ODD.
 A document instance which validates against a Modification schema is not
expected to validate against the tei_all schema.

- Extension: An Extension schema is one which extends a Pure Subset, Renaming
Subset, or  Modification by adding new elements, attributes, classes, datatypes
or content models.  One option is for these additions to be in a separate
namespace.    All changes should be documented in an ODD file.  A document
instance which validates against an Extension schema is not expected to validate
against the tei_all schema. [And perhaps the TEI may wish to consider a TEI
Extension namespace for users.  Then all major customisations for which one is
not able to provide an <equiv> should be in this separate namespace.  This
avoids namespace pollutions and deters the addition of elements without the
provision of an <equiv> where possible.]

- Supported Extension: A Supported Extension schema is a special case of the
Extension schema, where the extension has been created and is (to some degree)
supported by the TEI.  Examples of this include the example customisations the
TEI provides for including one or all of SVG, MathML or XInclude inside your TEI
Documents.  A document instance which validates against a Supported Extension
schema is not expected to validate against the tei_all schema.

- TEI Based: A TEI Based schema uses ODD to define itself, and may or may not
take advantage of existing TEI elements but substantially differs from the TEI.
 If existing TEI elements are used, they must be in the TEI namespace.  A
document instance which validates against a TEI Based schema is not expected to
validate against the tei_all schema.  If a TEI Based document does not use the
TEI namespace for TEI elements, then one supposes it could be referred to as
inspired but would not be considered conformant in any case.

=====end=====


-James

-- 
Dr James Cummings, Oxford Text Archive, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk



More information about the tei-council mailing list