[tei-council] Tite and conformance (long)
Daniel Paul O'Donnell
daniel.odonnell at gmail.com
Thu Jul 9 12:59:24 EDT 2009
I know this is a tired subject, but since we are closing in on
finalising an official TEI benefit on the basis of Tite, and since
conformance and extension is an important concept, I'd like to nail this
down now--at least to the point that I'm confident for the next year or
so that we're not breaking our own recommendations.
As I understand it, Tite has two features which may or may not be
problematic:
1) It drops the teiHeader
2) It introduces convenience elements that are algorithmically
convertible to canonical TEI elements+specific attribute values.
The questions we need to answer to be comfortable assigning it a
conformance level involve whether these two things are done in the ways
allowed by the guidelines section on customisation (23.2).
The Header
As I see it, dropping the header is allowed by 23.2.1.1 (Deleting
elements)
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#MDMDSU. It
is a pretty big drop, but it doesn't wreck the TEI abstract model, since
it just deletes something rather than reorganises a content model (see
23.3.3 Conformance to the TEI Abstract Model
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#CFAM).
Dropping the header makes Tite an Extension rather than a conformant or
conformable document, since conversion to TEI cannot be done
algorithmically (the information needed for the header is not found in
the source document). This is countenanced (though in quite negative
terms) by 23.3.3.2 (Mandatory Components of a TEI document)
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#CFAMmc
The Convenience Elements
The convenience elements map directly to canonical TEI elements with
specific attvalues. So <i> = <hi rend="italics">. This is a reversible
but not a clean modification, since the convenience elements cannot be
validated using an unmodified TEI ODD. According to 23.2.1.2 (renaming
of elements), these elements need to go in either a new namespace or an
empty namespace.
In Tite, no distinct namespace is supplied for these elements (though it
is easily and automatically determined). This is because it wants to
avoid requiring keying the additional namespace identifiers. But by not
supplying a distinct namespace, Tite is engaging in a practice that is
"strongly deprecated" by the Guidelines (23.3.4
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/USE.html#CFNS).
Leaving aside the question of the wisdom of the whole endeavour--we're
too far down the road to change that right now--the problem we have is
that Tite is in the process of becoming something the TEI uses itself
for a specific purpose (the digitization benefit) but which does
something we say we consider to be "strongly deprecated"--namely the
name space issue.
Initially, I figured that the namespace issue wasn't too big a deal
because it is so easily rectified--as soon as the keying is over, you
can run a Tite document through a stylesheet and add appropriate
namespace information, or of course, convert the document right into
canonical TEI.
But while practically speaking this still is the case, theoretically
speaking and in terms of good practice, we as an organisation should not
promulgate a practice that we deprecate for others. So either we need to
revise the namespace deprecation or we should fix Tite so it uses
namespaces correctly. Since I think we are right in our attitude towards
namespaces, then we need to fix Tite.
The two ways of doing this involve either requiring the addition of
namespace information covering the renamed elements (something the lead
designers or Tite rejected [correctly or not] because of the keystrokes
involved). Or moving all of Tite (including the unmodified TEI elements)
into its own namespace. Since Tite is an extension managed by the TEI
now, this presumably could be an official namespace, similar to those
used by the non-English language versions.
The main argument against moving all of Tite into its own namespace is
that this could be understood as implying that Tite varies far more
fundamentally from canonical TEI than it actually does--i.e. that there
is something special about an element like tite:p that distinguishes it
from tei:p, when in fact it is identical.
This leads me to two questions for more knowledgeable heads than mine:
1) Is there a way of assigning namespaces to specific elements in a
single location within the document so that, for example, <i> is
understood as <tite:i> and <p> as <tei:p> without having to indicate the
namespace affilitation on every instance? It is possible to do something
like this in XSLT if I remember correctly. If we could do that, it wold
solve the whole problem.
2) Do we currently list simple copies of canonical tei elements in
distinct namespaces? I.e. when we publish an internationalisation and
assign it to its own namespace, do we only include the elements that are
changed in this new namespace, or do this internationalisation
namespaces also include elements that are direct copies of the canonical
elements in the tei namespace? (In otherwise, if I am using the French
internationalisation and the element for tei:p is also named <p>, do I
need to include a namespace reference to the tei namespace every time I
use p, or has p been copied into the French namespace?). While I'd
prefer to go with a solution like that in (1) above, if we do copy
elements into our internationalised namespaces, then we have a precedent
for putting all of Tite into its own space without implying that tite:p
is somehow different from tei:p.
Any suggestions? We should solve this problem before Tite becomes
recommended for a fairly high profile benefit of the Consortium.
--
Daniel Paul O'Donnell
Associate Professor of English
University of Lethbridge
Chair and CEO, Text Encoding Initiative (http://www.tei-c.org/)
Co-Chair, Digital Initiatives Advisory Board, Medieval Academy of America
President-elect (English), Society for Digital Humanities/Société pour l'étude des médias interactifs (http://sdh-semi.org/)
Founding Director (2003-2009), Digital Medievalist Project (http://www.digitalmedievalist.org/)
Vox: +1 403 329-2377
Fax: +1 403 382-7191 (non-confidental)
Home Page: http://people.uleth.ca/~daniel.odonnell/
More information about the tei-council
mailing list