[tei-council] Chapter 23 - Using the TEI - part I

Lou's Laptop lou.burnard at oucs.ox.ac.uk
Sat Feb 2 14:10:15 EST 2008


[For completeness, I include here some comments from SPQR which Brett 
has already seen, but Council members have not]

Brett Zamir wrote:
> If I didn't mention it for Chapter 22, I think it would be helpful to 
> indicate whether module definitions, etc. should go in their own 
> dedicated documents, and if so, what a barest outline might be (e.g., 
> does it have a <TEI> and <teiHeader>, etc.?).
>
Well, this does rather depend on the application. Roma requires that  a 
<schemaSpec> be given within a complete TEI document, but others might not.


> *
> 23.2.1.1 Deletion of Elements
> *
> 1) I think it would be helpful to explain /why/ someone would want to 
> delete an element.
>
Why deletion in particular? Because it makes the schema smaller and 
easier to handle?
But you could argue the same for e.g. modification of valLists. There 
are many reasons for wanting to tighten up on a schema like tei_all!

> 2) It seems to me that this subsection and all of the others at the 
> same level are redundant with Chapter 22.
>
There is some repetition inevitably: 22 is the formal definition, 
whereas here we are giving the "how to".

> *23.2.1.4 Modification of Attribute and Attribute Value Lists
> *
> For the line, "It is often desirable to constrain the possible values 
> for an attribute to a greater extent than is possible by simply 
> supplying a TEI datatype for it. This facility is provided by the 
> <gi>valList</gi> element...", isn't there a way to use <rng:> prefixed 
> enumeration type elements to do this?
>
SPQR> "yes, you could do. the reason we want you to use <valList> is 
that it
provides a context for documentation (desc, gloss and equiv)" 
>
> *23.2.1.5 Class Modification
> *
> 1) "To add an element to a class in which it is not already a member, 
> all that is needed is to supply a <gi>memberOf</gi> element *with the 
> implicit value for its <att>mode</att> attribute of
> <val>add</val>*. *For example, *to add an element to the <ident 
> type="class">att.typed</ident> class, we include a declaration like
> the following:
> <egXML xmlns="http://www.tei-c.org/ns/Examples" 
> rend="full"><elementSpec ident="eg" module="tagdocs" mode="change" 
> ns="http://example.com/ns">
>   <classes mode="change">
>     *<memberOf *key="att.typed"/>
>   </classes>
> </elementSpec></egXML>
>
> I don't see how the implicit value here would be "add", since the mode 
> specified is "change". Even if it is, I find this explanation a bit 
> confusing.
>
SPQR>"The mode on <classes> is "change"; the implicit mode is on 
<memberOf>.
reworded, hopefully clearer now."

I dont think the @mode on <memberOf> need be mentioned at all, since it 
confuses the issue.


> 2) For this section, there also seem to be inconsistency between the 
> explanation and example:
>
> "...the <gi>classes</gi> element may indicate this by *defaulting its 
> <att>mode</att> attribut*e. *By default, this attribute has the value 
> <val>replace</val>*, implying that the memberships indicated by its 
> child <gi>memberOf</gi> elements the only ones applicable.
> *Thus *the following declaration: <egXML 
> xmlns="http://www.tei-c.org/ns/Examples" rend="full"><elementSpec 
> ident="term" module="core" *mode="change" *ns="http://example.com/ns">
>   <classes>
>     <memberOf key="att.interpLike"/>
>   </classes>
> </elementSpec></egXML>
> would have the *effect of removing the element* <gi>term</gi> from
> both its existing attribute classes, and adding it to the <ident 
> type="class">att.interpLike</ident> class.</p>
> <p>*If however the <att>mode</att> attribute is set to <val>change</val>*"
SPQR> "no, its correct. the <elementSpec> has a mode "change" as normal, 
but the <classes>
has no mode, and thus defaults to "replace". Its weird, tho, I will 
change it."

(This  slightly anomalous behaviour for @mode on <classes> was a recent 
change -- I have also tried to further clarify the text a bit.)

>
> *23.2.2 Modification and Namespaces*
>
> 1) I changed the line from "*All the *elements defined in the TEI 
> scheme are labelled as belonging to a single namespace" to 
> "*Essentially all of the *elements defined in the TEI scheme are 
> labelled as belonging to a single namespace" due to e.g., 
> http://www.tei-c.org/ns/Examples
>
> Likewise for the line later in 23.3.4: "All elements in a TEI Schema 
> which represents concepts from the TEI abstract model belong to the 
> TEI namespace, <ident type="ns">http://www.tei-c.org/ns/1.0</ident>, 
> maintained by the TEI along with additional namespaces for language 
> variants." Maybe the former example should also mention the language 
> variants as well?
>
SPQR> "I'm removing that hostage to fortune of the language variants; 
Lou, reinstate if you disagree "

I've made some textual changes to reflect this concern, though not 
exactly the ones you propose.
> 2) I think the content discussed here, e.g., about @ns would be well 
> to be introduced in Chapter 22.
Yes, it should be better documented there.

>
> 3) For the lines, "Suppose, for example, that we wish to add a new 
> attribute <att scheme="imaginary">topic</att> to the existing TEI 
> element <gi>p</gi>.  *In the absence of namespace considerations, this 
> would be an unclean modification*, since <gi>p</gi> does not currently 
> have such an attribute. The most appropriate action is to *explicitly 
> attach the new attribute to a new namespace *by a declaration such as 
> the following:....*Since <att scheme="imaginary">topic</att> is 
> explicitly labelled as belonging to something other than the TEI 
> namespace, we regard the modification which introduced it as clean.*", 
> as I understand it, adding a namespace would still not be "clean" 
> (although it might be conformable if the attribute could be safely 
> stripped or converted), because it still requires the addition of new 
> attributes not otherwise allowed in a regular TEI conformant document.
>
SPQR> "not sure I get you here. do you mean the namespace-declaring 
attributes? the new attributes  will be added, but not being in the TEI 
namespace are ok "

Yes, I am a bit puzzled by your comment too. As the example shows, you 
can add attributes from other namespaces with impunity, since 
cleanliness only applies to elements etc. from the TEI namespace.


> 4) For the line, "A namespace-aware processor will regard this 
> document as valid according to the unmodified schema.", if the schema 
> is "unmodified", then it will not have imported the namespace, which 
> would mean it was not valid, no?
>
SPQR> "yes, this sentence is odd. reworded."
> 5) "The namespace for such translations is the same as that for the 
> canonical namespace, *suffixed by the two character *language 
> identifier. A schema specification using the  Chinese translation, for 
> example, would use the namespace <ident 
> type="ns">http://www.tei-c.org/ns/1.0/zh</ident>". Might this be 
> changed to say "two or four or more character language identifier"? 
> Especially in the case cited, that makes a big difference what the 
> "dialect" is.
>
SPQR> "yes, agreed"

I've added a cross reference to #CHSH

> *23.3 Conformance*
>
> 1) I would think that even the first chapter might contain details 
> about the information presented here.
>
Yes, this might be a good idea.
> 2) How about turning the definitions for TEI Conformant, etc. into a 
> list or subsections? I think this is important enough to want to be 
> able to refer back to it quickly.
>
There are quite a few lists there already, of course.

> 3) I think it would be helpful to indicate here how a "clean 
> modification" relates to conformance.
>
It's a contentious topic: perhaps you would like to expound the issues 
as you see them?

> 4) For the description of an extension, /"A document is said to use a 
> <term>TEI Extension</term> if it is a well-formed XML document which 
> is valid against a TEI Schema which contains additional distinctions, 
> representing concepts not present in the TEI abstract model, and 
> therefore not documented in these Guidelines. Such a document cannot, 
> in general, be algorithmically conformant since it cannot be 
> automatically transformed without loss of information."/, might the 
> thought be conveyed here that some loss of information might not be 
> very critical in certain cases? For example, if a TEI processor, as 
> long as it were programmed to ignore elements it didn't recognize (or, 
> with my suggestion to allow ANY content within certain elements like 
> <graphic>), might be used to view a TEI document with some SVG inside, 
> without any deleterious effects. One might also point out that such 
> "extensions" might in some cases be even easier to process than 
> "Conformable" ones, given that they might use the more easily 
> recognized namespace mechanism rather than some other algorithm.
>

I don't agree. A TEI processor *must* be able to validate documents, and 
that means that their schema must declare all the elements used, 
excepting only those which come from other namespaces. But maybe I am 
misunderstanding your suggestion?


> 5) When discussing "TEI Recommended Practice", might mention be made 
> here of being able to use Schematron to enforce certain practices not 
> representable by the other schemas?
>
Better integration of schematron as a means of enforcing constraints not 
expressible or not expressed by schemas is a major work item for the 
next release...
> *23.3.1 Well-formedness criterion*
>
> 1) The reference here to successors to XML 1.0, made me look back and 
> perhaps this line in Chapter 1 might be changed, since theoretically 
> at least you might have some people looking to put documents into 
> Mongolian, etc., which can only be done by XML 1.1: "Attributes of 
> type <ident type="datatype">data.name</ident> are also words in this 
> sense, but they have the additional constraint that they must be legal 
> XML identifiers, as defined by the XML 1.0 specification* (or 
> successors)*."
>
Agreed: done.

> 2) Would this line, "Other ways of representing the concepts of the 
> TEI abstract model are possible, and other representations may be 
> considered appropriate for use in particular situations (for example, 
> for data capture, or project-internal processing)." be suitable for 
> adding mention of representing non-hierarchical information (e.g., 
> SGML-like ones)?
>
What SGML-like ways of representing non-hierarchical info are you 
thinking of? CONCUR is the only one I am aware of.
> *23.3.2 Validation Constraint
> *
> 1) "All <term>TEI Conformant</term> documents must validate against a 
> schema file that has been *derived from the published TEI 
> _Guidelines_*, combined and documented in the manner described in 
> section <ptr target="#MD"/>. We call the formal output of this process 
> a <term>TEI Schema</term>". Is this supposed to be derived from the 
> published ODD files or the like instead of "Guidelines"? If it in fact 
> does mean the Guidelines, maybe this should read "derived from the 
> directions outlined in the published TEI Guidelines"?
>
The Guidelines comprise both text and schema specifications.

> 2) Could this line, "No schema language fully captures all the 
> constraints implied by conformance to the TEI  abstract model." safely 
> be qualified to say "No *single *schema language"?
>
I don't see any significant difference between the two assertions, but I 
am happy to insert the word.
 
> 3) If the W3C Schema and Relax NG are fully interchangeable as far as 
> conversion from ODD/expression of TEI, might this line "A document 
> which is valid according to a TEI schema  represented using one schema 
> language may not be valid against the same schema expressed in other 
> languages" be expanded to make that clear (when I use Roma, for 
> example, I am unaware as to whether W3C Schema and RNG are equally 
> constraining)?
>
This is Roma-specific, however: and we are talking general principles 
here. It just so happens that Roma generates XSD by using trang to munge 
RELAX NG -- a different schema processor might decide to generate XSD 
directly and might be able to include constraints expressed in ODD by 
means of schematron constraints directly.

> 4) Given the earlier line, "TEI conformance implies that the schema 
> used to determine validity of a given document should be derived from 
> the present Guidelines, preferably by means of an ODD which references 
> and documents the schema fragments which the Guidelines define.", 
> might the line "derivation from an ODD is a *necessary* but not a 
> sufficient condition" be altered to reflect that a TEI-conformant 
> document must only /preferentially/ be derived from an ODD? 
I've removed "preferably". So sue me.

> And likewise for the line later in 23.3.5, "a TEI Schema can *only be 
> *generated from a TEI ODD..."?
>
See above.

> *23.3.3 Conformance to the TEI Abstract Model*
>
> For the lines, "...the class membership of an existing TEI element 
> cannot therefore be changed without changing the model. Elements can 
> however be removed from a class by deletion, and *new non-TEI elements 
> can be added to existing TEI classes*." Might it be stated here what 
> effect doing so may have on conformance?
>
Not sure what that effect is... we are talking about classes as a 
component of the abstract model here, so a non-TEI-namespaced membership 
in a class is a rather nebulous idea.


> *23.3.5 Documentation Constraint*
>
> 1) For the line, "A TEI Conformant document should therefore always be 
> accompanied by (or refer to) a valid <term>TEI ODD file</term> 
> specifying which modules, elements, classes, etc., are in use together 
> with any modifications or renamings applied, and from which a TEI 
> Schema can be generated to validate the document.", I think this might 
> be a good place to mention RDDL (the XHTML extension placed where a 
> namespace URL points, which lets you indicate further resources 
> related to a given namespace).
>
This has been mentioned before (by you I think?) and is definitely 
something we need to consider further. Please put in a SF feature request!

> 2) Does Roma have a means of providing an ODD file? Might a link be 
> added here to refer to resources for preparing such a file? I imagine 
> it could be pretty intimidating coming to the guidelines if one 
> weren't aware of resources like Roma and perhaps people giving up if 
> they thought they had to do all of this from scratch.
>
Roma does indeed allow you to save an ODD file as well as process one.

> *23.3.6 Varieties of TEI Conformance*
>
> For the line "If not, then the document can only be considered TEI 
> Conformant if it validates against a predefined TEI Schema *and 
> conforms to the TEI abstract model*.", isn't the last part redundant 
> with the next line asking whether the markup represents the TEI 
> abstract model?
>
Not really. The questions are to be answered in the sequence given, so 
if you answered "yes" to this one, you get the test for conformancre to 
the abstract model next; if you answered "no", you get it mentioned 
conditionally...



More information about the tei-council mailing list