[tei-council] review of IM

Daniel O'Donnell daniel.odonnell at uleth.ca
Sun Oct 28 00:57:29 EDT 2007


On Sat, 2007-10-27 at 21:51 -0400, Syd Bauman wrote:
> I read revision r3778 of IM. I note, however, that since then the
> source files have been re-arranged a bit. The filename of the chapter
> should be USE-UsingTEI.xml.
> 
> 
> It's not clear to me that this section needs to be in the Guidelines
> at all. (It's very clear that it needs to exist -- it is important
> documentation.) It is currently part of the chapter "Using the TEI",
> but really has very little to do with _using_ the TEI. It is probably
> better positioned as an appendix.

These are not things we want to hear Saturday before launch. Always,
Always, always, at this point say what needs to be fixed.


I'll go through these tomrrow am.
> 
> The entire section does not mention the inclusion of Schematron rules
> in an ODD, nor their extraction.
> 
> 
> * #IM/p[1]: At the moment, the second word should be "section" not
>   "chapter". However, as I say above, I think maybe it should be
>   "appendix", if anything.
> 
> * Passim: very often the adjectival "firstly", "secondly", and
>   "thirdly" are used where I think we usually use "first", "second",
>   and "third".
> 
> * #IM/p[4]: deserves a re-write; here's a starting point:
>       <p>An ODD processor is not mandated to perform these two stages
>       in sequence, but this may well be the simplest approach. The
>       ODD processing tools provided by the TEI Consortium and used to
>       process the source of these Guidelines take this approach.</p>
> 
> * #IM-unified/p[1], 1st sentence, "... which specifies the name and
>   default namespace of the result.": I'm not sure what is meant by
>   "the name" (name of what? the schema? -- it doesn't have a name,
>   does it? the file that holds the schema? the identifier used to
>   refer to the schema?) or how it is specified (the ident=, I
>   presume?); it is not stated how the namespace is specified. Here is
>   a suggested re-work of what I think this sentence is trying to
>   convey. 
>       <p>Merging an ODD customization with the TEI P5 ODD
>       specification is driven by a <gi>schemaSpec</gi> element:
>       <specDesc key="schemaSpec" atts="ident ns"/> 
>       The <att>ident</att> attribute is required; it provides a name
>       for the generated schema. Other components of the ODD
>       processing system may use this name to refer to the schema
>       being generated, e.g. in issuing error messages or as the base
>       name of the generated output schema file or files. The
>       <att>ns</att> attribute may be used to specify the default
>       namespace within which elements valid against the resulting
>       schema belong, as discussed in <ptr ref="#MDNS"/>.
> 
> * #IM-unified/p[1], 2nd sentence (pre-list): should perhaps a bit
>   more descriptive: "The main content of the <gi>schemaSpec</gi>
>   element consist of a series of specialized elements, in any order,
>   each of which falls into one of four types."
> 
> * #IM-unified/p[1]/list[1]: suggested revision follows. Note the PIs,
>   which indicate spots someone who underastands this process bettter
>   than I should check.
> 
>   <list type="ordered">
>     <label>specifications</label>
>     <item>The TEI ODD specification elements <gi>elementSpec</gi>,
>       <gi>classSpec</gi>, and <gi>macroSpec</gi> may appear as direct children
>       of <gi>schemaSpec</gi>. Each occurrence must bear a <att>mode</att>
>       attribute which determines how it will be processed.<note place="foot">We
>         do not here say what happens in case of errors; a specification in
>           <val>add</val> mode which is also present in an imported module should
>         obviously be flagged as an error.</note> If the value of <att>mode</att>
>       is <val>add</val>, then the object is simply copied to the output, but if
>       it is <val>change</val>, <val>delete</val>, or <val>replace</val>, then it
>       will be looked at by other parts of the process.</item>
>     <label>reference to specifications</label>
>     <item><gi>specGrpRef</gi> elements refer to <gi>specGrp</gi> elements that
>       occur elsewhere in the current ODD document or even in another document
>       entirely. A <gi>specGrp</gi> element, in turn, groups together a set of
>       ODD specifications (among other things, including further
>       <gi>specGrpRef</gi> elements). The use of <gi>specGrp</gi> and
>         <gi>specGrpRef</gi> permits the ODD markup to occur at the points in
>       documentation where they are discussed, rather than all inside
>         <gi>schemaSpec</gi>. The <att>target</att> attribute of any
>         <gi>specGrpRef</gi> should be followed, and the <gi>elementSpec</gi>,
>         <gi>classSpec</gi>, and <gi>macroSpec</gi>, elements in the
>       corresponding <gi>specGrp</gi> should be processed as described in the
>       previous item; <gi>specGrpRef</gi> elements should be processed as
>       described here.</item>
>     <label>references to TEI modules</label>
>     <item><gi>moduleRef</gi> elements with <att>key</att> attributes refer to
>       components of the TEI. The value of the <att>key</att> attribute matches
>       the <att>ident</att> attribute of a TEI module. The <att>key</att> must be
>       dereferenced by some means, such as reading an XML file with the TEI ODD
>       specification (either from the local hard drive or off the web), or
>       looking up the reference in an XML database (again, locally or remotely);
>       whatever means is used, it should return a stream of XML containing the
>       element, class, and macro specifications <?tei is this right? --sb ?>
>       belonging to the specified module. These specification elements can then
>       be checked against overrides in the <gi>schemaSpec</gi> being processed.</item>
>     <label>references to external modules</label>
>     <item><gi>moduleRef</gi> elements with <att>url</att> attributes refer to
>       external schemas written in RELAX
>       NG<?tei do these have to be in XML or compact syntax? --sb?>. These
>       should remain untouched, and be passed directly to the output schema when
>       it is created. </item>
>   </list>
> 
> * #IM-unified/p[2]: 
>   - I don't believe the term "object" has been defined, and it should be
>   - insert "with the <att>key</att> attribute" between
>     "<gi>moduleRef</gi>" and "must"
>   - in the list, the values of mode= are mis-encoded as <att> instead of <val>
>   - the list says what do do with objects of same ident= when mode=
>     is 'delete', 'replace', or 'change', but not 'add' (footnote 87
>     above notwithstanding -- it is too far removed from this list,
>     I'd say)
> 
> * #IM-unified/p[3], 2nd sentence, before the <list>: how about
>   "Each component could fall into one of four categories:"?
> 
> * #IM-unified/p[3]/list[1]/item[1]: Last I knew, this included the
>   xml:id= attribute, with the result that you could not use an
>   xml:id= value in your customization that occurs on an object in the
>   TEI ODD specification (at least, if that module is included). I'm
>   wondering if xml:id= should be excluded from this rule, so that
>   other ODD processors may get around this restriction. Or does that
>   lead to madness?
> 
> * #IM-unified/p[3]/list[1]/item[3], parenthetical: missing ", and",
>   but moreover I'm uncomfortable with the loose use of "elements",
>   "macros", and "attributes" for "when the ODD processor is building
>   an element" or whatever. Would the following do?
>   
>         (<gi>equiv</gi>, <gi>desc</gi>, <gi>gloss</gi>,
>         <gi>exemplum</gi>, <gi>remarks</gi>, and <gi>listRef</gi> in
>         the specifications of elements or macros, and
>         <gi>datatype</gi> and <gi>defaultVal</gi> in the
>         specification of attributes)
> 
> * #IM-unified/p[3]/list[1]/item[3], after parenthetical: insert
>   "occurrences" after "all".
> 
> * #IM-unified/p[3]/list[1]/item[4]: "i.e." -> "e.g."; make
>   "attribute" plural:
>      <item>identified objects (i.e. those with an <att>ident</att>
>      attribute, e.g. <gi>attDef</gi> and <gi>valItem</gi>) are
>      processed according to their <att>mode</att> attributes,
>      following the rules in this list.</item>
>   It would be better, I think, to reword the whole list to use
>   singular subjects, e.g. "Each object which can occur ... is taken
>   ..." 
> 
> * #IM-unified/p[4], sentence 2: Should we be pointing out that the
>    example demonstrates a non-conformant customization? In any case,
>    the term "element" should probably be more specific:
>       Consider this simple example of a non-conformant customization
>       to the <gi>p</gi> element:
> 
> * #IM-unified/p[4], between the <egXML>s: s/affect/effect/; reverse
>   "not" and "to"; also probably good to expand "the att.typed class";
>   thus perhaps
>        The effect of making <gi>p</gi> a member if the <name
>        type="class">att.typed</name> class is to provide it with both
>        the <att>type</att> and <att>subtype</att> attributes. If we
>        want <gi>p</gi> <emph>not</emph> to have the
>        <att>subtype</att> attribute, ...
> 
> * #IM-unified/p[4], after 2nd <egXML>: change <code> to <tag>
> 
> * #IM-unified/p[6]: s/entire/entirely/; but moreover, why is it
>   easier to deal with multiple examples? 
> 
> * #IM-unified/p[7]: delete first comma; I'm not fond of the
>   "whether to take account of" construct. How about "<p>When
>   processing the content models of elements and the content of
>   macros, the processor has to decide whether to take deleted
>   elements into account or not."?
> 
> * #IM-unified/p[7]/note, sentence 1: s/PizzaChef/Pizza Chef/; 
> 
> * #IM-unified/p[7]/note, sentence 2: would "The roma program behind
>   the P5 Roma application is not as sophisticated, ..." be incorrect?
>   It reads better.
> 
> * #IM-unified/p[7], between the <egXML>s --
> 
>   "... the <gi>choice</gi> is simply <att>model.global</att>.":
>   should be more like "... then <name
>   type="class">model.global</name> is left as the only child of
>   <gi>rng:choice</gi>".
> 
>   Notice that <choice> needs to be qualified, as it is also the name
>   of a TEI element. (In general, I think we should qualify all
>   elements not from the TEI namesapce, except perhaps in SG.)
> 
>   "is itself inside an <gi>zeroOrMore</gi> inside a <gi>group</gi>":
>   the "an" should be an "a".
> 
> * #IM-unified/p[7], right after 2nd <egXML>: before the example we
>    were talking in generic terms, but after with a specific element
>    name. 
>       "and it has been deleted (for example, if <gi>figDesc</gi> had
>       been deleted in the customization in which the above example
>       occurs)"
>    That's not too good, but you get the idea.
>    BTW, I'm curious: why is it necessary to remove the reference?
>    Couldn't it just be resolved to the pattern "empty"?
> 
> * #IM-unified/p[7]/note: How about the following:
>       Note that deletion of required elements will cause the schema
>       specification to mark as valid instances that cannot be TEI
>       Conformant documents since they break the TEI abstract model.
>       Conformance topics are addressed in more detail in <ptr
>       target="#CF"/>.
> 
> * Same para, next sentence, "consequentially": I don't wonder if the
>   word "consequently" is what is intended, in which case it should be
>   moved to be the 1st word of the sentence:
>     Consequently, surrounding constructs, such as a
>     <gi>rng:zeroOrMore</gi>, may also have to be removed.</p>
>   If "consequentially" is what was meant, we need to explain what
>   consequence is of concern.
> 
> * #IM-unified/p[8]: "flat set" is not explained. (I think it would
>   be good to explain it, but low priority.)
> 
> * #IMGS: In this section the voice switches from making the ODD
>   processor the active party ("an ODD processor must ...") and things
>   like "it will be necessary to remove" (what is that -- 'impersonal
>   passive'?) to the first person plural.
> 
> * #IMGS/p[1]: The fact that order matters in order to give "the best
>   chance of successfully supporting all the schema languages" perhaps
>   should be mentioned before the actual sequence of events. Although
>   I have to admit, I have not quite figured out why processing order
>   matters with respect to schema language. (It is very clear that
>   output order matters for DTDs: see #IM-makeDTD.)
> 
> * #IMGS/p[2], 1st 2 sentences, "Firstly, a decision must be made
>   about which schema language is going to be used. The TEI ODD
>   specification, using RELAX NG to express content models, is
>   slightly biased towards this language,": The first sentence seems
>   odd -- I would kinda hope software engineers designing an ODD
>   processor know what output they want. I also would hope that we
>   consider ourselves a wee bit more than _slightly_ biased towards
>   RELAX NG. 
> 
>      An ODD processor may use any desired schema language or
>      languages as its schema output. The TEI ODD specification uses
>      RELAX NG to express content models, and is therefore biased
>      towards this language. However, the current TEI ODD processing
>      system is capable of producing schema output in the three main
>      schema languages, as follows:
> 
> * #IMGS/p[2]/list/item[1]: s/direct/directly/; also "a RELAX NG
>   #compact version" should be "a version in the compact syntax" or
>   #some such. 
> 
> * #IMGS: In this section the `trang` program is encoded as an
>   <ident>; in the previous section Roma, I think it was, was not
>   encoded at all. I think that all references to programs, utilities,
>   commands, etc. should be encoded as <name type='pgm'>. (After all,
>   "trang" is the name of a program.)
> 
> * #IMGS/p[3]: if the rewrite of the beginning of para 2 is accepted,
>   #then this should be deleted.
> 
> * #IMGS/p[4], last sentence: is "Roma processors" (plural) correct?
>   Also, to anyone who has read a schema "in as simple a style as
>   possible" seems like an exaggeration. (E.g., much of the indirection
>   could be resolved -- not that I think this is a good idea, mind
>   you.) How about "in a comparatively simple style"?
> 
> * #IMGS/p[5]/eg[1] and eg[2]: Since there is no markup in the
>   examples, the CDATA marked sections are superfluous.
> 
> * #IMGS/p[5] text in between the two <eg>: The idea that "the
>   knowledge that the attributes such as <att>n</att> and
>   <att>rend</att> come from the global attribute class is lost" seems
>   pretty counter-intuitive: everyone and anyone can see that n= and
>   rend= come from the global attribute class, because the patterns
>   used are named "att.global.n" and "att.global.rend". Here is a
>   suggested re-wording:
>     In the above, a redefinition of an attribute class will have no
>     effect, as each class has already been expanded to its
>     constituent attributes.
> 
> * #IMGS/p[5] text after the 2nd <eg>:
>   - change "class attributes" to "attribute classes", no?
>   - change "with a pointer" to "via a reference"
> 
> * #IMGS/p[6], last sentence, "An ODD processor is not required to
>   support both.": Perhaps we should mention that for processing TEI
>   ODDs, the simple schema output is at least vastly preferred, if not
>   required.
> 
> * #IMGS/p[7]: the example <sp> declaration is not simplified, it is
>   completely different (there is no place to put the speech!). If we
>   want to keep this example, I'd change "simplified" to "fictitious".
> 
> * #IMGS/p[7], after <eg>s: I'm not fond of the wording here (no
>   reason not to use more precise industry-wide term "deterministic";
>   the last sentence makes it sound like it is a problem that RELAX NG
>   does not require determinism), but I think it is low priority and
>   can await 1.1, unless someone can re-word this a lot faster than I.
> 
> * IMGS/p[8], "... mandate any particular schema, but it is ...":
>   s/schema/mechanism/;
> 
> * IMGS/p[8], rest of para: Why are we recommending this only for
>   DTDs? Just because it is hard for us to do for RELAX NG doesn't
>   mean we should not recommend ODD processors do this.
> 
> * #IM-naming/head: how about "Names and Documentation in Generated
>   Schemas"? 
> 
> * #IM-naming/p[1], sentence 1: insert Oxford comma after "element".
> 
> * #IM-naming/p[1]/list/item[1]
>   - "... value of the <att>ident</att> attribute, prefixed ...":
>     insert "corresponding" after "the"
>   - "... distinctive prefix such as e.g. <val>tei_</val>.": remove
>     either "such as" or "e.g.".
>   - "(compact)": we haven't mentioned that examples are in the
>     compact syntax before, but I think it is a good idea that we do.
>     I suggest we standardize on "RELAX NG (compact syntax)" both here
>     and at #IMGS/p[5], just before the <egXML>. (Anywhere else?)
>   - I think "Referring strings have to be adjusted accordingly."
>     should be expanded. What exactly is a "referring string"?
>     Would something like "References to these patterns (or, in DTDs,
>     parameter entities) also need to be prefixed with the same
>     value." be correct?
> 
> * #IM-naming/p[1]/list/item[2], "... <gi>altIdent</gi> child, the
>   value of that is ...": re-word: "... <gi>altIdent</gi> child, its
>   content is ...".
> 
> * #IM-naming/p[1]/list/item[3], 2nd sentence: suggested re-wording: 
>      If there is only one occurrence of either of these elements, it
>      should be used; however if there are two or more occurrences with
>      different values of <att>xml:lang</att>, a locale indication in
>      the processing environment should be used to decide which to
>      use.
>   Note that this does not give advice on what to do when there are
>   two or more with the same value of xml:lang=. Fodder for a 1.1
>   improvement. 
> 
> * #IM-naming/p[2]/list/item[2]: there is an exception: colons are
>   removed first, so that the namespace prefix and attribute name are
>   run together, as in 'att.global.attribute.xmlid'.
> 
> * #IMMA, after the <egXML>: reword to something like the following.
>      Note that in much of these Guidelines, RELAX NG schema fragments
>      are shown in the compact syntax; both the content of the
>      <gi>contents</gi> element and the the unified ODD specification
>      generated by the TEI ODD processing software stores RELAX NG in
>      the more verbose XML format. However, the two formats are
>      interchangeable.
> 
> * #IMCL/p[1], sentence 1: actually, a definition is generated, not
>   just an alternation. suggested rewording:
>      An ODD model class generates a RELAX NG pattern definition
>      listing all the members of the class present in the ODD in
>      alternation.
> 
> * #IMCL/p[2]/egXML[2]:
>   - I expected to see an <a:documentation> element; am I crazy?
>   - it would probably be a good idea to explain the reason behind two
>     definitions, one as 'empty' (I do not understand well enough to
>     explain it)
> 
> * #IMCL/p[2]/quote/following-sibling::text(), "Naturally, this
>   sort of use of the documentation elements is not mandatory, and
>   other ODD processors may ignore them when creating schemas.": other
>   ODD processors could do something else, too, so I'd suggest
>   something like:
>         Naturally, this sort of use of the documentation elements is
>         not mandatory, and other ODD processors may generate
>         alternate documentation or ignore them when creating schemas.
> 
> * #IMCL/p[3], before the <egXML>s: this paragraph does not follow
>   house style in referring to elements and attributes.
>       <p>An individual attribute consists of a <gi>rng:attribute</gi>
>       element with a <att>name</att> attribute derived according to
>       the naming rules described above. In addition, the ODD model
>       supports a <gi>defaultVal</gi> element, which is transformed to
>       a <att>defaultValue</att> attribute in the <ident
>       type="ns">http://relaxng.org/ns/compatibility/annotations/1.0</ident>
>       namespace on the <gi>rng:attribute</gi> element. The body of
>       the attribute definition is taken from the <gi>datatype</gi>
>       child, unless there is a supporting <gi>valList</gi> element
>       with a <att>type</att> attribute with a value of
>       <val>closed</val>. In that case a <gi>rng:choice</gi> is
>       generated, listing the allowed values.
> 
> * #IMCL/p[3], after the <egXML>s: <ident> needs type="ns"; need to
>   cite the recommendation for marking up annotations this way.
>   (http://relaxng.org/compatibility-20011203.html, is it?)
> 
> * #IM-makeDTD/p[1], "... classes generate DTD entities,
>   the TEI ...": insert "parameter" after "DTD".
> 
> * #IM-makeDTD/p[1]/list/following-sibling::text(): I think this
>   sentence is far too colloquial for use in the Guidelines. I think
>   it can just be deleted.
> 
> * #IM-makeDTD/p[2]/eg[1]: I realize this is probably correctly
>   copied-and-pasted from some real DTD output, but I'm thinking that
>   the xmlns attribute should be declared with #FIXED.
> 
> * #IM-makeDTD/p[2], last sentence: "... the document is processing
>   by a DTD-aware ...": s/ing/ed/;
> 
> * #IMGD/p[1], 1st sentence:
>   - need a citation for Knuth's literate programming.
>   - latter half of sentence a bit wordy; suggested revision:
>        ... the previous sections have dealt with the
>        <term>tangle</term> process; to generate documentation, we now
>        turn to the <term>weave</term> process.
> 
> * #IMGD/p[2]: suggested revision:
>          An ODD customization may consist largely of general
>          documentation and examples, which should be processed
>          normally;, but in addition it will contain a
>          <gi>schemaSpec</gi> and possibly some <gi>specGrp</gi>
>          fragments.
> 
> * #ref-faith: probably would be good to come up with a more recent
>   image. 
> 
> * #STPE: This section deals with instructions on how to "stitch
>   together" the RELAX NG or DTD schema fragments into a usable
>   schema. My recollection is that Council decided this information
>   should not be included in the Guidelines themselves, so I am
>   recommending we delete the entire section, and I am not giving it a
>   closer reading.
> 
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
-- 
Daniel Paul O'Donnell, PhD
Department Chair and Associate Professor of English
Director, Digital Medievalist Project http://www.digitalmedievalist.org/
Chair, Text Encoding Initiative http://www.tei-c.org/

Department of English
University of Lethbridge
Lethbridge AB T1K 3M4
Vox +1 403 329-2377
Fax +1 403 382-7191
Email: daniel.odonnell at uleth.ca
WWW: http://people.uleth.ca/~daniel.odonnell/



More information about the tei-council mailing list