[tei-council] Chapter 1

Mon Jan 7 13:29:52 EST 2008

Lou's convinced me on everything below except the standards business:
can one not have international standards that are not ISO? What about,
for example, doping standards in international athletics, or judging
standards in figure skating. Neither are good standards in that they are
both problematically implemented, but they are international, do strive
for standard status, and are not ISO.

On Sat, 2008-05-01 at 20:22 +0000, Lou's Laptop wrote:
> Brett Zamir wrote:
> > Hi,
> >
> > This first chapter raised a number of more questions for me than some 
> > of the others, though, as usual, many are more such stylistic 
> > suggestions. Some suggestions might occasionally seem redundant, but I 
> > think good communicative systems have some inherent redundancy, 
> > especially for dim-wits--I mean slow learners--like me.
> >
> > *General issues:
> > *
> > 1) Shouldn't specific chapters be prefixed with "Chapter" capitalized 
> > (and same with "Section" too)?
> >
> This could be generated by stylesheet if there was a general feeling it 
> should be, obviously. But no-one has suggested it till now. Some of the 
> titles are quite long...
>  
> > 2) From my limited perspective, customization, unless absolutely 
> > necessary, seems to be something to be avoided rather than encouraged.
> This is a fundamental point about the TEI. You cannot use the TEI scheme 
> without "customizing" in some sense -- you have to select which modules 
> you want.
> 
> > HTML had been fractured between only a couple of browsers and it was 
> > frustrating enough. While there may not be TEI processors besides Roma 
> > yet, I imagine that tools will eventually develop to allow browsing 
> > and searching of such documents in specifically focused, user-friendly 
> > ways, given the great potential of such semantically rich documents as 
> > TEI and the increasing popularity of XQuery, etc. I think it is great 
> > that TEI is modular, but, in my humble estimation, I think the 
> > documentation ought to (emphatically) highlight the disadvantages of 
> > customizing, for those who might ever wish to share their documents 
> > beyond their own internal use.
> >
> You'll have to tell us what you consider those disadvantages to be, I'm 
> afraid!
> 
> 
> > *The TEI Infrastructure - **Specific Issues*
> >
> > 1.2 - Defining a TEI Schema
> >
> > 1) "The method...recommended by these Guidelines is to provide 
> > explicitly or by reference a TEI schema specification against which 
> > the document may be validated."  What is the means recommended here in 
> > saying "by reference"? Just by referring to a URL where the 
> > specification is kept? Is there a standard element for doing this 
> > (when not actually including the schema documentation elements)?
> 
> Maybe the phrase "by reference" is a bit misleading. All it means is 
> that there has to be some way of associating a document instance with 
> its schema. You can do that in a number of different ways, of course, 
> (as discussed elsewhere) but one way is actually to provide the schema 
> and the instance together in the same document -- or to provide some 
> kind of reference for the schema. There is no standard element or 
> attribute for doing that in TEI -- (unless you are using XSD of course)
> >
> > 2) "*A TEI-conformant schema* is a specific combination of TEI 
> > modules, *possibly also *including additional declarations that modify 
> > the element and attribute declarations contained by each module, for 
> > example *to suppress or rename some elements*." How can it still be a 
> > TEI-conformant schema if its elements are being renamed? Does this 
> > just mean that the process of renaming, etc. is conformantly-documented?
> >
> You need to read the chapter on conformance for the full picture but, 
> for example, a document in which all the element names are translated 
> into German or Chinese according to the mappings provided by the TEI is 
> still a TEI conformant document.
> 
> 
> > 3) Might I suggest referring to a resources page which includes 
> > reference to Roma for the discussion which mentions that "the 
> > specification may be processed to generate a formal schema..."? If I'm 
> > reading this and looking for help, I have no idea how this process 
> > would be accomplished or where to look.
> 
> Well, there is already a reference at the end of the next paragraph but 
> one, but there;s no reason not to add another here; so I have.
> 
> >
> > 1.3.1 Attribute Classes
> >
> > Might I suggest this section beginning with an example of a class with 
> > at least two attributes? You do give an example of a class with only 
> > one attribute later, and I think it is more salient to grasp the 
> > rationale for classes if the example already has more than one attribute.
> >
> Good suggestion. As itr happens, att.naming adds more than one attribute 
> so I've added reference to the other one, along with a frurther 
> "rationale" for classes.
> 
> > 1.3.1.1 For the definition of xml:lang, it can not only indicate the 
> > language of the element content, but also potentially of a text 
> > attribute, no?
> >
> Only "potentially" because we have gone to some lengths to abolish 
> "text" attributes.
> 
> 
> > 1.3.1.1 Might I suggest adding a definition of xml:space here too (and 
> > a section for it?) or even the schema attachment attributes like 
> > xsi:schemaLocation? (since the discussion is for global attributes)
> >
> There has been some controversy about that without the Council, and the 
> current feeling seems to be that xml:space is such a bad thing we'd 
> rather not talk about it at all!
> 
> 
> > 1.3.1.1.1 In reference to @n, it is said "Its value may be any string 
> > of characters". Should this be stated as being limited to 
> > non-whitespace characters? I see the definition in the schema as @n 
> > being of type "data.word", but I'm not familiar with the regular 
> > expression components which define it ((\p{L}|\p{N}|\p{P}|\p{S})+).
> >
> 
> This has also been the subject of some controversy. The regular 
> expression syntax is standard enough, but there is some debate as to 
> whether we should relax it a bit more.
> 
> > 1.3.1.1.1. While I know that @n can be something like "One", it would 
> > seem to me that this kind of usage ought to be discouraged, as it 
> > could make queries much more difficult (e.g., it should be much easier 
> > to process a query for a range from 40-63 than trying to figure out 
> > "forty" to "sixty-three") (Some preparers of these documents must have 
> > no real idea of how useful tagging is for searching, so they don't 
> > give consideration to these things when they make such decisions.)
> >
> 
> This is a usage note, I think. Foir some applications it's considered 
> more important to preserve exactly the form ofr the identifier supplied 
> in the original than it is to facilitate its use as a navigation tool -- 
> there are plenty of other ways of doing the latter, after all.
> 
> > 1.3.1.1.1 While it is described as being "redundant" to add numbering 
> > where there are no unusual deviations, I'd think that their presence 
> > might also indicate during a tagging project that the numbering has or 
> > has not yet been addressed.
> >
> "addressed" in what sense?
> 
> > 1.3.1.1.2 - "The xml:lang attribute indicates the language, writing 
> > system, and character set associated with a given element and all its 
> > contents." Shouldn't this read something like "language, script, and 
> > regional or other variant associated with..."?  (Same with the 
> > definition of data.language).
> >
> I am not sure why it refers to chatracter set, but I dont think "script 
> and regional or other variant" gains in precision on "writing system" 
> What do others think?
> 
> 
> > 1.3.1.1.3 - "Although the contents of the rend attribute are free 
> > text, in any given project, encoders are advised *to settle on *a 
> > standard vocabulary with which to describe typographic or manuscript 
> > rendition of the text." Might I suggest changing this to "adopt" 
> > (since otherwise, it might sound a little more like the project 
> > encoders should come up with their own internal standard).
> >
> But that's precisely what they *do* do! However, I am happy to "settle 
> on" your proposed rewording.
> 
> 
> > 1.3.2 - Do divPart, etc. have superclasses all the way to the top? 
> > Aren't all classes eventually orphaned? Why is addrPart unique in such 
> > a regard--just because other classes often have at least one parent or 
> > child class?
> 
> addrPart is unusual (not unique) because it is used only within one 
> element -- whereas most classes are subclasses of other *classes*. The 
> hierarchy isn;t complete -- not every class goes "all the way to the top".
> 
> >
> > 1.4.1 - macro.schemaPattern might be defined to begin "is a pattern to 
> > match elements" rather than as "A pattern to match elements..." since 
> > the other macros (at least in this context) all include the title 
> > within a sentence, rather than starting a new sentence.
> 
> That's a flagrant breach of house style: thanks for spotting it!
> 
> >
> > 1.4.2 - The documentation states "TEI-defined datatypes may be grouped 
> > into those which define normalised values for numeric quantities or 
> > probabilities, those which define various kinds of short-hand codes or 
> > keys, and those which define pointers or links" What about dates, 
> > etc., as mentioned in detail shortly afterward?
> 
> OK, added reference to tremporal expressions.
> >
> > 1.4.2 - Maybe add "ISO" in front of "international standard" for the 
> > definition for data.temporal.iso (I know its in the name itself, but 
> > it might help clarify...).
> >
> Doesnt "international standard" *mean* ISO?
> 
> > 1.4.2 - I made a change here to give an example which started with an 
> > underscore
> >
> > 1.4.2. - Re: data.enumerated, "This list may be open (in which case 
> > the list is advisory..." to the following?:
> >                 "This list may be open (in which case the list is 
> > advisory, following "TEI Recommended Practice"--see section 23.3 
> > Conformance)
> Have added wording similar to this.
> >
> > 1.4.2 - Mention that data.code is of the URI type?
> >
> we use the term data.pointer.
> > 1.4.2. - "An attribute may, of course, take more than one value of a 
> > given type, for example a list of pointer values, or a list of words. 
> > In the TEI scheme, this information is regarded as a property of the 
> > datatype element used to document the attribute in question rather 
> > than as a distinct datatype" But DTD's, etc. may distinguish in some 
> > cases (e.g., NMTOKEN vs. NMTOKENS). Does such a difference in TEI 
> > still allow translation of the difference into the DTDs/schemas? (I 
> > presume it does, just wanted to check.)
> 
> Yes, this is implemented by the ODD processor that generates the schema.
> 
> >
> > 1.5 (very end of chapter) - Do the RELAX NG null values need also to 
> > be in a particular order? Otherwise, it seems this information doesn't 
> > belong here according to the context (that the classes are arranged in 
> > order), unless there is a stronger transition.
> >
> Yes, I believe they do need to be declared first.
> 
> > Unrelated issue:
> > http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-data.outputMeasurement.html 
> > spells XSLFO without a hyphen
> >
> 
> Not any more it doesn't.
> 
> 
> > take care,
> > Brett
> I'll try!
> 
> Lou
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
-- 
Daniel Paul O'Donnell, PhD
Chair, Text Encoding Initiative <http://www.tei-c.org/>
Director, Digital Medievalist Project <http://www.digitalmedievalist.org/>
Associate Professor and Chair of English
University of Lethbridge
Lethbridge AB T1K 3M4
Vox: +1 403 329 2378
Fax: +1 403 382-7191
Homepage: http://people.uleth.ca/~daniel.odonnell/