[tei-council] Chapter 1

Sat Jan 5 15:22:10 EST 2008

Brett Zamir wrote:
> Hi,
>
> This first chapter raised a number of more questions for me than some 
> of the others, though, as usual, many are more such stylistic 
> suggestions. Some suggestions might occasionally seem redundant, but I 
> think good communicative systems have some inherent redundancy, 
> especially for dim-wits--I mean slow learners--like me.
>
> *General issues:
> *
> 1) Shouldn't specific chapters be prefixed with "Chapter" capitalized 
> (and same with "Section" too)?
>
This could be generated by stylesheet if there was a general feeling it 
should be, obviously. But no-one has suggested it till now. Some of the 
titles are quite long...

> 2) From my limited perspective, customization, unless absolutely 
> necessary, seems to be something to be avoided rather than encouraged.
This is a fundamental point about the TEI. You cannot use the TEI scheme 
without "customizing" in some sense -- you have to select which modules 
you want.

> HTML had been fractured between only a couple of browsers and it was 
> frustrating enough. While there may not be TEI processors besides Roma 
> yet, I imagine that tools will eventually develop to allow browsing 
> and searching of such documents in specifically focused, user-friendly 
> ways, given the great potential of such semantically rich documents as 
> TEI and the increasing popularity of XQuery, etc. I think it is great 
> that TEI is modular, but, in my humble estimation, I think the 
> documentation ought to (emphatically) highlight the disadvantages of 
> customizing, for those who might ever wish to share their documents 
> beyond their own internal use.
>
You'll have to tell us what you consider those disadvantages to be, I'm 
afraid!

> *The TEI Infrastructure - **Specific Issues*
>
> 1.2 - Defining a TEI Schema
>
> 1) "The method...recommended by these Guidelines is to provide 
> explicitly or by reference a TEI schema specification against which 
> the document may be validated."  What is the means recommended here in 
> saying "by reference"? Just by referring to a URL where the 
> specification is kept? Is there a standard element for doing this 
> (when not actually including the schema documentation elements)?

Maybe the phrase "by reference" is a bit misleading. All it means is 
that there has to be some way of associating a document instance with 
its schema. You can do that in a number of different ways, of course, 
(as discussed elsewhere) but one way is actually to provide the schema 
and the instance together in the same document -- or to provide some 
kind of reference for the schema. There is no standard element or 
attribute for doing that in TEI -- (unless you are using XSD of course)
>
> 2) "*A TEI-conformant schema* is a specific combination of TEI 
> modules, *possibly also *including additional declarations that modify 
> the element and attribute declarations contained by each module, for 
> example *to suppress or rename some elements*." How can it still be a 
> TEI-conformant schema if its elements are being renamed? Does this 
> just mean that the process of renaming, etc. is conformantly-documented?
>
You need to read the chapter on conformance for the full picture but, 
for example, a document in which all the element names are translated 
into German or Chinese according to the mappings provided by the TEI is 
still a TEI conformant document.

> 3) Might I suggest referring to a resources page which includes 
> reference to Roma for the discussion which mentions that "the 
> specification may be processed to generate a formal schema..."? If I'm 
> reading this and looking for help, I have no idea how this process 
> would be accomplished or where to look.

Well, there is already a reference at the end of the next paragraph but 
one, but there;s no reason not to add another here; so I have.

>
> 1.3.1 Attribute Classes
>
> Might I suggest this section beginning with an example of a class with 
> at least two attributes? You do give an example of a class with only 
> one attribute later, and I think it is more salient to grasp the 
> rationale for classes if the example already has more than one attribute.
>
Good suggestion. As itr happens, att.naming adds more than one attribute 
so I've added reference to the other one, along with a frurther 
"rationale" for classes.

> 1.3.1.1 For the definition of xml:lang, it can not only indicate the 
> language of the element content, but also potentially of a text 
> attribute, no?
>
Only "potentially" because we have gone to some lengths to abolish 
"text" attributes.

> 1.3.1.1 Might I suggest adding a definition of xml:space here too (and 
> a section for it?) or even the schema attachment attributes like 
> xsi:schemaLocation? (since the discussion is for global attributes)
>
There has been some controversy about that without the Council, and the 
current feeling seems to be that xml:space is such a bad thing we'd 
rather not talk about it at all!

> 1.3.1.1.1 In reference to @n, it is said "Its value may be any string 
> of characters". Should this be stated as being limited to 
> non-whitespace characters? I see the definition in the schema as @n 
> being of type "data.word", but I'm not familiar with the regular 
> expression components which define it ((\p{L}|\p{N}|\p{P}|\p{S})+).
>

This has also been the subject of some controversy. The regular 
expression syntax is standard enough, but there is some debate as to 
whether we should relax it a bit more.

> 1.3.1.1.1. While I know that @n can be something like "One", it would 
> seem to me that this kind of usage ought to be discouraged, as it 
> could make queries much more difficult (e.g., it should be much easier 
> to process a query for a range from 40-63 than trying to figure out 
> "forty" to "sixty-three") (Some preparers of these documents must have 
> no real idea of how useful tagging is for searching, so they don't 
> give consideration to these things when they make such decisions.)
>

This is a usage note, I think. Foir some applications it's considered 
more important to preserve exactly the form ofr the identifier supplied 
in the original than it is to facilitate its use as a navigation tool -- 
there are plenty of other ways of doing the latter, after all.

> 1.3.1.1.1 While it is described as being "redundant" to add numbering 
> where there are no unusual deviations, I'd think that their presence 
> might also indicate during a tagging project that the numbering has or 
> has not yet been addressed.
>
"addressed" in what sense?

> 1.3.1.1.2 - "The xml:lang attribute indicates the language, writing 
> system, and character set associated with a given element and all its 
> contents." Shouldn't this read something like "language, script, and 
> regional or other variant associated with..."?  (Same with the 
> definition of data.language).
>
I am not sure why it refers to chatracter set, but I dont think "script 
and regional or other variant" gains in precision on "writing system" 
What do others think?

> 1.3.1.1.3 - "Although the contents of the rend attribute are free 
> text, in any given project, encoders are advised *to settle on *a 
> standard vocabulary with which to describe typographic or manuscript 
> rendition of the text." Might I suggest changing this to "adopt" 
> (since otherwise, it might sound a little more like the project 
> encoders should come up with their own internal standard).
>
But that's precisely what they *do* do! However, I am happy to "settle 
on" your proposed rewording.

> 1.3.2 - Do divPart, etc. have superclasses all the way to the top? 
> Aren't all classes eventually orphaned? Why is addrPart unique in such 
> a regard--just because other classes often have at least one parent or 
> child class?

addrPart is unusual (not unique) because it is used only within one 
element -- whereas most classes are subclasses of other *classes*. The 
hierarchy isn;t complete -- not every class goes "all the way to the top".

>
> 1.4.1 - macro.schemaPattern might be defined to begin "is a pattern to 
> match elements" rather than as "A pattern to match elements..." since 
> the other macros (at least in this context) all include the title 
> within a sentence, rather than starting a new sentence.

That's a flagrant breach of house style: thanks for spotting it!

>
> 1.4.2 - The documentation states "TEI-defined datatypes may be grouped 
> into those which define normalised values for numeric quantities or 
> probabilities, those which define various kinds of short-hand codes or 
> keys, and those which define pointers or links" What about dates, 
> etc., as mentioned in detail shortly afterward?

OK, added reference to tremporal expressions.
>
> 1.4.2 - Maybe add "ISO" in front of "international standard" for the 
> definition for data.temporal.iso (I know its in the name itself, but 
> it might help clarify...).
>
Doesnt "international standard" *mean* ISO?

> 1.4.2 - I made a change here to give an example which started with an 
> underscore
>
> 1.4.2. - Re: data.enumerated, "This list may be open (in which case 
> the list is advisory..." to the following?:
>                 "This list may be open (in which case the list is 
> advisory, following "TEI Recommended Practice"--see section 23.3 
> Conformance)
Have added wording similar to this.
>
> 1.4.2 - Mention that data.code is of the URI type?
>
we use the term data.pointer.
> 1.4.2. - "An attribute may, of course, take more than one value of a 
> given type, for example a list of pointer values, or a list of words. 
> In the TEI scheme, this information is regarded as a property of the 
> datatype element used to document the attribute in question rather 
> than as a distinct datatype" But DTD's, etc. may distinguish in some 
> cases (e.g., NMTOKEN vs. NMTOKENS). Does such a difference in TEI 
> still allow translation of the difference into the DTDs/schemas? (I 
> presume it does, just wanted to check.)

Yes, this is implemented by the ODD processor that generates the schema.

>
> 1.5 (very end of chapter) - Do the RELAX NG null values need also to 
> be in a particular order? Otherwise, it seems this information doesn't 
> belong here according to the context (that the classes are arranged in 
> order), unless there is a stronger transition.
>
Yes, I believe they do need to be declared first.

> Unrelated issue:
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-data.outputMeasurement.html 
> spells XSLFO without a hyphen
>

Not any more it doesn't.

> take care,
> Brett
I'll try!

Lou