[tei-council] Chapter 1
Lou's Laptop
lou.burnard at oucs.ox.ac.uk
Sat Jan 5 15:22:10 EST 2008
Brett Zamir wrote:
> Hi,
>
> This first chapter raised a number of more questions for me than some
> of the others, though, as usual, many are more such stylistic
> suggestions. Some suggestions might occasionally seem redundant, but I
> think good communicative systems have some inherent redundancy,
> especially for dim-wits--I mean slow learners--like me.
>
> *General issues:
> *
> 1) Shouldn't specific chapters be prefixed with "Chapter" capitalized
> (and same with "Section" too)?
>
This could be generated by stylesheet if there was a general feeling it
should be, obviously. But no-one has suggested it till now. Some of the
titles are quite long...
> 2) From my limited perspective, customization, unless absolutely
> necessary, seems to be something to be avoided rather than encouraged.
This is a fundamental point about the TEI. You cannot use the TEI scheme
without "customizing" in some sense -- you have to select which modules
you want.
> HTML had been fractured between only a couple of browsers and it was
> frustrating enough. While there may not be TEI processors besides Roma
> yet, I imagine that tools will eventually develop to allow browsing
> and searching of such documents in specifically focused, user-friendly
> ways, given the great potential of such semantically rich documents as
> TEI and the increasing popularity of XQuery, etc. I think it is great
> that TEI is modular, but, in my humble estimation, I think the
> documentation ought to (emphatically) highlight the disadvantages of
> customizing, for those who might ever wish to share their documents
> beyond their own internal use.
>
You'll have to tell us what you consider those disadvantages to be, I'm
afraid!
> *The TEI Infrastructure - **Specific Issues*
>
> 1.2 - Defining a TEI Schema
>
> 1) "The method...recommended by these Guidelines is to provide
> explicitly or by reference a TEI schema specification against which
> the document may be validated." What is the means recommended here in
> saying "by reference"? Just by referring to a URL where the
> specification is kept? Is there a standard element for doing this
> (when not actually including the schema documentation elements)?
Maybe the phrase "by reference" is a bit misleading. All it means is
that there has to be some way of associating a document instance with
its schema. You can do that in a number of different ways, of course,
(as discussed elsewhere) but one way is actually to provide the schema
and the instance together in the same document -- or to provide some
kind of reference for the schema. There is no standard element or
attribute for doing that in TEI -- (unless you are using XSD of course)
>
> 2) "*A TEI-conformant schema* is a specific combination of TEI
> modules, *possibly also *including additional declarations that modify
> the element and attribute declarations contained by each module, for
> example *to suppress or rename some elements*." How can it still be a
> TEI-conformant schema if its elements are being renamed? Does this
> just mean that the process of renaming, etc. is conformantly-documented?
>
You need to read the chapter on conformance for the full picture but,
for example, a document in which all the element names are translated
into German or Chinese according to the mappings provided by the TEI is
still a TEI conformant document.
> 3) Might I suggest referring to a resources page which includes
> reference to Roma for the discussion which mentions that "the
> specification may be processed to generate a formal schema..."? If I'm
> reading this and looking for help, I have no idea how this process
> would be accomplished or where to look.
Well, there is already a reference at the end of the next paragraph but
one, but there;s no reason not to add another here; so I have.
>
> 1.3.1 Attribute Classes
>
> Might I suggest this section beginning with an example of a class with
> at least two attributes? You do give an example of a class with only
> one attribute later, and I think it is more salient to grasp the
> rationale for classes if the example already has more than one attribute.
>
Good suggestion. As itr happens, att.naming adds more than one attribute
so I've added reference to the other one, along with a frurther
"rationale" for classes.
> 1.3.1.1 For the definition of xml:lang, it can not only indicate the
> language of the element content, but also potentially of a text
> attribute, no?
>
Only "potentially" because we have gone to some lengths to abolish
"text" attributes.
> 1.3.1.1 Might I suggest adding a definition of xml:space here too (and
> a section for it?) or even the schema attachment attributes like
> xsi:schemaLocation? (since the discussion is for global attributes)
>
There has been some controversy about that without the Council, and the
current feeling seems to be that xml:space is such a bad thing we'd
rather not talk about it at all!
> 1.3.1.1.1 In reference to @n, it is said "Its value may be any string
> of characters". Should this be stated as being limited to
> non-whitespace characters? I see the definition in the schema as @n
> being of type "data.word", but I'm not familiar with the regular
> expression components which define it ((\p{L}|\p{N}|\p{P}|\p{S})+).
>
This has also been the subject of some controversy. The regular
expression syntax is standard enough, but there is some debate as to
whether we should relax it a bit more.
> 1.3.1.1.1. While I know that @n can be something like "One", it would
> seem to me that this kind of usage ought to be discouraged, as it
> could make queries much more difficult (e.g., it should be much easier
> to process a query for a range from 40-63 than trying to figure out
> "forty" to "sixty-three") (Some preparers of these documents must have
> no real idea of how useful tagging is for searching, so they don't
> give consideration to these things when they make such decisions.)
>
This is a usage note, I think. Foir some applications it's considered
more important to preserve exactly the form ofr the identifier supplied
in the original than it is to facilitate its use as a navigation tool --
there are plenty of other ways of doing the latter, after all.
> 1.3.1.1.1 While it is described as being "redundant" to add numbering
> where there are no unusual deviations, I'd think that their presence
> might also indicate during a tagging project that the numbering has or
> has not yet been addressed.
>
"addressed" in what sense?
> 1.3.1.1.2 - "The xml:lang attribute indicates the language, writing
> system, and character set associated with a given element and all its
> contents." Shouldn't this read something like "language, script, and
> regional or other variant associated with..."? (Same with the
> definition of data.language).
>
I am not sure why it refers to chatracter set, but I dont think "script
and regional or other variant" gains in precision on "writing system"
What do others think?
> 1.3.1.1.3 - "Although the contents of the rend attribute are free
> text, in any given project, encoders are advised *to settle on *a
> standard vocabulary with which to describe typographic or manuscript
> rendition of the text." Might I suggest changing this to "adopt"
> (since otherwise, it might sound a little more like the project
> encoders should come up with their own internal standard).
>
But that's precisely what they *do* do! However, I am happy to "settle
on" your proposed rewording.
> 1.3.2 - Do divPart, etc. have superclasses all the way to the top?
> Aren't all classes eventually orphaned? Why is addrPart unique in such
> a regard--just because other classes often have at least one parent or
> child class?
addrPart is unusual (not unique) because it is used only within one
element -- whereas most classes are subclasses of other *classes*. The
hierarchy isn;t complete -- not every class goes "all the way to the top".
>
> 1.4.1 - macro.schemaPattern might be defined to begin "is a pattern to
> match elements" rather than as "A pattern to match elements..." since
> the other macros (at least in this context) all include the title
> within a sentence, rather than starting a new sentence.
That's a flagrant breach of house style: thanks for spotting it!
>
> 1.4.2 - The documentation states "TEI-defined datatypes may be grouped
> into those which define normalised values for numeric quantities or
> probabilities, those which define various kinds of short-hand codes or
> keys, and those which define pointers or links" What about dates,
> etc., as mentioned in detail shortly afterward?
OK, added reference to tremporal expressions.
>
> 1.4.2 - Maybe add "ISO" in front of "international standard" for the
> definition for data.temporal.iso (I know its in the name itself, but
> it might help clarify...).
>
Doesnt "international standard" *mean* ISO?
> 1.4.2 - I made a change here to give an example which started with an
> underscore
>
> 1.4.2. - Re: data.enumerated, "This list may be open (in which case
> the list is advisory..." to the following?:
> "This list may be open (in which case the list is
> advisory, following "TEI Recommended Practice"--see section 23.3
> Conformance)
Have added wording similar to this.
>
> 1.4.2 - Mention that data.code is of the URI type?
>
we use the term data.pointer.
> 1.4.2. - "An attribute may, of course, take more than one value of a
> given type, for example a list of pointer values, or a list of words.
> In the TEI scheme, this information is regarded as a property of the
> datatype element used to document the attribute in question rather
> than as a distinct datatype" But DTD's, etc. may distinguish in some
> cases (e.g., NMTOKEN vs. NMTOKENS). Does such a difference in TEI
> still allow translation of the difference into the DTDs/schemas? (I
> presume it does, just wanted to check.)
Yes, this is implemented by the ODD processor that generates the schema.
>
> 1.5 (very end of chapter) - Do the RELAX NG null values need also to
> be in a particular order? Otherwise, it seems this information doesn't
> belong here according to the context (that the classes are arranged in
> order), unless there is a stronger transition.
>
Yes, I believe they do need to be declared first.
> Unrelated issue:
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-data.outputMeasurement.html
> spells XSLFO without a hyphen
>
Not any more it doesn't.
> take care,
> Brett
I'll try!
Lou
More information about the tei-council
mailing list