[tei-council] Chapter 17 - Simple Analytic Mechanisms
Lou's Laptop
lou.burnard at oucs.ox.ac.uk
Thu Jan 31 14:20:37 EST 2008
Brett Zamir wrote:
> Have you considered giving the chapters a reordering? I'd think (as
> Lou indicated as well) that chapters 22 and 23 really might be put
> together in a block with the first core chapters to emphasize their
> general and basic nature and maybe some other changes.
>
> Here's my own suggestion:
>
> *Dealing with the basic infrastructure:
> *
> #1 The TEI Infrastructure
> #2 The TEI Header
> #3 Elements Available in All TEI Documents
> #4 Default Text Structure
> #15 Corpora and Language Corpora /<Might go under genres, but I think
> it deals enough with a fundamental infrastructure issue to merit going
> here>/
> #5 Representation of Non-standard Characters and Glyphs
> #13 Names, Dates, People, and Places
> #22 Documentation Elements
> #23 Using the TEI
>
> *Special Features:*
>
> #16 Linking, Segmentation
> #14 Tables, Formulae, and Graphics
> #19 Graphs, Networks, and Trees
> #20 Non-hierarchical Structures
> *
> Meta-information:*
>
> #21 Certainty and Responsibility
> #17 Simple Analytic Mechanisms
> #18 Feature Structures
> *
> Special Genres:*
>
> #6 Verse
> #7 Performance Texts
> #8 Transcriptions of Speech
> #9 Dictionaries
>
> * Ancient Texts Genre: (though these might perhaps also go under
> Meta-information)
> * #10 Manuscript Description
> #11 Representation of Primary Sources
> #12 Critical Apparatus
The current order is what we arrived at after quite a bit of debate: Two
factors prevailed in establishing the current order though -- we didn't
want to introduce a two level hierarchy into the body of the text (4
levels of subdivision is already too many...) -- and we didn't really
expect many readers to begin at the beginning, go on to the end, and
then stop. It's a reference manual, for dipping into. That said, there
is a clear progression from general to increasingly specific topic
matter from beginning to end, with the USE chapter as a coda. However,
your proposed re-ordering is certainly feasible -- may I suggest that
you post it as a feature request on source forge for further discussion?
>
>
> *17.1 Linguistic Segment Categories
> *
> 1) Out of curiosity, anyone actually go down to the phonemic
> representation level in TEI? If so, why no tag?
>
I'm not aware of anyone having done this. Such a segmentation would of
course interfere with other levels of linguistic analysis (there is some
reference to this problem in the chapter on transcribed speech if I
remember rightly).
> 2) When the docs state, "the <gi>c</gi> element can contain only plain
> text, and will often contain only a single character", is this
> because a combining diacritic and its base form might be allowable
> together (as it presumably should be, especially since the guidelines
> recommend using these over precombined forms)? If so, might the
> reference stating "Should only contain a single character or an entity
> that represents a single character" be emended to refer to such
> combination characters as well?
>
I've made the descriptions consistent.
> 3) One example here has clauses of type "finite-declarative" and
> "declarative-finite". Any problem with that?
>
Err, not as far as I know. They mean different things.
> 4) Might the line,
>
> "The lemma attribute may be used to specify the lemma, that is the
> head- or base- form of an inflected verb or noun, for example"
>
> be changed to:
>
> "The lemma attribute may be used to specify the lemma, that is the
> head- or base- form of an inflected form (or of a non-standard form).
> For example,"
>
> In our texts, we plan to use <w lemma> to indicate what the standard
> form of a non-standard transliteration is...
>
I am not sure what you mean by "non-standard" here, but lemmatization is
definitely not the same thing as regularization. Using the @lemma
attribute to regularise nonstandard orthography sounds like
attribute-abuse to me... you should be using the <reg> element for this
purpose.
> *17.3 Spans and Interpretations*
>
> An example states "other spans identified by DTL here". Who is DTL?
> Should there be a @resp on the <spanGrp>?
>
D. Terence Langendoen, who originally drafted much of this chapter. And
yes, there should!
> *17.4 Linguistic Annotation
> *
> 1) Is whitespace inevitable between <w ana="#NN1">victim</w> and <w
> ana="#POS">'s</w> as there was whitespace in the CLAWS output? If so,
> do you want to add mention of the shortcoming that this adds?
This is quite a headache (and is commented on elsewhere I think);
mentioning it again here would distract from the main point of
discussion which is the analysis codes themselves.
>
> 2) Why does the line, "However, analysis into phrase and clause
> elements can be superimposed on the word and morpheme tagging in the
> preceding illustration." begin with "However"? Might this be clarified?
>
Stylistic tic. I have deleted it.
> 3) I changed the line "*These mechanisms all depend to a greater or
> lesser degree *on the ability to associate a unique identifier with
> any element in a TEI-conformant text, and then to specify that
> identifier as the target of a pointing element of some kind." to
> "*Many of these mechanisms will depend *on the ability to associate a
> unique identifier with any element in a TEI-conformant text, and then
> to specify that identifier as the target of a pointing element of some
> kind." since XPointer doesn't necessitate use of identifiers at all.
Actually, since we are using XPointer, the whole sentence needs revision
to indicate that other kinds of pointer would work too.
>
> For note 69, why is it required that the whole text be segmented into
> <s> if it is segmented?
Because that's how <s> is defined. It provides end-to-end segmentation
of the whole text. That's what it's for.
More information about the tei-council
mailing list