[tei-council] NH final but typos
Syd Bauman
Syd_Bauman at Brown.edu
Sun Oct 28 16:45:53 EST 2007
> The great thing about NH is that it can be totally rewritten for a
> putative 5.1 release with narry a bleep from the Birnbaum Doctrine.
Indeed, that is true about many many minor corrections we make. But
the better P5 is at 1.0 both the better we look, and the more TEI is
useful to encoders!
* I don't think the names of the various views should be capitalized.
They're not really proper nouns, but more importantly it looks
awful.
* Change "his/her" to "his or her".
* Change "only lines and line groups" to "only metrical lines"
* #NHME, 2nd <egXML>:
Maybe I don't get this poetry stuff, but if so, lots of other
readers won't, either. The punctuation in the extract seems (to me)
to give pretty strong evidence that this passage is 1 sentence with
a bunch of clauses. It looks mis-encoded.
* #NHME, 4th <egXML>:
What are the intervening <p> elements for? Why not just
<egXML xmlns="http://www.tei-c.org/ns/Examples" corresp="#NH-eg-02">
<p>
<s>Catholic woman of twenty-seven with five children And a body
first-ratepointed her finger at the back of one certain man and
asked me, "Is that guy a psychiatrist?" and by god he was!</s>
<s>"Yes," She said, "He <emph>looks</emph> like a psychiatrist."</s>
<s>Grown quiet, I looked at his pink back, and thought.</s>
</p>
</egXML>
* "... involves marking the starting and (usually) end points of the
non-nesting ...": either "start" and "end" or "starting" and
"ending". But more importantly, the "(usually)" should be deleted.
We are not, and should not, recommend any methods of encoding that
make use of implicit, rather than marked, structures.
* "There are several variations on this method of encoding:</p>": We
should not end a paragraph with a colon.
* Next para implies that milestones are segment boundary delimiters,
which they are not. (Lou has argued a generic <milestone> could be
licensed for such use; I think such a license would inappropriately
lump two different things into the same bucket -- but in either
case, it's not how <milestone> is currently licensed to be used.)
I think the solution is to move the two paras "For some common
structural ... would break down entirely." up into a new section
that precedes #NHBM, call it "Transition Marking with Empty
Elements" or some such. However, I'm not sure the "don't use <lb>
is if it were <l>" warning is really necessary. (I don't think I
object, either, it just seems a little out of place.)
Then remove "also" from "The segment boundaries also may be
delimited by", which is now the start of #NHBM.
* Not that I expect anything to be done about it, but let me say
again that I think this use of <anchor> is an abuse that we may
live to regret. Segment boundary delimiters do *not* mark points in
the text -- they mark ranges.
* That said, if we are going to have this abusive use of <anchor>, we
should demonstrate some useful practices for the value of subtype=:
- use the name of the element that would otherwise have been used,
not its gloss (so in this case, "s", not "sentence")
- use something other than camel case to separate the "start" and
"end" from the element name, lest we run afoul of someday wanting
to have a <quoteStart> element.
Thus I would suggest "s-start" and "s-end".
* Given that we have an example demonstrating <anchor>, the next
example that demonstrates custom elements either has to encode more
information or should not exist. I.e., if we do not demonstrate
what is to be gained by custom elements, we shouldn't bother
discussing them here.
* Para starting "Finally, elements that are normally used ..." up to
the <egXML>:
- <l>, generally speaking, cannot easily be encoded this way (the
content model of <lg> has to be modified)
- "they serve as empty boundary delimiters when": insert "segment"
before "boundary"
- attributes are not added to content models in the vernacular; how
about "are defined"?
- I think the parenthetical should be a footnote.
* "... to an existing TEI milestone or anchor tag automatically
without ...": besides the mis-match in number, it is *not* a
requirement that they be transformed into <milestone> or <anchor>
elements! How about:
... and if the modified elements and attributes can be mapped
directly to existing TEI markup structures automatically without
loss of information
* "... interpretations of the <choice> <expan>Noun Phrase</expan>
<abbr>NP</abbr> </choice> <q>fast trains and planes</q>."
"Noun Phrase" should not be capitalized; the <q> should be a
<quote>. Unless we can fix the stylesheets ASAP, we'll have to drop
the use of <choice>.
* "... of the phrase <mentioned>Fast trains and planes ..."
^
lower case
* <egXML> following fig 5:
- my gut instinct is that the subtype= values should be "phr-start"
and "phr-end"
- I really dislike overloading corresp=, but I suppose I'm alone on
this Island of Purity
* "Despite their advantages, segment-boundary delimiters incur the"
^
space
* "This means, amongst other things, that the reconstituted document
may not itself be valid." I'm not sure I know what this means
in this context, and it sounds like a non-sequitor. I'm inclined to
delete this sentence. Furthermore, the footnote (#79 currently)
about rule-based languages should be on the previous sentence,
anyway.
* #NHVE, first 2 <egXML>s: I am uncomfortable with implying that it
is OK or good to be reconstituting partial elements by co-indexing
n=. We have an attribute for the simple case, part=; we have
attributes for the complicated case: next= and prev= (shown in next
xmp); we have at least 2 out-of-line methods (<join> and stand-off
w/ <xi:include>). Why are we introducing another, demonstrably
problematic method? I realize we discuss the problems, but this
abuse of n= rubs me the wrong way, anyhow.
* "... example, is a Prepositional Phrase, not a sentence ..."
^ lower case ^
* "An example comes if we attempt to combine a detailed Grammatical
view of the Pinsky example with its metrical encoding" reword:
This problem can be demonstrated by using the <att>part</att>
attribute when encoding both the grammatical and metrical
views in the Pinsky example, as follows.
* "to provide targets for <gi>include</gi>": use <gi>xi:include</gi>.
* #NHSO, 1st <egXML>: there is no indication, except the footnote,
that the <include> element is from the XInclude specification.
Either make it explicit in the example (as with the
"http://www.example.org/cannot/really/use/XInclude" namespace), or
use an <eg> and the correct encoding.
* "In as much as it uses elements not included in the TEI namespace,
stand-off markup involves an extension of the TEI." This is false
on 2 grounds: using an element outside the TEI namespace does *not*
make something an extension. (More later, maybe). But more
importantly, this method does not use any elements outside the TEI
namespace. (Except <xi:include>, and remember, validation is
performed *after* inclusion processing, so these stand-off
documents are still valid against tei_all.)
More information about the tei-council
mailing list