[tei-council] NH final but typos

Syd Bauman Syd_Bauman at Brown.edu
Sun Oct 28 16:45:53 EST 2007


> The great thing about NH is that it can be totally rewritten for a
> putative 5.1 release with narry a bleep from the Birnbaum Doctrine.

Indeed, that is true about many many minor corrections we make. But
the better P5 is at 1.0 both the better we look, and the more TEI is
useful to encoders!


* I don't think the names of the various views should be capitalized.
  They're not really proper nouns, but more importantly it looks
  awful. 

* Change "his/her" to "his or her".

* Change "only lines and line groups" to "only metrical lines"

* #NHME, 2nd <egXML>:
  Maybe I don't get this poetry stuff, but if so, lots of other
  readers won't, either. The punctuation in the extract seems (to me)
  to give pretty strong evidence that this passage is 1 sentence with
  a bunch of clauses. It looks mis-encoded.

* #NHME, 4th <egXML>:
  What are the intervening <p> elements for? Why not just
  <egXML xmlns="http://www.tei-c.org/ns/Examples" corresp="#NH-eg-02">
    <p>
      <s>Catholic woman of twenty-seven with five children And a body
        first-ratepointed her finger at the back of one certain man and
        asked me, "Is that guy a psychiatrist?" and by god he was!</s>
      <s>"Yes," She said, "He <emph>looks</emph> like a psychiatrist."</s>
      <s>Grown quiet, I looked at his pink back, and thought.</s>
    </p>
  </egXML>

* "... involves marking the starting and (usually) end points of the
  non-nesting ...": either "start" and "end" or "starting" and
  "ending". But more importantly, the "(usually)" should be deleted.
  We are not, and should not, recommend any methods of encoding that
  make use of implicit, rather than marked, structures.

* "There are several variations on this method of encoding:</p>": We
  should not end a paragraph with a colon.

* Next para implies that milestones are segment boundary delimiters,
  which they are not. (Lou has argued a generic <milestone> could be
  licensed for such use; I think such a license would inappropriately
  lump two different things into the same bucket -- but in either
  case, it's not how <milestone> is currently licensed to be used.)

  I think the solution is to move the two paras "For some common
  structural ... would break down entirely." up into a new section
  that precedes #NHBM, call it "Transition Marking with Empty
  Elements" or some such. However, I'm not sure the "don't use <lb>
  is if it were <l>" warning is really necessary. (I don't think I
  object, either, it just seems a little out of place.)

  Then remove "also" from "The segment boundaries also may be
  delimited by", which is now the start of #NHBM.

* Not that I expect anything to be done about it, but let me say
  again that I think this use of <anchor> is an abuse that we may
  live to regret. Segment boundary delimiters do *not* mark points in
  the text -- they mark ranges.

* That said, if we are going to have this abusive use of <anchor>, we
  should demonstrate some useful practices for the value of subtype=: 
  - use the name of the element that would otherwise have been used,
    not its gloss (so in this case, "s", not "sentence")
  - use something other than camel case to separate the "start" and
    "end" from the element name, lest we run afoul of someday wanting
    to have a <quoteStart> element.
  Thus I would suggest "s-start" and "s-end".

* Given that we have an example demonstrating <anchor>, the next
  example that demonstrates custom elements either has to encode more
  information or should not exist. I.e., if we do not demonstrate
  what is to be gained by custom elements, we shouldn't bother
  discussing them here.

* Para starting "Finally, elements that are normally used ..." up to
  the <egXML>:
  - <l>, generally speaking, cannot easily be encoded this way (the
    content model of <lg> has to be modified)
  - "they serve as empty boundary delimiters when": insert "segment"
    before "boundary"
  - attributes are not added to content models in the vernacular; how
    about "are defined"?
  - I think the parenthetical should be a footnote.

* "... to an existing TEI milestone or anchor tag automatically
  without ...": besides the mis-match in number, it is *not* a
  requirement that they be transformed into <milestone> or <anchor>
  elements! How about:
     ... and if the modified elements and attributes can be mapped
     directly to existing TEI markup structures automatically without
     loss of information
  
* "... interpretations of the <choice> <expan>Noun Phrase</expan>
  <abbr>NP</abbr> </choice> <q>fast trains and planes</q>."

  "Noun Phrase" should not be capitalized; the <q> should be a
  <quote>. Unless we can fix the stylesheets ASAP, we'll have to drop
  the use of <choice>.

* "... of the phrase <mentioned>Fast trains and planes ..."
                                ^
                                lower case

* <egXML> following fig 5:
  - my gut instinct is that the subtype= values should be "phr-start"
    and "phr-end"
  - I really dislike overloading corresp=, but I suppose I'm alone on
    this Island of Purity

* "Despite their advantages, segment-boundary delimiters incur the"
                                    ^
                                    space

* "This means, amongst other things, that the reconstituted document
  may not itself be valid." I'm not sure I know what this means
  in this context, and it sounds like a non-sequitor. I'm inclined to
  delete this sentence. Furthermore, the footnote (#79 currently)
  about rule-based languages should be on the previous sentence,
  anyway. 

* #NHVE, first 2 <egXML>s: I am uncomfortable with implying that it
  is OK or good to be reconstituting partial elements by co-indexing
  n=. We have an attribute for the simple case, part=; we have
  attributes for the complicated case: next= and prev= (shown in next
  xmp); we have at least 2 out-of-line methods (<join> and stand-off
  w/ <xi:include>). Why are we introducing another, demonstrably
  problematic method? I realize we discuss the problems, but this
  abuse of n= rubs me the wrong way, anyhow.

* "... example, is a Prepositional Phrase, not a sentence ..."
                     ^ lower case  ^

* "An example comes if we attempt to combine a detailed Grammatical
  view of the Pinsky example with its metrical encoding" reword:

       This problem can be demonstrated by using the <att>part</att>
       attribute when encoding both the grammatical and metrical
       views in the Pinsky example, as follows.

* "to provide targets for <gi>include</gi>": use <gi>xi:include</gi>.

* #NHSO, 1st <egXML>: there is no indication, except the footnote,
  that the <include> element is from the XInclude specification.
  Either make it explicit in the example (as with the
  "http://www.example.org/cannot/really/use/XInclude" namespace), or
  use an <eg> and the correct encoding.

* "In as much as it uses elements not included in the TEI namespace,
  stand-off markup involves an extension of the TEI." This is false
  on 2 grounds: using an element outside the TEI namespace does *not*
  make something an extension. (More later, maybe). But more
  importantly, this method does not use any elements outside the TEI
  namespace. (Except <xi:include>, and remember, validation is
  performed *after* inclusion processing, so these stand-off
  documents are still valid against tei_all.)





More information about the tei-council mailing list