[tei-council] Revised AB

Syd Bauman Syd_Bauman at Brown.edu
Thu Oct 25 18:20:44 EDT 2007


I read version r3778, and overall it looks very good. Because I
realize not everyone will want to read everything, I've marked those
items I consider more important for Council, or more likely to be of
interest to some, with "@" instead of my usual "*".

* #AB/p[4], last sentence: the ending sequence of 4 prepositional
  phrases makes this sentence clumsy. Julia had the following
  recommendation: "We believe they reflect the widest variety of
  digital textual practices currently in use, though they are by no
  means limited to these."

* #AB/p[5]: Para ends with two "for more information see". Neither
  seems necessary here, especially since this kind of information is
  more thoroughly addressed a few paras later. The first one at least
  fits in with the flow of the rest of the para, but the second one
  feels like a non-sequitur.

* #ABSTRUNC/head: Why is this 1 section and not 2? (Just because the
  notational conventions would be so short? If so, makes some sense,
  esp. as I'm recommending below it be shorter!)

@ #ABSTRUNC/p[2]: I still think that if we'd like chapters to be
  arranged in increasing order of specialist interest that WD
  "Representation of Non-standard Characters and Glyphs" needs to
  moved much lower, down near NH "Non-hierarchical Structures" and
  "Certainty and Responsibility".

  But as for the rest of this paragraph, it looks like the list of
  chapters described presumes that WD has already been moved?
  Otherwise the count of 4 general, 8 genre, 9 special, 2 technical
  just don't line up.

@ #ABSTRUNC/p[6] (the one para purely on notational conventions): I
  am not at all sure it is a good idea to discuss the on-line
  formatting at this level of detail in the Guidelines. At this
  level, shouldn't the discussion be presentation-independent? This
  information belongs on a website page outside of (but linked by the
  nav bar of) the Guidelines proper, I suspect.

  If we keep this information, the first sentence's "XML elements or
  TEI classes" should read "TEI elements or classes", because non-TEI
  elements are not links. However, I do not understand why datatypes
  and macros aren't links. Seems to me a different set of stylesheets
  should be able to make them all (or none) links w/o necessitating
  a change to the content of the Guidelines.

  Second sentence ignores the fact that empty elements are displayed
  differently. I am strongly against displaying the name of an
  element differently based on whether or not it is permitted to have
  content, and have asked Chris, Dot, and James to remove the
  slash from names of empty elements, particularly in running prose.

@ #ABTEI2/p[1], "TEI scheme is .. committed to providing a maximum of
  ... flexibility, and extensibility": I am not at all sure what to do
  about this; perhaps nothing. But I think it is very problematic to
  claim that we are committed to providing maximum flexibility and
  extensibility, but then to define "conformance" in a way that at
  least erects barriers to, if not outright curtails that flexibility
  and extensibility of which the system is capable.

* #ABTEI2/p[1]/list[1]/item[2], "provide guidance for encoding of
  texts in this format":
  Should be either
  "provide guidance for the encoding of texts in this format"
  or
  "provide guidance for encoding texts in this format".

* #ABTEI2/p[1]/list[2]/item[4], "... the same text features":
  s/text/textual/;. 

* #ABTEI2/p[2], sentence 1: I'd change from plural to singular, i.e.:
  "The goal of creating a common interchange format which is
  application independent requires ...".

* #ABTEI2/p[2], sentence 2: "... which defines an Extensible Markup
  Language, but their definition is as far as possible independent of
  any particular schema language." First, even if the wording stays,
  "extensible markup language" should not be capitalized. But it is
  an arguable point: XML does not define a markup language, it
  defines a metalanguage.

* #ABTEI2/p[3], last sentence "... the Guidelines very rarely require
  any particular level of encoding, their correct use requires
  conformance to the meanings attached to the encodings they
  propose.": OK, maybe it's me: maybe I'm not as smart as the rest of
  you, or maybe I'm just over tired. But I've read that 3 times, and
  I'm still not 100% sure I understand it.

* #ABTEI2/p[4], last sentence: c/their/his or her/.

* #ABTEI2/p[5], last clause, "and the two terms <term>text
  creation</term> and <term>text capture</term> are often used
  interchangeably": I think whole clause should be deleted, as I
  don't think it's true. The regular expression "text\s+creation"
  does not occur outside of AB.

* #ABTEI2/p[6], last sentence: this is probably important to say, but
  it's not phrased quite right. It says basically that P5 does not
  support the "strip out the tags, and you get the Gutenberg
  plain-text version :-)" theory of markup; P4 did. But that's not
  quite right, either: P4 did not *support* that theory of markup,
  but rather licensed it or permitted it. You can certainly encode
  things in P4 in such a way where that is not true. But in P5, while
  it is *possible* to create encodings where that theory holds, it is
  not possible in the general case (falls apart as soon as you want
  to encode an error and its correction, e.g.), and licensing that
  theory was explicitly *not* a design goal. Lou, perhaps you can
  take the following as a starting point for a re-write:

    Further it should be noted that the encoding system described by
    these Guidelines no longer licenses the capability to encode
    texts such that simply removing the markup reveals an unmediated
    version of the source text; this capability was permitted by
    previous versions.

* #ABTEI2/p[8], "... a variety of text features, but sometimes ...":
  I'd change "text" to "textual".

@ #ABTEI2/p[9], 2nd sentence ("Because no predefined ..."): I am
  uncomfortable with the implication that the term "customization"
  only refers to reducing the scope -- we have always (including in
  MD, I believe) used it to mean both reductions and modifications by
  adding new stuff.

* #ABAPP1/p[2], "TEI interchange format": This format is not even
  mentioned, let alone defined, anywhere else in the Guidelines.

* #ABAPP1/p[4], sentence 3, "... which scans word-processor ...":
  while there are uses of the term "scans" which make this sentence
  true, the default use in most users' heads (and emphasized by its
  use 2 words later) does not. How about:

    Special-purpose software may be purchased which reads
    word-processor files or the output of page scanners and inserts
    tags.

* #ABAPP2/p[1], sentence 2, "If there are <val>n</val> different
  encoding formats, to provide a mapping between any given pair of
  formats requires <val>n*n-1</val> translations; with an interchange
  format, only <val>2n</val> such mappings are needed."
  
  OK, first, those are not <val>s.
  
  Second, "to provide a mapping between any given pair of formats"
  only requires 2 mappings (A -> B and B -> A). It might be better to
  be explicit and say "to and from" rather than "between". Not sure.
  In any case, what requires a whole lot is "to provide mappings
  to and from each given pair of formats".

  But the lot of mappings required is not N*N-1 (because that is
  really (N*N)-1, because multiplication takes precedence over
  addition), but rather N*(N-1).

  So something like:
      If there are <formula>n</formula> different encoding formats,
      to provide mappings between each possible pair of formats
      requires <formula>n*(n-1)</formula> translations; with an
      interchange format, only <formula>2n</formula> such mappings
      are needed.

@ #ABAPP2/p[3]/list[1]/item[2]/list/following-sibling::text(), "The
  second requires an extension to the TEI scheme, as described ...":
  I think it may be worth somehow pointing out here that these are,
  definitionally, non-conformant.
  
  My reading of CF for this comment has brought to light an
  inconsistency, about which I hope to post shortly ...

* #ABTEI/p[3]: "This version includes substantial amounts of ...":
  s/includes/included/; 

* #ABTEI/p[4]: I'd drop the comma after "June".

* #ABTEI/p[5], "... Extensible Markup Language, XML. <note
  place="foot">XML was originally ...": Extra space before <note>. 
  Should this be linked to the bibliography? How about TEI P1, P2,
  P3? 

@ #ABTEI: I think there is too much devoted to describing the TEI
  Consortium (which is really not that relevant to the Guidelines),
  but paring down such a discussion is quite difficult. Certainly we
  can drop the bit about TEI Members' Meetings, no?

* #ABTEI/p[8], "... the TEI Board appointed a Technical Council ...":
  I don't like the word "appointed" here. "Created"? "Formed"?
 
* #ABTEI//note[1]: "... from their inception up till 1998" sounds
  awfully informal. How about ".. from their inception until 1998"?



More information about the tei-council mailing list