[tei-council] Revised AB
Syd Bauman
Syd_Bauman at Brown.edu
Thu Oct 25 18:20:44 EDT 2007
I read version r3778, and overall it looks very good. Because I
realize not everyone will want to read everything, I've marked those
items I consider more important for Council, or more likely to be of
interest to some, with "@" instead of my usual "*".
* #AB/p[4], last sentence: the ending sequence of 4 prepositional
phrases makes this sentence clumsy. Julia had the following
recommendation: "We believe they reflect the widest variety of
digital textual practices currently in use, though they are by no
means limited to these."
* #AB/p[5]: Para ends with two "for more information see". Neither
seems necessary here, especially since this kind of information is
more thoroughly addressed a few paras later. The first one at least
fits in with the flow of the rest of the para, but the second one
feels like a non-sequitur.
* #ABSTRUNC/head: Why is this 1 section and not 2? (Just because the
notational conventions would be so short? If so, makes some sense,
esp. as I'm recommending below it be shorter!)
@ #ABSTRUNC/p[2]: I still think that if we'd like chapters to be
arranged in increasing order of specialist interest that WD
"Representation of Non-standard Characters and Glyphs" needs to
moved much lower, down near NH "Non-hierarchical Structures" and
"Certainty and Responsibility".
But as for the rest of this paragraph, it looks like the list of
chapters described presumes that WD has already been moved?
Otherwise the count of 4 general, 8 genre, 9 special, 2 technical
just don't line up.
@ #ABSTRUNC/p[6] (the one para purely on notational conventions): I
am not at all sure it is a good idea to discuss the on-line
formatting at this level of detail in the Guidelines. At this
level, shouldn't the discussion be presentation-independent? This
information belongs on a website page outside of (but linked by the
nav bar of) the Guidelines proper, I suspect.
If we keep this information, the first sentence's "XML elements or
TEI classes" should read "TEI elements or classes", because non-TEI
elements are not links. However, I do not understand why datatypes
and macros aren't links. Seems to me a different set of stylesheets
should be able to make them all (or none) links w/o necessitating
a change to the content of the Guidelines.
Second sentence ignores the fact that empty elements are displayed
differently. I am strongly against displaying the name of an
element differently based on whether or not it is permitted to have
content, and have asked Chris, Dot, and James to remove the
slash from names of empty elements, particularly in running prose.
@ #ABTEI2/p[1], "TEI scheme is .. committed to providing a maximum of
... flexibility, and extensibility": I am not at all sure what to do
about this; perhaps nothing. But I think it is very problematic to
claim that we are committed to providing maximum flexibility and
extensibility, but then to define "conformance" in a way that at
least erects barriers to, if not outright curtails that flexibility
and extensibility of which the system is capable.
* #ABTEI2/p[1]/list[1]/item[2], "provide guidance for encoding of
texts in this format":
Should be either
"provide guidance for the encoding of texts in this format"
or
"provide guidance for encoding texts in this format".
* #ABTEI2/p[1]/list[2]/item[4], "... the same text features":
s/text/textual/;.
* #ABTEI2/p[2], sentence 1: I'd change from plural to singular, i.e.:
"The goal of creating a common interchange format which is
application independent requires ...".
* #ABTEI2/p[2], sentence 2: "... which defines an Extensible Markup
Language, but their definition is as far as possible independent of
any particular schema language." First, even if the wording stays,
"extensible markup language" should not be capitalized. But it is
an arguable point: XML does not define a markup language, it
defines a metalanguage.
* #ABTEI2/p[3], last sentence "... the Guidelines very rarely require
any particular level of encoding, their correct use requires
conformance to the meanings attached to the encodings they
propose.": OK, maybe it's me: maybe I'm not as smart as the rest of
you, or maybe I'm just over tired. But I've read that 3 times, and
I'm still not 100% sure I understand it.
* #ABTEI2/p[4], last sentence: c/their/his or her/.
* #ABTEI2/p[5], last clause, "and the two terms <term>text
creation</term> and <term>text capture</term> are often used
interchangeably": I think whole clause should be deleted, as I
don't think it's true. The regular expression "text\s+creation"
does not occur outside of AB.
* #ABTEI2/p[6], last sentence: this is probably important to say, but
it's not phrased quite right. It says basically that P5 does not
support the "strip out the tags, and you get the Gutenberg
plain-text version :-)" theory of markup; P4 did. But that's not
quite right, either: P4 did not *support* that theory of markup,
but rather licensed it or permitted it. You can certainly encode
things in P4 in such a way where that is not true. But in P5, while
it is *possible* to create encodings where that theory holds, it is
not possible in the general case (falls apart as soon as you want
to encode an error and its correction, e.g.), and licensing that
theory was explicitly *not* a design goal. Lou, perhaps you can
take the following as a starting point for a re-write:
Further it should be noted that the encoding system described by
these Guidelines no longer licenses the capability to encode
texts such that simply removing the markup reveals an unmediated
version of the source text; this capability was permitted by
previous versions.
* #ABTEI2/p[8], "... a variety of text features, but sometimes ...":
I'd change "text" to "textual".
@ #ABTEI2/p[9], 2nd sentence ("Because no predefined ..."): I am
uncomfortable with the implication that the term "customization"
only refers to reducing the scope -- we have always (including in
MD, I believe) used it to mean both reductions and modifications by
adding new stuff.
* #ABAPP1/p[2], "TEI interchange format": This format is not even
mentioned, let alone defined, anywhere else in the Guidelines.
* #ABAPP1/p[4], sentence 3, "... which scans word-processor ...":
while there are uses of the term "scans" which make this sentence
true, the default use in most users' heads (and emphasized by its
use 2 words later) does not. How about:
Special-purpose software may be purchased which reads
word-processor files or the output of page scanners and inserts
tags.
* #ABAPP2/p[1], sentence 2, "If there are <val>n</val> different
encoding formats, to provide a mapping between any given pair of
formats requires <val>n*n-1</val> translations; with an interchange
format, only <val>2n</val> such mappings are needed."
OK, first, those are not <val>s.
Second, "to provide a mapping between any given pair of formats"
only requires 2 mappings (A -> B and B -> A). It might be better to
be explicit and say "to and from" rather than "between". Not sure.
In any case, what requires a whole lot is "to provide mappings
to and from each given pair of formats".
But the lot of mappings required is not N*N-1 (because that is
really (N*N)-1, because multiplication takes precedence over
addition), but rather N*(N-1).
So something like:
If there are <formula>n</formula> different encoding formats,
to provide mappings between each possible pair of formats
requires <formula>n*(n-1)</formula> translations; with an
interchange format, only <formula>2n</formula> such mappings
are needed.
@ #ABAPP2/p[3]/list[1]/item[2]/list/following-sibling::text(), "The
second requires an extension to the TEI scheme, as described ...":
I think it may be worth somehow pointing out here that these are,
definitionally, non-conformant.
My reading of CF for this comment has brought to light an
inconsistency, about which I hope to post shortly ...
* #ABTEI/p[3]: "This version includes substantial amounts of ...":
s/includes/included/;
* #ABTEI/p[4]: I'd drop the comma after "June".
* #ABTEI/p[5], "... Extensible Markup Language, XML. <note
place="foot">XML was originally ...": Extra space before <note>.
Should this be linked to the bibliography? How about TEI P1, P2,
P3?
@ #ABTEI: I think there is too much devoted to describing the TEI
Consortium (which is really not that relevant to the Guidelines),
but paring down such a discussion is quite difficult. Certainly we
can drop the bit about TEI Members' Meetings, no?
* #ABTEI/p[8], "... the TEI Board appointed a Technical Council ...":
I don't like the word "appointed" here. "Created"? "Formed"?
* #ABTEI//note[1]: "... from their inception up till 1998" sounds
awfully informal. How about ".. from their inception until 1998"?
More information about the tei-council
mailing list