[tei-council] Responses to Primary Sources #1 (up to the end of 11.1)
mholmes at uvic.ca
Thu Nov 24 13:42:42 EST 2011
This is my first batch of feedback on the Primary Sources chapter draft.
First, I should say that this is becoming an excellent piece of work,
and although I have many quibbles and criticisms, I don't want to appear
merely critical. This is what I have so far:
Repetition of "for example". Recommend substituting "such as" for the
"It may sometimes contain a variety of images of the same source pages,
for example of different resolutions, or of different kinds. Such a
collection may form part of any kind of document, for example a
commentary of a codicological or paeleographic nature, where there is a
need to align explanatory text with image data."
Superfluous "And" at the beginning of this sentence, especially since
"also" is present:
"And it may also be complemented..."
In this sentence:
"These elements make it possible to accommodate multiple images of each
page, as well as to record arbitrary planar coordinates of textual
elements on any kind of written surface and to link such elements with
digital facsimile images of them."
I don't believe that we need the word "textual"; it implies (to me at
any rate) that non-textual elements on the page cannot be identified by
<zone>s. Suggest either deletion of the word, or "textual or other
The description of sourceDoc depends on the phrase "dossier génétique".
I think this should be glossed in English. I don't know what it should
be glossed with, of course.
In this sentence:
"Either of the facsimile and sourceDoc elements may be used to represent
a digital facsimile."
I maintain that "and" should be "or".
The first example of mapping coordinate spaces, using the Karlsruhe
image, is pointlessly complicated. First, we create a <surface> whose
coordinate space is not identical with the graphic we're working with,
then create a <zone> which is larger than the <surface>, with the
graphic inside the <zone>. Why such complexity? The simplest case would
be to create a <surface> whose coordinate space is 0, 0, 500, 321, and
then use <zone>s to define the spaces of interest (the left and right
pages). My argument is not that it's wrong to do what's currently in the
document; it's that it's a rather abnormal way to proceed, it's too
complicated for the first example of <surface> and <zone>, and that it
will be off-putting and confusing for readers. I suggest the first
example should work like this (using the same page-image):
<zone ulx="37" uly="16" lrx="230" lry="293" xml:id="k95v"></zone>
<zone ulx="232" uly="16" lrx="416" lry="293" xml:id="k96r"></zone>
In other words, a single surface coterminous with the graphic, with two
zones established on it, one for each page.
If necessary, this simple example could be re-worked into the way it
currently appears, with the addition of a good explanation of why one
might do this, but I think the simple example needs to come first; it
will be what most people want to do, and will be all that many people
actually need. In fact, the Bovelles example which is carefully
worked-out below starts from this simple approach, but many readers will
not get that far because they will stumble at the first, more confusing,
In Figure 3, Zones within a surface, the added zone boundaries should be
in a different colour from the original image, so it's clear to the
reader that they are an artifact of the encoding, not part of the
The first example of using the @points attribute, on the Bovelles image,
is pointlessly complicated:
points="4.88147,31.0344 5.46483,30.7339 5.58857,32.2011
5.85374,32.8022 6.10123,33.4386 5.53554,33.7744 5.11128,33.3679
Why not just use integers here? Nothing is gained by five decimal
places, other than to slightly intimidate the reader. Most uses of
@points will use whole numbers (based on pixels within the image, below
which there is little purpose in descending).
In the Bovelles transcription, which links with the image further up the
page, the zone "B49rHead" is defined to contain both <head> elements
that appear at the beginning of the <div> (including "Chapitre
septiesme"). However, in the transcription example, only the first
<head> is linked to that <zone> using @facs. I suggest that either:
- The transcription be modified to contain a single <head>, so it can
be unambiguously linked to the <zone>, or
- The image of zones be modified to split that zone into two, so that
each can be linked to its appropriate <head>.
This whole section, which is tiny, seems superfluous to me. Its contents
have already been covered above ("a legal TEI document may thus comprise
any of the following : ..."), and if the explanation above is
insufficient, it should be expanded so this section can be deleted.
Another way of looking at this section is that it comprises the
introduction to 11.1.3, in which case it should be folded into it.
In this sentence:
"An embedded transcription is one in which words and other written
traces are encoded as subcomponents of elements representing the
physical surfaces carrying them rather than independently of them. "
I recommend a comma after "carrying them".
This sentence might not be true:
"Equally, the encoder may choose to provide only graphics without any
transcription, to provide only a structured (non-embedded)
transcription, or to provide any combination of the three."
I don't think <facsimile> + <sourceDoc> + <text> is actually allowed, is
it? If it is, then it needs to be included in the list further up the
page ("a legal TEI document may thus comprise any of the following : ").
More in a bit...
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)
More information about the tei-council