[tei-council] facsimile draft

Conal Tuohy Conal.Tuohy at vuw.ac.nz
Thu Aug 9 02:32:14 EDT 2007


G'day Lou, et al

I have had a look at the draft and I have a few comments and queries about your proposed changes:

Regarding <att>box</att>, I agree with the others who've suggested that the origin of our coordinate spaces should be at the upper left, and that positive values are to the right, and below, the origin. That's pretty conventional in my experience. I like the name of <att>box</att>! 

Can you comment on the new facsimile/front and facsimile/back - what would be encoded there?

My main set of issues relates to the changes about how graphics relate to zones, and to surfaces, and how textual elements relate to zones or graphics. This is quite a different system - can you explain the rationale behind the changes? 

In the previous draft, it was possible to assign graphical coordinates to individual elements in a transcription, but this is now dropped. What was the reason for that? I am pretty sure that Dot was particularly keen on that feature, and I was also convinced of its utility. For instance, imagine a TEI transcript originating in an OCR process, which would have image coordinates assigned to each word by the OCR software. Using your draft markup, if I understand correctly, it would be necessary to create a distinct zone element for each word, essentially a parallel of the transcription, and link each word in the transcript to its corresponding zone. This would be quite an overhead!

I also think that the value space for @facs is too loose - in the sense that a <p> or a <div> could use a @facs pointer to point to either an image file, to a zone, or to a graphic. I have a feeling this is not going to be so convenient for processing. In the previous draft, the idea was that such links would be ONLY to zones, which were facsimile equivalents of <anchor> elements in a transcription. 

You've also allowed <graphic> inside <zone>, and I'm having a hard time understanding the rationale for this change. It seems to be of a piece with the change to remove <graphic> from att.coordinated. Now, since a graphic has no @box of its own, it inherits one from its parent <zone>, is that right?  In my previous draft, a graphic had a @box (or @coords as it was still called) attribute of its own, and hence didn't need to be enclosed in a zone, and I don't see why we'd want to wrap those graphics in zones, when they could just have their own @box. What does that gain us?

Requiring graphics to be contained in zones would be convenient to the extent that the distinct graphics correspond exactly to areas of interest (i.e. if they have been exactly cropped to that size), but I'm not sure this is likely to be a common case. It seems to me more likely that graphics will tend to be larger than zones, in almost all cases. Hence there would need to be an analytical zone (highlighted the area of interest) and a graphical zone (to contain a graphic which showed the area of interest). Only if the graphic had been cropped to exactly cover the area of interest could its parent zone be accurately used as an analytical zone, and linked to a piece of transcript. 

Removing graphic from zone (and giving graphic its own @box) would mean that zones would be always empty, and this would simplify processing, too, I believe.

Regarding the "short-cut" which allows facsimile/graphic instead of requiring facsimile/surface/graphic, this seems reasonable, though I wonder if there's much prospect of people using this short-cut, and if not, I think the shortcut should be abolished (to simplify processing). The reason I doubt it would be popular is that if you have a single graphic, you already have the option of linking to it directly from a pb, which is an even shorter short-cut. If you use the facsimile/graphic shortcut (i.e. a graphic as a direct child of facsimile, rather than mediated by a surface), you don't have the option of using zones anyway, so this slightly-longer shortcut doesn't cater for any distinct use case as far as I can see).

In short, I'm a bit flummoxed. I liked the linking better the way it was.

Con

-----Original Message-----
From: tei-council-bounces at lists.village.Virginia.EDU on behalf of Lou Burnard
Sent: Mon 06/08/07 9:07
To: tei-council at lists.village.Virginia.EDU
Subject: [tei-council] facsimile draft 
 
As mentioned in the call, I've been working on trying to produce a 
section about facsimile markup which could be plugged into the current 
chapter on physical transcription, using as many as possible of the 
ideas discussed here by Conal and others over the last few weeks.

Time is running out, and we need to get closure on this, so I hope Conal 
and Dot will excuse me for steaming ahead on this without consulting 
with them privately first. I've used the documents circulated and 
followed (as far as I can) the discussion so far to produce a 
straw-person kind of a draft which is now posted for your (particularly 
their) urgent attention at http://www.tei-c.org/Drafts/facs.odd

I've deliberately restricted the scope of what this draft makes possible 
to what I hope we can all agree on as a bare minimum of functionality. 
It supports linking from text to image and image to text with a minimum 
of fuss ; it also supports linking between text and image fragments, but 
only provides one way to do it. It tries to fit in with existing TEI 
idiom and practice.

It is however in desperate need of help on the following counts:

-- I haven't the faintest idea how to transcribe the  Old English ms 
we're using as an example. (The one Conal circulated earlier). Either 
someone needs to transcribe it for me, or I need to find another example 
which I can transcribe. (Actually, as this one claims to be copyright of 
the Bodleian, the second is probably the wiser course)

-- in defining how the co-ordinate system works, I have had to rely on 
my vague recollections of O level maths. Someone who actually knows 
about this stuff should read it carefully to see how plausible this is. 
Also how feasible it is to implement it!

-- I've also made a wild guess about how to specify the datatype of my 
@box attribute (formerly known as @coords -- I renamed it because it is 
considerably more restricted than the synonymous XHTML attribute)

All comments, bouquets, and brickbats gratefully received

Lou



_______________________________________________
tei-council mailing list
tei-council at lists.village.Virginia.EDU
http://lists.village.Virginia.EDU/mailman/listinfo/tei-council




More information about the tei-council mailing list