[tei-council] Tite and conformance

James Cummings James.Cummings at oucs.ox.ac.uk
Fri Jul 10 05:40:09 EDT 2009


David Sewell wrote:
> Just one comment on the TEI Header issue. Given a strong enough 
> statement to the effect that a TEI Tite document is intended for initial 
> keyboard capture and is in a non-archival, nonconformant TEI format 
> where post-processing is expected, I don't think it is horrible that 
> <text> is permitted as a root element.

I agree that Tite users would probably just abuse a header if they had 
one.  I suppose it isn't *just* the removal of the header that bothers 
me.  When Tite was first conceived a valid TEI document consisted of a 
teiHeader and a text element. That is no longer the case.  TEI documents 
can now consist of a teiHeader, a text _or_ a facsimile (or fsdDecl, but 
let's not go there...)

Might point is that large digitisation are often working from processed 
images where they've started with OCR and have hordes of students 
marking up (with a graphical interface) and proofreading and correcting 
texts.  Alternatively sometimes when they are double-keying they are 
doing it against the images and there is at least a page/image 
relationship.

But the current setup of Tite does not allow them to have a <facsimile> 
element to preserve this information.  Output from OCR programs or 
formats like Omnipage, Finereader, DejaVu in their XML forms actually 
have every word marked up with corresponding co-ordinates.  (I've 
written some XSLT to change DejaVu to TEI facsimile for example.)

But exactly the people who are in a position potentially to preserve 
this information are unable to under the format that we're suggesting to 
them.  If I were suggesting a reDesign of Tite, it would include a 
wrapper element (TEI? Something else?) around the <text> so as to also 
allow a parallel <facsimile>.  But I'm not suggesting a reDesign of 
Tite.  I just thought I'd point out that having <text> has a root has 
other implications.

-James
-- 
Dr James Cummings, Research Technologies Service, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk


More information about the tei-council mailing list