[tei-council] Tite and conformance
James Cummings
James.Cummings at oucs.ox.ac.uk
Fri Jul 10 05:40:09 EDT 2009
David Sewell wrote:
> Just one comment on the TEI Header issue. Given a strong enough
> statement to the effect that a TEI Tite document is intended for initial
> keyboard capture and is in a non-archival, nonconformant TEI format
> where post-processing is expected, I don't think it is horrible that
> <text> is permitted as a root element.
I agree that Tite users would probably just abuse a header if they had
one. I suppose it isn't *just* the removal of the header that bothers
me. When Tite was first conceived a valid TEI document consisted of a
teiHeader and a text element. That is no longer the case. TEI documents
can now consist of a teiHeader, a text _or_ a facsimile (or fsdDecl, but
let's not go there...)
Might point is that large digitisation are often working from processed
images where they've started with OCR and have hordes of students
marking up (with a graphical interface) and proofreading and correcting
texts. Alternatively sometimes when they are double-keying they are
doing it against the images and there is at least a page/image
relationship.
But the current setup of Tite does not allow them to have a <facsimile>
element to preserve this information. Output from OCR programs or
formats like Omnipage, Finereader, DejaVu in their XML forms actually
have every word marked up with corresponding co-ordinates. (I've
written some XSLT to change DejaVu to TEI facsimile for example.)
But exactly the people who are in a position potentially to preserve
this information are unable to under the format that we're suggesting to
them. If I were suggesting a reDesign of Tite, it would include a
wrapper element (TEI? Something else?) around the <text> so as to also
allow a parallel <facsimile>. But I'm not suggesting a reDesign of
Tite. I just thought I'd point out that having <text> has a root has
other implications.
-James
--
Dr James Cummings, Research Technologies Service, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk
More information about the tei-council
mailing list