[tei-council] encoding page scans

Conal Tuohy Conal.Tuohy at vuw.ac.nz
Wed Dec 14 18:01:37 EST 2005


Dot Porter wrote:

> I agree that it is be preferable to provide image file references via
> indexing - either through a  METS file or a list of files in the TEI
> Header -  rather than including them in the text encoding itself using
> something like pb/@url. My main reason is that a single page in TEI
> may point to multiple scanned images - the same manuscript page under
> different lighting conditions, for example. Building those links into
> the pb element would make it cumbersome to link one encoded page to
> several different files, while a index could group images of the same
> page together providing a single reference for the pb.

That's true. And there's different purposes (thumbnail, reference), and
formats too (e.g. JPEG, TIFF, etc)

> Going a step further, what if we want to create relationships between
> areas of that page scan and the corresponding range in the TEI text?
> Is this something that the TEI should cover, or is it better to rely
> on other standards (METS, SVG)? The Edition Production Technology[1]

I have a preference for something internal to TEI for the sake of making
it very easy to encode in the simple case. But I'm not too fussed so
long as some recommended practice is documented as part of the TEI
guidelines.

> depends on @coords, which can be added to any tag and which indicates
> where the tag contents reside on the image. Unfortunately this only
> allows for one set of coordinates, even if there are multiple image
> files for the same page. For another project, I've been working on a

The same problem as above then - the links should go from the image to
the text, rather than the other way? Or we could use link/@targets to
encode n-ary links, i.e. a single link element indicating a
correspondence between some text markup and 1 or more graphics (or
regions within graphics)? The weakness of link/@targets for multiple
links is that it doesn't attach any semantics to the different links.

> system for encoding multiple sets of coordinates in a METS Structural
> Map (stored in a separate file, rather than as a wrapper for the TEI),
> and it seems to work pretty well. 

I'm sure it does :-) and I've read the METS profiles for doing it[1]
though I've never done it myself ... but do you think it should really
be necessary to go to these lengths? It seems to me that we could use
METS to handle TEI figure elements too. Page scans are a different level
of description of the text from figures, but TEI is full of different
analytical levels of all kinds, so I don't find that argument convincing
in itself. But I suppose I could be convinced :-) The main thing, to my
mind, is that the guidelines should clearly document some standard
practice that doesn't involve treating page scans as figures, or making
custom extensions to the TEI schema.

Dot, could you post a little example of the METS and TEI markup you are
using to associate an image file with a TEI page?

> The UVic Image Markup Tool[2] uses
> SVG within the TEI body to link to both file and coordinates; 

That's an interesting example. I agree that inline SVG could be a good
way to mark up regions within a graphic. Just a quibble, though: in your
example, is the link between the graphical region and the text
represented by a purely conventional correspondence between the @n and
the @xml:id attributes? In the guidelines it suggest using a ptr to
provide a TEI-namespaced proxy for the SVG region, then linking the ptr
and the text markup with a TEI <link> element. 

> another
> approach might be to have a TEI module for incorporating image files
> and their areas in a project.

This last is the approach I think I would prefer. One of my criteria for
such a feature would be that in the simplest case it should be dead easy
to associate a page image with a page. Whereas the METS approach is
perfectly capable, but it's probably not so convenient for encoders, I
would guess. Having a separate file is an extra hassle, for a start,
though perhaps a "METS" TEI module could be produced which would add
METS as a root element, and introduce the TEI element embedded in the
METS structure map? Could the same be done with SVG? That could be a way
to at least provide recommended encoding mechanisms defined using the
standard TEI customisation practice.

Cheers

Con

[1] http://www.loc.gov/standards/mets/profiles/00000005.xml



More information about the tei-council mailing list