[tei-council] facsimile markup

Thu Nov 23 03:25:28 EST 2006

Good evening all!

Thanks to Dot who put in a bit more work on this while I slept (until lunch time!) before her Thanksgiving vacation.

This is a short document intended to provide descriptions and examples of markup which could be used to link graphical images to pages, and within each page, to link segments of text to regions within those pages. I'm aware that our working documents have been horrendous. With this one I've tried to keep it concise so that it may be read without too much boggling of the mind. There's no attempt to justify the design, and no comparisons with SVG, MPEG-21, METS, etc. either.

Next I will be working on turning this into an ODD (by the 11th), producing some small sample docs, and writing XSLT for typical processing of such docs (such as highlighting the occurrence of certain words or phrases in a facsimile image). 

Regards

Con

PS happy Thanksgiving to the American council members!

Fax Markup

Linking page breaks to pages

Each <pb> is the link target of a <pg> element in the <sourceDesc> which represents the physical page which the <pb> introduces. e.g.

<sourceDesc>
   ...
   <pg start="#pb-1">
   ...
</sourceDesc>
...
<pb xml:id="pb-1"/>

An open question is how this <pg> element will fit with the physical bibliography draft. It appears that <pg> element here may be equivalent to <page> described in the PB draft (http://www.tei-c.org/Activities/PB/draft-0714.html). 

Linking pages to images

Each <pg> element may contain 1 or more <graphic> elements which represent facsimiles of the page.

e.g.
<!-- example showing page with multiple facsimile images -->
<pg start="#pb-1">
	<graphic url="p1.jpg" mimeType="image/jpeg" type="#access"/>
	<graphic url="p1-t.jpg" mimeType="image/jpeg" type="#thumbnail"/>
</pg>

Geometric transformations

The existing P5 element <graphic> refers to an image file with a @url attribute, and provides a @scalefactor attribute to scale the image uniformly (i.e. the same horizontally and vertically). 

This proposal defines a larger set of attributes (in a class "att.projection") to permit the encoding of a more general class of geometric projection; not only to scale but also maybe to shear (skew), rotate, and translate (move) the linked image. This would allow a <graphic> element to document the geometric relationship between a page and an image file which is a facsimile of that page. For example, if an image file is a scan of 2 pages, then 2 <graphic> elements might point at the same file, but pan and zoom to show either of the 2 pages contained in it. 

The att.projection class might include a variety of attributes to allow for this transformation to be specified as a combination of rotation, scaling, etc, which would be more intuitive for encoders, or using matrix algebra, which might be a more convenient encoding scheme for using with image-markup tools, or for unusual cases. A perspective projection, for instance, might be used to correct the foreshortening effect in photographs of inscriptions on large monuments.

We may need to tweak this class a bit and it would be a candidate for customisation in projects where the source material is unusual. It may be that there are better ways to represent these geometric transformations as XML attributes, such as as a single attribute containing multiple tokens. For now, though, att.projection would define:
@scale
@rotation
@x (left edge)
@y (top edge)
@width
@height

e.g.
<graphic mimeType="image/jpeg" url="foo.jpg" x="0" y="0" width="100px" height="100px"/>

Classifying graphics

Each graphic has a MIME content-type, but it can also be typed in other ways by linking it to one or more category elements defined in a taxonomy. We should define a base taxonomy including core concepts such as "access image", "thumbnail", "colour", "monochrome", etc, and allow users to refine these concepts by adding subordinate taxa to the taxonomy.

e.g.

<taxonomy xml:id="graph-tax">
	<bibl>
		<ref target="http://www.tei-c.org/P5/FAX.html#graph-tax">
			The TEI basic graphics taxonomy
		</ref>
	</bibl>
	<category xml:id="master">
		<catDesc>Master image files. Not for access.</catDesc>
	</category>
	<category xml:id="access">
		<catDesc>Regular access files.</catDesc>
	</category>
	<category xml:id="thumbnail">
		<catDesc>Small thumbnail image files</catDesc>
	</category>
	<category xml:id="ultraviolet">
		<catDesc>Image files taken under UV lighting</catDesc>
	</category>
	<category xml:id="microfilm">
		<catDesc>Image files scanned from microfilm</catDesc>
	</category>
</taxonomy>

...

<pg start="#pb-1">
	<graphic url="p1.tiff" mimeType="image/tiff" type="#master"/>
	<graphic url="p1.jpg" mimeType="image/jpeg" type="#access"/>
	<graphic url="p1-t.jpg" mimeType="image/jpeg" type="#thumbnail"/>
	<graphic url="../uv/p1.tiff" mimeType="image/jpeg" type="#ultraviolet #master"/>
	<graphic url="../mf/p1.tiff" mimeType="image/jpeg" type="#microfilm #master"/>
	<graphic url="../uv/p1.jpg" mimeType="image/jpeg" type="#ultraviolet #access"/>
	<graphic url="../mf/p1.jpg" mimeType="image/jpeg" type="#microfilm #access"/>
</pg>

Linking units of text to regions within pages

Each unit of text may be linked to a geometric region of a page. The coordinates of this region are given in the coordinate system of the page. In turn, this coordinate system is independently related to the coordinate systems of each facsimile image of that page, by the att.projection attributes of the <graphic> element of each facsimile image. If the att.projection attributes of a <graphic> were to specify an identity transformation (e.g. if these attributes were all empty) then the region coordinates would be interpreted as defining a region in terms of the coordinate space of the <graphic>. 

In general each unit of text falls within a region of a page. Any element whose position on a page it is desirable to encode should therefore be a member of the att.projection class so that it has the geometric attributes. The coordinate space of the element would be relative to the page on which the element falls. e.g.

<pb xml:id="pb-1"/>
<seg x="100" y="50" width="20" height="10">foo</seg>

In a variorum edition, a unit of text may appear in multiple editions, and hence it must relate to distinct regions of distinct pages from each edition. A given piece of text may appear at the top of page 2 in one edition, and at the bottom of page 1 in another. In this case it would be necessary to separately define 2 regions, one for each edition. In this case each unit of text could be encoded using special-purpose <region> elements which would be children of the <pg> representing the page on which they fall, and would be linked to the element containing the unit of text with @corresp. 

alternatively these regions could be encoded multiply (parallel segmentation, using <app> and <rdg>), with each <rdg> having its own location. The coordinate space of each <rdg> would be relative to the <pg> corresponding to the <pb> which corresponds to the rdg (i.e. the preceding pb whose @ed matches that of the rdg).

e.g.
<pb n="1" xml:id="pb-A-1" ed="A"/>
...
<pb n="2" xml:id="pb-B-2" ed="B"/>
...
<app>
	<rdg ed="A" left="100" top="100">foo</rdg>
	<rdg ed="B" left="200" top="100">bar</rdg>
</app>