[tei-council] tei stemma model

Daniel O'Donnell daniel.odonnell at uleth.ca
Fri Jul 20 15:26:43 EDT 2007


Having followed this discussion, I'd say:

a) I think David has lived up to what we asked: shown how we could
represent a stemma within the existing TEI and some slight modifications

b) Despite the advantages enunciated for a heavily packed @n, I really
think Conal and James are right about the TEI and even XML-ness of using
ids for ids. David's issue with repetition of sigla-as-id in other
places can be addressed in other ways within the TEI if necessary. Even
a Linkgroup could keep track of them.

c) I like the ID label.

d) I would be less happy with a constraint on contaminates that
restricted the pointer to one idref: while it is true that
contaminations from the same tradition/manuscript can be different
impulses, they can also repeat themselves identically. I think we need
to be able to use both multiple idrefs or multiple contaminates.

This raises the interesting question of whether it is worthwhile trying
to capture information I know about each act of contamination: if I am
going to have multiple instances, presumably I am doing this for a
reason. Is there space for a note or something on this element?

e) For 5.0, I wonder if a basic version of David's proposal as modified
by Conal and James is not eminently doable. I think there may be
additional implications to contaminates that we won't see or solve in
the next week, but I think we definitely have the makings of an
extremely useful development.

On Wed, 2007-07-18 at 13:26 -0400, David J Birnbaum wrote:
> Dear Lou (cc James, Council),
> 
> I may have translated the assignment from "describe a stemma using only 
> the limited set of tools you have in front of you" into "describe a 
> stemma in the best way possible, and then we'll see how to integrate 
> that into the TEI." In any case, I was inclined toward an <eTree>-like 
> solution because contamination both is and isn't parentage, which led me 
> to conclude not that a contaminated stemma isn't a tree (although that's 
> a fair way to look at it), but that it's a tree with one other type of 
> relationship tacked on.
> 
> The existing TEI <graph>, with <node> and <arc>, could describe a 
> stemma, since a stemma is a special type of graph and the existing model 
> is more than sufficiently powerful, but I think it's a clumsy tool for 
> the job. If we consider what we might want to do with a stemma other 
> than render it, one possibility is that we might want to use it for the 
> semi-automated evaluation of variation. I think this type of possibility 
> is the most exciting aspect of the whole enterprise, and the sort of 
> thing that makes humanities computing interesting even to philologists 
> who may not otherwise be interested in computing.
> 
> For example, suppose we have the stemma in my sample (you've all 
> memorized it by now, right? :-) ) and the following variation in an edition:
> 
>     <app>
>         <rdg wit="L t">Chocolate</rdg>
>         <rdg wit="R A I X">Peanut butter</rdg>
>     </app>
> 
> According to stemmatic principles, the reading in alpha was "Peanut 
> butter," and "Chocolate" was introduced in delta. If, on the other hand, 
> we have:
> 
>     <app>
>         <rdg wit="L t R A">Chocolate</rdg>
>         <rdg wit="I X">Peanut butter</rdg>
>     </app>
> 
> the "vote" is still two-to-four, but here we have a crux, since one 
> reading goes back to beta and the other to gamma, and the stemma doesn't 
> help us determine which goes back, in turn, to alpha.
> 
> If we've taken an <eTree> approach, we can examine the text() nodes of 
> the <rdg> elements for each <app> element and for each <rdg> element 
> find the youngest common parent (in the stemma) of the manuscripts cited 
> in the @wit attribute. If they are at the same depth in the stemma, we 
> have a crux, and the stemma cannot resolve which is primary. If, 
> however, the youngest common parent of one is deeper in the tree than 
> the lowest common parent of the other, the former is the error and the 
> latter can be projected back to alpha.
> 
> This is difficult but manageable XSLT/XPath programming with an 
> <eTree>-like model. With the <graph>/<node>/<arc> model, on the other 
> hand, it becomes much more complicated. It isn't impossible (after all, 
> the <graph>/<node>/<arc> model can be transformed into the <eTree> 
> model), but if we were going to try to build something like this for 
> production, I think we agree on which model has the greater engineering 
> advantages (by far). I'd like to see the TEI Guidelines say something 
> like "here's The Best Way to represent a stemma because in addition to 
> describing the graph [which one could do in a variety of ways], it also 
> lets one automate some of the analysis of variation, and that's part of 
> what humanities computing is all about."
> 
> With this in mind, I see no reason not to enhance the <eTree> model with 
> a <contaminates> feature. It doesn't get in the way for those who are 
> modeling true trees, and it lets us use the <eTree> structure for 
> modeling stemmata, which I think makes those models much more useful for 
> textual analysis than would be the case under the <graph>/<node>/<arc> 
> approach.
> 
> Cheers,
> 
> David
> 
> Lou Burnard wrote:
> > Unless I'm mistaken, when this investigation of how to represent ms 
> > stemma was first proposed, it was mostly as an exercise to see how 
> > applicable the existing TEI model for trees and graphs is. I may be 
> > inventing this, but my recollection is that the conversation went 
> > something like
> > x: we've got this lovely way of representing graphs and networks and 
> > stuff
> > y: why would anyone ever want to use such a thing
> > x: well you could use it represent, err, airplane networks, or family 
> > trees, or um,
> > y: or MANUSCRIPT TRADITIONS! wow that's really innerestink
> >
> > So the exercise intended for David was really to test the capabilities 
> > of the existing TEI model, which I think his work does rather well.
> >
> > I don't have anything to add to what James has already said about
> > labels and the use of existing TEI styles for identification and 
> > linkage. I do however wish to confess to a feeling of unease about 
> > contamination.
> >
> > As I understand it, this is a way of saying that a given node has one 
> > or more parent-nodes *other* than the one it's actually attached to -- 
> > which  of course means that you're not looking at a tree any more, but 
> > a directed acyclic graph. So it cannot be represented using <tree> or 
> > <eTree> -- you need to use the more general <graph>.
> >
> > I would be interested to see how David's example would play as a <graph>
> >
> > Lou
> >
> 
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
-- 
Daniel Paul O'Donnell, PhD
Director, Digital Medievalist Project http://www.digitalmedievalist.org/
Associate Professor and Chair, Department of English
University of Lethbridge
Lethbridge AB T1K 3M4
Canada
Vox: +1 403 329-2378
Fax: +1 403 382-7191




More information about the tei-council mailing list