12.0616 TEI & the individual scholar; research = display

Humanist Discussion Group (humanist@kcl.ac.uk)
Thu, 6 May 1999 19:03:03 +0100 (BST)

Humanist Discussion Group, Vol. 12, No. 616.
Centre for Computing in the Humanities, King's College London
<http://www.princeton.edu/~mccarty/humanist/>
<http://www.kcl.ac.uk/humanities/cch/humanist/>

[1] From: John Unsworth <jmu2m@virginia.edu> (41)
Subject: Re: 12.0610 TEI and the individual scholar

[2] From: C M Sperberg-McQueen <cmsmcq@acm.org> (87)
Subject: Re: 12.0610 TEI and the individual scholar

[3] From: Wendell Piez <wapiez@mulberrytech.com> (56)
Subject: Re: 12.0610 TEI and the individual scholar

[4] From: John Unsworth <jmu2m@virginia.edu> (17)
Subject: Re: 12.0610 TEI and the individual scholar

[5] From: Patrick Durusau <pdurusau@emory.edu> (100)
Subject: Re: Research is Display

[6] From: Domenico Fiormonte <itadfp@srv0.arts.ed.ac.uk> (22)
Subject: Re: browser

--[1]------------------------------------------------------------------
Date: Thu, 06 May 1999 18:56:34 +0100
From: John Unsworth <jmu2m@virginia.edu>
Subject: Re: 12.0610 TEI and the individual scholar

At 06:41 PM 5/5/1999 +0100, Charles Faulhaber wrote:

>--[1]------------------------------------------------------------------
> Date: Wed, 05 May 1999 18:14:21 +0100
> From: "by way of Humanist <humanist@kcl.ac.uk>"
><cbf@socrates.berkeley.edu>
> >
>Can someone give me examples of _individual scholars_ as opposed to
>large-scale projects, that are useing TEI for the production of either
>electronic or paper editions?

We have a number of projects at the Institute for Advanced Technology in
the Humanities which are headed up by individual scholars and are using TEI
in the production of electronic editions or, more broadly, thematic
research archives:

--Hoyt Duggan's Piers Plowman project
--Michael Levenson's "Monuments and Dust" (Victorian London) project
--David Germano's "Nyingma Tantra Research Archives"
--Steven Railton's "Uncle Tom's Cabin and American Culture project"
--Susan Schreibman's Thomas MacGreevy Archive
--Lloyd Benson's "American Newspaper Editorials in the Secession Era"
--Ken Price/Ed Folsom's Whitman Archive
--Martha Nell Smith et al., The Dickinson Archive
--Frank Grizzard's "Documentary History of the Construction of the
Buildings at the University of Virginia, 1817-1828."
--Richard Guy Wilson's "The Architecture of Thomas Jefferson"
--Ed Ayers' "The Valley of the Shadow"
--Gary Anderson's "The Life of Adam and Eve: The Biblical Story in Judaism
and Christianity"
--Deborah Parker's "The World of Dante"

Granted, many of these have extended TEI in some way, but they're still
TEI-conformant. We have some others (a much shorter list) that have
developed their own DTDs:

--Eaves/Essick/Viscomi's "The William Blake Archive"
--Jerome McGann's "The Rossetti Archive"
--Michael Satlow's "Inscriptions from the Land of Israel"
--Elizabeth Meyers' "New Interpretive Study of the Evolution of Slavery in
Hellenistic and Roman Greece"
--Katherine Rinne, "The Waters of the City Rome"

And one using EAD:

-Marion Roberts, "The Salisbury Project"

John Unsworth

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://jefferson.village.virginia.edu/~jmu2m/

--[2]------------------------------------------------------------------
Date: Thu, 06 May 1999 18:57:23 +0100
From: C M Sperberg-McQueen <cmsmcq@acm.org>
Subject: Re: 12.0610 TEI and the individual scholar

On Wed, 5 May 1999 18:41:24 +0100 (BST), Charles Faulhaber asks

>Can someone give me examples of _individual scholars_ as opposed to
>large-scale projects, that are useing TEI for the production of either
>electronic or paper editions?

The best place I know of to get an overview of TEI users is the TEI
applications page at http://www.uic.edu/orgs/tei/app/ -- this page
doesn't give a direct answer to the question, because it does not
distinguish projects by size. Working through it, I see between ten
and fifteen projects (of 64 listed) which I believe are individual
faculty research activities; I may be wrong.

The examples I am most certain about include, in alphabetical order of
their entries in the applications page, Karl Uitti's work on Chretien
de Troyes (listed as the Charrette Project), the corpus of spoken
Japanese being created by Syun Tutiya and collaborators at Chiba
University (listed as the Chiba Corpus of Map Task Dialogues in
Japanese) -- Prof. Tutiya is not working alone, so I am not sure if
this actually qualifies, but it is not a large-scale funded project --
Hoyt Duggan's edition of Piers Plowman (Piers Plowman Electronic
Archive), and Lew Barth's edition of Pirqe Rabbi Eliezer (Pirque Rabbi
Eliezer Electronic Text Editing Project).

Of course the way we have asked for information may have led some
individuals not to tell us about their use of TEI, on the grounds that
they are individuals, not 'projects'. But it seems to me likely that
rate of adoption of SGML in general (other than HTML), and the TEI in
particular, is much higher among projects involving more than one
person (and often outside funding) than among individuals doing
whatever they choose in the research time supported by their
institution. While this is a source of continuing frustration to
those of us who believe we know there are better ways to do things
than those our colleagues stubbornly persist in choosing, it is not
illogical. It could have been foreseen and predicted.

And indeed it was foreseen and predicted -- I scandalized at least one
member of the TEI Advisory Board by saying I expected large organized
projects to adopt the TEI first, and individuals to follow only later.
My reasons for expecting this are simply stated:

(1) The TEI is (or was when it was published) new technology aimed
at making it easier to create software-independent data, and thus
at making it easier to reuse data.

(2) All new technology imposes a cost to adopters; projects with a
designated 'technical' person can bear that cost more easily than an
individual. (So I suspect that full-disk tape backups, custom
programming, and production use of VRML are more frequent among
large-scale projects than among individual faculty members, too.)

(3) The benefits of the TEI, especially reuse by other people, are
of course most obviously critical to those creating resources
*intended* to be used by other people. They are naturally less
imposing to someone interested first of all in making the data
usable for their own purposes.

(4) So the benefits of TEI use are typically higher for projects
than for individuals, and the costs are more manageable. Both
projects and individuals exhibit enough rationality that this
difference in the cost/benefit ratio will be reflected in a
difference in rate of adoption.

Individual scholars will use the TEI in greater numbers when there is
more software around that will allow them to do things they really
want to do, and that either produces the data in TEI form, or uses TEI
form internally and requires TEI form for its input. One thing I did
not foresee was that so many of the people who develop software for
humanities research would receive the TEI so coolly, and would believe
that it is simpler to invent their own scheme for text encoding than
to use an existing one. Those who do develop simple software that
uses the TEI (myself, for example) have not made any great effor to
make that software public. It's not finished, it is too specific to
my own needs, it's buggy, I want to provide a better user interface,
no one else has an interpreter for the programming language I use, ...

Perhaps the recent discussions about the need for a new generation
of software for text analysis will lead to an improvement of this
situation. Let us all hope so.

Of course, it's also possible that humanists simply care a lot less
about reusability, sharing of resources, and the electronic
preservation of the cultural heritage than was thought when the TEI
was created. (But of course, most of the people at the Poughkeepsie
Conference in 1987 were in fact involved in large-scale projects, not
individual unfunded research. Biased sample, perhaps.)

A cynic might observe that if the TEI's goal was to make it possible
to create reusable data with markup that allowed researchers to do the
work they were most interested in, then the TEI has already succeeded:
it is now in fact possible to do that. The fact that so few humanists
take advantage of the TEI should (the cynic might continue) be taken
not as an indictment of the TEI but of the humanists who work with
computers.

It's late, and I'm tired and worn down with problems in other
projects, but I resist the cynic's interpretation. Progress seems
slow, but it's only five years since the Guidelines were published.
Five years is not, should not be, a long time to humanists.

-C. M. Sperberg-McQueen
Senior Research Programmer, University of Illinois at Chicago
Co-Editor, ACH/ACL/ALLC Text Encoding Initiative

--[3]------------------------------------------------------------------
Date: Thu, 06 May 1999 18:58:27 +0100
From: Wendell Piez <wapiez@mulberrytech.com>
Subject: Re: 12.0610 TEI and the individual scholar

Willard and HUMANIST:

I hope you don't mind getting spammed by me again, but this thread
entwines itself very close to my concerns.

Charles Faulhaber asks about individual scholars and the TEI. I'm sure
we'll hear from, or of them: there are a few. If there's a relative
scarcity, however, I think that's at least as much for cultural reasons
as it is due to the technology and tools. While virtually every scholar
in the Humanities depends on electronic text (name your favorite
word-processor or desktop publisher), it seems that those who become
interested in the technologies per se of production, publication,
"reading" itself -- how to make or use them better, serving long-term
scholarly interests -- are soon branded "techies" and thus marginalized
in ways both subtle and not-so. Not to mention the whole new set of
questions with their publishers (who can be as conservative as anyone,
and for plenty of good reasons).

[That's my take on that. Readers, please skip the rest of this message
if you're not following the state of the art in SGML/XML encoding....]

As for Prof. Vanhoutte's question about the feasibility of actually
browsing TEI: at this stage (May 1999), he basically has a choice.

(1) Express his requirements in the semantics of HTML (with scripting
for things like targetting different windows) and down-converting from
TEI into such HTML using such tools as Jade (free) and a DSSSL style
sheet. This can be done today, although DSSSL is a bear to learn.
Advantages: can then serve to HTML browsers; software is free.
Disadvantages: may have to sacrifice functionality to the limitations of
HTML; also, the significant benefits he is getting in his TEI source
code, for creation, quality control, maintenance, and the avoidance of
"application lock-in," will be invisible to all but the
technology-savvy.

(2) Hang in there and wait for the next generation of XML-capable
browsers. We need at least the linking and style specifications for XML
(XLink/XPointer and XSL, respectively) to mature somewhat more before
they are properly supported in stable products. Even then, he may still
have to do a "down" or "cross" conversion to get his texts to display
with all the functionalities he wants (using tools that may or not be
easier or as cheap as the current ones), but at least he shouldn't have
to completely brutalize his TEI markup to do so.

In the meantime, he might be playing with Fujitsu's HyBrick browser
(which I haven't used but which is said to support DSSSL and most of the
draft XLink) and/or the prototype XML browsers (Internet Explorer and
one or two others). There are various implementation perils in these
several directions, of course, and support is usually thin.

But that's life at the cutting edge..."no one understands!" On the other
hand, as I keep telling myself, there's no reason all of this has to
happen fast ... isn't it a systems theory principle (I read somewhere)
that something that grows slowly, declines slowly?

Respectfully,
Wendell

======================================================================
Wendell Piez mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

--[4]------------------------------------------------------------------
Date: Thu, 06 May 1999 18:58:57 +0100
From: John Unsworth <jmu2m@virginia.edu>
Subject: Re: 12.0610 TEI and the individual scholar

At 06:41 PM 5/5/1999 +0100, Edward Vanhoutte wrote:

>The problem now is that no browser I know of (Panopro, Multidoc) has the
>ability to show all of the versions in different windows on the screen
>when requested for. Trials with the CORRESP attribute to the <P> element
>resulted in the browser jumping to the corresponding paragraphs in both
>the other versions of the text, but what I really want is for the
>browser to open a new window in which the corresonding texts can be
>viewed.

We've been working on software that does this with sgml-tagged Unicode, and
we're continuing to develop that. If you're interested, have a look at

http://www.iath.virginia.edu/babble/

and--if you have the patience to work on an academic time-scale (in which
progress will be made, but slowly) we'd be happy to have you using the
software and advising on its development--in particular, I expect that
we'll be concentrating on this development effort in the fall.

John Unsworth
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://jefferson.village.virginia.edu/~jmu2m/

--[5]------------------------------------------------------------------
Date: Thu, 06 May 1999 18:59:24 +0100
From: Patrick Durusau <pdurusau@emory.edu>
Subject: Re: Research is Display

Greetings,

Edward Vanhoutte wrote:

<deletions>

> As for the markup, TEI (Lite) provides me with all the markup facilities
> I need. As for the software side, I'm in deep troubles. There does not
> seem to be an adequate display mechanis which can convince the academic
> community that what I have been doing (on a full time basisi) for the
> last one and a half year isn't just a loss of money.

Display is one component of the use of TEI and as I noted in an eariler post an
important one. There are other academic research uses that are served by
finely grained
markup that do no require user-friendly display such as linguistic analysis,
concordances and the like. If you are trying to sell the project and TEI
markup to a
non-technical auidence then display takes on a greater role. I mention this
just as a
casution to not equate display with actual use of a text for research purposes.

<deletions>

>
> The problem now is that no browser I know of (Panopro, Multidoc) has the
> ability to show all of the versions in different windows on the screen
> when requested for. Trials with the CORRESP attribute to the <P> element
> resulted in the browser jumping to the corresponding paragraphs in both
> the other versions of the text, but what I really want is for the
> browser to open a new window in which the corresonding texts can be
> viewed.

(I pass over your attempts to use Panopro and Multidoc without comment as I
have not
looked at either package in some time. It may be possible to create the
multi-window
effect you desire with either of these software packages. My experience
with Panopro
was 3-4 years ago with an earlier version of the software and not entirely
positive
both from the standpoint of implementation of the SGML standard and/or product
documentation/capabilities.)

Actually there are several other options that might meet your needs:

1. IBabble: A Synoptic Unicode Browser: http://www.iath.virginia.edu/babble/

Written in Java, IBabble (Institute for Advanced Technology in the
Humanities) is a
tool that will display parallel texts in parallel windows either as a
standalone
application or as a helper to Netscape.

2. Fujitsu's "HyBrick" SGML/XML Browser: http://www.fsc.fujitsu.com/hybrick/

HyBrick is a very sophisticated implementation of the HyTime standard (SGML
architectures and sophisticated linking mechanisms) and "includes a DSSSL
renderer
and XLink/XPointer engine running on top of James Clark's SP and Jade."

3. Link: an XML-XSL-XLL browser:
http://pages.wooster.edu/ludwigj/xml/index.html

Link is a Java application that was written by Justin Ludwig as part of a
Senior
Independent Study Project at the College of Wooster. It requires a number
of other
modules (available free from other sources) but is a good demonstration of
the type of
tools that are just on the horizon.

I have tried to list these options in order of increasing difficulty of use
by scholars
who are not primarily computer users. I have omitted several commercial
options such as
DynaText/DynaWeb (Inso Corporation) not because they would not perform the
requested
task quite well but as large commercial packages they require heavy
technical support
to make full use of their capabilities.

<deletions>

>
> Does anyone have an idea of how this can be achieved in a scolarly
> integer way, for a demo in which I put the corresponding paragraphs in a
> nested notes architecture resulted exactly in what I wanted, but the
> result is mere 'bricollage'.
> It it proofs to be impossible, I think text encoding is a lost case to
> the potential dense use it could have in textual criticism and the
> publication of electronic scholarly editions.
>

<deletions>

For the short range I would suggest taking a look at the IBabble package.
For the
mid-range, 6 months to a year, I would suggest visiting Robin Cover's
SGML/XML site to
remain aware of new software releases that may be of interest to academics.
(The
SGML/XML Web Page: http://www.oasis-open.org/cover/) Jon Bosak (Sun
Microsystems)
seemed to be of the opinion at XTech '99 that we may see the new XLink/XPointer
specification this summer (XSL: Extensible Style Language was released
quite recently).
This area is changing quite rapidly and tools that have long been dreamed
of are coming
ever closer to being a reality.

If you have any programmers on staff or have access to programmers there
are several
other SGML/XML packages I could suggest for your project off list.

Patrick

--
Patrick Durusau
Information Technology Services
Scholars Press
pdurusau@emory.edu
Interim Manager, ITS

--[6]------------------------------------------------------------------ Date: Thu, 06 May 1999 18:59:42 +0100 From: Domenico Fiormonte <itadfp@srv0.arts.ed.ac.uk> Subject: Re: browser

Hi Edward,

>> The problem now is that no browser I know of (Panopro, Multidoc) has > the ability to show all of the versions in different windows on the > screen when requested for. Trials with the CORRESP attribute to the > <P> element resulted in the browser jumping to the corresponding > paragraphs in both the other versions of the text, but what I really > want is for the browser to open a new window in which the > corresonding texts can be viewed.

Have you come across the Digital Variants Browser? It's a prototype on which two CS researchers at Goteborg's Viktoria Institute have been working on. If you go to: http://www.viktoria.informatics.gu.se/groups/play/ and click on "Information visualization" and then "Previous projects", you can read the paper that one of them presented to our seminar "Computers, Literature and Philology" (http://www.ed.ac.uk/~esit04/seminar.htm)

See also the original application, the Zoom Browser: http://www.informatik.gu.se/~bjork/flipzoom/text/index.html

It seems to me that this browser does *exactly* what you need -- and they might be also interested in developing something specific for your project.

Good luck!

------------------------------------------------------------------------- Humanist Discussion Group Information at <http://www.kcl.ac.uk/humanities/cch/humanist/> <http://www.princeton.edu/~mccarty/humanist/> =========================================================================