10.0844 TEI: rain, umbrellas, and dances

WILLARD MCCARTY (willard.mccarty@kcl.ac.uk)
Wed, 9 Apr 1997 21:54:52 +0100 (BST)

Humanist Discussion Group, Vol. 10, No. 844.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Centre for Computing in the Humanities, King's College London
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU> (134)
Subject: Re: 10.0841 rain on the workshops

[2] From: Mavis Cournane <cournane@curia.ucc.ie> (36)
Subject: Re: 10.0841 rain on the workshops

[3] From: orlandi@rmcisadu.let.uniroma1.it (15)
Subject: Re: 10.0841 rain on the workshops

[4] From: Patrick Durusau <pdurusau@emory.edu> (111)
Subject: Re: 10.0841 rain on the workshops

[5] From: michelle stanton <vcoao0dj@email.csun.edu> (5)
Subject: Re: 10.0841 rain on the workshops

[6] From: Francois Lachance <lachance@chass.utoronto.ca> (38)
Subject: percipitation control

--[1]----------------------------------------------------------------
Date: Tue, 08 Apr 97 18:28:14 CDT
From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Subject: Re: 10.0841 rain on the workshops

On Tue, 8 Apr 1997 21:56:06 +0100 (BST) my esteemed colleague and
fellow resident of the great County of Cook, Mark Olsen, said:

>I am surprised that so many of the recently announced workshops and
>institutes seem to make the teaching of encoding, and particularly
>TEI encoding, a primary objective. The TEI, is seems to me, is being
>treated as if it is an accepted standard. This is not at all clear.

If Dr. Olsen is surprised, I think it's because he's making an
elementary logical error, with the predictable consequences. (The
pedant in me burns to say, with Dr. Johnson, that he is not, in any
case, surprised at all, but astonished.) The curriculum of these
workshops and institutes does not, in general, depend on what the
organizers take to be popular belief, but on what the organizers believe
participants can usefully learn. So I think it's a grave error for Dr.
Olsen to attempt to reason from the announced curricula to the
organizers' beliefs about what is and is not 'accepted' by others.

If the courses at (alphabetically) CETH, Michigan, New Brunswick,
Oxford, and Virginia all plan to teach TEI and SGML, it's far more
likely that this is because the organizers believe, after some
experience with electronic texts, that SGML and the TEI offer the best
method of encoding texts for research purposes. They are not "treating"
the TEI "as if" it were an accepted standard. They have accepted it,
and are teaching it because they think it's worth knowing about.

Of course, it's true that few people will offer a course on a topic if
they think no one is likely to attend. The experience of the last few
years suggests, however, that there is ample demand for instruction in
the application of SGML and TEI, and I expect that the courses recently
announced on Humanist will fill up, just as course after course after
course on this subject has done over the past several years.

Not every involved in making electronic texts available for research
shares the view that SGML and the TEI offer the best way to go: in
addition to Dr. Olsen, readers of Humanist will remember the remarks of
Ian Lancashire in a discussion on this topic in late 1995. But the TEI
'applications page' (http://www.tei.uic/orgs/tei/app), a list of
projects who have told us that they are, in some sense, using the TEI,
has over fifty entries. Some of these are very small projects, no match
in size for Mark Olsen's ARTFL. Some are several times the size of
ARTFL. Some use TEI extensively, some only tangentially. But the
steady growth of the list, and the steady adoption of the TEI scheme by
new projects, suggests that despite Dr. Olsen's reservations, many
people have examined the TEI and the alternatives (rolling their own
scheme, or adopting any of the hundreds or thousands of other extant
schemes, dozens of which are actually publicly documented) and decided
to adopt the TEI for their own work.

For better or worse, I am unaware of any other encoding scheme in such
wide use for research using texts. More to the point, I am unaware
of any other encoding scheme which *deserves* to be in wider use.

>In fact, there are serious design and implementation questions
>that can suggest that TEI is neither an acceptable standard nor an
>implementable standard.
>
>Last year, I gave a rather inflammatory talk against the TEI
>guidelines at the ALLC/ACH conference in Bergen. ...

It would be interesting to hear about these serious design and
implementation issues sometime.

I had hoped they would be brought forward in Bergen, but even on their
face, the arguments Dr. Olsen made there had nothing to do with the
conclusions he would like to draw from them. I pointed this out
in Bergen, in a talk which I have not yet put on the Web, but
(fired by the example of my South Side colleague) will do soon.

>Given the facts that:
>
>-- the TEI Guidelines have never been subjected to significant
> peer review,

Rubbish. The TEI Guidelines were drafted by work groups drawn from the
relevant use communities, which worked in public; they have been
available for public comment (in draft or final form) since 1990; they
were reviewed by an Advisory Board composed of representatives of
relevant scholarly and professional societies (at least, the member
organizations agreed to review them, and sent representatives to the
Advisory Board meetings); TEI funding proposals have gone through the
normal review process at NEH several times since then, always with
success. Over fifty projects have adopted them for some part of their
work.

If you want to criticize the TEI or the Guidelines, feel free to do so.
If you want to start a career as a fiction writer, do that. But don't
confuse the two.

>-- that there are many in the feild who consider the current Guidelines
> to be unworkable for many reasons, and

They, like everyone else, are invited to comment on the Guidelines,
to suggest work items for their improvement, or to help revise them.

>-- that there is a dearth of analytical software (other than systems
> to verify that a document is TEI-conformant, for what that is
> worth),

There is a dearth of analytical software for any format one cares
to name, including HTML (possible exception: the TLG beta format;
make enough data available and the software will appear).

Come to that, there is a dearth of analytical software for the ARTFL
data format; does that count as a sign that it is deeply, fundamentally
flawed and should be replaced?

>I hope that the various course instructors will at the very least
>inform participants that the TEI is NOT the only, or even best,
>way to encode documents, and that encoding itself -- a labor intensive
>activity -- may have far more limited results than hoped. As I

I hope so too.

Anyone who comes away from any TEI course thinking that the TEI is
the 'only way to encode documents' has not been paying attention.
Many people do think it's the best -- if not the best imaginable, then
at least the best available.

But if anyone thinks there *is* any one 'best' way to encode documents,
in any absolute sense, I think they invite the suspicion that they
haven't yet managed to think seriously about the subject for longer than
ninety seconds at a time. Two minutes' thought usually suffices to
persuade anyone that there is no single unique best way to represent a
document, and cannot be. But of course A. E. Housman was quite right:
thinking is hard, and two minutes is a long time.

>suggested in Bergen, I do find the effort by TEI proponents to teach
>a specification that has not been sufficiently tested to be a risky
>endeavor because we are asking people to spend significant amounts
>of time and money performing tasks that may not live up to the
>advance billings.

I've been listening for several years, and I have yet to hear anyone
at any course on electronic texts ask anything of the kind. On the
contrary, the first and most insistent request is that the encoders
of texts make up their own minds about what is and what is not worth
encoding; the TEI is not there to serve as anyone's excuse for
independent thought.

As to the risk: does anyone think research with texts is not
inherently a risky endeavor? If one knows in advance what one's
work will prove, it's no longer research.

>I'm sure I'll be hearing from the TEI supporters in droves. :-)

I certainly hope so, but perhaps many will give you up as a lost
cause; I won't, because you're here in Cook County and serve as an
excellent excuse to repeat points often made already, but
perpetually in need of repetition.

>Nothing will ever be attempted if all possible objections
>must first be overcome. --- Samuel Johnson
>

I'll subscribe to that. It's why I think waiting to use or teach the
TEI until everyone in the community subscribes to it is such a silly
tomfool idea. Those engaged in research projects should consider the
alternatives and choose to do what makes most sense. That is their
responsibility to themselves, their later readers, and their funding
agencies. Neither Mark Olsen nor I can, nor should we, take that
responsibility off their shoulders.

-C. M. Sperberg-McQueen

--[2]----------------------------------------------------------------
Date: 09 Apr 1997 11:28:32 +0100 (BST)
From: Mavis Cournane <cournane@curia.ucc.ie>
Subject: Re: 10.0841 rain on the workshops

[Mark Olsen]
> I am surprised that so many of the recently announced workshops and
> institutes seem to make the teaching of encoding, and particularly
> TEI encoding, a primary objective.

I can only speak of my experience of the TEI workshop at CETH. To be
fair to the instructor he did emphasize that the TEI _are_ guidelines
_not_ a recognized standard. Anyone who does some reading will see
that emphasised too.

> The TEI, is seems to me, is being
> treated as if it is an accepted standard.

I would like to know what alternatives there are out there for the
encoding of medieval text.

> -- the TEI Guidelines have never been subjected to significant
> peer review,

How can you subject smth to peer review if nobody knows enough about
TEI to implement and experiment with it. I would have thought the
first step would be to teach people how to use it. Let them
experiment and then come back with their suggestions. Workshops while
promoting TEI also teach people so that they can decide for themselves
its attributes and vices.

> -- that there are many in the feild who consider the current Guidelines
> to be unworkable for many reasons, and.

This is very interesting and I would like to hear more. I haven't seen
any discussions of such problems on comp.text.sgml or on TEI-L. Could
you point me towards a forum for this.

> I hope that the various course instructors will at the very least
> inform participants that the TEI is NOT the only, or even best,
> way to encode documents,

If there are alternatives to TEI why aren't we inundated with postings
for workshops for them. The TEI has problems as anyone who uses it is
aware of. However, at least its proponents offer instruction about it
so we can all decide for ourselves. Attending a workshop and learning
about TEI does not mean that people are led up a blind alley. It
merely gives people an option. At the very least it starts them
thinking about their text in a useful, logical way.

You can lead a horse to water but you can't make him drink it :-)

Mavis Cournane

--[3]----------------------------------------------------------------
Date: Wed, 9 Apr 1997 11:56:01 +0100 (BST)
From: orlandi@rmcisadu.let.uniroma1.it
Subject: Re: 10.0841 rain on the workshops

About Mark Olsen's observations, I feel bound to admit that in
my opinion the TEI was (and is) a very interesting and very
well ment initiative, but as matters go it is more or less
dead stuff for the "cognoscenti". I speak as one who has studied
rather deeply the TEI application problems both for Italian and
Coptic texts. No need to enter into details, for those who have
similar experiences.

Mind you, very different is the situation for SGML: there I
am quite at ease, and I think that it will last as a standard.

Hope a witness is useful...

-----------------------------------------------------------------

Tito Orlandi orlandi@rmcisadu.let.uniroma1.it
CISADU - Fac. di Lettere Tel. 39.6.4991-3936
P.zale Aldo Moro, 5 Fax 39.6.4991-3945
00185 Roma http://rmcisadu.let.uniroma1.it/~orlandi

--[4]----------------------------------------------------------------
Date: Wed, 09 Apr 1997 09:45:56 -0400
From: Patrick Durusau <pdurusau@emory.edu>
Subject: Re: 10.0841 rain on the workshops

[ Part 2: "Included Message" ]

From: Patrick Durusau <pdurusau@emory.edu>

Just a few quick notes on Mark Olsen's post concerning workshops on TEI
encoding.

Mark Olsen <mark@barkov.uchicago.edu> (43)
Subject: Re: 10.0831 workshops & institutes

writes:

<omissions>
>
> Last year, I gave a rather inflammatory talk against the TEI
> guidelines at the ALLC/ACH conference in Bergen. A slightly
> revised version is at:
>
> http://tuna.uchicago.edu/homes/mark/talks/TEI.talk.html
>
<omissions>
> Given the facts that:
>
> -- the TEI Guidelines have never been subjected to significant
> peer review,

The claim that "the TEI Guidelines have never been subjected to
significant peer review" betrays a lack of appreciation for goals of the
TEI Guidelines. Unlike older schemes mentioned in his _Text Theory and
Coding Practice: Assessing the TEI_ (the resource found at the cited
URL), the TEI Guidelines were not intended to set forth a definitive
solution for all texts or any group of texts important to humanists. The
claim is further weakened by the participation of literally hundreds of
scholars in the formation of the Guidelines, which were a product of
peer pariticipation. The lack of what Mark would consider peer review is
not very significant unless he can point to some shortcoming of TEI that
peer review would have avoided.

I hasten to point out that the need for peer review is not proven by the
existence of inconsistently encoded texts using the TEI Guidelines. The
TEI Guidelines were not meant to be a definitive and exhaustive listing
of all tagging solutions for all texts. It should be judged for its
effort to create a model for the use of SGML with texts of concern to
humanists and not its application to specific texts, which was not its
goal. (I personally think formal peer review of the TEI Guidelines in
light of its goals would have given it high marks.)

I think there is much to be said for Mark's concern over such
inconsistent texts, but those objections can be meet by scholars
concerned with particular types of texts collaborating on guidelines for
implementing the TEI Guidelines for a particular class of texts. One
such effort is already underway with the formation by the Society of
Biblical Literature of a new seminar on Electronic Standards for
Biblical Language Texts. The results of this effort, being a specific
implementation of the TEI Guidelines for a limited group of texts, will
be the subject of peer review. The Seminar is addressing this year the
creation of Writing System Declarations (WSDs) and entity sets for
common references for use in connection with biblical language texts.
Humanists interested in this effort can contact the undersigned for more
details.

> -- that there are many in the feild who consider the current Guidelines
> to be unworkable for many reasons, and

I am unable adequately answer this particular objection since it is
devoid of any real content. Since Mark is apparently aware of prior
discussions concerning the suitability of the use of the TEI Guidelines
on this list, perhaps he will remember my call last year for an example
of a manuscript or text that could not be encoded using the TEI
Guidelines. The principals to that discussion never responded with any
such examples. If anyone would like to propose such a text and make it
available in photocopy or high resolution scans, I am sure there are
users of the TEI Guidelines who would be glad to propose and debate
possible encodings. If the Guidelines "are unworkable for many reasons"
I would expect one of the "many" to be able to produce at least one
example of such a failure. To date such objections have been couched
only in the vaguest of terms while the "many" continue to use
proprietary solutions and texts available only to the few.

> -- that there is a dearth of analytical software (other than systems
> to verify that a document is TEI-conformant, for what that is
> worth),

Quite true, but a serious lack that is slowing being corrected. Consider
the Babble program,
http://www.iath.virginia.edu/babble, which is a synotic text viewer
described at its homepage as follows:

>Babble, under development by Robert Bingler at the Institute for Advanced Technology in the Humanities, is an SGML-capable synoptic >text tool that can display multiple texts in parallel windows. It uses Unicode, an ISO 16-bit character set standard, whi

ch allows >multilingual texts, using mixed character sets, to be displayed simultaneously. Babble also allows users to search for strings in text >or in tags, and to link open texts for scrolling and searching. Currently, Babble runs as an application, an

d not as an applet: we hope >that coming releases of Web browsers will have the necessary intelligence about system fonts to permit Babble to run as an applet in >the near future.

Recent editions of WordPerfect and Word both contain SGML components and
one would hope that is only the beginning. Humanists need to make their
needs known to both their own computer science departments and software
companies.

>
> I hope that the various course instructors will at the very least
> inform participants that the TEI is NOT the only, or even best,
> way to encode documents, and that encoding itself -- a labor intensive
> activity -- may have far more limited results than hoped. As I
> suggested in Bergen, I do find the effort by TEI proponents to teach
> a specification that has not been sufficiently tested to be a risky
> endeavor because we are asking people to spend significant amounts
> of time and money performing tasks that may not live up to the
> advance billings.

As I noted above, TEI does not compell, beyond minimal compliance with
the Guidelines, any particular encoding. There is one noted dictionary
project that is rekeying its records since it lost access to files
created for TRS-80 computers. My only objection is the data is being
entered on Apple IIe's! I would be loathe to advise participants to use
closed system encoding methods which are rapidly going the way of
TRS-80's.

It may well be that encodings will not meet the expectations of their
encoders. I hardly see that as an objection to the use of the TEI
Guidelines. In my area of concern, biblical languages and Ancient Near
Eastern texts, TEI encoding holds the promise of making texts more
widely available and subject to more rapid analysis than hard copy.

I will try to finish a more formal announcement concerning the SBL
seminar by later this week and post it to the list.

Patrick

Patrick Durusau
Information Technology
Scholars Press
pdurusau@emory.edu
Chair, SBL Seminar on Electronic Standards for
Biblical Language Texts

--[5]----------------------------------------------------------------
Date: Wed, 9 Apr 1997 13:38:06 -0400 (EDT)
From: Francois Lachance <lachance@chass.utoronto.ca>
Subject: percipitation control

Willard,

Mark Olsen's April showers might not rain on all parades, TEI
supportive or otherwise.

This little shot of hail caught my attention:

> endeavor because we are asking people to spend significant amounts
> of time and money performing tasks that may not live up to the
> advance billings.

I read four comparators here:

Time, Money, People, and Claims

My first move is to bracket out {Claims} and ask if any research
exists that addresses the teaching of mark-up skills (as opposed to
the best practices for scanning/ocr or keying text): how fast do
learners acquire the material? how long do they retain it? What
prior knowledge affects the learning curve? That's effectiveness. The
next set of questions would target an analysis of spend. That's
efficiency. The one is a pedagogical question and the other of the
purvue of administration.

My second move would be then to examine claims as they are generated
in historical contingency of the interaction between administrative fiat
and pedagogical practice.

Of course, the whole issue is muddled further because of
"cross-learning". For example, just as learning (or attempting to
learn) a second language can have a rebound effect on native language
skills, being exposed to a set of encoding guidelines can be a way
into a whole realm of problems involving not only markup but transfer
protocols, processing, and even software development, but also the
interpersonal skills required to motivate and implement cooperative
work schemes and the pr skills required to challenge the social
construction of "waste" and numbers game as to whose labour counts.

The best show-stopping question I ever witnessed was Mavis Cournane's
challenge "Who does the donkey work?" That is where the tedious eros
resides -- to capture some threads from the motivation/preparedness
debate of computers in the classroom. The tedious eros is also the one
that takes humans where they are at and makes the claim they can be
elsewhere. And it is that very claim of being, possibly, elsewhere
that keeps me here,

hewing and hawing,

as ever a smart ass

Francois