Offline 22 (450)

Willard McCarty (MCCARTY@VM.EPAS.UTORONTO.CA)
Fri, 10 Feb 89 22:45:28 EST


Humanist Mailing List, Vol. 2, No. 588. Friday, 10 Feb 1989.

Date: Friday, 10 February 1989 1015-EST
From: KRAFT@PENNDRLS
Subject: OFFLINE 22

----------------------------------

<O F F L I N E 22>
by Robert A. Kraft

----------------------------------

The "pilot" column for OFFLINE appeared nearly five
years ago under the title "In Quest of Computer
Literacy" (CSR Bulletin 15.2 [April 1984] 41-45). At
about the same time, I prepared a two page stopgap
"Computer News Update" for use in responding to
inquiries that were arriving rather regularly in my
mail. Not surprisingly, the topics discussed in these
two pieces often overlapped -- the need for reliable
information, for accessible electronic texts and data
with which to work, for easy transfer capabilities to
permit individuals to work independently on their own
microcomputers, and for appropriate multilingual display
systems for screen and printer. During subsequent
installments of OFFLINE, attention has returned again
and again to these and closely related issues.
Significant progress has been made on all fronts,
although the informational need remains and will remain
most vexing, given the rapidly changing nature of the
technology and its applications.

Humanists have come a long way in the quest to harness
this technology for their needs. People whose faces once
turned pale (or some other shade) at the suggestion that
they might want to investigate how to use computers in
their work now routinely expose their thoughts and
locutions to "word processing," and perhaps their
finances (and grading?) to a "spreadsheet" approach.
Bibliographies and similarly ornery materials are also
atomized and reshaped by means of "data base management"
systems. With increasingly regular frequency,
selfconfessed novices are getting accounts on their
local mainframe computers and are linking into the
electronic bulletin boards and discussion groups such as
HUMANIST or the various field oriented listings for
history, philosophy, Anglo-Saxon studies, folklore,
archaeology, music, and the like. For biblical studies
and related interests, the wealth of information
recorded in John J. Hughes' BITS, BYTES & BIBLICAL
STUDIES (Zondervan, 1987) strikingly attests this
explosion of progress. The new annual HUMANITIES
COMPUTING YEARBOOK (Oxford Press), coordinated by
Ian Lancashire and Willard McCarty at Toronto,
will help survey the larger context of humanistic
scholarship and teaching.

<The Toronto Fair and Conference>

Of course, "seeing is believing," but the
opportunities for seeing even a small sampling of the
latest developments in humanistic computing are still
relatively rare. Fortunately, professional societies
such as SBL, AAR, ASOR, APA, MLA, and others have made
various attempts to expose their memberships to these
developments to some degree, although perhaps not always
as consistently as might be wished. The new technofocal
humanistic societies, born out of this very revolution
in technology, exist in part to mediate the
technological advances to the scholarly interests,
although this has also taken place with varied degrees
of success. On 6-10 June, two of the most prestigious of
these "new" societies -- The Association for Computers
and the Humanities (ACH) and The Association for
Literary and Linguistic Computing (ALLC) -- will hold a
joint international meeting hosted by the University of
Toronto Centre for Computing in the Humanities at which,
it is hoped and planned, the latest and best computer
related developments for humanistic academic interests
will be demonstrated in the setting of a gala "Software
and Hardware Fair."

Keeping up with technology does not "come cheap," and
the present bifurcation in professional scholarship
between the traditional societies (SBL, AAR, ASOR, etc.)
and the "computers and ..." groups causes added
hardships. It is not clear that our deans and
adminstrations are aware of this type of problem -- at
my University, faculty are permitted a maximum allowance
of $400 towards formal participation in one professional
conference per year. If I attend the annual SBL/AAR/ASOR
meetings, as I ought, there are no funds left for
meetings such as the ALLC-ACH. But the
computer/humanities meetings are also very important for
scholarship in my field, and there needs to be a way in
which the traditional scholarly support structures
(professional societies, academic institutions) provide
incentives, rather than discouragements, for such dual
or even multiple participation!

Registration for the ALLC-ACH Conference and the Fair
is in the neighborhood of US$200 for non-members of ALLC
or ACH (about US$100 for students). In addition to the
Fair, and the traditional smorgasboard/banquet of papers
and panels, there will also be an associated Summer
School in Humanities Computing, jointly sponsored by the
University of Toronto and Oxford University. Educational
institutions, professional societies, and other possible
patrons should be encouraged to consider underwriting
the cost of sending representatives to take advantage of
this unusually rich opportunity. Indeed, many OFFLINE
readers should seriously consider attending these
sessions at their own expense, if that proves necessary.

The following courses are tentatively scheduled for
the Summer School, on a graduated fee scale starting at
about US$150 ($125 for ALLC or ACH members) for one
course (the more you take, the less each costs, maximum
of four courses per week). During 29 May through 2 June,
the topics are WordPerfect, Computer Assisted
Instructional Writing, Desktop Publishing, Computer
Assisted Language Learning, Humanities Computing in
China-Japan-Korea, Hypertext, Interactive Writing for
Students, HyperCard, Meeting Campus Needs in Humanities
Computing, Meeting School Needs in Humanities
Computing, and Writing with Computer Support in the
Schools. On 5 June there will be a one day workshop on
Advanced Function Workstations. From 12-16 June, three
of the earlier courses will be repeated (WordPerfect,
Desktop Publishing, HyperCard) plus Scholarly
Publishing, Interactive Video, Relational Database
Systems, Programming in SNOBOL4, Study of Reader
Response, Tools for Translation, Nota Bene, Literary and
Linguistic Computing, and Discourse Analysis.

For further information, contact Professor Ian
Lancashire, Centre for Computing in the Humanities,
Robarts Library, 14th Floor, 130 St. George Street,
University of Toronto, Ontario M5S 1A5 CANADA (tel. 416
978-4238; BITNET IAN@UTOREPAS).

<Bringing Archives/Repositories into Sharper Focus>

In addition to any involvement with software/hardware
displays, my own special assignment for Toronto is to
coordinate a panel on humanities Archives/Repositories.
As is clear to readers of OFFLINE, this is a long and
abiding interest of mine. The computer offers a
fantastic set of tools for textual research, but they
cannot work in a vacuum. We must have access to the
electronic texts and related data. Over the years -- now
even decades! -- a wide variety of electronic materials
have been generated in a wide variety of forms and under
widely varying conditions. Some -- perhaps many -- of
the early individual efforts are no longer recoverable.
Certainly many electronically typeset books survive now
only as hardcopy orphans, having lost the electronic
parent.

Although there have been sporadic efforts to catalogue
and/or collect the surviving sea of material, none have
yet proved successful in any comprehensive sense. The
Oxford Text Archive is probably the largest unstructured
collection of such materials -- and it distributes a
catalogue of holdings as well -- but it is at the mercy
of the various data producers, who may or may not choose
to list or deposit their materials at the Archive. The
off-again, on-again Rutgers Inventory of Machine
Readable Texts deserves encouragement and support for
its intent to create a comprehensive list of what is out
there, although for a variety of reasons, progress has
been slow and sporadic.

Archiving is largely a thankless task, and requires
both personal commitment and fiscal support to be
effective. That the Oxford Text Archive has survived as
an active enterprise as long as it has is perhaps more a
tribute to British resourcefulness and tenacity on the
part of its staff than anything else. As its (usually)
amiable overseer, Lou Burnard, would be among the first
to admit, the fact that such a collection exists does
not guarantee that the needs of the people for whom it
exists are being met or even actively addressed. It
takes time and resources to document adequately what is
in an archive, to correct errors, to harmonize formats
and make coding choices consistent, to service inquiries
and orders, to stock tapes and diskettes, to make and
dispatch copies, to protect legal rights and keep track
of the whole business -- to mention only some of the
most obvious desirable functions.

<Kinds of Archives/Repositories>

At the most basic level, an archive (or repository) is
involved in collecting and preserving. This can be
viewed as a predominantly passive function -- to serve
as a storage area for whatever relevant materials are
submitted for deposit. Apart from anything else it has
done or hoped to do, the Oxford Textual Archive (OTA)
has been able to fill this function. It is there, and
welcomes contributions of data from whatever source --
including material that is not allowed to circulate
independently under any conditions. The fact that all
producers of electronic textual material have not in
fact sent their materials to or even listed them with
the OTA is unfortunate, and hopefully can gradually be
remedied. At one level, CCAT is among the guilty. We
have sent some materials to the OTA and have agreed to
provide a complete listing, but thus far have not
fulfilled the promise. But at least we are committed to
and are working in a cooperative mode. If every producer
and collector of electronic text would take similar
steps towards cooperation with the OTA we would all be
in a position to reap significant benefits!

Why do I emphasize working with the OTA? Because it is
in place (and has been for many years), is widely known,
and is willing to serve this function. The OTA issues a
catalogue of holdings, classified by language and
author/work, which includes references to the holdings
of cooperating archives elsewhere. Is there really any
point in spending scarce humanistic resources to try to
replicate this function elsewhere? That makes no sense
to me.

Many other archives and levels of archiving exist,
usually with a specific area of focus. I have not
attempted to include projects that are primarily
concerned with excerpting and indexing data although
they also qualify, in a general sense, as archives.
Instead, my main focus here is on consecutive textual
data. The classicists saw the need for making electronic
material available quite early in the game, and created
the American Philological Association's repository of
machine readable texts. The Latin side of this endeavor
has recently been taken up by the Packard Humanities
Institute (PHI), while the Thesaurus Linguae Graecae
(TLG) has worked for many years on encoding the ancient
Greek literature. At Duke University there is a related
project to encode Greek documentary papyri. Projects
that focus on ancient Greek inscriptions are underway at
Cornell and at the Princeton Institute for Advanced
Studies. Electronic versions of Ancient Near Eastern
materials can be found at UCLA. The Comprehensive
Aramaic Lexicon project is creating its archive centered
at Johns Hopkins, and the Yiddish Dictionary project at
Columbia. At CCAT, we have concentrated on producing and
collecting electronic materials related to biblical
studies. Bar Ilan University has its massive "Global
Jewish Database." French efforts have produced the
"Treasury of the French Language," now being continued
also at Chicago. Spanish is centered at Wisconsin.

The list goes on and on. It would not surprise me to
find that more than 50 major archival centers for
electronic texts and related humanities materials exist
throughout the western world. (I have only the vaguest
idea of the situation in Japan, China, and Russia, for
example, and should know more about Australia.) I have
not yet mentioned major collections and efforts of which
I am aware in the Universities and associated
institutions of Canada (e.g. Laval, McGill, Toronto,
Waterloo), Great Britain (Cambridge, Essex, Glasgow,
London), Scandanavia-Iceland (Bergen, Copenhagen,
Goeteberg, Oslo, Reykjavik), the Netherlands (Amsterdam,
Leiden, Nijmegen), Belgium (Liege, Louvain-la-Neuve,
Maredsous), France (Nancy, Paris), Spain (Madrid),
Germany-Austria (Bonn, Cologne, Goettingen, Mannheim,
Tuebingen, Ulm, Vienna), Italy (Pisa, Turin), Israel
(Academy of the Hebrew Language, Hebrew University). In
the USA, other institutions with major collections
include Berkeley, Brigham Young, Cleveland State,
Colorado, Dartmouth, Rutgers, San Diego, Southern
Mississippi, Stanford -- and there is always
talk of new archival projects and centers being
developed. In preparation for the Toronto panel on
Archives, I hope to be able to make available a more
precise list of such resources, with at least some
general characterization of their holdings. For this, I
will need a great deal of cooperation.

In most instances, the primary function of such
institutions and organizations as those mentioned above
is not simply to collect data, but to do something
special with the data. And herein lies a labyrinth of
problems. Working with data within a specific context
and strategy is not necessarily easily compatible with
distributing data to general users. It can be very
expensive and bothersome to field requests, provide
information, replicate the data in various formats, etc.
Few places are adequately equipped for such tasks. Thus
is is not really surprising that although a relatively
large amount of humanistic data has been encoded, it may
not be possible to obtain access to that which interests
you. And even if you can locate what you want, and can
get permission to use it, you may find that the amount
of preparatory work necessary for using it is
foreboding.

Sometimes the data is protected in some way so that it
can only be used within a specific framework. Access may
be only "online" -- that is, through a direct electronic
connection with the archive/repository (e.g. by
telephone line, or limited on-site use) -- without the
possibility of the user taking electronic material to
work on elsewhere. In some instances, the data can be
obtained and referred to at the user's convenience, but
can only be accessed by means of special software that
places limits on the process (e.g. CD-ROM packages under
software control). Often the need to protect and control
the data is dictated by legal considerations (e.g.
copyrighted material), or financial (recouping expenses,
if not making a profit). Even where no intent to
restrict is present, the circumstances may cause such a
situation -- e.g. when distribution is only possible in
a form incompatible with the users' equipment (9 track
tape, CD-ROM, etc.), or with the available software (a
specific data base management system, for example).

<What Can be Done?>

In short, there are many obstacles between the
would-be user and the extant data. Concerted efforts are
needed to attack at least the following overlapping
areas:

(1) Information is needed about the existence of
materials in electronic form, whether they are in large
"archival" centers or are the products of isolated
individuals. Please provide basic information (e.g.
title, format, ownership, availability) to OTA (Lou
Burnard, 13 Banbury Road, Oxford OX2 6NN, England; BITNET
ARCHIVE@VAX.OX.AC.UK) or to OFFLINE. And please alert me
to the existence of collections ("archives") that I may
have overlooked in the preceding discussion!

(2) Support is needed for gathering available materials
into appropriate locations for preservation, access,
and/or distribution. This is a more difficult problem
since few places are ready and willing to attempt to
handle all available external formats (diskettes, tapes,
etc.). Frequently most of the necessary equipment for
such tasks is available in the major centers, but there
is no staff or funding to do the job. Fortunately,
concern at least for preservation seems to be growing,
as evidenced by recent discussions with some
professional societies (e.g. SBL) concerning the
archiving of relevant electronic materials (book
manuscripts, articles, reports, bibliographies, etc.).
CCAT hopes to launch a pilot project to explore this
type of archiving on mass storage media such as WORM
laser/optical drives. Hopefully, other professional
groups and centers will also commit themselves to
this important step. Authors who have an electronic copy
of their own published work should consider depositing
it with such an archive.

(3) Support is needed for reshaping the data, as needed,
into consistent internal formats that can be manipulated
effectively by readily available software. At this point
the "archive" becomes an active participant in insuring
that the data can be put to good use. An example of this
process is the TLG data, which is internally consistent
so that appropriate software will work on the entire
data bank or on any of its parts. Similarly, the
"on-line" data banks mentioned above (e.g. Global Jewish
Databank or the ARTFL/French Language project) have
already performed this service. The costs involved in
such a process are enormous, but the resulting increase
in value for users cannot easily be measured. Again,
close cooperation of the various archival centers will
be required to move effectively towards this goal. And
the development of widely accepted standards for coding
new electronic materials will help to bring this ideal
closer to realization (OFFLINE has mentioned recent
efforts in this direction in earlier columns).

<Archiving for the Future and the Future of Archiving>

We are discussing an area of major transition for
traditional educational and research institutions. With
regard to textual materials, the major archives of the
past and present are our libraries. And it is to the
libraries, expanded to embrace electronic "text," that
we doubtless will look in the future. They are rapidly
gearing up, trying to catch up with the topsy-turvy
growth of the computerized archives during the infancy
stages of the new technologies, trying to harness any
useful results. Also playing catch-up are the publishing
houses, whose fates will become increasingly tied to
their integration of computer-related activities. As the
situation gradually stabilizes, with publishers and
libraries finding their proper balance in relation to
the computing expertise of the future, individual
scholars and humanities computing centers will probably
have much less to be concerned about at the archival
level. Our grandchildren probably will have little
firsthand knowledge about these struggles. But for the
moment, we are presented the opportunity and the
responsibility to help shape that future, and it is to
our own benefit and the benefit of those who follow that
we make the most of this challenge.

<Quick Notes>

Moises Silva of Westminster Theological Seminary has
prepared an electronic index to the Westminster
Theological Journal for the years 1938-1988, and has
made it available for distribution for non-commercial
purposes. Contact OFFLINE for details.

The latest Newsletter from the ATARI ST User Group
announces the availability of the main CCAT biblical
texts on diskette for that machine. Contact Doug Oakman,
1114 - 121st Street South, Takoma WA 98444, who also
reports that he has acquired an IBM/DOS Emulator for the
ATARI.

Dove Booksellers (3165 W. 12 Mile Rd., Berkeley MI
48072), with its growing line of computer materials,
announces a new "After-Hours Computer BBS"
(bulletin-board service) at 300/1200 baud M-F 5-8pm,
weekends and holidays 24 hrs. Dial 313 547-9693.

The Winter 1988 issue of the ACH Newsletter contains
these items of more general interest to OFFLINE readers:
a report on a proposed "Sanskrit Text Archive Project,"
and a summary of the past 6 months of HUMANIST
discussions on BITNET. Do you have access to a library
that subscribes to the publications of the Association
for Computers and the Humanities?

<----->

Please send information, suggestions or queries
concerning OFFLINE to Robert A. Kraft, Box 36 College
Hall, University of Pennsylvania, Philadelphia, PA
19104-6303. Telephone (215) 898-5827. BITNET address:
KRAFT at PENNDRLS (no longer PENNDRLN). To request
information or materials from OFFLINE (or from CCAT),
please supply an appropriately sized, self-addressed
envelope or an address label.