4.0252 Indexing (4/145)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Mon, 9 Jul 90 15:45:47 EDT

Humanist Discussion Group, Vol. 4, No. 0252. Monday, 9 Jul 1990.


(1) Date: Fri, 06 Jul 90 06:17:43 IST (14 lines)
From: Daniel Boyarin <BOYARIN@TAUNIVM>
Subject: Re: 4.0247 Qs: Indexing ...

(2) Date: 7 July 1990 11:50:15 CDT (60 lines)
From: "M. R. Sperberg-McQueen " <U15440@UICVM>
Subject: low tech indexing; NLCindex

(3) Date: Mon, 09 Jul 90 11:39:54 EDT (36 lines)
From: "David R. Chesnutt" <N330004@UNIVSCVM>
Subject: Re: 4.0247 Qs: Indexing ...

(4) Date: 09 Jul 90 13:29:03 EST (35 lines)
From: James O'Donnell <JODONNEL@PENNSAS>
Subject: indexing report

(1) --------------------------------------------------------------------
Date: Fri, 06 Jul 90 06:17:43 IST
From: Daniel Boyarin <BOYARIN@TAUNIVM>
Subject: Re: 4.0247 Qs: Indexing ...

On indexing. I think that you will probably find it easiest to remain
with nb's indexing program rather than massaging the text for someone
else's. It is a fairly slow process but not otherwise cumbersome in my
experience and very accurate. Perhaps you could borrow a 386 machine
for this job. On a fst machine it takes a fraction of the time that it
does on a slow one of course. Also a job like that can be left to run
overnight or over a weekend if you are sure about the electricity
supply. By the way, your book sounds fascinating to me. What is it
about? My guess is Philo or patristics? Both topics of great interest
to me. Good luck,Daniel Boyarin
(2) --------------------------------------------------------------73----
Date: 7 July 1990 11:50:15 CDT
From: "M. R. Sperberg-McQueen " <U15440@UICVM>
Subject: low tech indexing; NLCindex


The following addresses the questions on indexing raised by O'Donnell
only indirectly. But perhaps a general discussion of people's
experiences with indexing might be of interest.

When I did an index last summer, I ultimately settled on a very low tech
solution that, however, worked satisfactorily: I followed the Chicago
manual of style recommendations for manually marking the page proofs and
then, instead of writing the entries on index cards, I created the
electronic equivalent in the form of a list of entries in a file that I
later sorted alphabetically (as one would sort the index cards). In
part, this solution reflects my being more comfortable working on a hard
copy, where you can flip back and forth to look in the margins to see
how you treated a particular concept on an earlier or later page. And,
having marked the hard copy, it seemed silly to re-mark a copy on the
screen. The idea that indexing an electronic copy permits you to begin
your index before you get the page proofs doesn't seem to me to be
persuasive. If one has a publisher who's not going to allow reasonable
time to do the index on the basis of page proofs, one should find a
different publisher. It's not a weekend project.

I found trying to index the text electronically frustrating: WP is
happiest if you simply mark words in your text; since I was often
indexing concepts that did not consistently appear as words or phrases
in the text, I was having to type these in--and having to do so
repeatedly, as WP didn't allow me to write a macro for a phrase to be
called up and entered into the index entry field. I was also indexing
titles of German baroque poems which, notoriously, are not short, and WP
had a limit on the length of its entries. It's easier to get around
these problems in NB, since you can invoke stored phrases into index
entry fields. For indexes of names and places such as O'Donnell speaks
of, either WP or NB may well provide good solutions; for indexes of
subjects, they may be less helpful. I will not dwell on the quality of
subject indexes of the majority of computer manuals, as these disasters
presumably have been produced by people who don't realize that an index
is not a word list and that a good one requires intelligent reflection
on the part of the human indexer.

I have the feeling my low tech route is not going to be very appealing;
as compensation I suggest looking into LNCindex, which was originally
developed for indexing a scholarly edition (the Jefferson papers?) with
very complex and detailed multiple indexes. It permits 3 levels of
entries; it reduces typing by allowing one to use abbreviations (one
could type, for example, TJ to get Jefferson, Thomas). It runs on IBM
compatibles. It may or may not seem like a bargain at $160 (you can get
a $25 demo disk with the $25 to be applied to the purchase of the
program). It can certainly handle very large indexes; I don't know how
it does speed-wise, which is one of O'Donnell's main concerns. But I
know at least one person who thinks it's the cat's pyjamas, and Michael
would doubtless be willing to talk about it when he returns to Chicago
Monday. For official information on the program, you can write NLCindex
/ The Newberry Library / 60 West Walton Street / Chicago, IL 60610 or
phone 312/943-9090.

Marian Sperberg-McQueen
U. of Illinois at Chicago.
(3) --------------------------------------------------------------44----
Date: Mon, 09 Jul 90 11:39:54 EDT
From: "David R. Chesnutt" <N330004@UNIVSCVM>
Subject: Re: 4.0247 Qs: Indexing ...


In response to Jim O'Donnell's query re indexing, one approach might be
to use a dedicated indexing program rather than trying to handle the
various tasks with a word processing package.

The old mainframe version of CINDEX we developed for the Laurens Papers
project has been revised into a user-friendly PC version that will run
on IBMs or IBM-compatibles and is available through the Newberry Library
at Chicago, 60 West Walton St., Chicago, IL 60610.

Running on a standard IBM-AT, an index with 10,000 cites takes about 45
minutes to sort. We've used the PC version for the last three indexes
of the Laurens Papers. The PC files are fully compatible with the
mainframe version which means that we can easily update our in-house
cumulative index as we publish volumes.

Like its predecessor NLC follows the indexing rules of the Chicago
Manual of Style.

One of the major improvements in the PC version is its flexibility in
allowing the user to edit and resort the index files it creates. On a
normal index at Laurens, we probably spend 50% of our time editing and
refining the index so its very important to us to be able to produce a
"rough draft," edit that, and then have a new draft which incorporates
those changes.

Editing can be done in any word processing package which writes an ASCII
file back out for reprocessing via NLC.

David Chesnutt
Papers of Henry Laurens
University of South Carolina
(4) --------------------------------------------------------------38----
Date: 09 Jul 90 13:29:03 EST
From: James O'Donnell <JODONNEL@PENNSAS>
Subject: indexing report

From: Jim O'Donnell (Penn, Classics)

My thanks to numerous correspondents. I thought I would report briefly
on what I have found about indexing large files.

1. There is what sounds like a good dedicated index program out there,
and I have encouraged the specialist who knows about it to report
directly.

2. WordPerfect is going through a turtle stage. I have confirmed both
with other users and with their 800-line that if you are at all crowding
the limits of available RAM (not hard, since the program alone takes
well over 400K now in 5.1), the program slows down astonishingly.
Executing a simple command in a 228K file took a full and unbelievable
thirty seconds, just to mark one word for indexing. Expanded memory may
help, if you can get enough headroom that way, and WP reports that they
`are working on it'. Breath-holding not advised.

3. Detailed experiments with Nota Bene have been more encouraging. I put
600+ index markers randomly through the same 228K file, and it generated
a very satisfactory index (on my AT at 8MHz with 640 RAM) in only about
ten minutes. For myself for now, that is probably the way I will go.

3. One piece of advice. If you are creating a text that will need to be
indexed, you may think about including some fence or marker when you are
generating the text. I have a lot of abbreviated references to ancient
works (e.g., Aen. 6.234-238, Aug. conf. 11.10.13): it would speed
things marvelously in marking these things for index if I had put some
fence character (even one that is invisible to the printer) after the
238 or the 13 in those examples: then in one program or another a macro
or a global search/replace could speed the marking of items dramatically.