6.0188 Summary: Downloading from OPACS (1/150)
Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Wed, 12 Aug 1992 16:03:58 EDT
Humanist Discussion Group, Vol. 6, No. 0188. Wednesday, 12 Aug 1992.
Date: Mon, 10 Aug 92 12:56:30 EDT
From: junger@samsara.law.cwru.edu (Peter D. Junger)
Subject: Downloading from OPACS and filtering out computer droppings
A while back I posted a message on HUMANIST--which I gather
was reposted on PACS-L (with my permission)--discussing "Electronic
library catalogues and unreadable searches". I received many responses
to that posting, both over the lists and by personal communication. (I
tried to respond, and, I think, in most cases succeeded in responding, to
the personal messages.) I want to thank everyone who did come up with
sympathy and suggestions. And I want to summarize what I have learned about
this issue.
The problem that started all this was, as some of you will recall,
the fact that here at Case Western Reserve University our old--very old
and most incapable--electronic library catalogue was suddenly replaced with
a nice new INNOPAC catalogue. And suddenly I discovered that--though I could
now, mirabile dictu, actually do Boolean searches (sort of)--I could no
longer record my searches or results in machine readable form on my PC, which
left me feeling pretty desperate. The problem was that, though I could
still physically record what I was seeing as my search went on, I could not
read or edit the resulting file it was so filled with computer droppings.
Now I suppose, if I had never used an electronic catalogue before,
I would not have missed what I had never had. But our old electronic
catalogue was so incapable that it actually wrote its screens in pure ASCII
code in a plodding linear fashion, never trying to go backwards or draw
boxes with non-ASCII symbols, or anything like that. I mean its output
was so crude and old-fashioned that it would have satisfied even the
zealots of the GUTNBERG list. And I had, since I get confused easily and
have little memory, and what there is of it can be quite creative, and since
I have the messiest desk in Cleveland and no place to write notes (and anyway
I can never find a pen), I had come to depend on the ability to record my
so often failing efforts to find a book that I knew was there somewhere.
And now _they_--the people who are supposed to support and encourage us
in the risky business of actually using these computers and networks and
things that allow _them_ to advertise that _we_ are on the cutting edge of
technology, the death of the thousand cuts--had once again, without warning
or compunction, pulled the electronic rug out from under my ability to
do my pedestrian research in my pedestrian way.
And that's why I sent my original message on this subject to
HUMANIST.
Now before I summarize what I have learned from the lists--learned
from you--I should take a moment to explain what happened here at CWRU
after I protested to the Provost that _they_ were getting rid of the old
electronic catalogue before the new one was working properly. (I didn't
dream at the time that there were electronic catalogs from which all
downloading was effectively impossible; I just assumed--correctly as it
turned out--that the new catalog was not yet fully operational.) The Provost
referred me to the Director of the University Libraries, who is quite
clearly not one of _them_. She was horrified, although apparently not
surprised, by my predicament and explained that the ability to download
records of a search would be available in thirty days. She also told
me that she thought that the problem could be fixed within three days
and suggested that I talk to one of that subset of _them_ that is
supposed to support the libraries' computers, including the new electronic
catalog , but which is not in the Director of the Libraries' chain-of-command.
That conversation was most unpleasant. (I think that I am a civilian
casualty of a turf war.) But in the end the Director prevailed and the end
result was that within three days our new INNOPAC system had an additional
function which allows one to "export" the bibliographic records that one has
located in TEXT (i.e., ASCII) format. (This export function supposedly also
allows one to download the bibliographic records in MARK and Pro-Cite formats,
but that capability has not yet been implemented.)
This ability to export a bibliographic record is not, of course,
the same as the ability to download a history of the search by which one
located the bibliographic records. But, if that capability had existed
at the time I suddenly discovered that the old electronic catalogue was
being prematurely disconnected, I would not have been so unhappy. In fact,
I do not think that I would have sent my original message to HUMANIST on
this subject or have begun writing a filter to remove the computer droppings--
ANSI escape codes--from a downloaded file.
The responses to my message on this subject to HUMANIST made clear
that the problem of computer droppings afflict most electronic catalogues,
not just INNOPAC. In fact, the problem is worse with other systems which
do not have INNOPAC's export function.
I think that this is another example of _their_ work. (In this case
_they_ being the vendors of electronic catalogues.) There is no reason
why _they_ should not use pure ASCII codes to put information on the user's
screen, but that would deprive _them_ of the ability of claiming to be
up-to-date (viz., to have gotten into the early 1970s) because their output
can only be interpreted by DEC VT-100 (or, in the case of INNOPAC, VT-102)
terminals (or by communication programs that emulate them).
Several respondents claimed that INNOPAC's export function and
Pro-Cite were complete solutions to the problem and could not see why
one would want to keep a history of one's search. Others claimed that
the Pro-Cite, INNOPAC combination was a conspiracy in violation of the
anti-trust laws and strongly advised against buying a copy of Pro-Cite.
(I don't intend to buy one.)
A few suggested that the problem could be solved by taking
"snapshots" of each screen. That would technically solve the problem,
but wouldn't work for me; I can't take snapshots and think about my
search at the same time, especially not when the communications software
supplied by CWRU requires one to name the file to hold the snapshot
each time that one shoots. Some suggested that sending the record
to a printer, but then redirecting the printer output to a file on the
computer would solve the problem. I never could figure out how that
would work--and anyway our telnet package does not, yet, allow one
to send the copy to a printer.
And then several people came up with a real solution. It seems
that Clyde Grotophorst at George Mason has written a program called
CITEREAD that will remove the computer droppings from a NOTIS OPAC
(which is one of the systems without any export capability). It even
allows the user to specify additional items to filter from the down-
loaded file.
The only trouble was that CITEREAD, which is available at several
anonymous FTP sites, does not do a perfect job of handling records
downloaded from INNOPAC. But it does a good enough job, so I would
not have started working on my filter had I known of CITEREAD when I
first discovered the problem. But by the time I did hear of CITEREAD
and located a copy--using Archie--I had already sunk so much time in
my filter that it seemed a shame not to finish it.
And then nothing happened for a couple of weeks and then this
morning I finished my filter and it seems to work on recordings from
INNOPAC catalogues and on recordings from NOTIS catalogues and on recordings
from that other system that uses VT-100 emulation and is used by the
Cleveland Public Library: DRA I think it is called. I've now got a few
people around here alpha-testing the filter.
I am going on vacation for a week come dawn on Wednesday, so I
won't really be sure that it is bug-free until I return and can do some more
checking. (I know it needs to do a better job of checking boundary
conditions that it does at the moment, but none of the systems I have
tested it on have tried to move the cursor off the screen; INNOPAC and
NOTIS and DRA seem to produce well-behaved output.)
But I am feeling proud of myself, so I though I would report this
apparent success. (If any of you have a desperate need for such a filter
or are compulsive alpha-testers of programs written in 8086 assembler and
just have to have a copy to see if you can break it, Judy Kaul at our
library (jak4@po.cwru.edu) can probably tell you how to get a copy. And
I will be around until tomorrow (Tuesday) evening.
I think that there is a moral to all this. _They_ are going to
get us in the end unless we can support ourselves. With the INNOPAC
export facility and the existence of CITEREAD, my filter is hardly necessary,
but I feel a lot safer knowing that I can write it.
Peter D. Junger
Case Western Reserve University Law School, Cleveland, OH
Internet: JUNGER@SAMSARA.LAW.CWRU.Edu -- Bitnet: JUNGER@CWRU