18.052 new on WWW: Nameless Shakespeare; Rich Site Services

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty@kcl.ac.uk)
Date: Tue Jun 08 2004 - 04:55:22 EDT

Next message: Humanist Discussion Group (by way of Willard McCarty

                Humanist Discussion Group, Vol. 18, No. 52.
       Centre for Computing in the Humanities, King's College London
                   www.kcl.ac.uk/humanities/cch/humanist/
                        www.princeton.edu/humanist/
                     Submit to: humanist@princeton.edu

[1] From: Martin Mueller <martinmueller@northwestern.edu> (51)
Subject: A new digital Shakespeare from Northwestern University

[2] From: Gerry Mckiernan <gerrymck@IASTATE.EDU> (37)
Subject: Announcing: _RSS(sm): Rich Site Services_

--[1]------------------------------------------------------------------
         Date: Tue, 08 Jun 2004 09:47:22 +0100
         From: Martin Mueller <martinmueller@northwestern.edu>
         Subject: A new digital Shakespeare from Northwestern University

May I draw the attention of humanists everywhere to a new electronic
Shakespeare, which is accessible from the Northwestern University Library at
www.library.northwestern.edu/shakespeare.

The Nameless Shakespeare, as it is provisionally called, is the product of
collaboration between the Perseus Project at Tufts University and
Northwestern faculty and staff in Academic Technologies and the
Library. The project is very much a work in progress and will become part
of WordHoard, a larger project at Northwestern, which has received funding
from the Mellon Foundation.

The aim of the Nameless Shakespeare is to create a freely available text
that fully supports the query potential of the digital surrogate.
The text is derived from a scanned version of the Globe Shakespeare but has
been thoroughly revised to create a text that is standardized in its
spelling but reflects as closely as possible the prosodic and morphological
properties of the folio or quarto copy texts. The text is tagged in a
TEI-conformant manner, and in addition to its own citation scheme it
carries references to the Hinman TLN numbers. It is fully lemmatized and
has been parsed with the CLAWS part-of-speech tagger developed at Lancaster
University and used for the British National Corpus. In the course of this
summer we will add a level of semantic tagging to this text, using the USAS
tagger developed by Lancaster University.

The current interface for the Nameless Shakespeare is a stopgap measure
while we develop the new WordHoard interface, which will let users take
full advantage of this deeply tagged text. But clunky and inconsistent as
the current interface may be (especially in its delivery of complex query
results) it lets you do now what you cannot easily do through any other
site. You can, for instance, make a list of words spoken by Ophelia in
verse, or a list of words that occur only in Hamlet and Lear, adjectives in
the Comedy of Errors, and so forth.

At the moment the text of the Nameless Shakespeare will be accessible only
through the Northwestern interface. We expect to release the text early in
the fall of 2004 after we have added the level of semantic tagging and
corrected many remaining errors in the part-of-speech tagging, especially
in the assignment of grammatical words to such categories as adverb,
conjunction, or determiner.

We will be most interested in hearing from users of the Nameless
Shakespeare what they would like to see in the better interface we plan to
develop through WordHoard, and we will also be very grateful for error
reports. Automatic tagging of textual data has an error rate on the order
of 5%. Through manual corrections we have now reached a stage where we
believe the error rate hovers around 1%. But in a text of some 850,000 word
occurrences that still means about 10,000 wrongly assigned word
occurrences. Error reports, even of individual errors, are very useful in
directing attention to systemic problems.

Martin Mueller
Professor of English and Classics
Department of English
Northwestern University
Evanston, Illinois 60208
martinmueller@northwestern.edu
847-864-3496

--[2]------------------------------------------------------------------
         Date: Tue, 08 Jun 2004 09:49:13 +0100
         From: Gerry Mckiernan <gerrymck@IASTATE.EDU>
         Subject: Announcing: _RSS(sm): Rich Site Services_

RSS(sm): Rich Site Services

I am pleased to announce the establishment of my latest Web registry
titled_RSS(sm): Rich Site Services_ _RSS(sm)_ is a categorized registry
of library services that are delivered or provided through RSS/XML feeds
and is available at:

[ http://www.public.iastate.edu/~CYBERSTACKS/RSS.htm ]

RSS is an initialism for RDF Site Summary / Rich Site Summary / Really
Simple Syndication

[ http://www.libraryjournal.com/article/CA296443 ]

RSS(sm) has been seeded with examples in the following groups:

For each entry within a category, a link is provided to a RSS (and/or
XML) link for the item, or an information page that provides a
subsequent link (or more).

I am greatly interested in learning about other examples of ANY and ALL
RSS service(s) provided by any type of Library (academic, corporate,
public, research, special, etc.) for potential inclusion in this new
registry.

I am particularly interested in the use of syndication services such as
that provided by Amazon.com
[http://www.amazon.com/exec/obidos/subst/xs/syndicate.html/ ] for
Collection Development (or other library service), as well as any that
relate to other types of library services (e.g., ADMINISTRATION |
ACQUISTIONS | CATALOGING | CIRCULATION | COLLECTION DEVELOPMENT |
INSTRUCTION | INTERLIBRARY LOAN | ONLINE PUBLIC ACCESS CATALOGS |
REFERENCE SERVICES | TABLE OF CONTENTS , ETC|)

Regards,

Gerry

Gerry McKiernan
Syndicated Librarian
Iowa State University
Ames IA 50011

gerrymck@iastate.edu

The RSS(sm) registry was inspired byrecent postings to the Web4Lib
e-list
[ http://sunsite.berkeley.edu/Web4Lib/archive.html ]

This archive was generated by hypermail 2b30 : Tue Jun 08 2004 - 13:01:46 EDT