18.052 new on WWW: Nameless Shakespeare; Rich Site Services

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty@kcl.ac.uk)
Date: Tue Jun 08 2004 - 04:55:22 EDT

  • Next message: Humanist Discussion Group (by way of Willard McCarty

                    Humanist Discussion Group, Vol. 18, No. 52.
           Centre for Computing in the Humanities, King's College London
                       www.kcl.ac.uk/humanities/cch/humanist/
                            www.princeton.edu/humanist/
                         Submit to: humanist@princeton.edu

       [1] From: Martin Mueller <martinmueller@northwestern.edu> (51)
             Subject: A new digital Shakespeare from Northwestern University

       [2] From: Gerry Mckiernan <gerrymck@IASTATE.EDU> (37)
             Subject: Announcing: _RSS(sm): Rich Site Services_

    --[1]------------------------------------------------------------------
             Date: Tue, 08 Jun 2004 09:47:22 +0100
             From: Martin Mueller <martinmueller@northwestern.edu>
             Subject: A new digital Shakespeare from Northwestern University

    May I draw the attention of humanists everywhere to a new electronic
    Shakespeare, which is accessible from the Northwestern University Library at
    www.library.northwestern.edu/shakespeare.

    The Nameless Shakespeare, as it is provisionally called, is the product of
    collaboration between the Perseus Project at Tufts University and
    Northwestern faculty and staff in Academic Technologies and the
    Library. The project is very much a work in progress and will become part
    of WordHoard, a larger project at Northwestern, which has received funding
    from the Mellon Foundation.

    The aim of the Nameless Shakespeare is to create a freely available text
    that fully supports the query potential of the digital surrogate.
    The text is derived from a scanned version of the Globe Shakespeare but has
    been thoroughly revised to create a text that is standardized in its
    spelling but reflects as closely as possible the prosodic and morphological
    properties of the folio or quarto copy texts. The text is tagged in a
    TEI-conformant manner, and in addition to its own citation scheme it
    carries references to the Hinman TLN numbers. It is fully lemmatized and
    has been parsed with the CLAWS part-of-speech tagger developed at Lancaster
    University and used for the British National Corpus. In the course of this
    summer we will add a level of semantic tagging to this text, using the USAS
    tagger developed by Lancaster University.

    The current interface for the Nameless Shakespeare is a stopgap measure
    while we develop the new WordHoard interface, which will let users take
    full advantage of this deeply tagged text. But clunky and inconsistent as
    the current interface may be (especially in its delivery of complex query
    results) it lets you do now what you cannot easily do through any other
    site. You can, for instance, make a list of words spoken by Ophelia in
    verse, or a list of words that occur only in Hamlet and Lear, adjectives in
    the Comedy of Errors, and so forth.

    At the moment the text of the Nameless Shakespeare will be accessible only
    through the Northwestern interface. We expect to release the text early in
    the fall of 2004 after we have added the level of semantic tagging and
    corrected many remaining errors in the part-of-speech tagging, especially
    in the assignment of grammatical words to such categories as adverb,
    conjunction, or determiner.

    We will be most interested in hearing from users of the Nameless
    Shakespeare what they would like to see in the better interface we plan to
    develop through WordHoard, and we will also be very grateful for error
    reports. Automatic tagging of textual data has an error rate on the order
    of 5%. Through manual corrections we have now reached a stage where we
    believe the error rate hovers around 1%. But in a text of some 850,000 word
    occurrences that still means about 10,000 wrongly assigned word
    occurrences. Error reports, even of individual errors, are very useful in
    directing attention to systemic problems.

    Martin Mueller
    Professor of English and Classics
    Department of English
    Northwestern University
    Evanston, Illinois 60208
    martinmueller@northwestern.edu
    847-864-3496

    --[2]------------------------------------------------------------------
             Date: Tue, 08 Jun 2004 09:49:13 +0100
             From: Gerry Mckiernan <gerrymck@IASTATE.EDU>
             Subject: Announcing: _RSS(sm): Rich Site Services_

    RSS(sm): Rich Site Services

    I am pleased to announce the establishment of my latest Web registry
    titled_RSS(sm): Rich Site Services_ _RSS(sm)_ is a categorized registry
    of library services that are delivered or provided through RSS/XML feeds
    and is available at:

    [ http://www.public.iastate.edu/~CYBERSTACKS/RSS.htm ]

    RSS is an initialism for RDF Site Summary / Rich Site Summary / Really
    Simple Syndication

       [ http://www.libraryjournal.com/article/CA296443 ]

    RSS(sm) has been seeded with examples in the following groups:

    | ANNOUNCEMENTS | INTERNET RESOURCES GUIDES | NEW BOOKS | NEW JOURNAL
    ISSUES | NEWS |

    For each entry within a category, a link is provided to a RSS (and/or
    XML) link for the item, or an information page that provides a
    subsequent link (or more).

    I am greatly interested in learning about other examples of ANY and ALL
    RSS service(s) provided by any type of Library (academic, corporate,
    public, research, special, etc.) for potential inclusion in this new
    registry.

    I am particularly interested in the use of syndication services such as
    that provided by Amazon.com
    [http://www.amazon.com/exec/obidos/subst/xs/syndicate.html/ ] for
    Collection Development (or other library service), as well as any that
    relate to other types of library services (e.g., ADMINISTRATION |
    ACQUISTIONS | CATALOGING | CIRCULATION | COLLECTION DEVELOPMENT |
    INSTRUCTION | INTERLIBRARY LOAN | ONLINE PUBLIC ACCESS CATALOGS |
    REFERENCE SERVICES | TABLE OF CONTENTS , ETC|)

    Regards,

    Gerry

    Gerry McKiernan
    Syndicated Librarian
    Iowa State University
    Ames IA 50011

    gerrymck@iastate.edu

    The RSS(sm) registry was inspired byrecent postings to the Web4Lib
    e-list
    [ http://sunsite.berkeley.edu/Web4Lib/archive.html ]



    This archive was generated by hypermail 2b30 : Tue Jun 08 2004 - 13:01:46 EDT