15.096 accuracy rates & OCR research

From: by way of Willard McCarty (willard@lists.village.Virginia.EDU)
Date: Sun Jun 10 2001 - 06:10:41 EDT

  • Next message: by way of Willard McCarty: "15.097 TEI Council and Board nominations"

                    Humanist Discussion Group, Vol. 15, No. 96.
           Centre for Computing in the Humanities, King's College London

       [1] From: Willard McCarty <willard.mccarty@kcl.ac.uk> (25)
             Subject: OCR research

       [2] From: Barbara Bordalejo <bb268@nyu.edu> (8)
             Subject: Re: 15.093 accuracy rates for proof-reading

             Date: Sun, 10 Jun 2001 10:18:03 +0100
             From: Willard McCarty <willard.mccarty@kcl.ac.uk>
             Subject: OCR research

    Those interested in following the current thread on OCR of hand-printed
    texts might look at the following:

    1. Center of Excellence for Document Analysis and Recognition
    <http://www.cedar.Buffalo.EDU/>, a research centre "concerned with the
    science of recognition, analysis and interpretation of digital documents".

    2. The Document Understanding and Character Recognition WWW Server
    (Maryland) <http://documents.cfar.umd.edu/>, which "serves as a repository
    for Document Image Understanding and Optical Character Recognition (OCR)
    information and resources"; see esp the page on commercial character
    recognition resources, <http://documents.cfar.umd.edu/resources/products/>.

    3. Information Science Research Institute (Nevada)
    <http://www.isri.unlv.edu/>. This institute once published yearly results
    from its "OCR Technology Assessment" programme but does not appear to do so
    any longer.

    4. OCR and Text Recognition: Academic Research Projects
    <http://hera.itc.it:3003/~messelod/OCR/ResearchProjects.html>. A
    bibliography of academic research projects in the area.

    Other recommendations welcome.


    Dr Willard McCarty / Senior Lecturer /
    Centre for Computing in the Humanities / King's College London /
    Strand / London WC2R 2LS / U.K. /
    +44 (0)20 7848-2784 / ilex.cc.kcl.ac.uk/wlm/

             Date: Sun, 10 Jun 2001 10:18:34 +0100
             From: Barbara Bordalejo <bb268@nyu.edu>
             Subject: Re: 15.093 accuracy rates for proof-reading

    The final check of the Canterbury Tales Project publications should have
    "less than one correction for every four thousand characters." (Cf. p.
    45, Robinson and Solopova, "Transcription Guidelines" in Blake and
    Robinson, eds., _The Canterbury Tales Project Occasional Papers Volume I,
    Oxford: OHC, 1993).
    I have the idea that this translated on one mistake per hundred lines of
    transcription. Of course, the CDs include digitized images of the
    manuscripts which be compared with the transcriptions.

    Barbara Bordalejo

    This archive was generated by hypermail 2b30 : Sun Jun 10 2001 - 06:18:51 EDT