Humanist Discussion Group, Vol. 14, No. 699.
Centre for Computing in the Humanities, King's College London
<http://www.princeton.edu/~mccarty/humanist/>
<http://www.kcl.ac.uk/humanities/cch/humanist/>
Date: Wed, 28 Feb 2001 10:09:48 +0000
From: NINCH-ANNOUNCE <david@ninch.org>
Subject: European Metadata Engine Project
NINCH ANNOUNCEMENT
News on Networking Cultural Heritage Resources
from across the Community
February 27, 2001
The Metadata Engine Project (METAe)
<http://meta-e.uibk.ac.at/>http://meta-e.uibk.ac.at/
METAe Newsletter Now Available
<http://meta-e.uibk.ac.at/newsletter/news.htm>http://meta-e.uibk.ac.at/newsletter/news.htm
A promising European project, The METADATA ENGINE, is described below,
along with the first issue of the project's newsletter. Essentially, the
project is working on developing software that will automatically generate
metadata during the digitization of printed material and hopefully making
"large scale digitisation of printed material, such as books and journals,
more reliable in terms of digital preservation, more cost-effective in
terms of automation, and more user-oriented in terms of future applications."
David Green
===========
>Date: Mon, 26 Feb 2001 10:06:02 +0000
>>list <DIGLIB@INFOSERV.NLC-BNC.CA>
>From: Simon Tanner <S.G.Tanner@HERTS.AC.UK>
>
*** Apologies for cross-postings ***
The Metadata Engine Project (METAe) - Newsletter now available.
The first issue of the METAe Newsletter is now available from:
<http://meta-e.uibk.ac.at/newsletter/news.htm>http://meta-e.uibk.ac.at/newsletter/news.htm
(for an introduction to METAe see the base of this email)
In this first issue we introduce our project and tell you some information
about progress to date. Our next issue due out in April 2001 will have even
more detail and information. The METAe homepage has further information and
of course the METAe team welcome contact at any time:
<http://meta-e.uibk.ac.at/>http://meta-e.uibk.ac.at/. The METAe Project is
funded under the European Union IST Programme.
In this issue, Gnter Mhlberger, from the Project Co-ordination team at
University of Innsbruck explains the genesis of the idea that led to the
Metadata Engine Project. Also, the influence of the METAe project is
already being felt on the international scene and Alexander Eggar explains
why METAe have been invited to attend the next MOA2 DTD meeting in New York.
We also introduce the 14 partners that make up the Metadata Engine project.
In future issues two partners per issue will showcase their expertise and
involvement in METAe. This will give a good opportunity to find out more
about the backgrounds to our various partners.
We will endeavour to keep you up to date with the METAe project progress
and to give details of forthcoming events that METAe organises or will be
presenting information at. The newsletter may also include reports on
meetings attended by METAe partners - as this issue does, with an article
by Gerd Prasthofer on the SCHEMAS-workshop held in Bonn during November 2000.
We hope you will find this newsletter useful and informative. Any feedback
can be directed to Simon Tanner, Editor of the METAe Newsletter at
<mailto:s.g.tanner@herts.ac.uk>mailto:s.g.tanner@herts.ac.uk
Best regards,
Simon Tanner
Senior Digitisation Consultant (HEDS)
Higher Education Digitisation Service
Web: <http://heds.herts.ac.uk>http://heds.herts.ac.uk
Some further information about METAe:
The METADATA ENGINE Project
"Metadata" are playing a significant role in "digital preservation":
Firstly, they are, in conjunction with emerging standards (such as XML,
EAD, Dublin Core or RDF ), among the most promising ways to keep digital
material "alive" over the years and decades. Secondly, metadata are needed
for all kinds of resource discovery, i. e. using and accessing digital
collections in a user-friendly way. The METADATA ENGINE project picks up
these considerations and will develop software modules in order to automate
metadata capturing by introducing layout and document analysis as a key
technology for digitisation software. METAe will enhance dramatically the
quality of creating and maintaining digital collections of printed material
such as books and journals.
Objectives
The METAe project will address the need for an automated generation of
metadata during the conversion of printed documents and thus be able to
make large scale digitisation of printed material, such as books and
journals, more reliable in terms of digital preservation, more
cost-effective in terms of automation, and more user-oriented in terms of
future applications.
In order to achieve these aims the METADATA ENGINE project will
(1) introduce layout and document analysis to be employed as a key
technology in future digitisation software,
(2) develop capturing and conversion tools for the automated recording and
generation of administrative and descriptive metadata,
(3) develop an omnifont OCR-engine specialising in processing old European
typefaces of the 19th century,
(4) strictly obey emerging standards in the fields of digital preservation
and resource description, such as XML, EAD, TEI, or ISO 12083,
(5) develop a XML search engine capable for retrieving the tagged full text
and the images.
Description of work
The METAe project will develop a software package which extensively
automates and improves the generation of metadata by applying new
technologies for character, layout and document recognition, and converts
the captured information into XML documents. These XML files will serve as
a basis for a variety of applications, such as new XML search engines,
navigation tools, electronic books, audio books, or the automated
production of HTML, XHTML, PDF or PS files.
The METAe package consists of (1) an input module for scanning printed
material and importing existing bibliographic metadata, (2) an omnifont
character recognition module (OCR-engine) specialising in typefaces of the
19th century, (3) a document analysis module capable of classifying pages
according to their physical and logical structure (items such as title
pages, table of contents pages, etc., will be recognised automatically),
(4) a page layout analysis module capable of analysing and segmenting page
elements such as page numbers, headings, captions, footnotes, pictures,
highlighted phrases, or graphical separators, (5) a knowledge base
providing a controlled vocabulary and rules for the recognition process
(the table of contents is, in most cases, called "contents"), (6) a
conversion module assembling an XML document containing all recognised
metadata, and (7) an export module for the XML enriched document and the
scanned image.
The XML documents will be generated according to emerging standards for
digital preservation and the electronic interchange of information such as
RDF, DC, EAD, TEI, or ISO 12083.
In order to introduce a wide public to the new features of accessing and
browsing images and XML-marked full texts, a METAe search engine and web
application will be developed as well.
============================================================
Simon Tanner
Senior Digitisation Consultant (HEDS)
Higher Education Digitisation Service
University of Hertfordshire
Phone: +44 (0) 1707 286078
Fax: +44 (0) 1707 286079
Web: <http://heds.herts.ac.uk>http://heds.herts.ac.uk
METAe Project: <http://meta-e.uibk.ac.at/>http://meta-e.uibk.ac.at/
******************************************************************
Sun Microsystems, Inc. has published the second edition of its
popular "Digital Library Toolkit", a valuable resource for anyone
planning a digital collection. To download a free copy, go to:
<http://www.sun.com/products-n-solutions/edu/libraries/digitaltoolkit.html>http://www.sun.com/products-n-solutions/edu/libraries/digitaltoolkit.html
******************************************************************
==============================================================
NINCH-Announce is an announcement listserv, produced by the National
Initiative for a Networked Cultural Heritage (NINCH). The subjects of
announcements are not the projects of NINCH, unless otherwise noted;
neither does NINCH necessarily endorse the subjects of announcements. We
attempt to credit all re-distributed news and announcements and appreciate
reciprocal credit.
For questions, comments or requests to un-subscribe, contact the editor:
<<mailto:david@ninch.org>mailto:david@ninch.org>
==============================================================
See and search back issues of NINCH-ANNOUNCE at
<<http://www.cni.org/Hforums/ninch-announce/>http://www.cni.org/Hforums/ninch-announce/>.
==============================================================
This archive was generated by hypermail 2b30 : Wed Feb 28 2001 - 05:20:25 EST