21.502 Second Linguistic Annotation Workshop

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Sat, 26 Jan 2008 09:51:54 +0000

               Humanist Discussion Group, Vol. 21, No. 502.
       Centre for Computing in the Humanities, King's College London
  www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html
                        www.princeton.edu/humanist/
                     Submit to: humanist_at_princeton.edu

         Date: Sat, 26 Jan 2008 09:46:10 +0000
         From: Willard McCarty <willard.mccarty_at_kcl.ac.uk>
         Subject: Second Linguistic Annotation Workshop

From: Nancy Ide <ide_at_cs.vassar.edu>
Date: Fri, 25 Jan 2008 10:59:02 -0500

CALL FOR PARTICIPATION
ACL Special Interest Group on Annotation (SIGANN)
Sharable Corpus and Best Practice Guidelines Working Group Sessions

1-6PM, May 27, 2008

The Second Linguistic Annotation Workshop
Held in conjunction with LREC 2008
Marrakech, Morocco

<http://verbs.colorado.edu/LAW2008/>http://verbs.colorado.edu/LAW2008/

The SIGANN Sharable Corpus and Best Practice Guidelines Working
Groups will hold a joint session at the Second Linguistic Annotation
Workshop on the afternoon of May 27, 2008, in Marrakech, Morocco. The
session will be devoted to issues surrounding the merging and
harmonization of linguistic annotations representing various
phenomena that may have been produced by different groups using
different formats, and may be based on different theoretical
approaches. The discussions will use as a point of departure
linguistic annotations of a portion of the SIGANN Sharable Corpus
contributed by members of the computational linguistics community.

We solicit contributions of manually or automatically produced
annotations of the SIGANN Sharable Corpus for any linguistic
phenomenon, including but not limited to morpho-syntax, syntax,
semantic roles, word senses, named entities, temporal elements,
events, co-reference and other discourse-level phenomena. The
annotations will be collected in early April, after which the session
organizers will coordinate an effort to merge and compare the
contributed annotations. Based on the experience of this exercise,
discussion points including examples will be drawn up for
consideration in the joint session. Issues to be considered will include:

(1) What are the issues/problems of merging diverse annotations of
different phenomena into a single multi-layer annotation, in terms of
harmonizing different physical formats?

(2) What are the issues/problems of merging diverse annotations of
different phenomena into a single multi-layer annotation, in terms of
enabling a coherent and comprehensive linguistic description?

(3) Are there phenomena for which an attempt at
compatibility/harmonization is not desirable?

(4) What are the implications and/or suggestions of this exercise
for the development of best practice guidelines for linguistic annotation?

(5) Are there certain phenomena (e.g. segmentation into tokens,
phrases, etc.) that lend themselves more readily to the specification
of standard practices, and for which the existence of a common method
would enhance annotation interoperability?

(6) What are the good and bad consequences of introducing a
theoretical bias into the merging process? A theoretically biased
merging procedure creates essentially a new annotation that uses
previously created annotation as input in a destructive manner so
that the input annotation can not be read directly from the merged
output. Can the creation of a merged annotation that is consistent
with a theory justify making these changes? Can "errors" in input
annotation be detected in this way

Those who wish to contribute annotations and/or be involved in
discussions at the session should consult the LAW II website for
details: <http://verbs.colorado.edu/LAW2008/>,
or contact the session organizers.

Session organizers:

Best Practices Working Group
Nancy Ide, Vassar College (ide [at] cs.vassar.edu)

Sharable Corpus Working Group
Adam Meyers, New York University (meyers [at] cs.nyu.edu)
Received on Sat Jan 26 2008 - 05:20:26 EST

This archive was generated by hypermail 2.2.0 : Sat Jan 26 2008 - 05:20:26 EST