17.005 POS tagging for Latin, consumptively viewed

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty@kcl.ac.uk)
Date: Thu May 08 2003 - 01:53:20 EDT

Next message: Humanist Discussion Group (by way of Willard McCarty

                Humanist Discussion Group, Vol. 17, No. 5.
       Centre for Computing in the Humanities, King's College London
                   www.kcl.ac.uk/humanities/cch/humanist/
                     Submit to: humanist@princeton.edu

         Date: Thu, 08 May 2003 06:45:27 +0100
         From: Neven Jovanovic <neven.jovanovic@zg.tel.hr>
         Subject: Re: 16.610 POS tagging for Latin?

Some time ago, a member of the list asked as follows:

>I am looking for something equivalent to the CLAWS POS tagger that will work
>with a Latin text. I poked around on the web but nothing leaped out.

This is, of course, a problem of flective languages (with relatively free
word order), connected with the problem of parsing (in Latin, Russian,
Croatian, Greek...). As far as I know, there is some research into parsers
for Latin (Italian LEMLAT project, Portuguese OLISSIPO project--both
traceable on the WWW), but it seems yet to linger on purely academic level
(restricted to certain word types, or to certain text groups--in any case,
nothing readily available for us end-users).

However, this seems related to the _consumptive humanities_ theme. If I
want to parse, or to tag parts of speech in a Latin text, or texts--do I
build a parser first, or do I do it the _old-fashioned_ way, relying on
human linguistic intelligence? The first way has an obvious advantage--when
I build the parser, with necessary adaptations, I sell it to any and all
who need to parse / spellcheck any flective language, and get quite
comfortably rich (so I can even devote the rest of my life to purely
academic classical philology).

Neven

This archive was generated by hypermail 2b30 : Thu May 08 2003 - 01:52:52 EDT