[tei-council] Report from Berlin

Tue Oct 23 16:50:47 EDT 2012

Thanks Lou for such a clear and detailed account. This looks very 
interesting.

I've just been doing some work with ISOCAT myself, in an effort to start 
mapping feature structures from a TEI dictionary onto the GOLD 2010 
ontology, as expressed in ISOCAT. There are a couple of minor issues we 
should look at here (see below for boring detail).

I've also been talking to Laurent a little about the possibility of TEI 
as a serialization for Lexical Markup Framework, and I'm expecting to 
work more on this ahead of a presentation at the ICLDC3 conference in 
Hawaii next year. LMF is also an ISO standard, so Laurent's mission to 
bring ISO standards and TEI into closer integration is moving forward.

Issues I've found with ISOCAT/TEI integration:

It's not precisely clear to me how the attributes dcr:datcat and 
dcr:valueDatcat should map into a system with multiple hierarchical 
levels; and given such a system, I don't see the value of dcr:datcat 
since whatever level of ancestor it's supposed to point to can be 
discovered from what dcr:valueDatcat is pointing at. Meanwhile, it 
remains impossible to point to a particular ontology in ISOCAT because 
they don't have unique ids/URIs, but individual ontological components 
can be imported into multiple ontologies, so AFAICS you can't specify 
that you're pointing at the instance of 3062 (AcousticProperty) that's 
in RELISH / GOLD 2010, or the one that's in GOLD / GOLD 2010, or one 
that's somewhere else (assuming this could be said to matter).

This could all be due to my own ignorance, of course.

Cheers,
Martin

On 12-10-23 01:07 PM, Lou Burnard wrote:
> *EIT MMI Meeting, Berlin 22 oct 2012*
>
> As noted at the last FTF, Laurent Romary in his capacity as ISO TC7 WG3
> chair has proposed a new ISO/TEI joint activity in the area of speech
> transcription, which comes with the slightly obscure label of EIT MMI:
> the last part of which is short for “multimodal interaction”, although
> it seems the activity is really only concerned with speech
> transcription. I was invited to attend the third EIT MMI workshop, held
> at the DIN's offices in Berlin. Prime movers in the activity, apart from
> Laurent, appear to be Thomas Schmidt and Andreas Witt from the Institut
> fur Deutsche Sprache in Mannheim, but a number of other European
> research labs, mostly concerned with analysis of corpora of human
> computer interaction, were also represented; specifically: Nadia Mana
> from FBK (Trento, Italy); Tatjana Scheffler (DFKI, Germany); Khiet
> Truong (Univ of Twente) ; Benjamin Weiss (TU Berlin); Mathias Wilhelm
> (DAI Labor); Bertrand Gaiffe (ATILF, Nancy). This being an ISO activity,
> the real world of commerce and industry was also represented by Felix
> Burkhardt from Deutsche Telekom's Innovation Lab.
> Related ISO activity mentioned by Laurent included the work on Discourse
> Relations led by Harry Bunt, and the long-awaited MAF (morpho-syntactic
> annotation framework) which are both due to appear Real Soon Now. A
> quick tour de table confirmed my impression that most of the attendees
> were primarily researchers in Human Computer Interaction with little
> direct experience of the construction or encoding of spoken corpora, but
> Thomas Schmidt more than made up for that. The main business of the day
> was to go through his preliminary draft working document, the objective
> of which is to confer ISO authority on a subset of the existing TEI
> proposals for spoken text transcription, with some possible
> modification. The underlying work is well described in Schmidt's recent
> excellent article in TEIJ, so I won't repeat it: essentially, it
> consists of a close look at the majority of transcription formats used
> by the relevant research community/ies and tools, a synthesis of what
> they have in common, and suggestions of how that synthesis maps to TEI.
> This is to a large extent motivated by concerns about preservation and
> migration of data in “legacy” formats.
>
> The discussion began by establishing boundaries: despite my proposal to
> the contrary, it seems there was little appetite to extend the work into
> the area of truly multimodal transcriptions, which was still generally
> felt to be insufficiently understood for a practice-based standard to be
> appropriate. Concern was expressed that we should not make ad hoc
> premature suggestions. So the document really only concerns transcribed
> speech. There was no disagreement with the general approach which is to
> distinguish a small number of macro-structural featuresprovide
> guidelines about how to mark up specific units of analysis at the
> micro-structural level, using a subset of the TEI.
> I was also much cheered by two further remarks he made
> the graph-based “annotation framework” formalisation proposed by Bird
> and Liberman was theoretically complete but so generic as to be
> practically useless (I paraphrase)
> at the micro level, everything you need is there in the TEI (I quote)
>
> Discussion focussed on the following points raised by the working document:
>
> *Tiers*
>
> Many existing tools organise transcriptions into “tiers” of annotation.
> These seem to be purely technical artefacts, which can be addressed more
> exactly by used of XML markup. Unlike “levels” of annotation, they have
> no semantics. It's doubtful that we need a <tier> element.
>
> *Metadata -1*
>
> How many of the (very rich) TEI proposals should be included, or
> mentioned? And how should the three things Thomas had found missing be
> supplied? I suggested that <appinfo> was an appropriate way to record
> information about the transcription tool used; that the definition of
> the transcription system used belonged in the <encodingDesc>; and agreed
> that there was nothing specifically provided for recording pointers or
> links to the original video or audio transcribed. In the meeting, I
> speculated that maybe there was scope for extending (or misusing)
> <facsimile> for this last purpose; another possibility which pccurs to
> me as I type these notes is that one could also extend <recordingDesc>.
>
> *Timing*
>
> The timeline is fundamental to the macrostructure of a transcript.
> Thomas' examples all used absolute times for its <when>s, but I
> suggested that relative ones might be easier. The document ordering both
> of <when>s and of transcribed speech should reflect the temporal order
> as far as possible; this would allegedly facilitate interoperability
>
> *Metadata-2*
>
> What metadata was needed, required, recommended for the description of
> participants? (@sex raised its ugly head here). Could we use <person> to
> refer to artificial respondents in MMI experiments? (yes, if they have
> person-like characteristics; no otherwise)
>
> It was noted that almost any personal trait or state might be crucial to
> the analysis of some corpora. We noted that CMDI now recommended using
> the ISOCAT data category registry as an independent way of defining
> metadata terminology; also that ISOCAT was now available within the TEI
> scheme (though whether it fits into personal metadata I am less sure).
> There was (I think) general agreement that we'd reference the various
> options available in the TEI but not incorporate all of them.
>
> We agreed that the principles underlying a given transcription should be
> clearly documented, either in associated articles, in the formal
> specification for an encoding, or in the header of individual documents.
>
> *Utterances*
>
> Several people disliked the expanded element name <u> and its
> definition, for various theoretical reasons. Its definition should be
> modified to remove the implication that it necessarily followed a
> silence, though we seemed to agree that a <u> could only contain a
> stretch of speech from a single speaker.
>
> The temporal alignment of a <u> can be indicated either by @start and
> @end or by nested <anchor/>s : the standard should probably recommend
> use of one or the other methods but not both. We discussed whether or
> not the fact that existing tools did not support the (even simpler) use
> of @trans to indicate overlap should lead us not to recommend it.
>
> *U-plus*
>
> Thomas wanted some method of associating with a <u> the whole block of
> annotations made on it (represented as one or more <interpGrp>s). His
> document suggested using <div> for this purpose. A lighter-weight
> solution might be to include <interpGrp> within <u>, or to propose a new
> wrapper <annotatedU> element.
>
> *Tokenization*
>
> Laurent noted that MAF recommended use of <w> for individual tokens; we
> didn't need to take a stand on the definition of “word” but could simply
> refer to MAF. We needed some way of signalling the things that older
> transcription formats had found important, e.g. words considered
> incomplete, false starts, repetitions, abbreviations etc. so we needed
> to choose an appropriate TEI construct for them, even if we thought the
> concept was not useful or ill-defined. The general purpose <seg> element
> might be the simplest solution, but some diplomacy would be needed about
> how to define its application and its possible @type or @function values.
>
> *Conclusions*
>
> This workgroup will probably produce a useful document describing an
> important use case for the TEI recommendations on spoken language. It is
> currently a Google Doc which the group has agreed to share with the
> Council. I undertook to help turn this into an ODD, which could
> eventually become one of our Exemplars. Work on standardising other
> aspects of transcribed multimodal interactions probably needs to be
> deferred to a later stage.
>
>

-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)