[tei-council] Values of @xml:lang on <exemplum>

David Sewell dsewell at virginia.edu
Thu Apr 30 10:37:51 EDT 2009


On Thu, 30 Apr 2009, Lou Burnard wrote:

> I haven't looked, but is there really no official language code for "mixed" or
> "macaronic" or "unidentified"?

Indeed there are both, in ISO 639-2. "mul" = "multiple languages", "und"
= "undetermined".

However, the XML spec we've already cited on language identification
refers to BCP 47 (ftp://ftp.isi.edu/in-notes/bcp/bcp47.txt), which
deprecates the use of "mul" in favor of individually tagging the
languages. And it giveth and taketh away with respect to "und":

   4.  The 'und' (Undetermined) primary language subtag SHOULD NOT be
       used to label content, even if the language is unknown.  Omitting
       the language tag altogether is preferred to using a tag with a
       primary language subtag of 'und'.  The 'und' subtag MAY be useful
       for protocols that require a language tag to be provided.  The
       'und' subtag MAY also be useful when matching language tags in
       certain situations.

We could just be wicked and use xml:lang="und" for the language-neutral
cases. That would allow us to require @xml:lang on <exemplum> but avoid
the empty value.

But James & Sebastian's proposal of exemplum/@targetLang as a processing
tool has attractions as well. Unlike @xml:lang, it could contain more
than one language tag (oneOrMore values of xsd:language). We could still
use @targetLang="und" for language-neutral exempla.

David


-- 
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell at virginia.edu   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/


More information about the tei-council mailing list