[tei-council] Values of @xml:lang on <exemplum>
David Sewell
dsewell at virginia.edu
Thu Apr 30 10:37:51 EDT 2009
On Thu, 30 Apr 2009, Lou Burnard wrote:
> I haven't looked, but is there really no official language code for "mixed" or
> "macaronic" or "unidentified"?
Indeed there are both, in ISO 639-2. "mul" = "multiple languages", "und"
= "undetermined".
However, the XML spec we've already cited on language identification
refers to BCP 47 (ftp://ftp.isi.edu/in-notes/bcp/bcp47.txt), which
deprecates the use of "mul" in favor of individually tagging the
languages. And it giveth and taketh away with respect to "und":
4. The 'und' (Undetermined) primary language subtag SHOULD NOT be
used to label content, even if the language is unknown. Omitting
the language tag altogether is preferred to using a tag with a
primary language subtag of 'und'. The 'und' subtag MAY be useful
for protocols that require a language tag to be provided. The
'und' subtag MAY also be useful when matching language tags in
certain situations.
We could just be wicked and use xml:lang="und" for the language-neutral
cases. That would allow us to require @xml:lang on <exemplum> but avoid
the empty value.
But James & Sebastian's proposal of exemplum/@targetLang as a processing
tool has attractions as well. Unlike @xml:lang, it could contain more
than one language tag (oneOrMore values of xsd:language). We could still
use @targetLang="und" for language-neutral exempla.
David
--
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: dsewell at virginia.edu Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/
More information about the tei-council
mailing list