[tei-council] xml:lang="eng"

Paul F. Schaffner PFSchaffner at umich.edu
Thu Dec 8 12:14:23 EST 2011

Am I right in thinking that the MARC language codes (the original
source of the ISO 639-2 3-letter code list) remains the standard
for MARC records? And that those of us who depend on a lot of
back-and-forth TEI<->MARC interchange will probably find it
most convenient to continue to use those codes throughout our
files? If so, would we be better served by continuing to use
@lang rather than @xml:lang? Or is there a lossless 'crosswalk'
out there somewhere?


On Thu, 8 Dec 2011, Martin Holmes wrote:

> I really didn't mean to step on Piotr's toes there -- sorry about that.
> I had the mistaken impression that the ticket was done and closed, but
> that these examples had slipped through because they're in CDATA islands
> and so hadn't been accessible to XQuery or XPath discovery.
> Cheers,
> Martin
> On 11-12-08 06:26 AM, Kevin Hawkins wrote:
>> Will someone agree to take on adding this to CH?  I would but am not
>> sure of the precise status of
>> http://www.iana.org/assignments/language-subtag-registry (and of
>> http://www.iana.org/assignments/language-tag-extensions-registry ), so
>> I'll get the details wrong.
>> I think we will want to add a simililar note to att.global
>> For the record, in Paris we discussed these invalid values based on this
>> ticket that Syd submitted:
>> https://sourceforge.net/tracker/index.php?func=detail&aid=3304622&group_id=106328&atid=644062
>> We assigned the cleanup to Piotr, but I'm sure he appreciates Martin
>> taking care of the things he encounters.
>> On 12/8/2011 8:33 AM, Lou Burnard wrote:
>>> Ah, we should probably update the reference in CH to point to this in
>>> that case.
>>> On 08/12/11 12:24, Gabriel Bodard wrote:
>>>> By the way, I recently discovered that there is now a single IANA
>>>> registry of language codes so we no longer need to worry about looking
>>>> in the 2-letter list, and then if we don't find what we're looking for
>>>> move on to the 3-letter list. Instead, all codes, both 2- and
>>>> (occasional) 3-letter (and 4-letter script codes) are listed on a single
>>>> page at<http://www.iana.org/assignments/language-subtag-registry>, so
>>>> there is no longer any danger of accidentally using a 3-digit code that
>>>> has been deprecated in favour of its 2-digit replacement.
>>>> (For example, "Ancient Greek (to 1453)" is now unambiguously listed as
>>>> "grc", so should not be confused with "el".)
>>>> I've found this very helpful while trying to decide how to tag Nabatean
>>>> script in papyri...
> -- 
> Martin Holmes
> University of Victoria Humanities Computing and Media Centre
> (mholmes at uvic.ca)
> -- 
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
> PLEASE NOTE: postings to this list are publicly archived

Paul Schaffner | PFSchaffner at umich.edu | http://www.umich.edu/~pfs/
316-C Hatcher Library N, Univ. of Michigan, Ann Arbor MI 48109-1190

More information about the tei-council mailing list