[tei-council] biblscope and imprint

Martin Holmes mholmes at uvic.ca
Mon Nov 5 12:25:42 EST 2012


Hi Kevin,

The bit of the Guidelines with the links is here:

<http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CH.html#CHSH>

You're right that the att.global definition of @xml:lang should probably 
be revised too.

Of the Wikipedia definitions Lou found, I guess here:

<http://en.wikipedia.org/wiki/Romanization_of_Russian>

ALA-LC is already in the subtag registry:

Type: variant
Subtag: alalc97
Description: ALA-LC Romanization, 1997 edition
Added: 2009-12-09
Comments: Romanizations recommended by the American Library Association
   and the Library of Congress, in "ALA-LC Romanization Tables:
   Transliteration Schemes for Non-Roman Scripts" (1997), ISBN
   978-0-8444-0940-5.

so if that were the one used in this case, you could make the tag 
ur-Latn-alalc97. But I can't find any of the others.

Cheers,
Martin

On 12-11-05 09:12 AM, Kevin Hawkins wrote:
> On 11/5/12 12:06 PM, Martin Holmes wrote:
>> Hi Kevin,
>>
>> On 12-11-05 06:29 AM, Kevin Hawkins wrote:
>>> That document is helpful.  What wasn't clear to me about BCP 47 is
>>> whether you could only use a script subtag in combination with a
>>> language subtag if they were listed in combination in the IANA registry
>>> (as some are).
>>
>> Absolutely not. The idea of the "suppress script" thing, if I understand
>> it correctly, is that you _don't_ need to specify that script, because
>> it's the default or obvious, so when you use just "ru", the script
>> "Cyrl" is understood; but if a different script is used, then you should
>> specify it.
>
> Yes, I understand that.  I meant it's not clear that the non-default
> script subtags have to be enumerated in the registry or whether you can
> freely combine as needed.
>
>>>    Whereas this W3C guide explicitly says you can only use
>>> extended language subtags with certain languages, I see that it
>>> explicitly says you can use a script subtag with any language when it's
>>> not written in the script given for "suppress script".  And, as we see,
>>> it even gives the examples of Russian "transcribed into the Latin script".
>>>
>>> So if you encounter "ru-Latn", you're stuck figuring out which
>>> transliteration scheme was used (or whether the author just invented one).
>>
>> Yes, that's an interesting point; in many cases, there are ways to
>> specify the transliteration scheme used:
>>
>> Type: variant
>> Subtag: wadegile
>> Description: Wade-Giles romanization
>> Added: 2008-10-03
>> Prefix: zh-Latn
>>
>> which specifies one particular romanization of Chinese. Use of variants
>> is explained here:
>>
>> <http://www.w3.org/International/questions/qa-choosing-language-tags#xxxshortcomings>
>>
>> However, there are no such variants for Russian. If there are multiple
>> latin transliteration schemes in use for Russian, it would be a good
>> idea to register subtags for them.
>
> Yes.  All 10 of them that Lou found in Wikipedia.
>
>> To my mind, BCP 47 itself is actually quite hard to understand, but
>> we've linked in the Guidelines to two W3C documents (including the one
>> above) which really help to clarify the situation.
>
> You know, when you sent that link, I recalled that we were going to add
> it to the Guidelines, but I don't see it at:
>
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.global.html
>
> which is where I go to look for advice on @xml:lang.  Perhaps we could
> add the two W3C documents to ref-att.global.xml as well?
>
> --Kevin
>

-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)


More information about the tei-council mailing list