[tei-council] biblscope and imprint

Kevin Hawkins kevin.s.hawkins at ultraslavonic.info
Mon Nov 5 00:49:18 EST 2012


On 11/4/12 7:52 AM, Lou Burnard wrote:
> Firstly the comment that using "ru" for Russian transliterated in Roman
> characters is simply "underspecified" seems to me rather to miss the
> point. If I see something in a Unicode document which says it has
> xml:lang="ru" I expect to see proper Russian Unicode characters.

Perhaps.  I meant that while you might think that, it wasn't clear to me 
that the semantics of @xml:lang license that inference.  However, once I 
looked at BCP 47 and the discussion of "suppress script" further, I 
think it might indeed license Lou's inference.

> Secondly, even if I am prepared to accept Romanized versions of those
> characters and figure out for myself what the Russian should have been,
> this is not entirely easy. There are several different (Wikipedia lists
> ten) possible Romanization schemes, which vary quite considerably. In
> some, for example, the sequence "ye" stands for the Russian letter that
> looks like a Roman "e"; in others this character is represented by "e",
> unless it is iotated by a preceding soft sign. So generating a correct
> Cyrillic version of this citation isn't easy, and neither is deciding
> which scheme we're dealing with here!

BCP 47 allows for registering of variant subtags for systems of 
transliteration, but it does not require this.  However, per the 
discussion of "suppress script", it seems you effectively need to for 
transliteration.

This is puzzling.

> Thirdly, this particular example is actually taken verbatim from a
> rather elderly ISO standard on bibliographic reference (ISO 690, 1987).
>    Hence we probably should not mess with its representation at all.

I fully agree that as long as we are citing a citation in a source 
document, we shouldn't go de-transliterating it!

 >  (You
> can see it cited as a example in the Wikipedia entry for ISO_690,
>curiously enough).

I imagine that someone writing or improving the Wikipedia article on ISO 
690 googled around to see what they could find and stumbled upon the 
Guidelines ...

> My guess, but I defer to the Russian expert in our midst, is that this
> uses the now deprecated ISO/R:1968 but without access to the original,
> it's hard to be sure, and without being sure I'd rather not try to
> convert it into proper Russian.

Well, it looks like Lou not only tried, but as your resident Russian 
expert I can say that he also succeeded.

> All of which I suppose we can side-step cheerfully, by saying "ru-Latn",
> even though this particular combination isn't actually proposed in
> http://www.iana.org/assignments/language-subtag-registry, and even
> though this won't help anyone who *does* want to see the original title
> as it should have been presented!

I, like Martin in a later message, used to think that BCP 47 allowed for 
the various types of tags to be combined as you see fit, meaning that 
"ru-Latn" would be allowed.  But a closer reading of BCP 47 now makes me 
think that you can only use things in the IANA registry unless you use a 
private use subtag.

We could bring in Syd Bauman or Deborah Anderson to help us sort this 
out, or we could take a shortcut by simply removing the @xml:lang on 
this transliterated title.

--Kevin


More information about the tei-council mailing list