[tei-council] textual attrs (was "Re: Chapter 1")

Syd Bauman Syd_Bauman at Brown.edu
Fri Feb 8 09:54:50 EST 2008


> >> 1.3.1.1 For the definition of xml:lang, it can not only indicate
> >> the language of the element content, but also potentially of a
> >> text attribute, no?
> >>
> > Only "potentially" because we have gone to some lengths to
> > abolish "text" attributes.
> >
> So you'd rather not mention this because you're gradually
> deprecating away the still-existing text attributes?

Which still-existing text attributes? 

And remember, what we're talking about here are not attributes which
can have textual content, into which category replacementPattern= and
rend= fall, but rather about attributes that both contain text and
have values that are expressed in a natural language. 

Off the top of my head I think there are only a few dictionary
attributes and reason= that might still be "textual". But Let's have
a look at the possible candidates; lists appended. (In those lists I
use the parameter entity notation to indicate class names).

So in truth, it may be worth mentioning that xml:lang= governs the
natural language of attribute values as well.

Can we have a volunteer from Council to go through these lists of
attributes and for each note whether its value

a) comes from a formal language or controlled vocabulary like
   replacementPattern= of <cRefPattern> or encoding= of
   <binaryObject>

b) not (a), but nonetheless has nothing to do with a natural
   language, like delim= of <refState> or from= and to= of <locus>

c) are problematic because they really may contain natural language
   phrases, like reason= of <supplied>

d) something else?

For convenience in performing this task, the following lists are
sorted by the attribute class or element in which the attribute is
defined. 

Attributes that have datatype of rng:text
---------- ---- ---- -------- -- --------
expand= of %att.lexicographic;
norm= of %att.lexicographic;
orig= of %att.lexicographic;
split= of %att.lexicographic;
value= of %att.lexicographic;
replacementPattern= of <cRefPattern>
delim= of <refState>

Attributes whose datatype is some number of data.word
---------- ----- -------- -- ---- ------ -- ---------
extent= of %att.damaged;
sortKey= of %att.entryLike;
n= of %att.global;
rend= of %att.global;
mimeType= of %att.internetMedia;
commodity= of %att.measurement;
targFunc= of %att.pointing.group;
version= of %att.translatable;
loc= of <app>
name= of <attRef>
encoding= of <binaryObject>
assertedValue= of <certainty>
lang= of <code>
reason= of <gap>
level= of <langKnown>
from= of <locus>
to= of <locus>
baseForm= of <m>
value= of <metSym>
role= of <org>
age= of <person>
role= of <person>
age= of <personGrp>
size= of <personGrp>
cRef= of <ptr>
cRef= of <ref>
label= of <rhyme>
subtype= of <seg>
reason= of <supplied>
value= of <symbol>
sortKey= of <term>
reason= of <unclear>
name= of <vLabel>
lemma= of <w>


Not directly relevant, but while I was at it I ascertained the
following: 

Elements that contain <rng:text> somewhere in their content model
-------- ---- ------- ---------- --------- -- ----- ------- -----
<att>
<bibl>
<binaryObject>
<byline>
<castItem>
<catDesc>
<change>
<charName>
<closer>
<code>
<date>
<defaultVal>
<dictScrap>
<docImprint>
<eg>
<egXML>
<entryFree>
<etym>
<form>
<formula>
<g>
<geo>
<gi>
<glyphName>
<gramGrp>
<ident>
<idno>
<lem>
<localName>
<m>
<measureGrp>
<oVar>
<opener>
<origDate>
<pVar>
<postBox>
<postCode>
<rdg>
<re>
<sense>
<series>
<stringVal>
<tag>
<time>
<u>
<unicodeName>
<val>
<w>
<xr>

It seems to me there are at least a few cases that should probably be
macro.xtext or data.name instead. I will try to investigate these over
the weekend.



More information about the tei-council mailing list