conference call reminder, agenda and materials

Sun Nov 24 18:10:45 EST 2002

> Expected to participate:
>  Syd Bauman, Alex Bia, David Birnbaum, Lou Burnard, Matthew Driscoll, David 
>  Durand, Tomaz Erjavec, Merrilee Proffitt, Sebastian Rahtz, Laurent Romary, 
>  Susan Schreibman, John Unsworth, Perry Willett, Christian Wittern.

Is this list the current Council membership plus the editors?

<p>> The only minutes I can find from our June 24, 2002 conference call are at 
> http://lists.village.virginia.edu/lists_archive/tei-council/0254.html.

Eeek! I think that's my fault. Looks like I never folded JU's notes
into my own. For his notes see the message to which the one above is
a reply, i.e.
http://lists.village.virginia.edu/lists_archive/tei-council/0250.html. 

<p>Without suggesting an answer (for I haven't one), I'd like to clarify
Chirstian's question a bit, at least from my point of view. (Feel
free to correct me if I'm wrong.)

> - Should/could  P5 limit the content of attribute values to tokens
>    (and similar material) as opposed to the many attribute values in
>    P4, which allow essentially the same content as in PCDATA.

We're actually only talking about CDATA attribute values which are
intended to have as their value captured source document content or
similar data, e.g., the orig= attribute of <reg>. We are not worried
about those attributes which are used to describe the encoding. E.g.,
the type= of <div> is not a problem. I say this because I, for one,
do not think it at all an imposition to insist that values of such
attributes be limited to Unicode characters. I'm betting that pretty
much all attributes that are described with "sample values include"
or whatever in the Guidelines are non-problems. Here is a hastily
created list of the attributes I think we are discussing.

 reg= of the various name elements and <orig> (of <measure>, too?)
 orig= of <reg>
 expan= of <abbr>
 abbr= of <expan>
 sic= of <corr>
 corr= of <sic>
 key= of <entry>, <entryFree>, <superEntry>
 lemma= of <w>
 baseform= of <m> (should be baseForm=)
 sort= of the various personal name parts (%a.personPart;) 

and perhaps
 expand=, norm=, split=, value=, and orig= of the various dictionary
                                           elements 
 reg= of the various date and time elements?

>    Background: Attribute values are different from PCDATA in that
>    they can not contain other markup constructs. This makes it
>    impossible, for example, to specify language, writing system,
>    readings and the like for the content of attribute values.

"Impossible" seems like a bit too strong to me. "Difficult" or even
"obnoxiously difficult" might be better. I say this because I think
we could develop a mechanism for this. E.g., if we were to say that
such information (language, writing system, glyph variation, etc.)
are always specified by indirection using the global IDREF attribute
ws= to point to an element with the detailed information, we would
encode such stuff in normal PCDATA with something like
   <head ws="AG7c">
However, in order to indicate that the value of orig= was in the same
language, writing system, etc., we would have to add an attribute to
<reg> to indicate the ws= of the orgig=, e.g.
   <reg orig="icky-chars-go-here" orig_ws="AG7c">
Of course, the real problems occur when the characters inside the
orig= attribute are not all from the same ws= set. (This is actually
a very reasonable case to consider: one of the characters of reg= is
not a Unicode character.) Nonetheless, a stand-off markup solution
*could* be developed:
   <reg id="reg123" orig="icky-chars-go-here">
   
   <ws-link target="reg123" attr="orig" offset="4/6" apply-ws="AG7c"/>
would indicate that characters 5 through 7 of the orig= value are in
the writing system, language, etc., specified by AG7c.

I'm not suggesting that the above is necessarily a *good* solution,
only that it is *a* solution, and one that we might want to think
about a little before jumping on the "use no content-containing
attributes" bandwagon.

<p>> Additionally, there is some area of conflict between XML:lang and
> language specification in TEI, which could be cleared up as well.

I think it would be a good idea, Christian, if you were prepared to
briefly explain the relationship on the call.

<p>> 5) Report from Perry on the proposed TEI in Libraries Working Group.

Is this a proposed working group or a fait accompli? Does it require a
budget?