extended pointers for P4
Syd Bauman
Syd_Bauman at brown.edu
Thu Feb 14 15:38:07 EST 2002
As you all know, David Durand is heading up our new work group on
linking, hypertext, and the like. The new work group is charged with
coming up with recomendations for P5. My question for the council is
what to do with extended pointers for P4. There are two issues at
hand. The first is case:
P3> Note that the keywords, though shown here quoted in uppercase, are
P3> not case sensitive.
Several people, not the least Sebastian, have pointed out that this
is not very XML-like (he refers to it as "XML-immoral"), and that in
the modern world these keywords (more correctly called location
types) should be one case or the other, preferably lower case because
it's easier on the eyes.
Both editors (and Sebastian) think it would probably be a good idea
to change these to just lowercase (for P4). However, this is not a
"corrigible error" that we editors can haul off and do on our own:
such a change would likely break some existing documents. Of course,
it's not like there are hundreds of programs out there processing
thousands of documents with millions of extended pointers. :-(
<p>The second problem is whitespace. P3 is a little bit confusing, but
seems to insist on whitespace between location types and values, and
between separate parenthesized steps of the value:
P3> Location types and values, and the parameters within a location
P3> value, must be separated by white space characters.
and a bit later, discussing the steps:
P3> The value is a series of parenthesized steps, separated by white
P3> space.
How come? Why not allow
"child(2 p)(1 list)(7 item)"
instead of insisting on
"child (2 p) (1 list) (7 item)"?
Both editors (and Sebastian) think it would probably be a good idea
to change this so that the whitespace is optional (in P4). However,
this is at best a borderline "corrigible error", in that it is not
obviously wrong, or at least someone might be able to make a case for
it to work that way. The good news is that making the white space
optional should not break any existing documents.
I have asked Steve DeRose and David Durand (2 of the original 4 on
TR3, the work group on hypertext and hypermedia; since we have Lou's
input, and he was 1 of the 2 editor's back then, we've consulted with
50% of the people who made this decision), and neither thinks there
is a particularly strong reason not to effect these changes now.
<p>So -- what does the Council think? On the issue of case of location
types, should we
a) make no changes, leave them case insensitive;
b) change to case sensitive, always lower case;
c) change to case sensitive, always upper case;
d) change to case sensitive, some bizarre mixture or capitalized or
camel case;
e) change to case sensitive, always lower case for XML, but leave as
case insensitive for SGML texts (seems like a bad idea to me, but
I have to admit I haven't put any thought into it yet)?
On the issue of whitespace should we
a) make no changes, leave the whitespace required;
b) make the whitespace between location types and location values,
and between parenthesized steps, optional.
Note for those who actually think about how these things will be
parsed: per the formal definition of "locterm" in 4.2.2.2, all
location types are followed either by nothing (i.e., the end of the
attribute value or whitespace followed by another locterm), something
in parenthesis, or by one of the non-terminals "steps", "regs", or
"parms" (or "pointpair", after a name in parens); later in the
section all four of these non-terminals are defined to be enclosed in
parenthesis. Thus it is always the case that a location type is
followed either by nothing, whitespace and the next location type, or
by whitespace, and open parenthesis, and some location value(s).
Since there is always that open parenthesis, the whitespace is not
needed.
More information about the tei-council
mailing list