[tei-council] datatype of @n in att.global

Lou Burnard lou.burnard at computing-services.oxford.ac.uk
Fri Jun 29 06:09:22 EDT 2007


Sebastian Rahtz wrote:
> James Cummings wrote:
>> It has a space, which makes it two numbers or labels, and the definition
>> says it should have one.  Space is a magical thing which although it is a
>> character just like others has this weird semantics of creating two
>> separate things in people's minds when you put it in the middle of a 
>> string
>> of alphanumeric characters.
>>   
> I reject this spurious notion entirely. Were we in France, we'd write
> 1000 as 1{tiny space}000.
> 
> what about "1,000"? a number or not?
> 

When people decode strings of characters they bring to them two 
different kinds of grammar: the formal grammar of the language, and the 
equally powerful but less frequently formalized grammar of expectation 
and context.  To insist on either grammar as the sole arbiter is to 
invite derision. So to a French reader the {tinyspace} between two sets 
of digit strings is as effective a way of combining the digit strings 
together as is the comma to the non-French reader. In either case a new 
exception rule to the default behaviour of the space or the comma has to 
be learned, but learning exception rules is one thing people are really 
really good at.

So James is right to say that space "separates things in people's mind" 
and Sebastian is right to say "not always", and I think we just have to 
make an arbitrary decision here about which schema datatype is closest 
to the combination of rules we want to enforce.



More information about the tei-council mailing list