[tei-council] datatype of @n in att.global
Lou Burnard
lou.burnard at computing-services.oxford.ac.uk
Fri Jun 29 06:09:22 EDT 2007
Sebastian Rahtz wrote:
> James Cummings wrote:
>> It has a space, which makes it two numbers or labels, and the definition
>> says it should have one. Space is a magical thing which although it is a
>> character just like others has this weird semantics of creating two
>> separate things in people's minds when you put it in the middle of a
>> string
>> of alphanumeric characters.
>>
> I reject this spurious notion entirely. Were we in France, we'd write
> 1000 as 1{tiny space}000.
>
> what about "1,000"? a number or not?
>
When people decode strings of characters they bring to them two
different kinds of grammar: the formal grammar of the language, and the
equally powerful but less frequently formalized grammar of expectation
and context. To insist on either grammar as the sole arbiter is to
invite derision. So to a French reader the {tinyspace} between two sets
of digit strings is as effective a way of combining the digit strings
together as is the comma to the non-French reader. In either case a new
exception rule to the default behaviour of the space or the comma has to
be learned, but learning exception rules is one thing people are really
really good at.
So James is right to say that space "separates things in people's mind"
and Sebastian is right to say "not always", and I think we just have to
make an arbitrary decision here about which schema datatype is closest
to the combination of rules we want to enforce.
More information about the tei-council
mailing list