[tei-council] datatypes -- syd's comments
Syd Bauman
Syd_Bauman at Brown.edu
Tue Sep 20 18:32:25 EDT 2005
Lou wrote:
> ... tei.data.numeric maps to xsd:decimal which does support
> floating point numbers.
Right, you have instantiated tei.data.numeric as xsd:decimal, quite
in opposition to the recommendation, which was to use xsd:long |
xsd:double. I don't recall anyone presenting any argument as to why
xsd:decimal should be preferred. Not only does it not support
scientific notation, I believe it is harder for vendors to implement.
Did I miss something? (Always a possibility :-)
Arguments favoring xsd:decimal over xsd:double could be made (e.g.,
based on the approximation required of xsd:double), and I'd be happy
to entertain them. But as of now, I don't see why you made this
change.
---------
For those still wrapping their minds around these datatypes, I've
copied the following explanations from Eric van der Vlists book[1].
xsd:decimal --
This datatype represents decimal numbers. The number of digits
can be arbitrarily long (the datatype doesn't impose any
restrictions), ... Leading and
trailing zeros aren't considered significant and may be trimmed.
The decimal separator is always a dot (.), and a leading sign (+
or -) may be used, but any characters other than the 10 digits
zero through nine are forbidden, including whitespace inside the
value. ...
xsd:long --
Contains an integer between -9223372036854775808 and
9223372036854775807; i.e., the values that can be stored in a
64-bit word.
xsd:double --
... represents IEEE ... double (64 bits) precision
floating-point types. These store the values in the form of a
mantissa and an exponent of a power of 2 (m x 2^e), allowing a large
scale of numbers in a storage that has a fixed length.
Fortunately, the lexical space doesn't require powers of 2 (in
fact, it doesn't accept powers of 2), but instead uses a
traditional scientific notation based on integer powers of 10.
Because the value spaces (powers of 2) don't exactly match the
values from the lexical space (powers of 10), the recommendation
specifies that the closest value is taken. The consequence of
this approximate matching is that float datatypes are the domain
of approximation; most of the float values can't be considered
exact and are approximate.
These datatypes accept several special values: positive zero (0),
negative zero (-0) (which is less than positive 0 but greater
than any negative value); infinity (INF), which is greater than
any value; negative infinity (-INF), which is less than any
value; and "not a number" (NaN).
Note
----
[1] http://books.xmlschemata.org/relaxng/relax-CHP-8-SECT-1.html
More information about the tei-council
mailing list