[tei-council] Feature request 1933198: 'precision'
Gabriel Bodard
gabriel.bodard at kcl.ac.uk
Thu Apr 9 13:43:45 EDT 2009
David Sewell a écrit :
> A detailed response from Tim.
> (2) Isn't @atLeast the same as @min and @atMost the same as @max? In my
> view it is essential to have @assertedValue (or, better, @mostLikely) as
> well so that you can give what you think is the most probable value
> within an interval.
No, @atLeast and @atMost express the likely smallest and largest values
that a single measurement or number might have, while @min and @max
represent the start and end points of a range of numbers (such as the
dimensions of a non-rectangular papyrus, or the start and end years of a
time period). In any case, these attributes are already defined
elsewhere; we're not creating them anew for this purpose.
Bear in mind also that this element is not meant to be used in isolation
to represent a number, but (like <certainty/>) will usually point to an
attribute on another element that already contains a numerical value;
that attribute could, I should have thought, be considered to contain
the most likely value, no? (Borrowing @assertedValue from <certainty/>
is also possible, however.)
> In humanities, our assertions are rarely based on a statistical
> analysis. However, a person often has a reasonable idea of the upper and
> lower limits of probable values and an idea of the most likely value
> within the interval defined by these limits.
Yes, this is how we envisaged using @atLeast and @atMost on
<precision/>, if that was what you wanted to do.
> (3) OK, but I would prefer to use @conf (for confidence) rather than
> @degree seeing that "confidence" is the conventional term in this
> context. I would not be unhappy with @cert, but think that @conf is
> better.
I don't have strong feelings about the attribute name (I rarely do) but
I see two compelling reasons to keep @degree: (i) we are borrowing these
three attributes from <certainty/> and so it is both technically
convenient to keep the same names, and will help people to understand
what is going on since they already use them elsewhere; (ii) while one
important use for this attribute is to express confidence in the formal
way you describe, it also has the more general use of simply expressing
the degree of precision that this element is recording: since the
element is called <precision/>, it seems slightly contradictory to talk
about the confidence of that precision.
> implies that you have done a statistical analysis.) If you haven't done
> an analysis then you need to use forensic categories instead (e.g.
> beyond reasonable doubt, more probable than not, doubtful). I believe
> that in these circumstances, the encoder should not create an illusion
> by plucking a percentage (or probability) out of the air but should
> instead use categories (e.g. high, medium, low) which correspond to
Yes, the default values would be high, medium, low (as with certainty).
User-defined values also possible, of course.
> The forensic categories (high, medium, low) also need to include
> "unknown" for cases when the confidence level is unknown (e.g. a circa
> date).
I'm not sure I follow this--when you express a circa date, you're saying
that your degree of precision is lower than usual, not that you don't
know whether or not you're being precise.
> (4) I don't think that @stdDev is useful or necessary if @min, @max, and
> @conf are available. The standard deviation of a set of measurements can
> be used to construct a confidence interval under certain circumstances.
> However, your average punter has no idea what range of possible values a
> standard deviation implies or when it is a bad idea to use because the
> sample is too small or not randomly selected or the sampling
> distribution is not normal, etc. People who do know these things can
Right. I think @stdDeviation will only be used by someone who really
knows what it means and for whom this is precisely what they want to
record. It's certainly not compulsory, and most people may well chose to
record their precision and confidence using @notBefore and @notAfter
etc., as you describe very usefully above.
Thanks for the feedback.
G
--
Dr Gabriel BODARD
(Epigrapher & Digital Classicist)
Centre for Computing in the Humanities
King's College London
26-29 Drury Lane
London WC2B 5RL
Email: gabriel.bodard at kcl.ac.uk
Tel: +44 (0)20 7848 1388
Fax: +44 (0)20 7848 2980
http://www.digitalclassicist.org/
http://www.currentepigraphy.org/
More information about the tei-council
mailing list