[tei-council] Feature request 1933198: 'precision'

Gabriel Bodard gabriel.bodard at kcl.ac.uk
Thu Apr 9 13:43:45 EDT 2009


David Sewell a écrit :
> A detailed response from Tim.
> (2) Isn't @atLeast the same as @min and @atMost the same as @max? In my
> view it is essential to have @assertedValue (or, better, @mostLikely) as
> well so that you can give what you think is the most probable value
> within an interval.

No, @atLeast and @atMost express the likely smallest and largest values 
that a single measurement or number might have, while @min and @max 
represent the start and end points of a range of numbers (such as the 
dimensions of a non-rectangular papyrus, or the start and end years of a 
time period). In any case, these attributes are already defined 
elsewhere; we're not creating them anew for this purpose.

Bear in mind also that this element is not meant to be used in isolation 
to represent a number, but (like <certainty/>) will usually point to an 
attribute on another element that already contains a numerical value; 
that attribute could, I should have thought, be considered to contain 
the most likely value, no? (Borrowing @assertedValue from <certainty/> 
is also possible, however.)

> In humanities, our assertions are rarely based on a statistical
> analysis. However, a person often has a reasonable idea of the upper and
> lower limits of probable values and an idea of the most likely value
> within the interval defined by these limits.

Yes, this is how we envisaged using @atLeast and @atMost on 
<precision/>, if that was what you wanted to do.

> (3) OK, but I would prefer to use @conf (for confidence) rather than
> @degree seeing that "confidence" is the conventional term in this
> context. I would not be unhappy with @cert, but think that @conf is
> better.

I don't have strong feelings about the attribute name (I rarely do) but 
I see two compelling reasons to keep @degree: (i) we are borrowing these 
three attributes from <certainty/> and so it is both technically 
convenient to keep the same names, and will help people to understand 
what is going on since they already use them elsewhere; (ii) while one 
important use for this attribute is to express confidence in the formal 
way you describe, it also has the more general use of simply expressing 
the degree of precision that this element is recording: since the 
element is called <precision/>, it seems slightly contradictory to talk 
about the confidence of that precision.

> implies that you have done a statistical analysis.) If you haven't done
> an analysis then you need to use forensic categories instead (e.g.
> beyond reasonable doubt, more probable than not, doubtful). I believe
> that in these circumstances, the encoder should not create an illusion
> by plucking a percentage (or probability) out of the air but should
> instead use categories (e.g. high, medium, low) which correspond to

Yes, the default values would be high, medium, low (as with certainty). 
User-defined values also possible, of course.

> The forensic categories (high, medium, low) also need to include
> "unknown" for cases when the confidence level is unknown (e.g. a circa
> date).

I'm not sure I follow this--when you express a circa date, you're saying 
that your degree of precision is lower than usual, not that you don't 
know whether or not you're being precise.

> (4) I don't think that @stdDev is useful or necessary if @min, @max, and
> @conf are available. The standard deviation of a set of measurements can
> be used to construct a confidence interval under certain circumstances.
> However, your average punter has no idea what range of possible values a
> standard deviation implies or when it is a bad idea to use because the
> sample is too small or not randomly selected or the sampling
> distribution is not normal, etc. People who do know these things can

Right. I think @stdDeviation will only be used by someone who really 
knows what it means and for whom this is precisely what they want to 
record. It's certainly not compulsory, and most people may well chose to 
record their precision and confidence using @notBefore and @notAfter 
etc., as you describe very usefully above.

Thanks for the feedback.

G

-- 
Dr Gabriel BODARD
(Epigrapher & Digital Classicist)

Centre for Computing in the Humanities
King's College London
26-29 Drury Lane
London WC2B 5RL
Email: gabriel.bodard at kcl.ac.uk
Tel: +44 (0)20 7848 1388
Fax: +44 (0)20 7848 2980

http://www.digitalclassicist.org/
http://www.currentepigraphy.org/


More information about the tei-council mailing list