[tei-council] regularizing names

Julia Flanders Julia_Flanders at Brown.edu
Thu Jul 14 17:38:15 EDT 2005


Perry and I did discuss the possibility of providing regularization 
of the individual name components; I append our example below.

Your point about the name linking to a PERSON rather than to a NAME 
is an interesting problem. The intention of the examples we gave, and 
their underlying spirit, is that the <regName> element documents a 
name, not a person; it may point onward to a person (i.e. a name 
authority record about a person), but it needn't. However, I can see 
that that pointing inflects the semantics of the <regName> element: 
it can't simply regularize once it points to an authority record.

I guess the question is whether this is a practical problem. In the 
P4 universe, people often use reg= with an implicit semantics of key= 
(in the sense that a given regularization really does refer to a 
person, and may even be used explicitly that way) in cases where they 
have only unique name-person mappings. We regard this as showing want 
of judgment (because they might in future come across another John 
Smith) but not as a tagging error, as long as the value of reg= is 
something like "Smith, John" rather than "jsm0111". In the system 
we're discussing now, is there a disadvantage to using the same 
element for both functions? that is, if you want to regularize the 
name, you can do so; if you want to link to information about a 
person, you can do so. Is it ever unclear which is happening?

If you had a document in which there were many different people named 
John Smith, all spelled and abbreviated in wacky ways, all the 
references might legitimately point to a single <regName> element 
that regularized them all simply as names. If you wanted to use 
<regName> to link these to persons, you would need instead multiple 
<regName> elements with links to person data. Or you could just use 
key= if you didn't want to regularize the names at all.

But I may not be taking sufficient account of the advantages of 
knowing which is meant, programmatically, without having to draw 
inferences based on what kind of information is present.

Here's the example we had come up with for encoding/regularizing name parts:

<regName xml:id="rn17">Clinton, William J.</regName>
<regName xml:id="rn18">
	<regName xml:id="rn18.1">Jones</regName>,
	<regName xml:id="rn18.2">Herbert</regName>
	<regName xml:id="rn18.3">Fizzlebaum</regName></regName>
<!-- above <regName> elements may be somewhere in <teiHeader> or <hyperDiv>,
            and may be part of a new prosopographic element -->
<!-- ... -->
<name reg="#rn17">Bill Clinton</name>
<persName reg="#rn18"><foreName reg="#rn18.1">Herb</foreName> 
<foreName reg="#rn18.3">F.</foreName> <foreName 
reg="#rn18.3">"Fizzy"</foreName> <surname 
reg="#rn18.1">Jones</surname></persName>

Julia

At 9:44 PM +0100 7/11/05, Lou Burnard wrote:
>The main problem I have with this is that all the examples cited 
>seem to be about linking the name to a PERSON not to a NAME, in 
>which case the existing key attribute would surely be more 
>appropriate? (That's not to say that having a <regName> child within 
><person> wouldn't be a good idea). But the only use for having both 
>a reg attribute and a key attribute on a name ought to be for one to 
>regularize the name, and the other to regularize the thing-named. In 
>which case, we have to be able to support the reg attribute on 
>components of names, e.g.
><name key="JF">
><forename reg="JULIA">Juley</forename>
><surname reg="FLANDERS">Flanderes</surname>
></name>
>
><person ident="JF">
><regName>
><forename ident="JULIA">Julia</forename>
><surname ident="FLANDERS">Flanders</surname>
></regName>
><!-- other demographic info here -->
></person>
>
>[I've used co-labelling here rather than ID/IDREF but the same 
>principle applies]
>
>in haste
>
>Lou
>Julia Flanders wrote:
>
>>  Here's what Perry and I have come up with concerning the 
>>regularization of names.
>>
>>  We propose, tentatively, that in P5 we handle the regularization 
>>of names as follows:
>>
>>  Name elements (<name> and the other "primary" name elements 
>>including <persName>, <placeName>, <orgName>, <geogName>) may carry 
>>a reg= attribute, which points to a <regName> element which 
>>contains information on regularization. This element might live in 
>>the TEI header or in some other convenient place. The datatype for 
>>reg= would be tei.data.code. The element name (<regName> as against 
>><reg>) is chosen to help distinguish the function of this 
>>regularization structure from other kinds of regularization in the 
>>document.
>>
>>  The nested naming elements such as <foreName>, <surname>, 
>>etc.--i.e. those which cannot occur on their own but must be nested 
>>inside a primary naming element--would not carry the reg= attribute 
>>and cannot be regularized independently. We considered a mechanism 
>>for doing this and if any Council members think it would be a good 
>>idea we can add this bit.
>>
>>  The <regName> element may either contain a regularized version of 
>>the name as its content, or it may carry a pointer to an external 
>>resource (either locally maintained or an authority file external 
>>to the project, such as Library of Congress Name Authority files). 
>>It could also do both, and we don't see a need to police this 
>>choice (e.g. by making these options exclusive).
>>
>>  The <regName> element would carry three attributes:
>>  --authority= indicates what authority, if any, is being used as 
>>the source of the regularization, with values such as "NACO", other 
>>possibilities
>>  --url= (or target=) points to an external resource
>>  --xml:id= allows the element to be pointed to from the naming 
>>element in the text.
>>
>>  Example:
>>
>>  <regName authority="NACO" xml:id="rn17">Clinton, William J.</regName>
>>  <regName authority="NACO" xml:id="rn18">Aldrin, Edwin Eugene, Jr.</regName>
>>
>>  <!-- above <regName> elements may be somewhere in <teiHeader> or <hyperDiv>,
>>  and may be part of a new prosopographic element -->
>>  <!-- ... -->
>>  <name reg="#rn17">Bill Clinton</name>
>>  <persName reg="#rn18">Buzz Aldrin</persName>
>>
>>  We are still concerned about how this use of reg= and <regName> 
>>will fit in with other forms of regularization (e.g. spelling). 
>>However, since all other uses of reg= will probably now be 
>>addressed as child elements, the possibility of confusion is 
>>diminished.
>>
>>  Comments welcome--
>>
>>  Julia
>>  _______________________________________________
>>  tei-council mailing list
>>  tei-council at lists.village.Virginia.EDU
>>  http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>>
>>




More information about the tei-council mailing list