[tei-council] Proposal <idno> coverage -SF 2493417
Peter Boot
pboot at xs4all.nl
Wed Jan 21 16:44:24 EST 2009
I just added the following comment to SF feature request 2493417. I cc
this to Syd and Kevin, as their postings show some interest in the issue.
Peter
SF 2493417 consists of two parts. The first part asks for some extra
examples that show idno's are not necessarily numeric. Syd provided some
examples in SF bug 2457147.
The second part of the feature request requests of <idno> that we
'extend its scope so that it can treat unique identifiers for core
components of a bibliographical reference, in particular, authors (it
should thus be part of the content model of <author> among others'. The
rest of this comment discusses that second request.
It is clear there are many advantages to unique identification of
scholarly authors: finding an author’s other articles, finding an
author’s current affiliation, relating non-article publications (weblog
entries, etc.) all require some more robust way of identifying a person
than by name. An illustration of that fact is given by [1]: the
Mathematical Reviews author database contains 32 authors called "Wang,
Wei" with no additional names. For more literature, see [2, 3, 4].
It should therefore be possible to identify scholarly authors by
something other than their name. There exist, perhaps unfortunately,
several initiatives to assign unique id's to scholarly authors, such as
Researcher ID (http://www.researcherid.com/) and Digital Author
Identifier (http://www.surffoundation.nl/smartsite.dws?ch=ENG&id=13480).
Others have argued researchers should be identified through their OpenID
accounts (http://openid.net/). National libraries have their
(overlapping) authority files. There exists an upcoming ISO standard for
identifying names/entities: International Standard Name Identifier
(http://www.isni.org/). Elsevier has its Scopus id’s.
It should be possible to store these author identifiers in (TEI)
bibliographies. We could achieve that effect in a number of ways:
(1) use @key on an <author>’s <name>
(2) use @ref on an <author>’s <name>
(3) add <author> to att.canonical and use @key or @ref on <author>
(4) create a new element <authorid> and add it to <author>’s content model
(5) extend the scope of the existing element <idno> and add <idno> to
<author>’s content model
Any solution will however have to cater for the fact that authors may
have multiple digital author identifiers, corresponding to different
scheme’s. E.g.:
- an International Standard Name Identifier might one day look like
urn:isni:12341234
- a researcher id looks like C-1234-2008 or
http://www.researcherid.com/rid/ C-1234-2008
- a Dutch DAI looks like: info:eu-repo/dai/nl/12456454
- an open id might look like: https://me.yahoo.com/johndoe61
This means that any solutions that rely on attributes will either need
to somehow store the identification scheme in the attribute, or have to
rely on parsing the value to guess what scheme is applicable. @key has
the added problem that it holds by definition only one value, so even if
key="researcherid:C-1234-2008"
would work, it could not at the same time hold the International
Standard Name Identifier for the researcher. @ref could hold multiple
values, but must contain uri’s; we could have
ref="info:eu-repo/dai/nl/12456454 https://me.yahoo.com/johndoe61"
but then software would have to guess what scheme is applicable.
This implies that for a robust solution we need a repeatable element
that stores the identifier’s scheme as a type or scheme attribute, and
the value either as text or as a value attribute. We can either create a
new element for the purpose, e.g. <authorid>, or reuse an existing element.
The proposal here is to use the existing <idno> element. The need to
identify authors is exactly analogous to the need to identify
bibliographic elements such as articles or monographs, the element has
already an appropriately generic name, and I see no reason why not to
use it. This does not involve, as Syd wrote on the TEI in Libraries
mailing list
(https://listserv.indiana.edu/cgi-bin/wa-iub.exe?A2=ind0901B&L=TEILIB-L&T=0&F=&S=&P=2774),
a ‘semantic shift’: <idno> would have the same meaning it always had, it
would just be applied to new elements.
This would involve:
- changing the definition of idno from ‘supplies any standard or
non-standard number used to identify a bibliographic item’ to e.g.
‘supplies any standard or non-standard number used to identify
bibliographic elements’
- adding <idno> to <author>’s content model, presumably as its first
element.
We could then have e.g.
<author>
<idno type="nldai">info:eu-repo/dai/nl/12456454</idno>
<idno type="openid">https://me.yahoo.com/johndoe61</idno>
John Doe
</author>
[1] TePaske-King, B. and Richert, N. (2001), 'The identification of
authors in the Mathematical Reviews Database', Issues in Science and
Technology Librarianship, 31.
[2] Bourne, Philip E. and Fink, J. Lynn (2008), 'I Am Not a Scientist, I
Am a Number', PLoS Computational Biology, 4 (12), e1000247.
[3] Danskin, Alan, et al. (2008), 'A review of the current landscape in
relation to a proposed Name Authority Service for UK repositories of
research outputs', (JISC).
[4] Cals, J. W. L. and Kotz, D. (2008), 'Researcher identification: the
right needle in the haystack', The Lancet, 371 (9631), 2152-53.
More information about the tei-council
mailing list