[tei-council] remaining EEBO TCP issues
Paul Schaffner
PFSchaffner at umich.edu
Mon Oct 21 16:27:34 EDT 2013
On Mon, Oct 21, 2013, at 15:26, Sebastian Rahtz wrote:
> does that mean you'll change your sources? or do you expect
> people like to me to make the change afterwards?
Could do. Haven't yet. But yes, changes (at present only a few
changes, such as the corrections that Martin Mueller sent me this
morning) happen to the earlier files all the time. Every new release
is a re-release of all the earlier files, including some with (usually
very minor) changes. And no, these are not (yet) being seriously
tracked.
Dealing with a similar issue this week and last (and next too,
probably),
which is why I've been silent lately: head down, focusing on the
matter of re-generating headers for all 131,000 EEBO items*, without
losing any of the corrections that we've made to ProQuest metadata,
or any of the de-duping and correcting that ProQuest itself has
made to the MARC that it releases. An 'overlay' problem, as they say
in the library world.
*That is, all the unique combinations of bibliographic record and
image-set identifier: only about half of those will ever have to map
to a TCP file, but all of them are mapped to a TCP identifier, since
we create the headers and the IDs first for everything we might
ever transcribe, rather than transcribing first and then describing
them.
pfs
--
Paul Schaffner Digital Library Production Service
PFSchaffner at umich.edu | http://www.umich.edu/~pfs/
More information about the tei-council
mailing list