[tei-council] update on inclusion of TEI output in Google Books

James Cummings James.Cummings at it.ox.ac.uk
Mon Dec 16 11:06:35 EST 2013


Hi Kevin,

This all sounds good to me. I think Council would like updates 
(perhaps from Paul) on how it goes... if Council can do anything 
to assist with it, then certainly let it know. I think it is 
perfectly fine for such activity to be taking place in the 
community and only become and official TEI-C activity when any 
such work is requested.

-James

On 15/12/13 21:40, Kevin Hawkins wrote:
> Fellow Technical Council members (cc'ing Laurent Romary and Stefan
> Majewski),
>
> As those of you who have been on Council for a few years know, Google
> has expressed interest in providing TEI as a download format in Google
> Books -- I believe just for public-domain titles that were scanned at
> one of Google's Library Partners.  An engineer at Google began work on
> this in 2011 in response to a request from the Google Library Quality
> working group, which is comprised of staff from Google's various Library
> Partners and deals mostly with questions concerning the quality of scans
> and OCR.  I believe this group occasionally holds conference calls.
>
> The way we know about this is that Peter Gorman (Wisconsin) is a member
> of that group, and he looped in Martin Mueller, then Chair of the TEI's
> Board, who then looped in Council.  We were provided with a few samples
> encoded texts, and various of us sent feedback on those samples and more
> general matters.  In particular I'll mention that since Ranjith was
> interested in using the Best Practices for TEI in Libraries (at Peter
> Gorman's suggestion, I believe), I urged the engineer to aim for a level
> of encoding between Level 3 and Level 4 rather than throwing out
> structural data he has beyond Level 3) and not worry about validation to
> the schemas provided with the BP.  We reviewed some early samples and
> heard from the engineer that he needs to lobby further for its inclusion
> in order to get it deployed to the public.
>
> Concil created an ad-hoc committee on TEI for Google Books in September
> 2011 to provide further guidance. The group included James, Martin
> Holmes, Laurent Romary, and me.  We began work on a document intended to
> address the questions from others at Google:
>
> https://docs.google.com/document/d/1PWBt_y-svn8ESAFDz1KxZinKXxc9dfn6kj5sbsIbBR0/edit
>
> Unfortunately, the engineer has been slow to respond in general, having
> been pulled to other projects.  Peter Gorman tried to restart the work
> on this in February 2013, having written to a mix of people interested
> in the question, not all of whom were on the Google Library Quality
> working group or Council's ad-hoc committee, in February 2013.  The
> engineer responded that he would come back to this project in March 2013.
>
> At some point the Google Library Quality working group formed a TEI
> sub-group to deal just with suggesting improvements on Google's use of
> TEI markup.  In March a colleague of mine at Michigan asked that Paul
> Schaffner and I (both at Michigan as well) be added to the TEI sub-group
> of the Google Library Quality working group, though I, and I assume
> Paul, are not on the main Google Library Quality working group.  At
> about the same time, Stefan Majewski was added to the group
> (representing the Austrian National Library) and tried to kick-start the
> process of evaluating samples.
>
> Some additional samples were provided by Google:
>
> https://drive.google.com/folderview?id=0B_I1dv3x62jERUVqSFktZ3RKeXc&usp=sharing
>
> for the following items in Google Books (using Google Books'
> identifiers, based on barcodes stuck in the copy in the library from
> which it was scanned):
>
> +Z137414909
> +Z136964409
> +Z169495603
> +Z156881802
> +Z156987604
> +Z170360609
> +Z155001508
> +Z150106808
> +Z159009101
> +Z119545503
> +Z152825208
> +Z156332508
>
> Stefan looked at them recently and responded to the group with some ways
> that they might be involved.
>
> I have asked the TEI sub-group of the Google Library Quality working
> group and Ranjith whether they want the Technical Council to continue to
> have a role in this, but no one has responded.  My suggestion is that
> Council decide that since TEI expertise and Council representation is
> now being provided through folks like Paul Schaffner (who was recently
> reelected to Council for two more years), Stefan Majewski, and me, the
> ad-hoc committee can officially disband (not that it was ever official
> in much of any way!) and cede this work to the TEI sub-group.  I will,
> of course, continue to urge the group and the Google engineer in
> particular to seek input from the broader TEI community.
>
> Regardless, I will shortly share
> https://docs.google.com/document/d/1PWBt_y-svn8ESAFDz1KxZinKXxc9dfn6kj5sbsIbBR0/edit
> with the TEI sub-group since I don't believe they ever saw this document.
>
> --Kevin
>


-- 
Dr James Cummings, James.Cummings at it.ox.ac.uk
Academic IT Services, University of Oxford


More information about the tei-council mailing list