21.494 Google hosting research data

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Wed, 23 Jan 2008 07:11:33 +0000

               Humanist Discussion Group, Vol. 21, No. 494.
       Centre for Computing in the Humanities, King's College London
  www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html
                        www.princeton.edu/humanist/
                     Submit to: humanist_at_princeton.edu

         Date: Wed, 23 Jan 2008 07:05:37 +0000
         From: James Cummings <James.Cummings_at_oucs.ox.ac.uk>
         Subject: Google hosting research data

Dear all,

I was reading recently about the plans of Google to start hosting
research data.[1] They are launching a service in which they are
starting with 120 terabytes of hubble space telescope data, and in
addition the images of the Archimedes Palimpsest, which partly has
inspired this Google project. The site will supposedly feature ways
for members of the public to tag and annotate data, like images on
flickr and videos on you-tube. (And these features and services have
certainly created tangible benefits to our global society.) While
making more data available online, especially to the
information-poor, can only be seen as a good thing, announcements
like this always leave me with a slightly uncertain sense of unease.
If there is an increasing amount of annotation on such data, what
technologies will develop to sort the wheat from the chaff. Sometimes
my unease this is because I'm uncertain of how the service will
operate, (i.e. will just anyone be able to annotate? Can I store my
TEI texts there?) and sometimes because I worry about Google's
motives (they don't really care *what* content, they just want as
much as possible). The issues of online storage are, of course,
quite interesting in their problems, limitations, and true costs.[3]
I was interested partly because all the announcements are quite plain
in citing both the hubble telescope and the Archimedes Palimpsest,
and I was wondering if this was a conscious pairing of science and
humanities data. Obviously part of my interest is piqued by this
owing to the upcoming death of the AHDS (though the OTA will continue!).

A place where one can dump, and then successively add to and
annotate, research data certainly has its place and I will watch this
development with interest. But this, and many other developments
always seem to fall far short of the co-operative virtual research
development platform that I've always wanted. What I want is a
sourceforge-for-humanists, where we can undertake truly collaborative
research but with the benefits of software development tools (such as
a free subversion repository, web presence, bug tracking, forums,
etc.), but alongside of that all the tools one might wish for a
variety of humanities endeavours. (Say, if a small group of us were
working on a critical edition, then image, image+text proofreading,
variant reading tools, assisted editing, online markup editors,
etc. But the point isn't just to allow this one activity, but any
countless number of collaborative workflows.) There have been a
number of attempts at this kind of thing, which always seem to fall
short of the possibilities. I can't picture such a thing happening
without significant commercial backing, and can't envision a
company's business model which would help to create such a thing.

In any case, I was wondering what other readers thought about this
latest development in Google's mission to control, I mean 'organize',
the world's information? Is my unease warranted, or am I just paranoid?

-James

[1]http://blog.wired.com/wiredscience/2008/01/google-to-provi.html
[2]http://searchstorage.techtarget.com/originalContent/0,289142,sid5_gci1246719,00.html
[3]http://pimm.wordpress.com/2007/09/25/googles-palimpsest-project-promiscuous-distribution-of-all-science-data-sets/

-- 
Dr James Cummings, Oxford Text Archive, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk
Received on Wed Jan 23 2008 - 02:34:19 EST

This archive was generated by hypermail 2.2.0 : Wed Jan 23 2008 - 02:34:22 EST