Humanist Discussion Group, Vol. 14, No. 482. Centre for Computing in the Humanities, King's College London <http://www.princeton.edu/~mccarty/humanist/> <http://www.kcl.ac.uk/humanities/cch/humanist/> [1] From: John Bradley <john.bradley@kcl.ac.uk> (56) Subject: Re: 14.0469 XML & WWW; XML references; a broader question [2] From: "David Halsted" <halstedd@tcimet.net> (29) Subject: Re: 14.0469 XML & WWW [3] From: "Fotis Jannidis" <fotis.jannidis@lrz.uni- (22) muenchen.de> Subject: Re: 14.0469 XML & WWW [4] From: lachance@chass.utoronto.ca (Francois Lachance) (42) Subject: imprint, edition, publication --[1]------------------------------------------------------------------ Date: Wed, 08 Nov 2000 09:39:03 +0000 From: John Bradley <john.bradley@kcl.ac.uk> Subject: Re: 14.0469 XML & WWW; XML references; a broader question >btw, I don't think that xml aware clients will be the solution for this >problem, because of the size of the editions. Wendell: I also share the view that Fotis is expressing here, although I must say that I have had so little time to do serious work in this area that I'm not sure my opinions should count for TOO much these days! Nonetheless, it seems to me that the WWW (and also much of the development work at W3C) is predicated on the unspoken assumption that the amount of data to be exchanged between server and client is relatively small. This model may be fine for the kind of transational-oriented B2B applications that seem to be driving developments these days. However, it appears to be a serious problem when looking at the scholarly use of texts. I recall the first time this observation struck me -- several years ago when I went to the text archive site at University of Virginia (or was it Michigan?) and fetched their relatively-lightly marked up SGML-TEI documents using (as I recall) Panorama. By the nature of the web access, and the "document-oriented" nature of SGML, (and, to be fair, perhaps the way that Panorama worked then) I had to fetch the entire document before seeing it. It took a very long time -- as I recall about 30 minutes (this was when I was still at U of Toronto) -- before I saw anything of the document at all. Suppose that instead of looking (merely trying to read!) a novel by Dickens I had been trying to do some analysis on all of Dickens' works. The slowness would have been only one of the problems. At the time it seemed to me that this approach -- shipping the entire document in a single gulp over the Internet before anything could be done with it -- was not going to gain wide acceptance for material of this kind. The HTML representation of the same material was easier to handle because it had been split up into chunks -- but it seems to me that for scholarly use of text at least this chunking (except for straightforward reading on screen or printing out) was unfortunate at least, and, of course, the only markup one had to work with was HTML. It might be possible to divide the document into chunks for XML processing as well, although (it seems to me at least) by the nature of the way that SGML and XML work, the chunked version becomes at least in some sense different from the unchunked one when split in separate pieces. I know, of course, that XPointer links can be made between separate documents, and someday widely available software will be able to deal with them -- but the chunking of materials into separate XML documents, not just the linking between them, is, I think, undesirable. This becomes more and more of an issue when the amount of text in the document becomes larger, and the links between different parts (the thereby implied kind of processing one might want to do on those links) more intricate -- think about analysing text chunks that cross the boundaries between chunks provided by the electronic publisher, for example. You may recall that I raised this problem at my presentation at Virginia, and proposed there an architectural model that is XML based but is not based on the HTTP-WWW document chunking model. Whether it is any good or not, of course, would require me to develop it further! All the best. ... john b ---------------------- John Bradley john.bradley@kcl.ac.uk --[2]------------------------------------------------------------------ Date: Wed, 08 Nov 2000 09:39:56 +0000 From: "David Halsted" <halstedd@tcimet.net> Subject: Re: 14.0469 XML & WWW Edition size could be addressed in a number of ways. It's true that it's probably not useful to think of individual desktops chunking through a large number of very large XMLs retrieved on the fly from remote machines, but it might be possible to think of, say, individual servers indexing a group of XML documents that are actually "stored" on other servers and making the index available for a set of users with shared interests. In addition, sites with lots of XML behind them could make useful drill-downs available to users as well, and expose the results in XML. So you could have a very nice set of mixed modes; sites with lots of XML could use server-side tools (including databases) to optimize searching, but could also expose the XML data stores, enabling anybody with enough machine to run their own searches against the data. Users finding the site-provided tools inadequate could beef their RAM and manipulate the data themselves to meet their own needs; in fact, those users could expose the results of their research as XML and enable the original store to provide a link to their results. Depending on the field, the results might become part of the underlying data store or simply build a searchable interpretive layer on top of the raw data. Eventually, we get to move beyond thinking about servers and clients, to thinking about severs talking to servers and people sort of "peeking in" to the data, asking the servers to provide the information they want from a connected series of other servers with data exposed in XML, that is, publicly queryable. It'd be nice to see Humanities computing develop some things here; texts and published research can be public in a way that corporate data can't, so perhaps the true potential in distributed XML models can be realized more quickly with online Humanities computing. Dave Halsted *** David G. Halsted, PhD Consultant (XML/RDBMS/Web apps) halstedd@tcimet.net --[3]------------------------------------------------------------------ Date: Wed, 08 Nov 2000 09:40:30 +0000 From: "Fotis Jannidis" <fotis.jannidis@lrz.uni-muenchen.de> Subject: Re: 14.0469 XML & WWW From: Wendell Piez <wapiez@mulberrytech.com> (18) > How large do you expect these editions to be? What we have now are electronic editions with some megabyte. To give an example: the rather small edition "Der junge Goethe in seiner Zeit" (young goethe in his time) has about 35 MB. But this will grow quickly and I expect editions on one server to have some gigabytes in 10-20 years. I am not talking about commercial editions like the ones offered by Chadwyck-Healey, because they can solve these interoperability problems within their company, but about editions put on the net by the scholars who created them. > Why would server-side > processing be better for large editions? At the moment: because the browsers can't offer any kind of processing which would be useful to solve this problem. In the future: Probably there will be a division of labor between xml browsers and server. It would make our work easier if we agree early upon a common solution. > Or possibly I mistake you. If you mean to say XML-aware clients will not be > the *entire* solution to the problem, I agree. Yes, that is exactly what I wanted to say. But your question sounds to me like you have some ideas how to handle these problems. I am very interested in any ideas. Fotis Jannidis --[4]------------------------------------------------------------------ Date: Wed, 08 Nov 2000 09:41:42 +0000 From: lachance@chass.utoronto.ca (Francois Lachance) Subject: imprint, edition, publication Patrick, How would your argument about the openendedness of electronic editions work if the volatility of texts were a consequence of social practices and less so of technologically determined paradigms? (The question is of course moot if you consider "paradigms" to be expressions of social practice.) I am just a little wary of a quasi-ahistorical assertion of a single monolithic "print-medium paradigm of publication". And so I like to generalize in a most grandiose fashion: All texts are volatile. Electronic distribution may actually help preserve the variants that contribute to the creation of an edition. The vapours are captured in many media. Paper plus voice plus screen contribute to preservation of variation. A consideration of multimedia and audiovisual components of textual expression certainly challenges the often dichotomous crypto-mcluhanesque debate over print versus electronic. If an edition is a set of readings of records of performances, by its very matricial structure it is not only a gathering of what was witnessed but also an index of what might have been. Whatever the medium in which it is expressed, an edition contains a certain amount of conjecture. And it is the opening of an edition's working hypotheses to testing that contribute to its incompleteness (in the sense of possible world semantics) --- not the medium in which the expression of those working hypotheses are fixed. I just wonder how the link between systems of distribution and authorial control is any different for the written word, the spoken word, the film, the song, the symphony, the painting either hung in a gallery or reproduced as a digital image. We can ask ourselves what cultural conditions result in gallery spaces where viewers can adjust the lighting or concert spaces where the sound is not uniform (for example Morton Feldman's _Rothko Chapel_) for every point in the space. There is a wholesale attitude towards temporality and the possibility of intersubjective experience that accompanies people's use of media and their discourse about the use of media. Some of us begin from a non-Parmenidean position: change is the very basis upon which we can build shared experiences. Media can help in two ways: as facilitators of change and preservation; as facilitators of sharing (and hoarding). I'm not quite sure if a necessary (as opposed to fortuitous) connection exists between the two types of facilitation. Any thoughts? -- Francois Lachance, Scholar-at-large http://www.chass.utoronto.ca/~lachance Member of the Evelyn Letters Project http://www.chass.utoronto.ca/~dchamber/evelyn/evtoc.htm
This archive was generated by hypermail 2b30 : 11/08/00 EST