Humanist Discussion Group, Vol. 14, No. 299. Centre for Computing in the Humanities, King's College London <http://www.princeton.edu/~mccarty/humanist/> <http://www.kcl.ac.uk/humanities/cch/humanist/> Date: Tue, 03 Oct 2000 13:16:02 +0100 From: Wendell Piez <wapiez@mulberrytech.com> Subject: Re: 14.0295 primitives Osher Doctorow writes: > Could I prevail >upon Wendell to possibly restate his thesis, if any, in one sentence >comparable to my political history-prehistory declaration that permutations >of A, B, and N in Shakespearean play contexts contain all the content of >political history-prehistory? > >Yours Faithfully, > >Osher Doctorow I'm afraid not: I really have no talent for such flights at least in the context of an e-mail list. Rather, let Osher take the post for what it's worth to him -- if that's not much, that's perfectly fine; I don't expect any post I write to be on target for all readers. Instead (and as long as I'm being summoned back to the floor), I'd like to try and take the discussion a step further -- I accept Mr. Doctorow's challenge to be more abstract and far-reaching, even if I'm not more concise and conclusive. There are five points; please feel free to use your delete key (or the moral equivalent thereof). 1. There is apparently a difference between "methodological primitives" in the sense that Ott, Bradley and myself were taking them, which is to say core operations to be performed on a specified data set via an automated process, and in the sense that Prof. Unsworth is meaning them, as irreducible operations performed by a scholar as he or she goes about the work of tracing, understanding, and presenting a thesis about a text or subject of research. (I'll let Willard speak for himself.) There is also, at least potentially, a relation between these two things, as many of us have experienced in our own work. The implication has been that if we have the first (paraphrase this as "if we can teach our computers to help us read, find, sort, filter and so forth") we can facilitate the second. 2. A key difference between what a computer does in performing operations on a text, and what a human reader does, is that the data set (the "input") on which the computer operates is finite and bounded, whereas what the human reader brings is unknown and variable. It may be finite, although large, but since its bounds are unknown, and since no two human readers (or even readings) bring the same context to bear on a text, practically speaking, it is infinite and unknowable. (Caveat: the Internet and the web now make it possible for a computer's inputs themselves be practically infinite and unbounded, because unknowable; nevertheless we have hardly begun to think about what this may mean for automated processing of texts.) 3. One ancient technique for bridging this gap, is to teach the computer something about what we know about a text, and to design its interfaces and its processes in such a way to give us better access to the full range of this knowledge, than we can ourselves achieve unaided. I say "ancient" because this work is far older than digital processing. Add a table of contents or an index to a text, or line and verse numbering, or lay out the text on the page with chapter titles in a larger type face, and you are beginning to "teach [the book] to help us read, find, sort, filter and so forth". With computers, examples of this practice would include text encoding, or markup, as well as the addition of external sources of information such as databases, dictionaries, "knowledge bases" etc. 4. Historically, one barrier to this work has been (as far as computers and automation have been concerned) that to design these interfaces and processes, we have had to invest in technologies and methods that mask the processes as much as they reveal them. This has largely been because of the design of our tools and the esoteric knowledge they have themselves required. It is as if we had created indexed commentaries on Classical Chinese poetry, but written them in English (finding that with our keyboards it is easier to compose an alphabetical index in English), thereby requiring our Chinese audience to learn English (on top of Classical Chinese) to get the benefit of the commentaries. (Not only that, but we have used a dialect of English that will be largely obsolete in five years.) This problem has been faced not only by "Computing Humanists" but also by the culture as a whole (or marketplace, if you like), that has invested untold millions in systems of computer-based automation that, whatever benefits they have delivered, have always fallen short of promises. Consequently, there have been waves of development working to ameliorate the problem in one way or another. The emergence of object-oriented programming methodologies, including the notion of "strong data typing", is one such wave; the emergence of standards-based markup languages is another. My earlier post tried to trace how these two developments should in theory complement one another, and how industry is now moving forward quickly on that basis to deal with its own analogous problems. Nevertheless, I argued, in the context of Humanities research we have a considerable way to go, even to match what has long been done with such structures as indexes and footnotes in the printed book -- at least, that is to say, if we want to do it on a basis that can reach beyond that five-year half-life that computer applications have faced. 5. Even so, the gap remains between an automated process, working on known inputs, and a human process, working with who-knows-what "extraneous" but all-important -- all-pervasive and all-conditioning --knowledge, memory, intuition, assumptions, imagination. Human readers perceive in a text (just for example) the implicit logics of narrative ordering; intertextual references; metaphorical correspondences; ironies. What would it take to teach a computer to perceive these on our behalf? Out of what methodological primitives, subject to automation, can such operations be built? Respectfully, Wendell ====================================================================== Wendell Piez mailto:wapiez@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== ------------------------------------------------------------------------- Humanist Discussion Group Information at <http://www.kcl.ac.uk/humanities/cch/humanist/> <http://www.princeton.edu/~mccarty/humanist/> =========================================================================
This archive was generated by hypermail 2b30 : 10/04/00 EDT