Humanist Discussion Group, Vol. 16, No. 209.
Centre for Computing in the Humanities, King's College London
<http://www.princeton.edu/~mccarty/humanist/>
<http://www.kcl.ac.uk/humanities/cch/humanist/>
Date: Sun, 15 Sep 2002 07:43:18 -0700
From: Patrick Durusau <pdurusau@emory.edu>
Subject: Re: 16.201 a for-the-first-time residue?
Willard,
The notions that SGML/XML allowed "discovery" of overlap and that
overlapping is a "residual" problem in markup are seriously flawed. The
first confuses the limitations of a technique with the subject under
examination and the latter confuses the "solution" with the problem space.
On the first point, note that Michael Sperberg-McQueen says:
<snip>
> Overlap, for instance, was not a
> >problem before SGML. Pre-SGML systems had no trouble encoding what we
> >would refer to as overlapping structures. Of course, those systems and
> >their users didn't think of them as overlapping structures: overlap was
> >not something that you would conveniently describe before SGML, because
> >before SGML the notion that documents had structure was hardly something
> >you could talk about coherently.
and,
<snip>
>It is a problem which emerged
> > which allowed us to see it and formulate it only when we adopted SGML
> >and XML. SGML and XML can in some sense be said to have allowed us to
> >discover overlap, in that they have provided the conceptual framework
> >within which the problem of overlap can be formulated concisely for the
> >first time.
It is true that overlapping was not a problem that could be described as an
SGML/XML parsing problem prior to the invention of those markup languages
but that seems to me to be a description of the poverty of structures
possible (in XML at least) rather than a commentary on the problem space.
As Michael noted, prior solutions had no such problems but he did not
contend that texts lacked such structures prior to the invention of SGML/XML.
It is in fact unfair to SGML to lump it in with the poverty of structures
that are possible to express in XML, where overlapping structures are
simply ignored for the sake of the solution. SGML could in fact represent
overlapping structures, a feature that was dropped from XML. One strategy
to support the XML solution is to marginalize overlap as a "residual"
problem and hence "interesting" but trivial in light of major problems
being solved. (SGML solves the same problems without the limitations of
XML, a fact that is often overlooked.)
The second, and perhaps more serious, flaw I see in Michael's argument is
that it confuses the solution with the problem space. I can best illustrate
that with the following analogy:
Consider the need for and use of maps prior to the invention of the
Mercator projection technique in the 16th century. Maps, which were
produced on flat surfaces, could not account for the known fact that the
surface being represented was in fact curved. This lead to serious
navigational errors and problems for sailors who wanted to venture beyond
the safety of shore lines.
If the problem space is defined as movement from one place to another where
the distances are not affected by the mapping distortion, that would mean
that pre-Mercator maps solve all but "residual" problems. After all, the
majority of travel involves distances (at least in the 16th century) that
are not affected by such problems. That solves the majority of cases and
leaves only a "residue" that is an "interesting" but hardly compelling problem.
To take the self-imposed limitations of XML as a definition of the problem
space puts markup languages in a similar position to pre-Mercator maps. It
works for a large number of cases but I would hardly describe the remaining
portion of the problem space as a "residue." In fact, I would suggest that
the flat-text view of XML makes apparent a number of interesting issues
with texts but leaves them just beyond our reach.
It is possible to simply flatten texts to conform to the limitations of
XML, but torturing a text to fit our technique seems to me to be a poor
solution. Just as living in a post-Mercator world had more possibilities
for successful navigation, markup strategies that more closely approximate
texts (rather than the reverse) will lead to richer analysis and discoveries.
Patrick
-- Patrick Durusau Director of Research and Development Society of Biblical Literature pdurusau@emory.edu
This archive was generated by hypermail 2b30 : Sun Sep 15 2002 - 03:02:52 EDT