Markup: footnotes and apparatus (238)

Willard McCarty (MCCARTY@VM.EPAS.UTORONTO.CA)
Mon, 6 Mar 89 19:47:55 EST


Humanist Mailing List, Vol. 2, No. 676. Monday, 6 Mar 1989.


(1) Date: 6 March 1989 09:50:54 CST (168 lines)
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: footnotes and apparatus

(2) Date: 6 March 1989 15:44:00 CST (50 lines)
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: apparatus, cont'd -- if only it were so simple ...

(1) --------------------------------------------------------------------
Date: 6 March 1989 09:50:54 CST
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: footnotes and apparatus

Charles Faulhaber asks for a utility to make text and apparatus "usable"
in a Unix environment; Bob Kraft asks what's wrong with the form he's
got (with the apparatus intercalated at the point of reference), and
further asks "where *should* footnotes and apparatus go?" I can never
resist a discussion of textual apparatus . . .

I can't answer Charles Faulhaber's question because he doesn't say what
will count as usable for him. If he wants a screen editor which will
scroll an apparatus window in synch with the text window, I wish him
luck but don't think he'll get one soon. If he just wants to separate
the text from the apparatus, shouldn't awk do that? (How you'll ever
get them back together, though, I can't imagine; I'd leave them be if I
were you.)

But if one wanted to *write* an editor or other software to understand
critical apparatus, where should the apparatus go? Bob rightly focuses
on this as the interesting question. The Text Representation Committee
of the Text Encoding Initiative will, I am sure Bob is right, kick this
one around a good deal. But it seems to me there are at least two
plausible answers (which I want to describe here to get my two cents'
worth in before the committee takes up the issue).

1. If I control the text myself, I'd almost always just as soon have
the notes and the apparatus embedded at their reference points in the
running text. Notes, then, would be embedded at the point where the
footnote symbol should appear in the text -- at the end of the passage
or phrase being annotated. Apparatus, similarly, would have to be
embedded at the point in the reading text associated with the variation
(where the Nestle symbol would go if you were using Nestle symbols).
<note>Nestle symbols are little marks in the reading text designed to
draw your attention to omissions, additions, transpositions, and
variants in other manuscripts, given in full by the apparatus; the idea
is to allow the reader to monitor the apparatus without having to look
down at it constantly to see whether it has anything for the current
line. They were developed by an editor of the Greek New Testament and
seem to be rarely used outside it; most people seem to find them
distracting.</note> This complicates things a bit, because variants in
the apparatus are associated with the point where the variation begins,
not the end, so they sometimes seem to run backwards.

This isn't too bad for just a few variants:

<poem n=214>
<stanza>
I taste a liquor never brewed--
From Tankards scooped in Pearl--
Not all the <var lemma='Frankfort Berries' ms='autograph alternate'
reading='Vats upon the Rhine'>Frankfort Berries
Yield such an Alcohol!
<stanza> ...

(The variant is an alternate entered in the autograph ms.)

But it begins to be hard to read for even just a few more variants, even
if we allow ourselves to abbreviate the lemma to the first letter or so
of each word. We also have to begin marking verse boundaries
explicitly, since we run over so much.

<poem n=214>
<stanza>
<v>I taste a liquor never <var lemma='b.--' ms=SDR
reading='brewed,'> brewed--
<v>From <var lemma='T.' ms=SDR reading='tankards'>Tankards scooped
in <var lemma='P.--' ms=SDR reading='pearl;'>Pearl--
<v>Not <var lemma='a. t.' ms=SDR reading=[omisit]>all the
<var lemma='Frankfort Berries' ms='autograph alternate'
reading='Vats upon the Rhine'>Frankfort
<var lemma='B.' ms=SDR reading='berries yield the sense'>Berries
<v><var lemma='Y.s.a.A.' ms=SDR reading='Such a delirious whirl.'>
Yield such an Alcohol!
<stanza> ...

(The new variants are from the poem's publication in the Springfield
Daily Republican of 4 May 1861.)

I am improvising an SGML markup which I hope is fairly obvious: '<var>'
marks the beginning of a variation, and takes the attributes 'lemma'
(the reading in the base text), 'ms' (which mss have the variant about
to be given) and 'reading' (what they read instead of the lemma). The
tag 'var' marks only the beginning of the lemma, not the end, because as
we see in line 3 multiple variations can hit overlapping lemmata, and
SGML doesn't handle that very well. (We *could* mark begin and end of
the variation explicitly, and then we wouldn't need the 'lemma'
attribute, but I believe the result would be even uglier than this
approach. This approach also at least looks something like a normal
positive apparatus.)

2. For the sake of easy legibility, we might prefer to bunch the
apparatus sentence by sentence (or stanza by stanza for poetry), to keep
the machine-readable form human-readable:

<poem n=214>
<stanza>
<v id=1.1>I taste a liquor never brewed--
<v id=1.2>From Tankards scooped in Pearl--
<v id=1.3>Not all the Frankfort Berries
<v id=1.4>Yield such an Alcohol!
<apparatus>
<var line=1.1 lemma='b.--' ms=SDR reading='brewed,'>
<var line=1.2 lemma='T.' ms=SDR reading='tankards'>
<var line=1.2 lemma='P.--' ms=SDR reading='pearl;'>
<var line=1.3 lemma='a. t.' ms=SDR reading=[omisit]>
<var line=1.3 lemma='Frankfort Berries' ms='autograph alternate'
reading='Vats upon the Rhine'>
<var line=1.3 lemma='B.' ms=SDR reading='berries yield the sense'>
<var line=1.4 lemma='Y.s.a.A.' ms=SDR reading='Such a delirious
whirl.'></apparatus></stanza>
<stanza> ...

Separating apparatus from the base text has forced us to add an 'id'
attribute on the <v> (verse) tags, to identify each line within the
stanza, and to add a 'line' attribute to the <var> tag to link (the
beginning of) each variation to a line of text. Apart from the labeling
of each attribute, this looks to me a lot like the apparatus criticus I
know and love from good critical editions.

If we were confident of the absolute consistency of the linear file, we
might even use the SGML 'Shortref' facility to make the whole thing look
more like a normal apparatus: defining 'new line plus number plus
period' as '<var line=[the number]', the next string up to a right
square bracket as the lemma, what follows, up to a space followed by an
'=' and a non-blank, as the variant reading, and the string after the
'=' as the list of mss with that reading:

<poem n=214>
<stanza>
<v id=1.1>I taste a liquor never brewed--
<v id=1.2>From Tankards scooped in Pearl--
<v id=1.3>Not all the Frankfort Berries
<v id=1.4>Yield such an Alcohol!
<apparatus>
1.1 b.-- ] brewed, =SDR
1.2 T. ] tankards =SDR
1.2 P.-- ] pearl; =SDR
1.3 a. t. ] [om.] =SDR
1.3 Frankfort Berries ] Vats upon the Rhine =autograph alt.
1.3 B. ] berries yield the sense =SDR
1.4 Y.s.a.A. ] Such a delirious whirl. =SDR
</apparatus></stanza>
<stanza> ...

3. With the rise of CDs, it will not be uncommon for us to work with
texts frozen on CDs -- for these texts, we will have no choice but to
keep our annotations and our apparatus separate from the text. Unless
we want to copy the file entire onto a read/write medium, we will have
to find a way of linking our apparatus to a text possibly encoded with
no expectation of receiving any apparatus at all.

This requires nothing new, except that (a) we have to identify the poem
and stanza as well as the line, and (b) we have no guarantee that the
lines of the base text were given identifiers, so our SGML parser may
not be able to verify that our 'line=' values in the apparatus point at
real lines, and our application program will have to be responsible for
finding the lines they do point at.


Speaking for myself, it seems to me that any of these three approaches
ought to be possible, though I don't know that the last one can be done
cleanly (pointers to unmarked passages seem fraught with complications).

Of course, this is just my opinion, and I am eager to be instructed by
those with different ideas about the problem of apparatus.

Michael Sperberg-McQueen
(2) --------------------------------------------------------------54----
Date: 6 March 1989 15:44:00 CST
From: "Michael Sperberg-McQueen 312 996-2477 -2981" <U35395@UICVM>
Subject: apparatus, cont'd -- if only it were so simple ...

Before anyone else points it out, I should probably admit that the SGML
markup sketched out in my last note was too simple in at least one way:
it cannot provide more than one alternate reading for the same lemma
without repeating the lemma. The problem did not arise in Dickinson
poem 214, but it does in (say) poem 636:

. . .
<v id=2.4>And slowly pick the lock--
<apparatus>
<var line=2.4 lemma='slowly' ms='margin' reading='slily'>
<var line=2.4 lemma='slowly' ms='margin' reading='softly'>
. . .

We really ought to be able to define a variation as comprising one lemma
and one or more pairs of reading-plus-ms, so we can code it thus:

. . .
<v id=2.4>And slowly pick the lock--
<apparatus>
<variation line=2.4 lemma='slowly'>
<variant ms='margin' reading='slily'>
<variant ms='margin' reading='softly'>
<variation>
. . .

Or thus:

. . .
<v id=2.4>And slowly pick the lock--
<apparatus>
<variation>
<line>2.4</>
<lemma>slowly</>
<reading>slily <ms>margin</></reading>
<reading>softly <ms>margin</></reading>
</variation>
. . .

This last version has the advantage of allowing tags for highlighting or
other features to appear within lemmata and variants, together with its
obvious disadvantage of crowding the text with a *lot* of markup. It's
slightly less obvious, though, that one 'variation' corresponds to
exactly one 'line' and one 'lemma', which is why I tend offhand to
prefer the other versions.

Michael Sperberg-McQueen