[tei-council] Content for <pb/> etc. [was: soft hyphens (again)]

O'Donnell, Dan daniel.odonnell at uleth.ca
Fri Jun 25 11:57:30 EDT 2010

On 10-06-25 07:48 AM, Elena Pierazzo wrote:
> Sorry if I intervene only now, but I have a case for which a content
> (even a textual content!!) for<pb/>  might be very important.
> When you transcribe manuscripts you want to be able to record
> paginations and foliation for which you normally would use<pb n="my
> page number"/>.
> But what happens when there is a correction on a page number? I have
> found this case and many others while working on Austen manuscripts.
> Here is the cases we found:
> - page number missing (in a manuscript that normally has page numbers):
> it would be good to be able to use<pb><supplied>45</supplied></pb>  (or
> similar)
> - page number corrected: Austen write 78 the correct the 8 into 7:
> <pb>7<subst><del>8</del><add>7</add></subst></pb/>
> - page number is wrong: Austen write 56 per 57:
> <pb>5<choice><sic>6</sic><corr>7</corr></choice></pb>
Interesting question. I guess I'd ask, from a theoretical perspective, 
whether pb/@n is not supposed to refer to an editorially determined 
number rather than text in the witness, whereas the example you are 
discussing is really content. I.e. if @n is not metadata and Austen's 
mistakes and corrections content. For me, pb has always been a meta-data 
tag describing the concept of a page break rather than actual textual 

If you go with your interpretation, it raises other questions. For 
example, if Austen writes her page numbers in the bottom corner, should 
you use pb to mark the ends of pages rather than the beginning? Or if, 
as also happens, you end up with multiple paginations in multiple places 
(top right, corrected in bottom right, for example), what happens. It 
seems to me if the goal is to record the fact that the witness has some 
interesting pagination content, that should be treated as part of the 
content rather than part of pb--this is a great place for the new 
genetic module proposal.

A place where pb/@n is not adequate even from a strictly theoretical 
perspective is when a document has multiple 
canonical/historical/editorial paginations. E.g. when the actual 
pagination is in dispute or has changed over time (e.g. as happens in 
some manuscripts that have been rebound over time or lost pages).
> One of the leading idea behind P5 was to move any textual content from
> attributes to element, I think the<pb>  has escaped this revision. While
> <lb/>  have numbers that are rarely written on the page, page numbers are
> often actual symbols on the page and therefore an editor would like to
> be able to transcribe them with all the possible features they may have
> (correction. alteration, underlining, etc.), in the same way one can
> transcribe other words written by the author.
I think whatever we do, it should be allowed for all the *b/ elements: 
page numbers may be more common than with others, but lines and columns 
(and quires and every other kind of break) do show up with the same 
problems, so whatever the solution, it's just inviting trouble if we 
don't allow it for all the relevant milestones.
> Elena
> Gabriel Bodard wrote:
>> On 21/06/2010 16:02, Kevin Hawkins wrote:
>>> I'm afraid I still don't understand.  Are these elements no longer empty
>>> elements?  They still appear that way even in Sebastian's test release.
>>>     When did this change happen?
>> Which elements?<gap/>  (always an "empty"--i.e. permitting no
>> text-content--element), has taken a child<desc>  ever since P5 first
>> release;<space/>  for some reason did not, although it does now. Both
>> also take<certainty/>,<precision/>  etc. It makes sense to me that a
>> so-called empty element could contain another non-text-bearing element
>> such as<certainty/>, which in this case serves as a much richer
>> analogue to a cert attribute.
>> At the moment, of course,<lb/>  and the other milestoneLike elements are
>> still literally empty, but I am arguing (lightly, as I don't have a
>> specific use-case for them) that they could just as rationally be
>> allowed to take a child from the certainty class, although still no text
>> content, of course.
>> Sorry if this was confusing.
>> G

Daniel Paul O'Donnell
Professor of English
University of Lethbridge

Chair and CEO, Text Encoding Initiative (http://www.tei-c.org/)
Co-Chair, Digital Initiatives Advisory Board, Medieval Academy of America
President-elect (English), Society for Digital Humanities/Société pour l'étude des médias interactifs (http://sdh-semi.org/)
Founding Director (2003-2009), Digital Medievalist Project (http://www.digitalmedievalist.org/)

Vox: +1 403 329-2377
Fax: +1 403 382-7191 (non-confidential)
Home Page: http://people.uleth.ca/~daniel.odonnell/

More information about the tei-council mailing list