[tei-council] Von Braun toc with some content

Peter Boot pboot at xs4all.nl
Wed Aug 12 17:33:02 EDT 2009


This is what I imagine the body of the response might look like. Your 
thoughts are welcome.

Peter


1. Introduction

TEI is a standard that has been successfully and widely used in the 
digital transcription of texts from many periods. It has been used both 
for mass digitisation in digital libraries and for digitisation of 
literary manuscripts.
It is successful for a number of reasons:
* it contains modules for both very regularly occurring textual 
features, such as lists or tables, and for specialised features such as 
linguistic analysis. Recently a module for technical documentation was 
developed for ISO. Where new features are necessary, the system can be 
easily extended;
* it focuses on the creation of an application-independent digital 
representation of the source document. Because the representation does 
not depend on the capabilities of specific software, the representation 
will outlast the capabilities of today's software and can be used for 
many different purposes
* [more bragging]
The basic idea is that each document is encoded as an XML file. (There 
is a glossary in the back of this document that explains technical 
terms). This XML file contains the full text (typed and hand-written) 
and describes the structure of this text: it defines the hierarchical 
structure that groups the individual notes, specifies features like 
underlining, and identifies e.g. the person who wrote a particular piece 
of text. The XML file also contains pointers to the files that contain 
the page images. A document header contains (among else) the 
meta-information that is necessary for cataloguing: author, date, 
information about attachments, etc.
	A phase of document preparation will thus result in a collection of XML 
files. We describe the workflow of this process in section 2 of this 
response. A number of possible components of these files is presented in 
section 3. How a working system can be created on the basis of the XML 
files is discussed in section 4. Based on these discussions, in section 
5 we address the specific questions formulated in the Request for 
Information. Section 6 contains pointers to a number of web sites that 
present different sorts of documents based on the technologies we 
advocate here. Section 7 finally is a glossary that explains technical 
terminology.
	[Do we need to say why we are interested in this? If so, I'd say our 
main interest is seeing that they use the proper technology and we want 
that, apart from the fact that we believe it's best for everyone, 
because of the publicitary value this would have for us]


2. Workflow for document preparation

Might consist of the following phases
(1)high quality digital photography (the samples on the web show some 
scans where part of the page is missing)
(2)creation of an inventory of all pages: what pages are there, what are 
their dates and authors, to what sets of notes do they belong, are they 
notes proper or attachment to notes, are they possibly duplicates of 
other pages,  etc.
[To me this seems to call for a simple database; from that database, the 
outline of the TEI documents (basic headers, facsimile section, pb 
elements) can then be generated]
(3)Creation of guidelines for the desired encoding
This will involve a consideration of the desirables that emerge from 
study of the material, technical possibilities, available funding and 
time
(4)Transcribing typed content, presumably by sending this overseas
(5)Transcribing hand-written notes
(6)extending the encoding with more complex phenomena, such as internal 
references, indexing, identifying persons and projects


3. TEI components

Will explain that a TEI schema can be created that contains just those 
components that NASA has decided they will want to use. Explain some of 
the available components, but only very briefly. Relate this to what the 
different encodings mean in terms of enhanced access. Explain ODD in 
qualitative terms.

4. Possible technical architecture of a working system

[This section would discuss what to do once the XML has been created. 
I'd stress there are multiple options, eg. Cocoon + stylesheets + Lucene 
(or eXist). Mention some of the options from Lou's presentation at 
http://tei.oucs.ox.ac.uk/Oxford/2007-02-13-oucs/talk-publishing.xml]

5. Approach to concepts

This would answer NASA's specific questions, in so far as we have answers
[1. How should NASA catalogue the Weekly Notes? Do you have specific 
ideas on how to implement the approach or strategy?
2. What format(s) should the Weekly Notes be available in?
3. How should the Weekly Notes be indexed?
4. What timeframe do you expect this work to require?
5. What other strategies or approaches do you recommend that NASA pursue 
that would contribute to successful cooperation between NASA and other 
entities to create a successful and useful product from the Weekly 
Notes? Could these notes form the basis for understanding management 
best practices? Could engineering design and operational considerations 
be derived from these notes? Could these notes form the basis for formal 
classroom training? ]

6. Links

Links to sample projects to the Guidelines and to some introductory 
material.


7. Glossary




More information about the tei-council mailing list