[tei-council] Von Braun toc with some content

Thu Aug 13 10:02:23 EDT 2009

I went ahead and started a google doc, invitations were sent to all of
you at the email addresses I have for you so please let me know if you
haven't received an invitation. (James, your email didn't work for
some reason, can you please send me an email address off-list that I
can use to subscribe you to the document?)

Peter Boot is the document "owner" but we all have permission to edit/comment.

http://docs.google.com/Doc?docid=0AWaTLxlvC6kUZGYzOThqcG1fNmY2dHJiNGZt&hl=en

Dot

P.S. I hope that I am not the only person who thinks of Tom Lehrer
every time Von Braun's name is mentioned.

On Thu, Aug 13, 2009 at 10:23 AM, Elena
Pierazzo<elena.pierazzo at kcl.ac.uk> wrote:
> I would stress on the fact that TEI fits particularly well for
> transcribing manuscripts as it allows to capture the text while
> preserving most of the feature of the manuscripts.
>
> It also allows to normalise names, dates, events and keywords which
> allows an easy indexing of such entities.
>
> E
>
> On 12 Aug 2009, at 22:33, Peter Boot wrote:
>
>>
>> This is what I imagine the body of the response might look like. Your
>> thoughts are welcome.
>>
>> Peter
>>
>>
>> 1. Introduction
>>
>> TEI is a standard that has been successfully and widely used in the
>> digital transcription of texts from many periods. It has been used
>> both
>> for mass digitisation in digital libraries and for digitisation of
>> literary manuscripts.
>> It is successful for a number of reasons:
>> * it contains modules for both very regularly occurring textual
>> features, such as lists or tables, and for specialised features such
>> as
>> linguistic analysis. Recently a module for technical documentation was
>> developed for ISO. Where new features are necessary, the system can be
>> easily extended;
>> * it focuses on the creation of an application-independent digital
>> representation of the source document. Because the representation does
>> not depend on the capabilities of specific software, the
>> representation
>> will outlast the capabilities of today's software and can be used for
>> many different purposes
>> * [more bragging]
>> The basic idea is that each document is encoded as an XML file. (There
>> is a glossary in the back of this document that explains technical
>> terms). This XML file contains the full text (typed and hand-written)
>> and describes the structure of this text: it defines the hierarchical
>> structure that groups the individual notes, specifies features like
>> underlining, and identifies e.g. the person who wrote a particular
>> piece
>> of text. The XML file also contains pointers to the files that contain
>> the page images. A document header contains (among else) the
>> meta-information that is necessary for cataloguing: author, date,
>> information about attachments, etc.
>>       A phase of document preparation will thus result in a collection of
>> XML
>> files. We describe the workflow of this process in section 2 of this
>> response. A number of possible components of these files is
>> presented in
>> section 3. How a working system can be created on the basis of the XML
>> files is discussed in section 4. Based on these discussions, in
>> section
>> 5 we address the specific questions formulated in the Request for
>> Information. Section 6 contains pointers to a number of web sites that
>> present different sorts of documents based on the technologies we
>> advocate here. Section 7 finally is a glossary that explains technical
>> terminology.
>>       [Do we need to say why we are interested in this? If so, I'd say our
>> main interest is seeing that they use the proper technology and we
>> want
>> that, apart from the fact that we believe it's best for everyone,
>> because of the publicitary value this would have for us]
>>
>>
>> 2. Workflow for document preparation
>>
>> Might consist of the following phases
>> (1)high quality digital photography (the samples on the web show some
>> scans where part of the page is missing)
>> (2)creation of an inventory of all pages: what pages are there, what
>> are
>> their dates and authors, to what sets of notes do they belong, are
>> they
>> notes proper or attachment to notes, are they possibly duplicates of
>> other pages,  etc.
>> [To me this seems to call for a simple database; from that database,
>> the
>> outline of the TEI documents (basic headers, facsimile section, pb
>> elements) can then be generated]
>> (3)Creation of guidelines for the desired encoding
>> This will involve a consideration of the desirables that emerge from
>> study of the material, technical possibilities, available funding and
>> time
>> (4)Transcribing typed content, presumably by sending this overseas
>> (5)Transcribing hand-written notes
>> (6)extending the encoding with more complex phenomena, such as
>> internal
>> references, indexing, identifying persons and projects
>>
>>
>> 3. TEI components
>>
>> Will explain that a TEI schema can be created that contains just those
>> components that NASA has decided they will want to use. Explain some
>> of
>> the available components, but only very briefly. Relate this to what
>> the
>> different encodings mean in terms of enhanced access. Explain ODD in
>> qualitative terms.
>>
>> 4. Possible technical architecture of a working system
>>
>> [This section would discuss what to do once the XML has been created.
>> I'd stress there are multiple options, eg. Cocoon + stylesheets +
>> Lucene
>> (or eXist). Mention some of the options from Lou's presentation at
>> http://tei.oucs.ox.ac.uk/Oxford/2007-02-13-oucs/talk-publishing.xml]
>>
>> 5. Approach to concepts
>>
>> This would answer NASA's specific questions, in so far as we have
>> answers
>> [1. How should NASA catalogue the Weekly Notes? Do you have specific
>> ideas on how to implement the approach or strategy?
>> 2. What format(s) should the Weekly Notes be available in?
>> 3. How should the Weekly Notes be indexed?
>> 4. What timeframe do you expect this work to require?
>> 5. What other strategies or approaches do you recommend that NASA
>> pursue
>> that would contribute to successful cooperation between NASA and other
>> entities to create a successful and useful product from the Weekly
>> Notes? Could these notes form the basis for understanding management
>> best practices? Could engineering design and operational
>> considerations
>> be derived from these notes? Could these notes form the basis for
>> formal
>> classroom training? ]
>>
>> 6. Links
>>
>> Links to sample projects to the Guidelines and to some introductory
>> material.
>>
>>
>> 7. Glossary
>>
>>
>> _______________________________________________
>> tei-council mailing list
>> tei-council at lists.village.Virginia.EDU
>> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>
> --
> Dr Elena Pierazzo
> Research Associate
> Centre for Computing in the Humanities
> King's College London
> 26-29 Drury Lane
> London WC2B 5RL
>
> Phone: 0207-848-1949
> Fax: 0207-848-2980
> elena.pierazzo at kcl.ac.uk
> www.kcl.ac.uk
>
>
>
>
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>

-- 
*~*~*~*~*~*~*~*~*~*~*
Dot Porter (MA, MSLS)          Metadata Manager
Digital Humanities Observatory (RIA), Regus House, 28-32 Upper
Pembroke Street, Dublin 2, Ireland
-- A Project of the Royal Irish Academy --
Phone: +353 1 234 2444        Fax: +353 1 234 2400
http://dho.ie          Email: dot.porter at gmail.com
*~*~*~*~*~*~*~*~*~*~*