[tei-council] FW: first stab at Google > TEI

Martin Mueller martinmueller at northwestern.edu
Wed Jun 22 20:56:48 EDT 2011

It would be worth following up on Stuart Yeates' suggestion and pick a set
of texts that have different types of problems at the paragraph-like

What about verse? It is characteristically indented. Can it be
distinguished from indented prose?

If you know that a text is a play or contains only plays, can you guess
speaker changes? 

And so forth, not to speak of pursuing all this in different languages and

On 6/22/11 6:57 PM, "Sebastian Rahtz" <sebastian.rahtz at oucs.ox.ac.uk>

>The revised Gulliver looks fine, within its limitations. Until/unless
>chapter detection is available,
>probably not much point doing any more work there.
>perhaps try something a bit more complex? eg
>A bigger challenge would be
>&f=false .....
>Sebastian Rahtz   
>Head of Information and Support Group, Oxford University Computing
>13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>Sólo le pido a Dios
>que el futuro no me sea indiferente

More information about the tei-council mailing list