[tei-council] More thoughts about examples.
James Cummings
James.Cummings at oucs.ox.ac.uk
Thu Apr 30 06:40:24 EDT 2009
In Lyon we had a discussion about examples and we've been moving forward
on some of the results of that discussion (thanks especially to
Sebastian and David).
One of the things that I suggested was "decoupling examples from the
Guidelines and making them referenceable" (according to the minutes, I'm
sure I didn't put it so cogently). I just wanted to go on record about
what I meant by this. The idea was that, as with the element/class/macro
specifications, the underlying code for the examples should be stored
separately and then pointed to by the Guidelines in some manner (a
tei:ptr that is resolve when generating the Guidelines, an XInclude,
whatever.)
The arguments against this (according to the minutes) were that a) this
is not our most important task right now and b) decontextualised
examples don't always make sense. I cannot help but agree with a), but
don't think that is a reason not to think about, just a reason to put
its implementation at a low priority. I also agree with b), but the
point isn't about decontextualising the examples in the rendered version
of the guidelines, but in the underlying infrastructure giving us a more
resilient and flexible structure for re-use and maintenance of the
examples in an internationalised context.
One of my arguments for decoupling the examples was that they are
currently stored in multiple places. If memory serves my learned
colleagues here in Oxford argued against that during the meeting saying
that the Guidelines are a 'single resource' and although the element
specifications are stored separately from the prose they are really all
part of one big document.
That is, indeed, true. Well, to a point, and only as long as we have a
single English source. As soon as we have multiple translations of the
Guidelines prose, then this logic falls down. A quick review for those
who haven't looked at the P5/Source/ directory in subversion for awhile:
There is a master file (P5/Source/guidelines.xml) which is a symbolic
link to P5/Source/Guidelines/en/guidelines-en.xml. (i.e. the English
version is the current source of the Guidelines.) This file contains a
whole bunch of entity references to each of the chapters, the
appendices, and every element/class/macro specification as sub-files.
(That including things by entity reference makes me feel icky is a
matter for another discussion...) In the separate files which make up
the chapters, these entities can then be used to virtually include the
specifications at the correct place.
However, my reasons for decoupling the examples are brought about mainly
by our attempts towards internationalisation. We have a good system of
separation for the element/class/macro specifications which helps to
enable the internationalisation of the contents of those files. In most
cases (e.g. element descriptions) this is the specifications are the
only source for that kind of thing in the Guidelines. This is not the
case with examples, they appear both in the specifications and in the
prose of the Guidelines.
As noted above currently the source of the guidelines.xml file is the
English one. While there is an 'fr' directory for a French version of
the Guidelines these are all symbolic links back over the the English
version. (There is an HD-Header.xml with a now dated translation of the
Header chapter, but it doesn't appear to be used in generation of the
French version of the Guidelines online.) But the point is that our
underlying infrastructure should accommodate the possibility of
translations of the Guidelines in an efficient manner.
Hypothetically, let's say the French/Chinese produce a full translation
of the Guidelines, for some of the examples they will have replaced them
with better French/Chinese ones, for others they may have just
translated them into French/Chinese because their textual content is
less significant than the structure of the elements being demonstrated.
In other cases they might have not translated the example and just
used the English one, for whatever reason. Let's say they are showing an
example of the 'cb' element. They might produce a new example which
shows something interesting peculiar to French/Chinese texts, in our
revision of the Guidelines we might want to use it. Sebastian's work at
producing element example reference pages means that someone reading the
Guidelines will now be able to see this, but if we want to use that
French example in our Chinese translation (or the English one!) then we
have to duplicate the content. This seems to go against the XML and TEI
doctrines of storing the information once and using it in multiple
places/manners.
To sum up, I think that we should decouple the examples from both the
prose and element/class/macro specifications and store them in a single
place, a corpus of examples. These should at very least all have
@xml:id's to make them referenceable and @xml:lang's to indicate their
language. Any place they existed before should point to or otherwise
include these examples when rendering the Guidelines. If just that was
done then there would be no visible change to users. However it enables
many more possibilities:
- examples which are duplicated in more than one place (multiple
Chapters, multiple element specifications, etc.) only need to exist once
and be pointed to multiple times
- examples which are translations can be stored next to their originals
and thus it be recognised that the one needs to change when the other
does (as with element descriptions)
- examples which are used verbatim in different language versions of the
Guidelines only need to be stored once
- it gives us a place to store new examples that people have created
(for later inclusion in the Guidelines/specs).
- it could be implemented gradually in a modular way, people choosing to
use it or not
- but overall it seems a more flexible and modular system than storing
them in multiple places.
I'd argue that storing examples both in the prose of the Guidelines and
in the element/class/macro specifications is inefficient and makes us
prone to producing inconsistent examples. While some may like to argue
that theoretically these are all in one document because of those entity
references, from a practical maintenance point of view they are in
separate files. I believe this impacts negatively on the way we produce
and use examples. There are also sorts of problems/discussions inherent
to the implementation of this and I've intentionally left them out
because I think that is separate from whether it is theoretically better
to separate the examples from their multiple storage places.
I believe Fielding, paraphrasing Seneca I believe, said it best:
"It is a trite but true observation, that examples work more forcibly on
the mind than precepts."
Sorry for the length,
-James
--
Dr James Cummings, Research Technologies Service, University of Oxford
James dot Cummings at oucs dot ox dot ac dot uk
More information about the tei-council
mailing list