[tei-council] quotation marks, quotes, etc.
Syd Bauman
Syd_Bauman at Brown.edu
Sun Apr 15 18:44:20 EDT 2007
I'd like to address Trac ticket #304.
Passages offset by quotation marks in the source may be encoded as
a specific type of feature, e.g. mentioned not used (<mentioned>),
authorial distancing (<soCalled>), quotation (<quote>), speech or
thought (<q>); or may be encoded as "taken from elsewhere, details
unknown or unsaid" (<q>).
The problem here is that <q> is overloaded, serving two purposes.
Need to develop a proposal to leave <q> as a generic (perhaps even
more generic?) element, and introduce a new element for the "speech
or thought" function.
I think the way forward here is pretty clear, and Lou & I agree to
the basic game-plan sketched out above. But there are still a couple
of potentially controversial issues. So here is a slightly more
detailed proposal, followed by questions.
* Retain <quote> as it is: passage attributed to some agency external
to the text, i.e. a quotation from a written source. Remains a
member of model.quoteLike, also a member of new model.quoted.
* New element <quo> for direct speech or thought. (I.e., not a
quotation of a written source, not authorial distancing, not an
example in a dictionary entry.) A member of new model.quoted.
* Change semantics of <q> to be a bit more broad, basically covering
anything that was indicated in the source with quotation marks, but
about which the encoder does not wish to say more. Essentially
syntactic sugar for <hi rend="quotation marks">. A member of
model.hiLike.
* <cit> remains as is, becomes a member of new model.quoted
This system has the advantage of a clean break between quoting of
passages external to the text and direct speech or thought of, e.g.,
a character. But it also permits <q> to be used quite loosely, which
is good, because it reflects what lots of projects already do.
That is,
- quotation could be encoded with <quote> or <q>
- that which a character speaks could be encoded with <quo> or <q>
- authorial distance could be encoded with <soCalled> or <q>
- words mentioned not used could be encoded with <mentioned> or <q>
- dictionary examples could be encoded with <quote> or <q> (what if
they are contrived?)
- a filename that appeared in quotes could be encoded as <name> or
<q>
- a filepath that appeared in quotes could be encoded as <ident> or
<q>
- a newly introduced term could be encoded as <term> or <q>
You can well imagine projects that encode all this stuff with <q> on
the first pass (because it is easier and thus less expensive), and
then on a second pass convert to more nuanced encoding for those
aspects they care about, and not others.
Questions:
* Does anyone strongly object to the basic idea?
* Is the name <quo> OK? (If not, please provide a suggested
alternative :-)
* Have I got the model divisions correct?
More information about the tei-council
mailing list