[tei-council] <said> proposals available

Syd Bauman Syd_Bauman at Brown.edu
Sun Jul 15 22:32:29 EDT 2007


The two separate proposals for the new <said> element are now
available in http://www.tei-c.org/Drafts/said/ (or will be as soon as
the server syncs). The two proposals are in files called
  said-asis
  said-q-hi
and each has .odd source and derived .doc.html, .rnc, and .rng files. 

The two proposals are very similar. The only difference is how the
<q> element is defined. In the latter ("said-q-hi") proposal, the
Guidelines are explicit that <q> can be used for any of the various
underlying reasons that gets represented with quotation marks. I
prefer this proposal, in part because I think lots of people already
use <q> this way.

Here is a quick executive summary:

<said> is for direct speech (or its discursive equivalents: e.g.
       reported thought or speech, dialog, etc.), whether real or
       contrived, typically as part of the current text, although I
       suppose one could imagine otherwise. Most common usage is
       likely to be a character's spoken words in a novel or a
       person's spoken words reported in a non-fiction article. In
       English prose it will very often be associated with phrases
       like "he said", or "she asked". <said> is not a viable child
       of <cit>.

<quote> is for material that is quoted from sources outside the text,
        whether correctly or not, whether real or contrived, whether
        originally spoken or written. Most common usage is likely to
        be quoting passages from other documents. May be used in a
        dictionary for real or contrived examples of usage. <quote>
        is still a viable child of <cit>.

--------- said-asis: ---------
<q> is for passages quoted from elsewhere; in narrative, either
    direct or indirect speech or something being quoted from outside
    the text; in dictionaries, real or contrived examples of usage.
    <q> is still a viable child of <cit>, for those who don't use the
    more specific <quote>.

--------- said-q-hi: ---------
<q> is for any of a number of features when differentiating among
    them is not desired, e.g. because it is economically not feasible
    or simply not of interest for the current purpose. Items that may
    be encoded this way include
    - representation of speech or thought
    - quotation
    - technical terms and glosses
    - passages mentioned, not used
    - authorial distance
    and perhaps even
    - from a foreign language
    - linguistically distinct
    - emphasized
    - any other use of quotation marks in the source


Some tangentially related items I noticed: 

* I think the example with <list type="speakers"> should be re-worked
  so that the who= attributes are pointing to <person>s, but as I
  don't speak French (that is French, right? -- there should really
  be an xml:lang= on the <egXML>, no?) I am not a good choice to do
  that work.

* In the last example of the section, the word "language" is encoded
  as a <mentioned>, but I don't think that's right. I'm not very
  confident about what *is* right, but I'd prefer <term> to
  <mentioned>. (I suppose we could ask the co-author of the source
  for the example, Terry Langendoen, who chaired the committee on
  text analysis and interpretation back in the early 1990s :-)


Some unrelated changes I've made in the ODD:

* lowercase 'm' -> uppercase 'M' in description of em dash

* "the quotation is marked up as part of a concurrent but independent
  hierarchy" changed to "the quotation is marked up using stand-off
  markup", as we don't do concurrent hierarchies any more

* "the quotation boundaries are represented by empty milestone tags"
  to "the quotation boundaries are represented by empty segment
  boundary delimiter elements", as they're *not* milestones!




More information about the tei-council mailing list